Spring BootMetrics Monitoring
Spring Boot

Metrics Monitoring

Metrics are numerical measurements collected over time — request counts, latencies, error rates, JVM heap usage, database pool sizes, and custom business counters. Spring Boot uses Micrometer as its metrics facade, which supports dozens of monitoring backends (Prometheus, Datadog, InfluxDB, CloudWatch) through pluggable registries. Micrometer collects metrics; a backend like Prometheus stores and queries them.

Micrometer Concepts

Micrometer provides five core meter types. Each answers a different question about the system. Choosing the right type gives monitoring dashboards accurate data and prevents misleading visualisations — using a Counter when a Gauge is needed, for example, makes dashboards show total events when users expect current state.
Java
// ── Five Micrometer meter types: ─────────────────────────────────────
//
// 1. COUNTER
//    A value that only ever increases.
//    Question: "How many times did X happen?"
//    Examples: requests processed, errors, orders placed, logins
//    Key methods: increment(), increment(double amount)
//
// 2. GAUGE
//    A value that can go up or down — represents current state.
//    Question: "What is the current value of X right now?"
//    Examples: active connections, queue depth, cache size, heap used
//    Key methods: gauge(name, object, valueFunction)
//    Note: Gauge is observed at scrape time — not recorded.
//
// 3. TIMER
//    Measures duration and frequency of events.
//    Question: "How long does X take? How often does it happen?"
//    Examples: HTTP request duration, DB query time, external API calls
//    Key methods: record(duration), record(Supplier), recordCallable()
//    Publishes: count, totalTime, max, percentiles (p50, p95, p99)
//
// 4. DISTRIBUTION SUMMARY
//    Measures distribution of values (not time).
//    Question: "What is the distribution of X?"
//    Examples: request payload size, response body size, order value
//    Key methods: record(double amount)
//    Publishes: count, totalAmount, max, percentiles
//
// 5. LONG TASK TIMER
//    Measures duration of tasks that are currently running.
//    Question: "How long have active tasks been running?"
//    Examples: batch jobs, scheduled tasks, long-running DB migrations
//    Key methods: start(), stop(sample)

// ── Tags — the dimensions of a metric: ───────────────────────────────
// Every meter can have tags (key-value pairs) that add dimensions.
// Tags let you slice metrics by endpoint, status, region, customer tier, etc.
//
// Counter.builder("orders.placed")
//     .tag("channel",  "mobile")      // filter by channel
//     .tag("region",   "us-east-1")   // filter by region
//     .tag("tier",     "premium")     // filter by customer tier
//     .register(meterRegistry);
//
// In Prometheus/Grafana, you can query:
//   sum(orders_placed_total{channel="mobile"}) — mobile orders only
//   rate(orders_placed_total[5m])              — order rate per second

Built-in Spring Boot Metrics

Spring Boot auto-configures a rich set of metrics for the JVM, HTTP server, database connection pool, and any Spring component it detects. These metrics require no code — they start flowing the moment the Actuator and a metrics registry are on the classpath.
Java
// ── Auto-configured metric groups: ───────────────────────────────────

// JVM metrics (jvm.*):
// jvm.memory.used            → heap and non-heap memory used
// jvm.memory.max             → max available memory per pool
// jvm.gc.pause               → GC pause duration (Timer)
// jvm.gc.memory.promoted     → bytes promoted to old gen
// jvm.threads.live           → current live thread count
// jvm.threads.daemon         → daemon thread count
// jvm.threads.states         → threads by state (RUNNABLE, BLOCKED, etc.)
// jvm.classes.loaded         → loaded class count
// jvm.buffer.memory.used     → direct/mapped buffer pool memory

// HTTP server metrics (http.server.requests):
// Dimensions: uri, method, status, exception, outcome
// Measurements: count, totalTime, max
// Example queries:
//   rate(http_server_requests_seconds_count{status="500"}[5m])
//   histogram_quantile(0.99, http_server_requests_seconds_bucket)

// HikariCP connection pool (hikaricp.*):
// hikaricp.connections.active    → currently in-use connections
// hikaricp.connections.idle      → idle connections in pool
// hikaricp.connections.pending   → threads waiting for a connection
// hikaricp.connections.max       → maximum pool size
// hikaricp.connections.timeout   → connection timeout events

// System metrics (system.*  /  process.*):
// system.cpu.usage           → system-wide CPU usage (0.01.0)
// process.cpu.usage          → this JVM's CPU usage
// process.uptime             → seconds since JVM start
// process.files.open         → open file descriptors
// disk.free                  → free disk space

// Cache metrics (cache.*):
// cache.gets{result="hit"}   → cache hit count
// cache.gets{result="miss"}  → cache miss count
// cache.puts                 → cache put count
// cache.evictions            → eviction count

// ── application.yml — enable percentile histograms: ──────────────────
// management:
//   metrics:
//     distribution:
//       percentiles-histogram:
//         http.server.requests: true
//         order.placement.duration: true
//       percentiles:
//         http.server.requests: 0.50, 0.95, 0.99
//       slo:                              # service level objectives
//         http.server.requests: 100ms, 250ms, 500ms, 1000ms

Custom Metrics

Business metrics are as important as technical metrics. Tracking orders placed, payments processed, and active users in the same system as JVM and HTTP metrics enables unified dashboards that correlate technical symptoms (high p99 latency) with business impact (order placement rate dropped 40%).
Java
@Service
@RequiredArgsConstructor
@Slf4j
public class OrderMetricsService {

    private final MeterRegistry meterRegistry;
    private final AtomicInteger activeOrders = new AtomicInteger(0);

    @PostConstruct
    public void initMetrics() {
        // GAUGE — observe current value at scrape time:
        Gauge.builder("orders.active", activeOrders, AtomicInteger::get)
            .description("Number of orders currently being processed")
            .register(meterRegistry);

        // Gauge referencing a service bean method:
        Gauge.builder("queue.depth", this, obj ->
                obj.getQueueDepth())
            .description("Current order processing queue depth")
            .tag("queue", "order-processing")
            .register(meterRegistry);
    }

    // COUNTER — track order outcomes by channel and status:
    public void recordOrderPlaced(String channel, String tier) {
        Counter.builder("orders.placed")
            .description("Total orders placed")
            .tag("channel", channel)    // mobile | web | api
            .tag("tier",    tier)       // free | premium | enterprise
            .register(meterRegistry)
            .increment();
    }

    public void recordOrderFailed(String channel, String reason) {
        Counter.builder("orders.failed")
            .description("Total failed orders")
            .tag("channel", channel)
            .tag("reason",  reason)     // payment_failed | out_of_stock
            .register(meterRegistry)
            .increment();
    }

    // TIMER — measure end-to-end order placement duration:
    public OrderResponse placeOrderWithMetrics(
            CreateOrderRequest request) {
        return Timer.builder("orders.placement.duration")
            .description("Time to place an order end to end")
            .tag("channel", request.getChannel())
            .publishPercentiles(0.5, 0.95, 0.99)
            .register(meterRegistry)
            .record(() -> doPlaceOrder(request));
    }

    // DISTRIBUTION SUMMARY — track order value distribution:
    public void recordOrderValue(BigDecimal amount, String currency) {
        DistributionSummary.builder("orders.value")
            .description("Distribution of order values")
            .tag("currency", currency)
            .baseUnit("cents")
            .publishPercentiles(0.5, 0.95, 0.99)
            .register(meterRegistry)
            .record(amount.doubleValue());
    }

    // LONG TASK TIMER — track running batch jobs:
    private final LongTaskTimer batchJobTimer = LongTaskTimer
        .builder("batch.job.active")
        .description("Currently running batch jobs")
        .register(Metrics.globalRegistry);

    public void runBatchJob(Runnable job) {
        LongTaskTimer.Sample sample = batchJobTimer.start();
        try {
            job.run();
        } finally {
            sample.stop();
        }
    }
}

Metrics with @Timed and @Counted

Micrometer provides AOP-based annotations that instrument methods without any MeterRegistry boilerplate. @Timed records the duration and invocation count of a method. @Counted counts how many times it is called. Both require the spring-boot-starter-aop dependency to activate the AOP proxy.
XML
<!-- pom.xml — AOP required for annotation support: -->
<!-- <dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-aop</artifactId>
</dependency> -->

@Service
@RequiredArgsConstructor
public class ProductService {

    // ── @Timed — records duration + call count: ───────────────────────
    @Timed(
        value       = "product.search.duration",
        description = "Time to search products",
        percentiles = {0.5, 0.95, 0.99},
        histogram   = true,
        extraTags   = {"layer", "service"}
    )
    public List<ProductResponse> search(ProductSearchRequest request) {
        return productRepository.search(request);
    }

    // ── @Timed with dynamic tags via MeterRegistry (alternative): ─────
    @Timed(value = "product.findById.duration")
    public ProductResponse findById(Long id) {
        return productRepository.findById(id)
            .map(ProductResponse::from)
            .orElseThrow(() ->
                new ResourceNotFoundException("Product not found: " + id));
    }

    // ── @Counted — counts method invocations: ─────────────────────────
    @Counted(
        value       = "product.views.total",
        description = "Number of product detail views",
        extraTags   = {"source", "api"}
    )
    public ProductDetailResponse getDetail(Long id) {
        return buildDetailResponse(findById(id));
    }
}

// ── Enable @Timed and @Counted processing: ────────────────────────────
@Configuration
public class MetricsConfig {

    // Required to process @Timed annotations:
    @Bean
    public TimedAspect timedAspect(MeterRegistry registry) {
        return new TimedAspect(registry);
    }

    // Required to process @Counted annotations:
    @Bean
    public CountedAspect countedAspect(MeterRegistry registry) {
        return new CountedAspect(registry);
    }
}