Observability Stack — Logs, Metrics, Traces

Set up a complete observability stack. Structured logging, Prometheus metrics, distributed tracing with OpenTelemetry — correlated with trace IDs.

by Promptsy Team

1,044 views159 copies

+100

Backend Development DevOps & Infrastructure #logging #observability #opentelemetry #prometheus #tracing #monitoring

Prompt Discussion

Prompt

## Task
Complete observability setup: logs + metrics + traces, correlated.

## Requirements
- Language: TypeScript/Node.js
- Logs: Structured JSON → stdout (collected by platform)
- Metrics: Prometheus format, exposed at /metrics
- Traces: OpenTelemetry → Jaeger or Tempo

## Implementation

### 1. Structured Logging
```typescript
// Every log line is JSON with trace correlation
{
  "level": "info",
  "message": "Order created",
  "traceId": "abc123",
  "spanId": "def456",
  "userId": "user_789",
  "orderId": "ord_012",
  "duration_ms": 45,
  "timestamp": "2026-04-01T12:00:00Z"
}
```

### 2. Prometheus Metrics
```typescript
// HTTP request metrics
http_request_duration_seconds{method, path, status} // histogram
http_requests_total{method, path, status}           // counter

// Business metrics
orders_created_total{payment_method}                // counter
cart_value_dollars{currency}                        // histogram

// Infrastructure metrics
db_query_duration_seconds{query_name}               // histogram
cache_hit_ratio{cache_name}                         // gauge
```

### 3. Distributed Tracing
```typescript
// Auto-instrument: HTTP, database, Redis, external APIs
// Manual spans for business logic:
const span = tracer.startSpan("process-order");
span.setAttribute("order.id", orderId);
span.setAttribute("order.total", total);
// ... business logic ...
span.end();
```

## Implementation Notes
1. Use OpenTelemetry SDK — single library for all three signals
2. Trace context propagation: W3C Trace Context headers across services
3. Log → trace correlation: inject traceId into every log line
4. Metric naming: follow Prometheus conventions (snake_case, _total suffix for counters)
5. Sampling: 100% for errors, 10% for normal traffic (reduce cost)
6. Dashboards: pre-built Grafana dashboards for RED metrics (Rate, Errors, Duration)

Compatible models

Copilot (GitHub)Claude Code

Gallery

No gallery images yet.

Version history

Discussion

Start discussion→

No comments yet. Start the discussion