Set up a complete observability stack. Structured logging, Prometheus metrics, distributed tracing with OpenTelemetry — correlated with trace IDs.
## Task
Complete observability setup: logs + metrics + traces, correlated.
## Requirements
- Language: TypeScript/Node.js
- Logs: Structured JSON → stdout (collected by platform)
- Metrics: Prometheus format, exposed at /metrics
- Traces: OpenTelemetry → Jaeger or Tempo
## Implementation
### 1. Structured Logging
```typescript
// Every log line is JSON with trace correlation
{
"level": "info",
"message": "Order created",
"traceId": "abc123",
"spanId": "def456",
"userId": "user_789",
"orderId": "ord_012",
"duration_ms": 45,
"timestamp": "2026-04-01T12:00:00Z"
}
```
### 2. Prometheus Metrics
```typescript
// HTTP request metrics
http_request_duration_seconds{method, path, status} // histogram
http_requests_total{method, path, status} // counter
// Business metrics
orders_created_total{payment_method} // counter
cart_value_dollars{currency} // histogram
// Infrastructure metrics
db_query_duration_seconds{query_name} // histogram
cache_hit_ratio{cache_name} // gauge
```
### 3. Distributed Tracing
```typescript
// Auto-instrument: HTTP, database, Redis, external APIs
// Manual spans for business logic:
const span = tracer.startSpan("process-order");
span.setAttribute("order.id", orderId);
span.setAttribute("order.total", total);
// ... business logic ...
span.end();
```
## Implementation Notes
1. Use OpenTelemetry SDK — single library for all three signals
2. Trace context propagation: W3C Trace Context headers across services
3. Log → trace correlation: inject traceId into every log line
4. Metric naming: follow Prometheus conventions (snake_case, _total suffix for counters)
5. Sampling: 100% for errors, 10% for normal traffic (reduce cost)
6. Dashboards: pre-built Grafana dashboards for RED metrics (Rate, Errors, Duration)No gallery images yet.