Monitoring
The Docker Compose deployment includes optional monitoring with Prometheus, Grafana, and pre-built dashboards.
Enabling Monitoring
Section titled “Enabling Monitoring”Start the monitoring stack alongside your services:
docker compose --profile monitoring up -dOr run just monitoring services:
docker compose --profile monitoring up -d prometheus grafanaAccessing Services
Section titled “Accessing Services”| Service | URL | Credentials |
|---|---|---|
| Grafana | http://localhost:3002 | admin / admin |
| Prometheus | http://localhost:9091 | — |
Pre-built Dashboards
Section titled “Pre-built Dashboards”Grafana comes pre-configured with dashboards for all services:
System Overview
Section titled “System Overview”Overall system health:
- Container CPU and memory usage
- Network I/O across services
- Container restarts and uptime
API Service
Section titled “API Service”API performance metrics:
- Request rate (requests/second)
- Response latency (p50, p95, p99)
- Error rate by status code
- Active connections
Execution Runtime
Section titled “Execution Runtime”Runtime metrics:
- Jobs in progress
- Execution duration histogram
- Queue depth
- Success/failure rates
PostgreSQL
Section titled “PostgreSQL”Database performance:
- Active connections
- Query rate
- Transaction throughput
- Cache hit ratio
Prometheus Targets
Section titled “Prometheus Targets”Prometheus scrapes metrics from:
| Target | Endpoint | Metrics |
|---|---|---|
| API | api:9090/metrics | Request counts, latencies, errors |
| Runtime | runtime:9090/metrics | Execution metrics |
| PostgreSQL Exporter | postgres-exporter:9187/metrics | Database metrics |
| cAdvisor | cadvisor:8080/metrics | Container metrics |
Configuration
Section titled “Configuration”Custom Prometheus Config
Section titled “Custom Prometheus Config”Edit monitoring/prometheus/prometheus.yml:
global: scrape_interval: 15s evaluation_interval: 15s
scrape_configs: - job_name: 'api' static_configs: - targets: ['api:9090']
- job_name: 'runtime' static_configs: - targets: ['runtime:9090']
# Add custom targets here - job_name: 'custom-service' static_configs: - targets: ['my-service:9090']Custom Grafana Dashboards
Section titled “Custom Grafana Dashboards”Add dashboard JSON files to monitoring/grafana/provisioning/dashboards/:
# Download a dashboard from Grafana.comcurl -o monitoring/grafana/provisioning/dashboards/my-dashboard.json \ 'https://grafana.com/api/dashboards/1860/revisions/latest/download'
# Restart Grafana to pick up changesdocker compose restart grafanaAlert Rules
Section titled “Alert Rules”Add Prometheus alerting rules in monitoring/prometheus/alerts/:
groups: - name: api rules: - alert: HighErrorRate expr: | sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) > 0.05 for: 5m labels: severity: critical annotations: summary: "High error rate (> 5%)"Environment Variables
Section titled “Environment Variables”Configure monitoring via .env:
# PrometheusPROMETHEUS_RETENTION=15dPROMETHEUS_PORT=9091
# GrafanaGRAFANA_PORT=3001GRAFANA_ADMIN_PASSWORD=your-secure-password
# Metrics exportMETRICS_ENABLED=trueMETRICS_PORT=9090Metrics Endpoints
Section titled “Metrics Endpoints”The API and Runtime services expose Prometheus metrics on a separate port (9090):
# Check API metricscurl http://localhost:9090/metrics
# Check Runtime metrics (requires port mapping)docker compose exec runtime curl localhost:9090/metricsResource Usage
Section titled “Resource Usage”The monitoring stack adds minimal overhead:
| Service | CPU | Memory |
|---|---|---|
| Prometheus | ~100m | ~256MB |
| Grafana | ~50m | ~128MB |
| cAdvisor | ~50m | ~128MB |
Production Considerations
Section titled “Production Considerations”Persistence
Section titled “Persistence”Enable persistent storage for Prometheus data:
services: prometheus: volumes: - prometheus-data:/prometheus
volumes: prometheus-data:External Access
Section titled “External Access”For production, place Grafana behind a reverse proxy with TLS:
server { listen 443 ssl; server_name grafana.example.com;
location / { proxy_pass http://localhost:3001; proxy_set_header Host $host; }}Alertmanager
Section titled “Alertmanager”Add Alertmanager for alert routing:
services: alertmanager: image: prom/alertmanager:v0.26.0 ports: - "9093:9093" volumes: - ./monitoring/alertmanager:/etc/alertmanager command: - '--config.file=/etc/alertmanager/alertmanager.yml'Troubleshooting
Section titled “Troubleshooting”No metrics in Grafana
Section titled “No metrics in Grafana”-
Check Prometheus is scraping targets:
Terminal window curl http://localhost:9091/api/v1/targets -
Verify services expose metrics:
Terminal window docker compose exec api curl -s localhost:9090/metrics | head -20 -
Check Grafana datasource:
- Go to Settings → Data Sources
- Verify Prometheus URL is
http://prometheus:9090
High memory usage
Section titled “High memory usage”Reduce Prometheus retention:
services: prometheus: command: - '--storage.tsdb.retention.time=7d'