Skip to main content

Production Guides

Production Tracing for GoFr

Summary

GoFr ships built-in OpenTelemetry tracing — every HTTP request, gRPC call, and datasource operation is traced automatically. Configure the exporter via TRACE_EXPORTER (otlp, jaeger, zipkin, or gofr) and TRACER_URL, set TRACER_RATIO for head-based sampling, and W3C Trace Context propagation flows through GoFr's HTTP service client without extra code.

When to use this guide

You have GoFr running in Kubernetes (or any container platform) and want traces flowing into a backend — Jaeger, Grafana Tempo, an OpenTelemetry Collector, or a vendor that accepts OTLP. This guide covers exporter configuration, sampling, and propagation across multiple services.

For adding application-level spans inside handlers, see Custom Spans In Tracing.

What GoFr traces automatically

Once tracing is enabled, GoFr instruments without code changes:

  • HTTP server — every incoming request becomes a root span (or a child if upstream sent W3C trace headers).
  • HTTP client — outgoing calls via the GoFr service client (with circuit breaker / retry / rate limit) are traced and propagate context.
  • gRPC — server and client interceptors emit spans.
  • Datasources — SQL, Redis, Mongo, Cassandra, Pub/Sub publishers and subscribers (Kafka, NATS, SQS, Google Pub/Sub) emit spans for each operation.
  • Migrations — recorded as spans, useful for debugging long-running schema changes.

What custom spans add (ctx.Trace("name")) is application logic — business operations that span multiple datasource calls or pure-CPU work you want to time.

Configuration

GoFr reads tracing config from environment variables. The relevant keys (verified against pkg/gofr/otel.go):

VariablePurposeDefault
TRACE_EXPORTEROne of otlp, jaeger, zipkin, gofrunset (tracing disabled)
TRACER_URLEndpoint for the chosen exporterunset
TRACER_HOSTDeprecated — use TRACER_URLunset
TRACER_PORTDeprecated — use TRACER_URL9411
TRACER_RATIOHead-based sampling ratio (0.0–1.0)1
TRACER_HEADERSCustom OTLP headers, Key1=Value1,Key2=Value2unset
TRACER_AUTH_KEYShortcut for Authorization headerunset

Tracing is disabled if neither TRACE_EXPORTER nor TRACER_URL is set — GoFr logs tracing is disabled, as configs are not provided at debug level. The sampler is ParentBased(TraceIDRatioBased(TRACER_RATIO)), so a sampling decision made upstream is honored.

zipkin is supported but deprecated; the framework logs a warning recommending otlp instead. The gofr exporter ships traces to GoFr's hosted tracer at https://tracer-api.gofr.dev/api/spans (override with TRACER_URL).

Backend recipes

Jaeger (OTLP gRPC)

Modern Jaeger (1.35+) accepts OTLP natively on port 4317:

YAML
# ConfigMap fragment
TRACE_EXPORTER: "jaeger"
TRACER_URL: "jaeger-collector.observability.svc.cluster.local:4317"
TRACER_RATIO: "0.1"

jaeger and otlp use the same OTLP gRPC exporter under the hood — they differ only in log labeling.

Grafana Tempo / OpenTelemetry Collector

Point at any OTLP gRPC endpoint:

YAML
TRACE_EXPORTER: "otlp"
TRACER_URL: "otel-collector.observability.svc.cluster.local:4317"
TRACER_RATIO: "0.1"

Running an OTel Collector as a sidecar or DaemonSet is the recommended pattern: it does tail-based sampling, batching, and can fan out to multiple backends without changing the app.

Honeycomb / Datadog / Vendor OTLP

For SaaS backends that accept OTLP and require an API key:

YAML
TRACE_EXPORTER: "otlp"
TRACER_URL: "api.honeycomb.io:443"
TRACER_HEADERS: "x-honeycomb-team=YOUR_API_KEY,x-honeycomb-dataset=orders"
TRACER_RATIO: "0.1"

Or with a single auth header:

YAML
TRACER_AUTH_KEY: "Bearer YOUR_TOKEN"

GoFr's OTLP exporter currently uses an insecure (cleartext) gRPC connection inside the cluster — for SaaS endpoints over the public internet, route through an OTel Collector that terminates TLS, or rely on a service mesh.

Sampling: head-based vs tail-based

TRACER_RATIO is head-based: the sampling decision is made when the trace starts. With TRACER_RATIO=0.1, 10% of root spans are kept; the other 90% are dropped at the source. Cheap, predictable, but you cannot retroactively keep a slow or errored trace that wasn't sampled.

For production-grade observability, tail-based sampling — done in an OpenTelemetry Collector with the tail_sampling processor — lets you keep all traces that contain errors or exceed a latency threshold while sub-sampling the happy path. The pattern is: app sends 100% (or a high ratio) to the local collector; collector decides what to ship onward.

A starting matrix:

EnvironmentTRACER_RATIONotes
Local dev1See everything
Staging1Catch issues before prod
Production (low traffic, < 50 RPS)1Volume is fine
Production (high traffic)0.050.1Or sample 100% to a collector and tail-sample there

Propagation across services

GoFr sets up a CompositeTextMapPropagator(TraceContext{}, Baggage{}), so the W3C traceparent and baggage headers are honored on incoming requests and written on outgoing requests through the GoFr HTTP service client. No extra code is needed:

Go
package main

import (
	"encoding/json"

	"gofr.dev/pkg/gofr"
)

func main() {
	app := gofr.New()

	app.AddHTTPService("payments", "http://payments.default.svc.cluster.local")

	app.GET("/checkout", func(ctx *gofr.Context) (any, error) {
		span := ctx.Trace("checkout.compute-total")
		defer span.End()

		// The downstream span on payments will be a child of this trace.
		// GetWithHeaders takes (ctx, path, queryParams, headers) and returns (*http.Response, error).
		httpResp, err := ctx.GetHTTPService("payments").
			GetWithHeaders(ctx, "/charge", nil, nil)
		if err != nil {
			return nil, err
		}
		defer httpResp.Body.Close()

		var resp any
		if err := json.NewDecoder(httpResp.Body).Decode(&resp); err != nil {
			return nil, err
		}

		return resp, nil
	})

	app.Run()
}

The downstream payments service — also a GoFr app pointed at the same exporter — will record its spans as children of the same trace. In Jaeger or Tempo, you'll see the full chain end-to-end.

Production tips

  • One exporter, many services: point all your services at the same collector. Querying a trace that hops services is the whole point.
  • Resource attributes: GoFr sets service.name from APP_NAME (default gofr-app). Set APP_NAME per-deployment so traces are attributable.
  • Don't sample on the client when you can sample on the collector — once dropped at the source, a trace is gone forever.
  • Watch the exporter error log: GoFr installs a custom OTel error handler (otelErrorHandler) that logs exporter failures via the standard logger. If you see these in volume, your collector is unreachable or overwhelmed.
  • Trace IDs in logs: include the trace ID in your logs to jump from a noisy log line to its trace. GoFr's structured logger and trace context share *gofr.Context, so you can read span.SpanContext().TraceID() and log it.

Verification

Bash
# 1. Confirm env is set inside the pod.
kubectl exec deploy/orders -- env | grep -E "TRACE_|TRACER_"

# 2. Generate traffic.
kubectl port-forward svc/orders 8080:80
for i in $(seq 1 50); do curl -s http://localhost:8080/checkout > /dev/null; done

# 3. Confirm spans are flowing in the collector or backend logs.
kubectl logs -n observability deploy/otel-collector | grep -i orders

# 4. Open Jaeger UI and search service=orders.
kubectl port-forward -n observability svc/jaeger-query 16686:16686
# http://localhost:16686

Frequently asked