OpenTelemetry for PHP and Node: An Instrumentation Baseline Without Vendor Lock-In
OpenTelemetry PHP support has matured enough in 2026 that there is no longer a good reason to reach for a proprietary SDK first. If you are instrumenting a PHP or Node.js service today, starting with a vendor-specific agent is a choice you will likely pay for later - in migration effort, in surprise pricing, or in the friction of switching when your monitoring vendor gets acquired or raises prices.
This post walks through a practical instrumentation baseline for PHP and Node.js using OpenTelemetry. By the end you will have traces, metrics, and structured logs flowing to a collector - and you will be able to swap the backend without touching application code.
Why OpenTelemetry and Not a Vendor SDK?
The honest answer is: OpenTelemetry forces good instrumentation hygiene.
Vendor SDKs tend to encourage passive instrumentation - install the agent, let it auto-instrument everything, and hope the traces make sense. OpenTelemetry works the same way for auto-instrumentation, but because the data model is open and widely documented, you end up reasoning about span attributes, propagation headers, and sampling strategies instead of treating observability as a black box.
The other reason is cost control. Once your signals go through the OpenTelemetry Collector, you can sample aggressively before exporting. Head-based sampling at the collector level can cut your ingestion bill by 60-80% without changing a line of application code. Proprietary agents generally give you less control here.
What the Baseline Covers
A minimum viable observability setup for a production PHP or Node.js service needs three things:
- Distributed traces - request flows across service boundaries, including database queries, external HTTP calls, and queue operations
- Runtime metrics - memory, CPU, event loop lag (Node), garbage collection frequency (PHP/Node)
- Structured logs with trace context - log lines correlated to the trace ID so you can jump from a span to the relevant log lines
This post focuses on traces and metrics. Log correlation is a one-liner once the trace context is propagated correctly.
PHP Instrumentation with opentelemetry-php
The PHP SDK is stable for traces and beta for metrics as of early 2026. For most production use cases that is enough.
Install the SDK
composer require open-telemetry/sdk open-telemetry/exporter-otlp
For auto-instrumentation of Symfony, PSR-7 HTTP clients, PDO, and Redis, add the relevant contrib packages:
composer require open-telemetry/opentelemetry-auto-symfony \
open-telemetry/opentelemetry-auto-pdo \
open-telemetry/opentelemetry-auto-redis
Auto-instrumentation hooks into Symfony's kernel events, PDO statement execution, and Redis commands without requiring manual span creation in your code.
Configure the SDK via Environment Variables
OpenTelemetry respects the standard environment variable spec, so you configure it outside your application code:
OTEL_SERVICE_NAME=my-php-app
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
OTEL_TRACES_SAMPLER=parentbased_traceidratio
OTEL_TRACES_SAMPLER_ARG=0.1
Setting OTEL_TRACES_SAMPLER_ARG=0.1 samples 10% of root spans that have no parent. Adjust based on your traffic volume and the cost model of your backend.
Verify Context Propagation
The most common mistake with PHP instrumentation is broken trace context across async boundaries - jobs dispatched to a queue or events handled in a separate process. If a background job starts a new root span instead of continuing the trace from the HTTP request that triggered it, you lose the full picture.
To propagate correctly, serialize the current context when dispatching:
$propagator = Globals::propagator();
$carrier = [];
$propagator->inject($carrier);
// Store $carrier alongside the job payload
And extract it at the start of the job handler:
$propagator = Globals::propagator();
$context = $propagator->extract($carrier);
$span = $tracer->spanBuilder('process-job')
->setParent($context)
->startSpan();
This is the part that vendor auto-instrumentation gets wrong most often. Explicit propagation is worth the extra lines.
Node.js Instrumentation with @opentelemetry/sdk-node
The Node.js SDK is the most mature in the OpenTelemetry ecosystem. Auto-instrumentation covers Express, Fastify, Koa, HTTP, gRPC, Prisma, Sequelize, Redis, and most databases you are likely to use.
Install
npm install @opentelemetry/sdk-node \
@opentelemetry/auto-instrumentations-node \
@opentelemetry/exporter-trace-otlp-http \
@opentelemetry/exporter-metrics-otlp-http
Bootstrap File
Create a tracing.js (or tracing.ts) that initializes the SDK before any other imports:
import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http';
import { PeriodicExportingMetricReader } from '@opentelemetry/sdk-metrics';
const sdk = new NodeSDK({
traceExporter: new OTLPTraceExporter({
url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT + '/v1/traces',
}),
metricReader: new PeriodicExportingMetricReader({
exporter: new OTLPMetricExporter({
url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT + '/v1/metrics',
}),
exportIntervalMillis: 30000,
}),
instrumentations: [getNodeAutoInstrumentations()],
});
sdk.start();
Run your application with this file loaded first:
node --require ./tracing.js index.js
Or with --import for ESM:
node --import ./tracing.js index.js
Event Loop Metrics
The default auto-instrumentation includes Node.js runtime metrics - event loop lag, active handles, garbage collection duration. These are the metrics that tell you whether your service is healthy under load before errors start appearing.
If you are using a framework like Fastify, add a process-level check that alerts when event loop lag exceeds 100ms. That threshold catches most blocking operations before users notice.
The OpenTelemetry Collector: Your Sampling and Routing Layer
Both the PHP and Node services above export to an OpenTelemetry Collector. This is not optional if you want vendor independence.
A minimal collector configuration that accepts OTLP, applies tail sampling, and forwards to Grafana Tempo (or any OTLP-compatible backend):
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
tail_sampling:
decision_wait: 10s
num_traces: 100
policies:
- name: errors-policy
type: status_code
status_code: { status_codes: [ERROR] }
- name: slow-traces-policy
type: latency
latency: { threshold_ms: 1000 }
- name: probabilistic-policy
type: probabilistic
probabilistic: { sampling_percentage: 5 }
batch:
exporters:
otlp:
endpoint: tempo:4317
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [tail_sampling, batch]
exporters: [otlp]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [otlp]
The tail sampling policy here keeps 100% of error traces and slow traces (over 1 second), and samples 5% of everything else. For a service handling 1,000 requests per minute, that means you store roughly 100 traces per minute plus all errors - more than enough for debugging.
Common Pitfalls
Starting with all signals at full volume. Turn on sampling from day one. A trace backend ingesting unsampled production traffic will cost more than your entire infrastructure within a month.
Not setting OTEL_SERVICE_NAME. Without a service name, all your traces land in a default bucket and correlation across services becomes impossible.
Skipping span attributes on custom spans. Auto-instrumentation gives you HTTP method, URL, status code, and database query. But business-level context - user ID, tenant ID, order ID - requires manual attribute setting. Add these on the spans that matter for your debugging workflow, not everywhere.
Forgetting to handle SDK shutdown. In Node.js, register a process.on('SIGTERM') handler that calls sdk.shutdown(). Without this, the last batch of telemetry before a pod restart is lost.
Choosing a Backend
Once you are exporting to the collector, the backend is a configuration change. Common options:
- Grafana Tempo + Prometheus - open source, runs on your infrastructure, no per-seat pricing
- Jaeger - simpler setup than Tempo, good for smaller teams
- Honeycomb, Grafana Cloud, Datadog - managed options that accept OTLP directly
Because your data flows through the collector, switching from Jaeger to Tempo or from Tempo to a managed service takes about 10 minutes of configuration changes and a collector restart. No code changes required. This is the entire value proposition.
Integrating Observability with Code Quality Work
Observability and code quality consulting are more connected than teams often realize. A well-instrumented service makes performance regressions visible before they reach production, turns vague "the app is slow" complaints into specific spans, and gives you the data to justify refactoring work. If your codebase currently has no tracing and you are trying to figure out where time is spent in a request, that is also a sign that the architecture may benefit from a broader review.
If you are building or modernizing a service and want instrumentation built in from the start rather than bolted on later, get in touch at hello@wolf-tech.io or visit wolf-tech.io. An instrumented codebase is a measurable codebase - and that changes how confidently you can ship.
Quick Reference
| PHP | Node.js | |
|---|---|---|
| Core package | open-telemetry/sdk | @opentelemetry/sdk-node |
| Auto-instrumentation | opentelemetry-auto-symfony, opentelemetry-auto-pdo | @opentelemetry/auto-instrumentations-node |
| Trace exporter | open-telemetry/exporter-otlp | @opentelemetry/exporter-trace-otlp-http |
| Metrics exporter | open-telemetry/exporter-otlp | @opentelemetry/exporter-metrics-otlp-http |
| Trace SDK maturity | Stable | Stable |
| Metrics SDK maturity | Beta | Stable |
| Context propagation | Manual for async/queue | Automatic for HTTP, manual for queues |
FAQ
Is OpenTelemetry PHP production-ready? The traces SDK is stable. The metrics SDK is beta but has been used in production by larger teams since late 2025 without major issues. If you need metrics stability guarantees, start with traces only and add metrics once you have verified the integration.
Does OpenTelemetry work with Symfony?
Yes. The opentelemetry-auto-symfony package hooks into kernel request/response events, controller resolution, and console commands. You get HTTP traces with no manual code changes.
Can I use OpenTelemetry alongside an existing APM agent? In principle yes, but in practice they often conflict on instrumentation hooks. The cleaner path is to migrate fully rather than run both simultaneously.
How much overhead does OpenTelemetry add? With sampling enabled and the OTLP exporter configured for async batching, the overhead is typically under 1ms per request for PHP and under 0.5ms for Node.js. Unsampled, synchronous export is a different story - always use batching.

