OpenTelemetry for PHP and Node: An Instrumentation Baseline Without Vendor Lock-In

#OpenTelemetry PHP
Sandor Farkas - Founder & Lead Developer at Wolf-Tech

Sandor Farkas

Founder & Lead Developer

Expert in software development and legacy code optimization

OpenTelemetry PHP support has matured enough in 2026 that there is no longer a good reason to reach for a proprietary SDK first. If you are instrumenting a PHP or Node.js service today, starting with a vendor-specific agent is a choice you will likely pay for later - in migration effort, in surprise pricing, or in the friction of switching when your monitoring vendor gets acquired or raises prices.

This post walks through a practical instrumentation baseline for PHP and Node.js using OpenTelemetry. By the end you will have traces, metrics, and structured logs flowing to a collector - and you will be able to swap the backend without touching application code.


Why OpenTelemetry and Not a Vendor SDK?

The honest answer is: OpenTelemetry forces good instrumentation hygiene.

Vendor SDKs tend to encourage passive instrumentation - install the agent, let it auto-instrument everything, and hope the traces make sense. OpenTelemetry works the same way for auto-instrumentation, but because the data model is open and widely documented, you end up reasoning about span attributes, propagation headers, and sampling strategies instead of treating observability as a black box.

The other reason is cost control. Once your signals go through the OpenTelemetry Collector, you can sample aggressively before exporting. Head-based sampling at the collector level can cut your ingestion bill by 60-80% without changing a line of application code. Proprietary agents generally give you less control here.


What the Baseline Covers

A minimum viable observability setup for a production PHP or Node.js service needs three things:

  • Distributed traces - request flows across service boundaries, including database queries, external HTTP calls, and queue operations
  • Runtime metrics - memory, CPU, event loop lag (Node), garbage collection frequency (PHP/Node)
  • Structured logs with trace context - log lines correlated to the trace ID so you can jump from a span to the relevant log lines

This post focuses on traces and metrics. Log correlation is a one-liner once the trace context is propagated correctly.


PHP Instrumentation with opentelemetry-php

The PHP SDK is stable for traces and beta for metrics as of early 2026. For most production use cases that is enough.

Install the SDK

composer require open-telemetry/sdk open-telemetry/exporter-otlp

For auto-instrumentation of Symfony, PSR-7 HTTP clients, PDO, and Redis, add the relevant contrib packages:

composer require open-telemetry/opentelemetry-auto-symfony \
  open-telemetry/opentelemetry-auto-pdo \
  open-telemetry/opentelemetry-auto-redis

Auto-instrumentation hooks into Symfony's kernel events, PDO statement execution, and Redis commands without requiring manual span creation in your code.

Configure the SDK via Environment Variables

OpenTelemetry respects the standard environment variable spec, so you configure it outside your application code:

OTEL_SERVICE_NAME=my-php-app
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
OTEL_TRACES_SAMPLER=parentbased_traceidratio
OTEL_TRACES_SAMPLER_ARG=0.1

Setting OTEL_TRACES_SAMPLER_ARG=0.1 samples 10% of root spans that have no parent. Adjust based on your traffic volume and the cost model of your backend.

Verify Context Propagation

The most common mistake with PHP instrumentation is broken trace context across async boundaries - jobs dispatched to a queue or events handled in a separate process. If a background job starts a new root span instead of continuing the trace from the HTTP request that triggered it, you lose the full picture.

To propagate correctly, serialize the current context when dispatching:

$propagator = Globals::propagator();
$carrier = [];
$propagator->inject($carrier);
// Store $carrier alongside the job payload

And extract it at the start of the job handler:

$propagator = Globals::propagator();
$context = $propagator->extract($carrier);
$span = $tracer->spanBuilder('process-job')
    ->setParent($context)
    ->startSpan();

This is the part that vendor auto-instrumentation gets wrong most often. Explicit propagation is worth the extra lines.


Node.js Instrumentation with @opentelemetry/sdk-node

The Node.js SDK is the most mature in the OpenTelemetry ecosystem. Auto-instrumentation covers Express, Fastify, Koa, HTTP, gRPC, Prisma, Sequelize, Redis, and most databases you are likely to use.

Install

npm install @opentelemetry/sdk-node \
  @opentelemetry/auto-instrumentations-node \
  @opentelemetry/exporter-trace-otlp-http \
  @opentelemetry/exporter-metrics-otlp-http

Bootstrap File

Create a tracing.js (or tracing.ts) that initializes the SDK before any other imports:

import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http';
import { PeriodicExportingMetricReader } from '@opentelemetry/sdk-metrics';

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({
    url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT + '/v1/traces',
  }),
  metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter({
      url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT + '/v1/metrics',
    }),
    exportIntervalMillis: 30000,
  }),
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();

Run your application with this file loaded first:

node --require ./tracing.js index.js

Or with --import for ESM:

node --import ./tracing.js index.js

Event Loop Metrics

The default auto-instrumentation includes Node.js runtime metrics - event loop lag, active handles, garbage collection duration. These are the metrics that tell you whether your service is healthy under load before errors start appearing.

If you are using a framework like Fastify, add a process-level check that alerts when event loop lag exceeds 100ms. That threshold catches most blocking operations before users notice.


The OpenTelemetry Collector: Your Sampling and Routing Layer

Both the PHP and Node services above export to an OpenTelemetry Collector. This is not optional if you want vendor independence.

A minimal collector configuration that accepts OTLP, applies tail sampling, and forwards to Grafana Tempo (or any OTLP-compatible backend):

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  tail_sampling:
    decision_wait: 10s
    num_traces: 100
    policies:
      - name: errors-policy
        type: status_code
        status_code: { status_codes: [ERROR] }
      - name: slow-traces-policy
        type: latency
        latency: { threshold_ms: 1000 }
      - name: probabilistic-policy
        type: probabilistic
        probabilistic: { sampling_percentage: 5 }
  batch:

exporters:
  otlp:
    endpoint: tempo:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [tail_sampling, batch]
      exporters: [otlp]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp]

The tail sampling policy here keeps 100% of error traces and slow traces (over 1 second), and samples 5% of everything else. For a service handling 1,000 requests per minute, that means you store roughly 100 traces per minute plus all errors - more than enough for debugging.


Common Pitfalls

Starting with all signals at full volume. Turn on sampling from day one. A trace backend ingesting unsampled production traffic will cost more than your entire infrastructure within a month.

Not setting OTEL_SERVICE_NAME. Without a service name, all your traces land in a default bucket and correlation across services becomes impossible.

Skipping span attributes on custom spans. Auto-instrumentation gives you HTTP method, URL, status code, and database query. But business-level context - user ID, tenant ID, order ID - requires manual attribute setting. Add these on the spans that matter for your debugging workflow, not everywhere.

Forgetting to handle SDK shutdown. In Node.js, register a process.on('SIGTERM') handler that calls sdk.shutdown(). Without this, the last batch of telemetry before a pod restart is lost.


Choosing a Backend

Once you are exporting to the collector, the backend is a configuration change. Common options:

  • Grafana Tempo + Prometheus - open source, runs on your infrastructure, no per-seat pricing
  • Jaeger - simpler setup than Tempo, good for smaller teams
  • Honeycomb, Grafana Cloud, Datadog - managed options that accept OTLP directly

Because your data flows through the collector, switching from Jaeger to Tempo or from Tempo to a managed service takes about 10 minutes of configuration changes and a collector restart. No code changes required. This is the entire value proposition.


Integrating Observability with Code Quality Work

Observability and code quality consulting are more connected than teams often realize. A well-instrumented service makes performance regressions visible before they reach production, turns vague "the app is slow" complaints into specific spans, and gives you the data to justify refactoring work. If your codebase currently has no tracing and you are trying to figure out where time is spent in a request, that is also a sign that the architecture may benefit from a broader review.

If you are building or modernizing a service and want instrumentation built in from the start rather than bolted on later, get in touch at hello@wolf-tech.io or visit wolf-tech.io. An instrumented codebase is a measurable codebase - and that changes how confidently you can ship.


Quick Reference

PHPNode.js
Core packageopen-telemetry/sdk@opentelemetry/sdk-node
Auto-instrumentationopentelemetry-auto-symfony, opentelemetry-auto-pdo@opentelemetry/auto-instrumentations-node
Trace exporteropen-telemetry/exporter-otlp@opentelemetry/exporter-trace-otlp-http
Metrics exporteropen-telemetry/exporter-otlp@opentelemetry/exporter-metrics-otlp-http
Trace SDK maturityStableStable
Metrics SDK maturityBetaStable
Context propagationManual for async/queueAutomatic for HTTP, manual for queues

FAQ

Is OpenTelemetry PHP production-ready? The traces SDK is stable. The metrics SDK is beta but has been used in production by larger teams since late 2025 without major issues. If you need metrics stability guarantees, start with traces only and add metrics once you have verified the integration.

Does OpenTelemetry work with Symfony? Yes. The opentelemetry-auto-symfony package hooks into kernel request/response events, controller resolution, and console commands. You get HTTP traces with no manual code changes.

Can I use OpenTelemetry alongside an existing APM agent? In principle yes, but in practice they often conflict on instrumentation hooks. The cleaner path is to migrate fully rather than run both simultaneously.

How much overhead does OpenTelemetry add? With sampling enabled and the OTLP exporter configured for async batching, the overhead is typically under 1ms per request for PHP and under 0.5ms for Node.js. Unsampled, synchronous export is a different story - always use batching.