Introducing otex

We built thisfor ourselves.

At Parseable, we are building the first datalake purpose built for observability.
Which means we spend a lot of time thinking about where and how that data comes to us.
At Parseable, we are building the first datalake purpose built for observability. Which means we spend a lot of time thinking about where and how that data comes to us.

Scroll

If you have ever spent an afternoon wiring up OpenTelemetry traces across a codebase, manually wrapping handlers in tracer.startSpan() blocks, chasing all the breaking SDK changes and debugging exporter configs, you know how quickly instrumentation becomes a full-time job of its own.

We built otex, a VS Code extension that instruments a repository automatically with OpenTelemetry distributed tracing. otex deeply understands the OTel specs, SDKs and config. It analyzes your codebase, identifies where spans should go, and produces a clean, production-ready instrumentation setup with zero manual span blocks.

01The problem

Instrumentation is difficult

Instrumenting an application is a complex yet boring activity. Complex, because it demands deep familiarity with the application, understanding OTel, it's semantics and the ecosystem. Boring, because it's mostly boilerplate code that doesn't add business value. Until it does (:

The natural result: developers put off instrumentation until the last moment, then rush through it to meet a deadline. What comes out is often low quality instrumentation. For example, it doesn't follow semantic conventions, doesn't have a coherent flow as different developers have different opinions on what needs to be instrumented, misses critical spans, or breaks when the SDK updates.

LLMs getting great at code generation puts high-quality instrumentation within reach. But as we'll see, the generalist nature of LLMs creates its own set of challenges that make them a costly and unreliable solution for instrumentation specifically.

02Coding agents

Coding agents should instrument, right?

The natural instinct is to reach for a coding agent, i.e. point Claude Code or Codex at the application, tell it to add tracing, and expect it to figure out the rest.

In practice, this approach has three failure modes that compound on each other.

1. Hallucinated packages

A USENIX study analyzing 576,000 code samples from 16 different LLMs found that 19.7% of generated package references were hallucinated. The code was importing packages that do not exist. JavaScript packages were especially prone, with a 21.3% hallucination rate. And 43% of these hallucinated packages recurred consistently across repeated queries, meaning the model is confidently wrong in the same way every time.

For OpenTelemetry specifically, the package ecosystem is sprawling. There is @opentelemetry/sdk-node, @opentelemetry/sdk-trace-node, @opentelemetry/sdk-trace-base, @opentelemetry/auto-instrumentations-node, and dozens of individual instrumentation packages. An agent directly driven by a LLM has to pick the right combination from memory, and frequently picks wrong.

2. Version drift

The mix of OTel evolution and complex codebases creates challenges. As OTel versions evolve, there are several backward incompatible changes. For example, the OpenTelemetry JS SDK shipped a major breaking change in v2.0 (released February 2025): the Resource class constructor was removed entirely.

// What LLMs still generate (v1.x — BROKEN on v2.x):
const { Resource } = require("@opentelemetry/resources");
new Resource({ "service.name": "my-app" });

// What v2.x actually requires:
const { resourceFromAttributes } = require("@opentelemetry/resources");
resourceFromAttributes({ "service.name": "my-app" });

This is a full constructor removal, and just one of a dozen breaking changes in the same release: Resource.default() became defaultResource(), addSpanProcessor() was removed from TracerProvider, the minimum Node.js version jumped to ^18.19.0 || >=20.6.0, and View and Aggregation classes were replaced with object-based configs.

Agents and their driving models are trained on vast codebases, but their training data has a cutoff. When you ask Claude Code to instrument a project today, it has to reason about which SDK version your package.json will resolve to, which APIs are available in that version, and which semantic conventions are current. That is a lot of moving parts for the coding agent to one-shot the instrumentation.

3. Semantic convention drift

OpenTelemetry semantic conventions are versioned. Attributes get renamed, stabilized, and deprecated across releases. During the HTTP semantic conventions stabilization alone, 17 attributes were renamed (e.g., http.status_code became http.response.status_code). This is a natural part of the standardization process, but it creates a moving target for instrumentation, which agents just can't keep up with.

Given the in-deterministic nature of AI coding agents, you get a mix of old and new attribute names. The traces will ingest fine, but dashboards and alerts will silently miss data because they are querying for attributes that have been renamed. This is even worse than no instrumentation, because you have the false confidence of thinking you are tracking a critical signal when in reality your instrumentation is broken.

03Skills

How about adding Skills to the agent?

Even if you add detailed OTel specs, provide specific tool access to the agent and create a thorough skill, the core problem remains: instrumentation is a multi-step, stateful process that requires several deterministic decisions: detect the framework, identify instrumentation points, decide what to instrument, generate spans with correct semantic conventions, and validate they compile. Each step needs the output of the previous one to be correct.

Skills don't solve for this. They can't hold this chain together. Skills would hand over everything to the LLM, including the parts where it's most unreliable — exact SDK APIs, convention names, version specific syntax. Framework detection, SDK version pinning, semantic convention enforcement should be deterministic, not probabilistic.

Without a multi step feedback loop, grounded in actual information about Code, Framework, SDK and Semantic conventions, skills fail at production grade instrumentation.

04Why otex

otex is an agent, built to instrument

otex's layered architecture offers deterministic scaffolding to handle the mechanical parts, like code and framework detection, identifying the right SDK version. The LLM handles the important judgment calls about what and where to instrument. Agentic design ensures the state is maintained, there are guardrails at every step, and there is a clean feedback loop to evaluate the output before moving on to the next step. The LLM is a critical part of the process, but it's not on its own. IDE serves as the execution surface for writing directly into the right files with proper diffs.

Built-in knowledge of OTel SDKs. otex ships with purpose-built knowledge of the SDK surface area, auto-instrumentation libraries, and exporter configurations. It already knows to use OTLP exporters and follow semantic conventions.

AST analysis before code generation. Before making any changes, otex parses your codebase's abstract syntax tree to understand the project structure: what framework you are using, which routes exist, where the entry point is. Every decision is grounded in your actual code, so package selection and API usage are always correct for your SDK version.

Predictable, auditable output. Every run produces the same result for the same codebase. The output follows a consistent pattern: one tracing.js setup file, one require statement in your entry point, the correct auto-instrumentation packages. You can review it in seconds.

Try otex now