LangSmith: Designing Responsible AI Audit Trails (Part 2)

In part 1 of LangSmith observability post we had introduced how to do basic setup and view basic trace information in LangSmith portal.
Now lets extend this concept to a very important aspect of Responsible AI called Audit trails. Audit trail is about keeping records of the AI application’s run, so that any 3rd party governance auditor can verify the details of any run, safety threshold violations, or overall transparency.
Audits are now a compliance requirement (EU AI Act, NIST RMF, ISO 42001). Most teams lack the operational visibility to perform audits.
So by starting with audit in this post we are really beginning with the end-state that you want to achieve in RAI.

Here are the key RAI principles with their audit requirements and how LangSmith can help achieve the same.

Accountability: LangSmith tracing give us an immutable, chronological record of every step and the responsible parties.
Transparency: Nested runs can be viewed in LangSmith to see the exact sequence of prompts, tool calls, and model responses.
Fairness/Bias: Custom metadata/tags in LangSmith tracing can be used to label runs with user demographics or sensitive attributes for later evaluation.
Explainability: LangSmith traces have capacity to reconstruct and explain the final decision by capturing the exact prompt template, model version and full final response used for a decision.

LangSmith provides multiple control levels when tracing.
In part 1 we saw that just setting the environment variable for tracing as “true” will automatically trace every call, however without much control for developer. You cannot easily skip a specific sub-function or add custom metadata.
The @traceable decorator gives medium control. You can wrap specific functions as a Run, capturing its inputs and outputs. You can also include custom metadata or tags to annotate the trace, give a custom name to the trace, and specify the type of trace such as “tool”, “chain” , or “agents”.
The next is tracing_context , that allows dynamic, programmatic control over tracing boundaries and metadata within a block of code.
Taking this one more step further in fine-grained control are the RunTree API and the conversation threads to see the full parent, child runs, and even the multiple steps involved in a thread of conversation.

Method	Control Level	Use Case
Environment variable (`LANGCHAIN_TRACING_V2=true`)	Low	Quick enablement for all traces
`@traceable` decorator	Medium	Capture key function inputs/outputs + add custom tags
`tracing_context()`	High	Dynamically control what to trace and what to exclude
RunTree & conversation threads	Very High	Custom tracing + parent-child reasoning chains

The tracing technique can be applied to any function that performs a critical, non-deterministic, or sensitive step where an auditor can ask, “why did this happen?”.

Few examples.. Let’s say the user’s request triggers a decision based on some sensitive variable then custom metadata can be used to add these runtime facts that traditional logging would have missed.
A RAG system’s custom logic for re-ranking documents, a human-in-the-loop validation step, or a conditional branching function that determines which model to use (e.g., a small model for low-risk requests, a large model for high-risk ones). Tracing these custom functions provides the “reasoning path” for the auditor to check.
When human in the loop reviews and overrides AI output then even this event can be captured in the tracing context.

All the audit logs can then be filtered and even passed to a LLM to translate it into a human readable format, something that a Risk Auditor from a non-technical background will prefer over JSON files!

How to handle handle latency trade-offs
LangSmith also handles latency concerns around all this tracing by using asynchronous, non-blocking submission of trace data.
An Architect could also prioritize the trace logging based on risk levels. For high volume, low-risk services, logging every request is expensive and inefficient. An Architect could implement intelligent “sampling” to reduce overhead while being compliant.
Log 100% of traces tagged with PII, sensitive variables, policy overrides with fine-grained control.
Log only max 5% of general, low-risk requests to debug overall system health.

Architect can also consider using OpenTelemetry industry standard, that allows data to be buffered and exported to a centralized collector service outside of your core application, providing near-zero overhead and good isolation.

Conclusion
RAI auditability is most effective when designed upfront, not added later as a patch.

By combining:

structured tracing,
metadata-based tagging,
sampling strategies, and
LangSmith’s RunTree visualizations,

you build systems that are transparent, governable, and regulator-ready—without compromising performance.

← Previous Post Next Post →

Subscribe