2 posts tagged with "zod"

AI Observability for LLM Evals with Langfuse

February 7, 2026 · 10 min read

Senior Software Engineer

This article documents an evaluation harness for a Remote EU job classifier—but the real focus is AI observability: how to design traces, spans, metadata, scoring, and run-level grouping so you can debug, compare, and govern LLM behavior over time.

Schema-First RAG with Eval-Gated Grounding and Claim-Card Provenance

February 5, 2026 · 7 min read

Vadim Nicolai

Senior Software Engineer

This article documents a production-grade architecture for generating research-grounded therapeutic content. The system prioritizes verifiable artifacts (papers → structured extracts → scored outputs → claim cards) over unstructured text.

You can treat this as a “trust pipeline”: retrieve → normalize → extract → score → repair → persist → generate.