Skip to main content

25 posts tagged with "LangGraph"

Building stateful, multi-step AI agents and workflows with LangGraph — graphs, reducers, checkpointing, and durable execution.

View All Tags

Observable AI Memory: mem0, LangGraph, and Qdrant with Enterprise-Grade Telemetry

· 13 min read
Vadim Nicolai
Senior Software Engineer

Most "AI memory" demos stop at memory.add() and memory.search(). That works on a laptop. It does not survive contact with production. The real questions are: When this recall is slow, which store is to blame? When a graph's spend triples overnight, which feature caused it? When a customer asks "what did your agent remember about me, and when?", can you answer from an audit log instead of a shrug?

TL;DR — This field report shows how to build an agent memory layer where every operation honors a contract: fail-open, PII-safe, and fully instrumented. Three stores (mem0, Qdrant, LangGraph) are funneled through single chokepoints, and each chokepoint fans out to five telemetry sinks. The result is a stack that answers the hard production questions without guesswork.

Multi-Modal Evaluation for AI-Generated LEGO Parts: A Production DeepEval Pipeline

· 19 min read
Vadim Nicolai
Senior Software Engineer

Your AI pipeline generates a parts list for a LEGO castle MOC. It says you need 12x "Brick 2 x 4" in Light Bluish Gray, 8x "Arch 1 x 4" in Dark Tan, and 4x "Slope 45 2 x 1" in Sand Green. The text looks plausible. But does the part image next to "Arch 1 x 4" actually show an arch? Does the quantity make sense for a castle build? Would this list genuinely help someone source bricks for the build?

These are multi-modal evaluation questions — they span text accuracy, image-text coherence, and practical usefulness. Standard unit tests cannot answer them. This article walks through a production evaluation pipeline built with DeepEval that evaluates AI-generated LEGO parts lists across five axes, using image metrics that most teams haven't touched yet.

The system is real. It runs in Bricks, a LEGO MOC discovery platform built with Next.js 19, LangGraph, and Neon PostgreSQL. The evaluation judge is DeepSeek — not GPT-4o — because you don't need a frontier model to grade your outputs.

CrewAI's Genuinely Unique Features: An Honest Technical Deep-Dive

· 14 min read
Vadim Nicolai
Senior Software Engineer

TL;DR — CrewAI's real uniqueness is that it models problems as "build a team of people" rather than "build a graph of nodes" (LangGraph) or "build a conversation" (AutoGen). The Crews + Flows dual-layer architecture is the core differentiator. The role-playing persona system and autonomous delegation are ergonomic wins, not technical breakthroughs. The hierarchical manager is conceptually appealing but broken in practice. This post separates what's genuinely novel from what's marketing.