Skip to main content

11 posts tagged with "rust"

View All Tags

Multi-Probe Bayesian Spam Gating: Filtering Junk Before Spending Compute

· 44 min read
Vadim Nicolai
Senior Software Engineer

In a B2B lead generation pipeline, every email that arrives costs compute. Scoring it for buyer intent, extracting entities, predicting reply probability, matching it against your ideal customer profile — each module is a DeBERTa forward pass. If 40% of inbound email is template spam, AI-generated slop, or mass-sent campaigns, you are burning 40% of your GPU budget on garbage.

The solution is a gating module: a spam classifier that sits at stage 2 of the pipeline and filters junk before anything else runs. But a binary spam/not-spam classifier is too blunt. You need to know why something is spam (template? AI-generated? role account?), how confident you are (is it ambiguous, or have you never seen this pattern before?), and which provider will block it (Gmail is stricter than Yahoo on link density).

This article documents a hierarchical Bayesian spam gating system with 4 aspect-specific attention probes, information-theoretic AI detection features, uncertainty decomposition, and a full Rust distillation path. The Python model trains on DeBERTa-v3-base. The Rust classifier runs at batch speed with 24 features and zero ML dependencies.

Building a ZoomInfo Alternative with Qwen and MLX: Local Buyer Intent Detection

· 11 min read
Vadim Nicolai
Senior Software Engineer

ZoomInfo charges $300+ per user per month for intent data — buying signals that tell sales teams which companies are actively in-market. It is the platform's number one feature and the reason enterprises pay six figures annually for access. But the underlying technology — classifying company content into intent categories — is a text classification problem. One that a 3-billion-parameter open-source model can solve on a single laptop.

Fine-Tune Qwen3 with LoRA for AI Cold Email Outreach

· 27 min read
Vadim Nicolai
Senior Software Engineer

An AI cold email engine does one thing: it reads what you know about a company and writes a personalized outreach email — automatically, at scale. If you've ever spent an afternoon manually tweaking 50 nearly-identical emails, you understand the problem. If you've paid for Instantly, Smartlead, or Apollo, you've already solved it — just not on your own terms.

Those SaaS tools charge $30-200/month, send your prospect list to their servers, and give you a black-box model you can't touch. You can't train it on your best-performing emails. You can't add custom quality gates. You can't run it offline. For engineers and technical founders, that's a bad deal.

This system is the alternative: a locally-run pipeline where you own every layer — model weights, scoring logic, and approval gates. The core is Qwen3-1.7B, fine-tuned with LoRA adapters on MLX (Apple's framework for M1/M2 Metal acceleration). A Rust orchestration layer drives the full batch loop: pulling company records, invoking the model, running quality filters, and surfacing emails for human review before anything sends.

The result is not a toy. On a single M1 MacBook Pro, the pipeline generates 200+ personalized emails per batch in under 10 seconds — no GPU cloud, no API latency, no per-email cost. Fine-tuning converges in under 30 minutes on the same machine.

From Research Papers to Production: ML Features Powering a Crypto Scalping Engine

· 33 min read
Vadim Nicolai
Senior Software Engineer

Every feature in a production trading system has an origin story — a paper, a theorem, a decades-old insight from probability theory or market microstructure. This post catalogs 14 ML features implemented in a Rust crypto scalping engine, traces each back to its foundational research, shows the actual formulas, and includes real production code. The engine processes limit order book (LOB) snapshots, trade ticks, and funding rate data in real time to generate scalping signals for crypto perpetual futures.

Two Paradigms of Multi-Agent AI: Rust Parallel Agents vs Claude Code Agent Teams

· 28 min read
Vadim Nicolai
Senior Software Engineer
TL;DR

Three multi-agent coordination positions, one codebase. A static Rust/Tokio fan-out assigns 20 agents at compile time with zero coordination overhead. A team.rs library implements the full Claude Code agent-teams model in pure Rust — TaskQueue, Mailbox, PlanGate, ShutdownToken — and the study pipeline now uses it to run a 2-step search→write flow with inter-worker messaging. Claude Code agent teams invert every assumption of static fan-out: dynamic task claiming, file-locked concurrency, full bidirectional messaging. The decision rule is one question: do your agents need to talk to each other? If no, tokio::spawn + Arc<T>. If yes: build team.rs, or use TeamCreate.

Multi-agent AI engineering has become a core discipline in production software development. The interesting question is no longer whether to build multi-agent systems. It is how — and specifically, which architectural pattern to reach for given the nature of the work. The clearest demonstration is that multiple fundamentally different paradigms live inside the same codebase.