Reduced human screening by ~96%
Role: CTO & Lead Architect · HireFinch — AI voice interviewer
The problem
Recruiters were drowning in early screens. Hiring managers lacked structured signal. Ad-hoc interviews wasted everyone's time and produced inconsistent data that made panel prep a guessing game.
What we built
An agentic voice interviewer that maps to job-specific rubrics. Every score cites the exact transcript span and rubric level — managers get an executive summary in minutes, not hours. We chose multi-model failover (OpenAI ⇄ Gemini) because no single provider meets our uptime bar. A provider-agnostic inference shim with circuit breakers handles routing; the candidate never notices a failover.
The evidence
Realtime reply latency at p95 ≤ 1.2s with failover. Availability SLO of 99.95% with error-budget guardrails. Rubric-adherence F1 ≥ 0.85. PII leakage = 0 on the policy suite. Proctoring blends stylometry, latency profiles, and webcam snapshots with ≤7-day retention.
Read the full build notes
$5M saved, 91% live task success
Role: Product Engineering Lead · Enterprise LLM Copilot for Ops, Freight & Finance
The problem
Ops, freight, and finance teams burned time chasing scale tickets and reconciling buy/sell lines. Month-end spikes delayed invoicing. Chase emails were the default workflow — and they didn't scale.
What we built
An agentic workflow with RAG over OMS/TMS, email, and knowledge bases. The copilot handles tool calls for TMS status, proof-of-delivery retrieval, and portal/EDI fetches — all read-only. We enforced PII scrubbing on ingest, human-in-the-loop gates for sensitive actions, and audit logs on every tool call.
The evidence
Scale-ticket chase emails dropped 45%. Invoice cycle time fell 1.3 days (~28%). Exception rate cut from 12% to 6%. Interactive latency at p95 1.2–2.0s, availability 99.95%, zero critical incidents post-GA. Offline eval accuracy improved from 92% to 97%; hallucination rate under 1.5% after red-teaming.
Read the full build notes