About
I'm an AI engineer focused on retrieval systems and LLM-powered applications. I build things, measure whether they work, and write about what I learn.
What I work on
- RAG systems — chunking strategies, embedding models, retrieval methods, reranking
- Evaluation pipelines — synthetic QA generation, retrieval metrics (MRR, NDCG, Recall@K)
- LLM applications — structured outputs, tool use, prompt engineering
How I approach problems
Every project starts with the same questions:
- What does success look like? (Define the metric)
- Where are we now? (Measure the baseline)
- What changed? (Prove the delta)
This isn't revolutionary — it's just discipline. But it's the difference between "I think this works" and "here's the evidence."
Background
I spent 25 years in enterprise consulting — MarkLogic, Oracle, RightNow — leading technical delivery across APAC and North America. I translated business requirements into working systems for banks, telcos, and government agencies, and managed international teams doing the same.
Now I'm applying that lens to AI engineering. I build evaluation pipelines, RAG systems, and synthetic data workflows — the infrastructure that tells you whether an LLM actually works before it reaches production. My projects emphasize measurable outcomes over demos: retrieval accuracy metrics, structured evaluation frameworks, failure-mode analysis.
I'm not coming from a research background. I'm coming from delivery — where things need to work reliably, at scale, for real users. That turns out to be exactly what's missing in most LLM deployments.