Agentic AI · MLOps · Quant Systems · Memory-Layer R&D
I build AI systems the way production infrastructure should be built — as composable architectures with explicit state, durable memory, evaluation loops, and deployment boundaries that hold under load. The work here is defined by how systems are structured and where they break, not just what they output.
The hard part of agentic AI isn't the model. It's the seams: memory bleeding into prompts, evaluation bolted on after release, deployment treated as an afterthought. I treat those boundaries as first-class design problems — because that's where reliability, cost, and trust are actually won or lost.
Every layer earns its place by the failure mode it removes. The architecture is just that idea, made explicit:
| Layer | Role | Failure it removes |
|---|---|---|
| Intent → workflow | Decompose a goal into bounded, inspectable steps | Agents that wander with no stopping condition |
| Memory | Episodic, semantic, and graph state the agent reads and writes | Context bloat, prompt leakage, run-to-run amnesia |
| Tools & data | Typed, permissioned external calls | Unaudited side effects and silent data drift |
| Evaluation gate | Score and approve output before anything downstream trusts it | Regressions shipping unnoticed |
| Traces & monitoring | A durable record of every decision the system made | Failures you can't reproduce, explain, or bound |
| Iteration loop | Route eval and trace signals back into the workflow | A system no better than the day it shipped |
- Agentic orchestration — explicit state, bounded tool use, human oversight, and decisions you can trace and roll back.
- Memory architecture — episodic, semantic, and graph memory with conflict resolution, designed as a contract rather than an ever-growing context window.
- Evaluation-first ML — metrics, gates, and regression checks that decide whether a system earns trust before it ships.
- Quant research systems — backtesting, risk constraints, and a hard separation between research and execution.
- Operational shape — APIs, containers, CI, and runbooks a small team can actually run without me in the room.
| Repository | Architectural theme |
|---|---|
agentic-quant-lab |
End-to-end quant agent: planner → simulator → risk engine → research report |
memory-layer-rnd |
Multi-store memory with retrieval and conflict resolution |
agentic-mlops-foundry |
Production scaffolding: API boundary → runtime → eval gate → deployment path |
llm-finetuning-eval-lab |
Data → baseline → metrics → model card → CI gate |
Agents: LangGraph-style workflows, role decomposition, RAG, tool calling, eval harnesses ML/MLOps: Python, FastAPI, Docker, GitHub Actions, experiment tracking Quant: pandas, NumPy, backtesting, risk metrics, RL research Systems: API design, ADRs, test strategy, operational runbooks
- Architecture should surface failure modes early, while they're still cheap to fix.
- Memory is a contract, not a transcript dump.
- Autonomy is only safe with evaluation, traces, and rollback paths.
- Finance systems must separate research, paper trading, and live execution — without exception.
- Real engineering shows up in docs, tests, and explicit trade-offs, not demos.
