Skip to content
View mastroke's full-sized avatar

Block or report mastroke

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mastroke/README.md

Masoob Alam

Agentic AI · MLOps · Quant Systems · Memory-Layer R&D

I build AI systems the way production infrastructure should be built — as composable architectures with explicit state, durable memory, evaluation loops, and deployment boundaries that hold under load. The work here is defined by how systems are structured and where they break, not just what they output.

The hard part of agentic AI isn't the model. It's the seams: memory bleeding into prompts, evaluation bolted on after release, deployment treated as an afterthought. I treat those boundaries as first-class design problems — because that's where reliability, cost, and trust are actually won or lost.

How I Think About Systems

Every layer earns its place by the failure mode it removes. The architecture is just that idea, made explicit:

Layer Role Failure it removes
Intent → workflow Decompose a goal into bounded, inspectable steps Agents that wander with no stopping condition
Memory Episodic, semantic, and graph state the agent reads and writes Context bloat, prompt leakage, run-to-run amnesia
Tools & data Typed, permissioned external calls Unaudited side effects and silent data drift
Evaluation gate Score and approve output before anything downstream trusts it Regressions shipping unnoticed
Traces & monitoring A durable record of every decision the system made Failures you can't reproduce, explain, or bound
Iteration loop Route eval and trace signals back into the workflow A system no better than the day it shipped

Where I Focus

  • Agentic orchestration — explicit state, bounded tool use, human oversight, and decisions you can trace and roll back.
  • Memory architecture — episodic, semantic, and graph memory with conflict resolution, designed as a contract rather than an ever-growing context window.
  • Evaluation-first ML — metrics, gates, and regression checks that decide whether a system earns trust before it ships.
  • Quant research systems — backtesting, risk constraints, and a hard separation between research and execution.
  • Operational shape — APIs, containers, CI, and runbooks a small team can actually run without me in the room.

Repositories

Repository Architectural theme
agentic-quant-lab End-to-end quant agent: planner → simulator → risk engine → research report
memory-layer-rnd Multi-store memory with retrieval and conflict resolution
agentic-mlops-foundry Production scaffolding: API boundary → runtime → eval gate → deployment path
llm-finetuning-eval-lab Data → baseline → metrics → model card → CI gate

Stack

Agents: LangGraph-style workflows, role decomposition, RAG, tool calling, eval harnesses ML/MLOps: Python, FastAPI, Docker, GitHub Actions, experiment tracking Quant: pandas, NumPy, backtesting, risk metrics, RL research Systems: API design, ADRs, test strategy, operational runbooks

Operating Principles

  • Architecture should surface failure modes early, while they're still cheap to fix.
  • Memory is a contract, not a transcript dump.
  • Autonomy is only safe with evaluation, traces, and rollback paths.
  • Finance systems must separate research, paper trading, and live execution — without exception.
  • Real engineering shows up in docs, tests, and explicit trade-offs, not demos.

Pinned Loading

  1. agentic-mlops-foundry agentic-mlops-foundry Public

    Production template for agentic AI services with eval gates, Docker, FastAPI and CI.

    Python

  2. agentic-quant-lab agentic-quant-lab Public

    Agentic quantitative research lab with risk guardrails, reproducible backtests and paper-trading boundaries.

    Python

  3. llm-finetuning-eval-lab llm-finetuning-eval-lab Public

    Reproducible fine-tuning and evaluation workflow for LLM classification tasks.

    Python

  4. memory-layer-rnd memory-layer-rnd Public

    Research harness for persistent memory layers in AI agents: episodic, semantic and graph memory.

    Python

  5. graphiti graphiti Public

    Forked from getzep/graphiti

    Build Real-Time Knowledge Graphs for AI Agents

    Python

  6. RLtrading RLtrading Public archive

    Agentic quantitative research lab with risk guardrails, reproducible backtests and paper-trading boundaries.

    Python