Amaresh Hebbar amareshhebbar

Amaresh Hebbar

AI Engineer · Agentic LLM Systems · Multi-Agent Infrastructure · Medical AI
Author of TrueNorth — I design, fine-tune, and ship production-grade AI systems end to end.

About

I build agentic AI systems — multi-agent pipelines, LLM infrastructure, on-device inference, and domain fine-tuning — and take them all the way to production.

Open-source author — shipped TrueNorth to PyPI & NPM: an LLM infrastructure engine with 1,258 passing tests, a 13-stage safety pipeline, and 8-provider routing (~90% cost reduction).
Fine-tuning at scale — published a 16-model medical AI suite (Qwen2.5) on Hugging Face: ICD-10/CPT/DRG coding, SNOMED mapping, clinical NLP, PM-JAY classification, and Hindi-medical. Each model trained with a QLoRA → DoRA → ORPO → merge pipeline on a real, paired SFT dataset (also published).
Led a 10-person team across frontend, backend, and mobile — delivering two concurrent AI product lines.
Hackathons — SANS FIND EVIL! (DFIR Automation) · Google Cloud Rapid Agent (GitLab Partner) · INDIA RUNS (Redrob AI × Hack2Skill, Data & AI).
Research-grade rigor — published benchmarks (100% precision on SANS DFIR triage), open SFT datasets, and W&B-tracked training runs.
Based in Bengaluru, India · Open to remote-first AI engineering roles (IST, comfortable with US/EU overlap).

Tech Stack

Featured Projects

Project	What it does	Stack	Highlights
TrueNorth	Developer-first LLM infrastructure engine — declare the outcome in YAML, it owns the full multi-turn conversation lifecycle	Python · TS · Go · RN	1,258 tests · 4 SDKs · hallucination firewall (94%) · 8-provider routing · PyPI + NPM
ShiftLeft	Autonomous 5-agent bug-fixing pipeline: reads repo → triages → generates fix → opens MR	Python · LangGraph · Gemini · GitLab MCP	End-to-end in ~60s, zero human steps · Google Cloud Rapid Agent Hackathon
LogPoseSIFT	Autonomous DFIR orchestrator — MCP server wraps 200+ SANS SIFT tools as typed Go endpoints	Go · Claude · Gemini · MCP · Volatility 3	100% precision · 92.8% recall · 0 hallucinations · SANS FIND EVIL! Hackathon
HireSignal	Ranks 100K candidates against a Senior AI Engineer JD in ~35s on CPU — multi-signal scoring + honeypot detection + semantic embeddings	Python · sentence-transformers · NumPy	No GPU, no API, no network during ranking · 85 honeypots caught · 10 tests · INDIA RUNS Hackathon
PocketLLM	100% offline Android AI chat running LLMs on-device via MediaPipe C++ bridge	React Native · Expo · MediaPipe C++ · AWS S3	9 open-weight models (0.4–5.2 GB) · prompts never leave device
Medical AI Suite	16 fine-tuned Qwen2.5 specialist models for medical coding, billing & clinical NLP	QLoRA · DoRA · ORPO · Unsloth · HF	13 published models + 16 open SFT datasets · 2 live demos · Apache 2.0

Fine-tuned models live on Hugging Face · Training runs tracked on Weights & Biases

Hugging Face — Medical AI Fine-tuned Model Suite

A suite of Qwen2.5 specialist models, one per clinical task. Each model is trained through a consistent QLoRA → DoRA → ORPO → merge pipeline (via Unsloth + TRL) on a dedicated, published SFT dataset — no synthetic training data. Released under Apache 2.0; training tracked on W&B.

📦 Collection: Medical AI Fine-tuned Model Suite · 📊 Datasets: AxisMapper Medical AI Suite

Model	Size	Task	Dataset (rows)	Method	GPU
icd10-coder-qwen25-7b	7B	Clinical text → ICD-10-CM code + justification	icd10-coder-sft (74.7k)	QLoRA → DoRA → ORPO → merge	A40 48GB
icd10-coder-qwen25-7b-merged	8B	Merged full-weights build of the ICD-10 coder (no adapter load)	icd10-coder-sft (74.7k)	QLoRA → DoRA → ORPO → merge	A40 48GB
snomed-mapper-qwen25-7b	7B	Clinical concept → SNOMED CT mapping	snomed-mapper-sft (74.7k)	QLoRA → DoRA → ORPO → merge	A40 48GB
clinical-summarizer-qwen25-7b	7B	Clinical-note summarization	clinical-summarizer-sft (30k)	QLoRA → DoRA → ORPO → merge	A40 48GB
medical-billing-qwen25-3b	3B	Medical billing code generation	medical-billing-sft (17k)	QLoRA → DoRA → ORPO → merge	A40 48GB
cpt-coder-qwen25-3b	3B	Procedure text → CPT code	cpt-coder-sft (17k)	QLoRA → DoRA → ORPO → merge	A40 48GB
radiology-coder-qwen25-3b	3B	Radiology report → diagnostic code	radiology-coder-sft (25.1k)	QLoRA → DoRA → ORPO → merge	A40 48GB
pmjay-classifier-qwen25-3b	3B	India PM-JAY scheme package classification	pmjay-classifier-sft (11.1k)	QLoRA → DoRA → ORPO → merge	A40 48GB
discharge-qa-qwen25-3b	3B	QA over discharge summaries	discharge-qa-sft (30k)	QLoRA → DoRA → ORPO → merge	A40 48GB
medical-ner-qwen25-3b	3B	Clinical named-entity recognition	medical-ner-sft (16.7k)	QLoRA → DoRA → ORPO → merge	A40 48GB
hindi-medical-qwen25-3b	3B	Hindi-language medical assistant	hindi-medical-sft (19.7k)	QLoRA → DoRA → ORPO → merge	A40 48GB
icd10-to-drg-qwen25-1b	1.5B	ICD-10 → DRG for reimbursement grouping	icd10-to-drg-sft (5.39k)	QLoRA → DoRA → ORPO → merge	A40 48GB
insurance-classifier-qwen25-1b	1.5B	CPT/HCPCS → Stark Law DHS classification	insurance-classifier-sft (1.6k)	QLoRA → DoRA → ORPO → merge	A40 48GB
ayurveda-icd-qwen25-1b	1.5B	Ayurveda term → ICD mapping	ayurveda-icd-sft (3k)	QLoRA → DoRA → ORPO → merge	A40 48GB
pharmacy-ner-qwen25-1b	1.5B	Pharmacy / drug entity recognition	pharmacy-ner-sft (3.5k)	QLoRA → DoRA → ORPO → merge	A40 48GB

_{Pipeline (all models): Qwen2.5-Instruct base → QLoRA SFT (4-bit NF4, rank 16, α 32) → DoRA → ORPO preference alignment → adapter merge. Optimizer paged_adamw_8bit · cosine 2e-4 · trained on RunPod (NVIDIA A40 48GB) via Unsloth + TRL. Per-model wall-time ranges from ~0.4 h (1.5B configs) to ~1.9 h (7B configs). Training tracked at wandb.ai/amareshhebbar-/axiomapper. Datasets built from authoritative real-world sources (e.g. CMS FY2026 ICD-10-CM, HCPCS Stark Law DHS list), not LLM-generated.}

Live demos (Spaces): icd10-coder-demo · hiresignal

Hackathons

Submission	Hackathon	Track	What it does
ShiftLeft	Google Cloud Rapid Agent	GitLab Partner	Label a GitLab issue → autonomous 5-agent pipeline reads the repo, triages the bug, writes the fix, and opens an MR in under 60 seconds
Poneglyphs — ShiftLeft	Google Cloud Rapid Agent	GitLab Partner	Label a GitLab issue `shiftleft` → 5-agent pipeline reads GitLab Orbit, triages, writes fix, opens MR
LogPoseSIFT	SANS FIND EVIL!	DFIR Automation	Autonomous DFIR orchestrator — deploys an AI crew via strict MCP endpoints, runs SIFT diagnostics, triages and self-corrects in seconds
AllBlue	SANS FIND EVIL!	DFIR Automation	Splunk alerts trigger autonomous AI forensic triage — IOC findings pushed back as structured events. 100% precision, 0 hallucinations
HireSignal	INDIA RUNS · Redrob AI × Hack2Skill	Data & AI Challenge	Ranks 100K candidates against a Senior AI Engineer JD in ~35s on CPU — multi-signal scoring, 85 honeypots detected, per-candidate reasoning. Live sandbox

GitHub Stats

Highlights

Published TrueNorth to PyPI and NPM (Apache 2.0)
Released a 16-model medical AI suite + 16 open SFT datasets on Hugging Face
1000+ problems solved on LeetCode
B.E. Computer Science & Engineering, Dayananda Sagar College of Engineering (2021–2025)

Open to remote-first AI engineering roles — LLM infrastructure, agentic systems, fine-tuning, or AI product engineering.
Let's build something intelligent.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amaresh Hebbar amareshhebbar

Sponsoring

Achievements