Staff GenAI Systems Engineer by day, M.Sc. AI student by night. I build production AI systems that scale and research the stuff that'll matter in two years.
I work at the intersection of AI research and real-world systems engineering. That means taking bleeding-edge models and making them actually work in production—fast, reliable, and without burning through compute budgets.
Day job: Architecting GenAI systems that handle real user traffic. Multi-agent orchestration, RAG pipelines that don't hallucinate, and LLM inference that doesn't cost a fortune.
Research: Currently obsessed with making LLMs smarter through better architectures. Created and Working on Helios Engine—a Rust framework for LLM agents that prioritizes speed and reliability. Also exploring graph-based RAG, advanced reranking, and RL for domain-specific models.
Languages: Rust (when performance matters), Python (for everything else)
AI/ML: Advanced RAG systems, multi-agent orchestration, model optimization, distributed training, RLHF pipelines
Production: End-to-end system design, efficient inference, model serving at scale
- Graph-based retrieval systems with dynamic reranking
- Chain-of-thought reasoning architectures
- Ultra-low-latency inference in Rust
- Model distillation for specialized domains
Good AI systems should be:
- Fast — nobody wants to wait 30 seconds for a response
- Accessible — open source beats paywalls
- Practical — research is cool, but only if it ships
Everything I build is open source. Knowledge hoarding is how we end up with mediocre AI.
- 🤗 HuggingFace — models and experiments
- 💼 LinkedIn — the professional version
- 📫 Open to collaborations on open source AI infrastructure
Building the future of AI systems — one commit at a time.




