About
Hi, I'm Aaron. I design end-to-end AI systems — data pipelines, feature platforms, model training, evaluation, and inference — with a focus on production-hardened, performance-critical systems.
Recent work has been on LLM systems and training infrastructure: RLHF-style training and reasoning-model experiments, GPU/CUDA optimization for training and inference, agentic workflows and multi-step reasoning, and distributed compute with Ray.
Alongside that, I continue to architect large-scale real-time and batch systems where latency, throughput, and cost have to be balanced under load, in production, with people depending on the answer.
What I'm interested in
- LLM systems at the intersection of distributed infrastructure
- Training efficiency and large-scale compute optimization
- Low-latency systems — long-term, financial and trading infrastructure
I work best in principal-level roles where I can own systems end-to-end, drive the architecture, and build platforms that make real-world AI work at scale.
Reach me
- LinkedIn — /in/aaron-brooks-atx
- GitHub — @rabrooks
- Hugging Face — @doublea322
Built with Astro; hosted on AWS (S3 + CloudFront). Source on GitHub.