About

Hi, I'm Aaron. I design end-to-end AI systems — data pipelines, feature platforms, model training, evaluation, and inference — with a focus on production-hardened, performance-critical systems.

Recent work has been on LLM systems and training infrastructure: RLHF-style training and reasoning-model experiments, GPU/CUDA optimization for training and inference, agentic workflows and multi-step reasoning, and distributed compute with Ray.

Alongside that, I continue to architect large-scale real-time and batch systems where latency, throughput, and cost have to be balanced under load, in production, with people depending on the answer.

What I'm interested in

LLM systems at the intersection of distributed infrastructure
Training efficiency and large-scale compute optimization
Low-latency systems — long-term, financial and trading infrastructure

I work best in principal-level roles where I can own systems end-to-end, drive the architecture, and build platforms that make real-world AI work at scale.

Reach me

LinkedIn — /in/aaron-brooks-atx
GitHub — @rabrooks
Hugging Face — @doublea322

Built with Astro; hosted on AWS (S3 + CloudFront). Source on GitHub.