Extensive hands-on experience training LLMs (pre-training, fine-tuning, or post-training) in a research or production setting. Deep expertise in modern deep learning frameworks such as PyTorch, and specialized LLM training stacks (e.g. Megatron, NeMo, verl, or similar). Strong theoretical and practical understanding of LLM fundamentals: architectures, tokenization, data pipelines, batching, mixed precision, distributed training, and debugging unstable runs. The ability to own projects end to end, starting from a high-level problem or product pain point and overseeing it through the design, experimentation, implementation, and iteration phases. A product-aware mindset – you care about how developers actually use agents and can translate product needs and failure modes into modeling and evaluation work. At least 3 years of Python experience writing clean, maintainable code in modern ML codebases.