Research Engineer
Role Description
This is a Research Engineer role focused on post-training and reasoning. Full-time or part-time; we’re open to candidates currently pursuing a PhD or Master’s or working in industry. Your core responsibilities will include conducting research in post-training optimization and reasoning techniques, developing innovative algorithms, and collaborating with cross-functional teams to apply findings to advanced AI systems. The role also involves analyzing complex datasets, enhancing AI models, and contributing to cutting-edge R&D projects aimed at optimizing AI performance and interpretability.
What we’re looking for:
- Strong ML engineering fundamentals. Comfortable training and fine-tuning LLMs end-to-end (PyTorch, HF, vLLM, deepspeed/FSDP, or similar)
- Real exposure to post-training methods (SFT, preference optimization, RL fine-tuning), not just having read the papers
- A track record of shipping research or research-grade engineering: publications, strong open-source contributions, or production ML systems at a lab/frontier company
- Comfortable working with a part-time research lead. You can take a direction and run, surface tradeoffs early, and don’t need someone in the room every day
- Excited by applied work in a domain with real-world consequences (you don’t have to come from healthcare; you do have to care about it)
The work spans the full post-training stack as applied to expert reasoning:
- Designing and running SFT, DPO, and RL (GRPO/PPO and successors) experiments on reasoning traces from our expert network
- Building benchmarks and evals that meaningfully measure clinical and adjudication reasoning — not just final-answer accuracy, but the reasoning path
- Turning raw expert outputs into high-quality training datasets: schema design, quality controls, scaling pipelines
- Working directly with customers (frontier labs, healthcare AI companies) on bespoke data and eval engagements
- Publishing where it makes sense
Why Cobalt?
- High-Impact Work: Your designs will be used by leading institutional investors and the world’s most influential organisations
- Innovation at the Frontier: Work at the intersection of artificial intelligence, institutional finance, and cutting-edge technology
- Remote-First Culture: Work from anywhere with flexible hours and a globally distributed team
- Growth Opportunity: Shape design culture and build a world-class design team as we scale
- Competitive Compensation: Industry-leading salary, equity, and benefits package
Location and work arrangement
- Remote or in-person. While we’re headquartered in San Francisco, we welcome talented data scientists from around the world. Team collaboration occurs across time zones with overlapping core hours