Research Engineer

Cobalt Logo
  • IT
  • FlexTime
  • FullTime
  • PartTime

Role Description

This is a Research Engineer role focused on post-training and reasoning. Full-time or part-time; we’re open to candidates currently pursuing a PhD or Master’s or working in industry. Your core responsibilities will include conducting research in post-training optimization and reasoning techniques, developing innovative algorithms, and collaborating with cross-functional teams to apply findings to advanced AI systems. The role also involves analyzing complex datasets, enhancing AI models, and contributing to cutting-edge R&D projects aimed at optimizing AI performance and interpretability.

What we’re looking for:

  • Strong ML engineering fundamentals. Comfortable training and fine-tuning LLMs end-to-end (PyTorch, HF, vLLM, deepspeed/FSDP, or similar)
  • Real exposure to post-training methods (SFT, preference optimization, RL fine-tuning), not just having read the papers
  • A track record of shipping research or research-grade engineering: publications, strong open-source contributions, or production ML systems at a lab/frontier company
  • Comfortable working with a part-time research lead. You can take a direction and run, surface tradeoffs early, and don’t need someone in the room every day
  • Excited by applied work in a domain with real-world consequences (you don’t have to come from healthcare; you do have to care about it)

The work spans the full post-training stack as applied to expert reasoning:

  • Designing and running SFT, DPO, and RL (GRPO/PPO and successors) experiments on reasoning traces from our expert network
  • Building benchmarks and evals that meaningfully measure clinical and adjudication reasoning — not just final-answer accuracy, but the reasoning path
  • Turning raw expert outputs into high-quality training datasets: schema design, quality controls, scaling pipelines
  • Working directly with customers (frontier labs, healthcare AI companies) on bespoke data and eval engagements
  • Publishing where it makes sense

Why Cobalt?

  • High-Impact Work: Your designs will be used by leading institutional investors and the world’s most influential organisations
  • Innovation at the Frontier: Work at the intersection of artificial intelligence, institutional finance, and cutting-edge technology
  • Remote-First Culture: Work from anywhere with flexible hours and a globally distributed team
  • Growth Opportunity: Shape design culture and build a world-class design team as we scale
  • Competitive Compensation: Industry-leading salary, equity, and benefits package

Location and work arrangement

  • Remote or in-person. While we’re headquartered in San Francisco, we welcome talented data scientists from around the world. Team collaboration occurs across time zones with overlapping core hours