Data Scientist || 15 + Years experience
Role: Data Scientist
Location: San Jose, CA
Responsibilities
- Design and build end-to-end ML/data pipelines (ingestion transformation labeling feature engineering training validation deployment).
- Develop reproducible ML research workflows using Jupyter/Colab and version-controlled environments.
- Define and manage data labeling strategy (guidelines, quality checks, automation).
- Perform exploratory data analysis (EDA) and create datasets for supervised & unsupervised ML.
- Train, evaluate, and optimize ML models across environments (dev, staging, prod).
- Apply MLOps best practices (CI/CD for ML, pipeline automation, monitoring model drift, experiment tracking).
- Collaborate with ML engineers, software developers, and product managers to deliver production-ready ML solutions.
Requirements
- PhD in Computer Science, Data Science, Statistics, Applied Math, or related field.
- 3 years applied research/industry experience in ML or Data Science.
- Proven expertise in end-to-end ML pipeline design and deployment.
- Strong Python skills (NumPy, Pandas, scikit-learn, TensorFlow/PyTorch).
- Proficiency in SQL and large-scale data processing.
- Experience with MLOps tools (MLflow, Kubeflow, Airflow, Vertex AI, SageMaker).
- Familiarity with cloud platforms (Google Cloud Platform preferred) and Docker/Kubernetes.
- Experience automating data labeling workflows (active learning, weak supervision).
Nice-to-have: Ex-Google or top-tier tech company background.
E++:++ ++|++
USA | Canada | UK | India