Data Engineer
About The Company At Hatch, we are pioneering the future of AI-driven customer engagement by developing intelligent systems that interact directly with customers in real-world scenarios. Our innovative approach combines cutting-edge artificial intelligence with robust data infrastructure to deliver seamless, conversational experiences. Backed by prestigious investors such as Y Combinator, Bessemer, and NextView, Hatch is experiencing rapid growth, doubling revenue annually and establishing itself as a category leader in AI for customer service. Our team is committed to building scalable, reliable, and high-performance systems that transform how businesses connect with their clients. We foster a dynamic and collaborative environment where innovation, ownership, and speed are core values. Join us in shaping the future of customer engagement technology.
About The Role We are seeking a highly skilled Data Engineer to join our expanding data team in New York City. This role requires a professional with strong full-stack and back-end engineering experience who is passionate about building and maintaining scalable data pipelines and platform services. As a Data Engineer at Hatch, you will be instrumental in powering our analytics, reporting, and AI initiatives by designing robust data systems that handle increasing data volumes and complex AI models. Your work will involve developing production-grade data pipelines, APIs, and backend services, ensuring they are optimized for performance, reliability, and cost-efficiency. This position offers an exciting opportunity to work on high-impact projects in a fast-paced environment, with a focus on operational excellence and innovative architecture. Please note that this role is not centered on business intelligence or dashboard creation but on engineering scalable data systems for AI applications.
Qualifications
- 5+ years of experience in backend/full-stack and data engineering roles.
- At least 3+ years of experience building and maintaining production services in Python or Go.
- Proficiency in designing and implementing APIs, SDKs, and backend services following modern software engineering principles.
- Strong understanding of computer science fundamentals including data structures, algorithms, concurrency, networking, and distributed systems.
- Hands-on experience with distributed data technologies such as Kafka, Kinesis, Pub/Sub, Spark, Flink, Dask, Redis, and MongoDB.
- Experience with cloud platforms, particularly AWS and GCP, including monitoring and troubleshooting services (CloudWatch, Prometheus, Grafana).
- Solid SQL skills and practical knowledge of data modeling techniques for dimensional, event-driven, and domain-specific datasets.
- Experience with containerization, CI/CD pipelines, and deployment strategies using Kubernetes, Terraform, and related tools.
- Excellent communication skills, with the ability to clearly explain complex technical concepts and advocate for best practices.
- Ability to work effectively in a fast-paced startup environment, contributing to both application and data systems development.
Responsibilities
- Design, build, and maintain scalable batch and real-time data pipelines utilizing technologies such as Kinesis, Pub/Sub, Flink, Spark, Airflow, and dbt.
- Develop and maintain APIs, SDKs, and backend services that integrate seamlessly with data infrastructure and AI systems.
- Apply software engineering best practices including modular design, testing, CI/CD, observability, and code reviews across all data and service development activities.
- Model, partition, and optimize datasets in data lakes and warehouses such as BigQuery and Aurora PostgreSQL, focusing on performance, cost, and governance.
- Collaborate with backend and platform engineering teams to define data contracts, streaming interfaces, and shared service boundaries.
- Implement infrastructure-as-code and container orchestration solutions using Terraform, Docker, and Kubernetes/EKS for deployment and scaling.
- Establish and monitor service level objectives (SLOs) for data quality, latency, and availability, proactively resolving production issues.
- Participate in architectural discussions to ensure data systems follow scalable, event-driven, microservice, and domain-driven design patterns.
Benefits
- Competitive salary and equity packages.
- Flexible work arrangements including remote work (Eastern or Central Time Zone preferred) or hybrid model (3 days/week in NYC office).
- Comprehensive medical, dental, and vision insurance plans.
- 401(k) retirement plan with company matching.
- Flexible paid time off to support work-life balance.
- Opportunity to contribute to a high-growth, mission-driven company at the forefront of AI innovation.
Equal Opportunity
Hatch is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, ethnicity, gender, sexual orientation, age, disability, religion, or any other status protected by applicable law. All qualified applicants will receive consideration for employment without regard to these factors.