Data Engineer
Fulltime Data Engineer
Location Cleavland OH ,Pittsburgh, PA
Key Responsibilities
-
Design, develop, and maintain ETL/ELT pipelines using PySpark , Python , and Hadoop ecosystem tools (HDFS, Hive, HBase, etc.)
-
Manage and optimize large-scale data ingestion, transformation, and storage solutions.
-
Collaborate with cross-functional teams to integrate data from multiple sources (databases, APIs, streaming platforms, etc.)
-
Ensure data quality, consistency, and reliability across data pipelines.
-
Implement and monitor job scheduling, logging, and alerting for data workflows.
-
Tune performance of Spark jobs and Hadoop clusters for scalability and efficiency.
-
Work with cloud platforms (AWS EMR, Azure HDInsight, or Google Cloud Platform Dataproc) for big data processing.
-
Develop and maintain data models, schemas, and metadata documentation.
-
Collaborate with DevOps teams to automate data deployment and version control.