Information Technology_USA – USA_Engineer
ALL CAPS, NO SPACES B/T UNDERSCORES PTN_USGBAMSREQID
Candidate BeelineID i.e. PTN_US_9999999_SKIPJOHNSON0413
MSP Owner: Thomas Hodges
Targeted Bill Rate- market ratehr
REQUIREMENT_CITY – Owings Mills, MD
REQUIREMENT_ID- 10460302
Role Name – Lead PySpark EngineerROLE_DESCRIPTION –
-
10+ years of experience in big data and distributed computing.
-
Very Strong hands-on experience with PySpark, Apache Spark, and Python.
-
Strong Hands on experience with SQL and NoSQL databases (DB2, PostgreSQL, Snowflake, etc.).
-
Proficiency in data modeling and ETL workflows.
-
Proficiency with workflow schedulers like Airflow
-
Hands on experience with AWS cloud-based data platforms.
-
Experience in DevOps, CI/CD pipelines, and containerization (Docker, Kubernetes) is a plus.
-
Strong problem-solving skills and ability to lead a team
-
Lead the design, development, and deployment of PySpark-based big data solutions.
-
Architect and optimize ETL pipelines for structured and unstructured data.
-
Collaborate with Client, data engineers, data scientists, and business teams to understand requirements and provide scalable solutions.
-
Optimize Spark performance through partitioning, caching, and tuning.
-
Implement best practices in data engineering (CI/CD, version control, unit testing).
-
Work with cloud platforms like AWS
-
Ensure data security, governance, and compliance.
-
Mentor junior developers and review code for best practices and efficiency
SysMind