Lead Data Engineer - Scalable Data Pipelines - Contract to Hire

Remote Full-time
Lead Data Engineer (PySpark, Airflow, Azure) – Scalable Data Pipelines We’re looking for an experienced Senior Data Engineer to design, build, and optimize large-scale data pipelines powering analytics and machine learning workloads. This role is ideal for someone who is hands-on, performance-oriented, and comfortable leading other engineers while owning end-to-end data workflows. You’ll work on both batch and real-time processing, take ownership of Spark performance tuning, and help enforce best practices around data quality, governance, and reliability. ⸻ Responsibilities • Design, develop, and optimize scalable data pipelines using Python, PySpark, Apache Spark, and Airflow • Build and maintain batch and streaming data processing systems on Spark • Design and manage Airflow DAGs to orchestrate complex, dependency-heavy workflows • Implement data partitioning, caching, and Spark performance tuning to handle large datasets efficiently • Ensure data quality, governance, security, and reliability across the data lifecycle • Monitor, troubleshoot, and optimize data jobs, SLAs, and pipeline dependencies • Manage cloud infrastructure (Azure) for data workloads, including cost optimization • Implement CI/CD pipelines for data workflows using Git, Docker, and Infrastructure-as-Code tools • Support analytics and ML use cases by working with structured and unstructured data • Lead and mentor other data engineers, providing architectural guidance and code reviews • Promote best practices in coding standards, documentation, and version control • Collaborate effectively with distributed, remote teams in an Agile environment ⸻ ✅ Requirements • 8+ years of hands-on experience in Data Engineering • Strong expertise with Apache Spark / PySpark, including internals such as: • RDDs, DataFrames, DAG execution, partitioning, shuffles, and caching • Proven experience building and operating Airflow DAGs (scheduling, dependencies, retries, SLAs) • Advanced Python and SQL skills with a focus on performance and maintainability • Solid experience with Azure data and compute infrastructure • Working knowledge of Docker, Kubernetes, Terraform, and CI/CD best practices • Strong problem-solving skills and ability to optimize large-scale data processing systems • Prior experience leading or mentoring engineers • Comfortable working in Agile/Scrum environments • Excellent communication skills and ability to collaborate with remote teams ⸻ ⭐ Nice to Have • Experience with streaming frameworks (Spark Structured Streaming, Kafka, Event Hubs) • Familiarity with data governance, lineage, and observability tools • Experience supporting ML or advanced analytics pipelines • Background in cost-efficient Spark optimization at scale Apply tot his job
Apply Now →

Similar Jobs

Data Engineer – MUST HAVE AZURE & IICS – 100% Remote

Remote Full-time

Staff Data Platform Engineer

Remote Full-time

Corporate Vice President - Data Protection Engineer

Remote Full-time

SOX Control Tester

Remote Full-time

Principal Product Manager, Reporting & Optimization Insights [Remote]

Remote Full-time

Software Engineer II - Data Platform

Remote Full-time

Data Engineer 5 - Privacy

Remote Full-time

Senior Software Engineer, iOS

Remote Full-time

Data Scientist/Analyst - Remote

Remote Full-time

Data Scientist(Remote)

Remote Full-time

Patient Access Specialist, Full time, Days (Remote - Must reside in IL, IN, IA, or WI)

Remote Full-time

2025 Summer Data Analytics/Data Science Intern – Amazon Store

Remote Full-time

**Experienced Entry-Level Data Entry Specialist – Remote Opportunity at arenaflex**

Remote Full-time

**Experienced Full Stack Customer Support Specialist – Remote Call Center Operations for blithequark**

Remote Full-time

Experienced Virtual Customer Service Representative – Delivering Exceptional Support in a Dynamic Remote Environment at blithequark

Remote Full-time

Experienced or Entry-Level Data Entry Specialist – Remote Full-Time or Part-Time Opportunity for Career Growth and Development in a Dynamic Healthcare Environment at arenaflex

Remote Full-time

Experienced Full Stack Technical Support / Customer Service Representative – Remote Night Shift Opportunity with Global Leader in Customer Experience and Tech-Powered Innovation

Remote Full-time

Senior/Staff Software Engineer (Backend / Infrastructure leaning) – Remote with occasional travel across East Coast – Competitive Salary + Equity

Remote Full-time

**Experienced Part-Time Data Entry Assistant – Flexible Remote Work Opportunities at blithequark**

Remote Full-time

Credit Underwriter (REMOTE - 100 OPENINGS)

Remote Full-time
← Back to Home