Senior Data Engineer
Overview We are looking for an onshore + remote Senior Data Engineer to join our team. You will be responsible for designing, developing, and maintaining data pipelines and structures that support our data science practice and customer-facing data products. Responsibilities Design and Development: Develop, test, and maintain data pipelines using Python and orchestration tools such as Airflow or Kestra. Code Quality: Write clean, maintainable, and efficient code, following best practices for coding standards, security, testing, and deployment. Database Management: Design and optimize database tables, write efficient SQL queries, and manage database migration scripts. Documentation: Develop and maintain documentation around data sources and their composition requirements and resulting transformations from ingestion to final data structures. Also, ensure accurate tracking through the data lifecycle. Collaboration: Work with the product team, data scientists, and other stakeholders to define and implement data solutions using new and existing data sources and technologies. Continuous Improvement: Participate in code reviews, contribute to team learning, and stay updated with industry trends and technologies. Requirements Technical Qualifications Python Demonstrable experience in data-focused libraries (Pandas, DuckDB, etc.) Experience working with DAGs or equivalent structures Experience in process automation in Python Experience Integrating with third-party APIs Proven Experience in a Data Engineering or Similar Role (ideally 5+ yrs) Understanding Data Lineage and Strategies ETL Pipeline Design/Development Data Modeling Experience Experience Building Scalable Data Lakes/Warehouses Experience analyzing and organizing large data sets Experience in Event-Based Data Processing Strong Data Documentation Experience Strong SQL (Postgres RDBMS) Experience Table design and optimization Advanced Query Building and Optimization Advanced Data Aggregation Strategies Experience with ETL/Workflow Automation & Tools (Kestra, Airflow, or similar) Git SCM (Gitlab) Experience in Regulated Industries (Healthcare, Banking, etc.) Bonus: AWS (S3, Step Functions, Batch, Athena, Glue) Experience in Data Analysis Experience working with Data Science/ML teams Experience with Typescript