Job Description
Description :
About the Role :
We are seeking a highly experienced Data Engineer with 5+ years of hands-on expertise to join our data platform and analytics engineering team. The ideal candidate will specialize in designing, building, and maintaining scalable, reliable, and high-performance data pipelines and analytical data platforms. This role requires close collaboration with data scientists, analysts, product teams, and business stakeholders to ensure trusted, well-modeled, and production-ready data for analytics and AI use cases.
Key Responsibilities :
- Design, develop, and maintain end-to-end data pipelines for batch and streaming data.
- Build scalable, reliable, and efficient ETL/ELT workflows using modern data engineering tools.
- Develop and optimize data models for analytics, reporting, and machine learning use cases.
- Implement data ingestion from multiple sources including databases, APIs, files, and event streams.
- Work closely with data scientists and analysts to support analytical and ML workloads.
- Ensure data quality, consistency, validation, and monitoring across data pipelines.
- Optimize performance and cost for large-scale data processing systems.
- Collaborate with cloud and platform teams to deploy and manage data infrastructure.
- Implement data governance, security, access controls, and compliance best practices.
- Develop and maintain documentation for data pipelines, models, and architectural decisions.
- Monitor pipeline health, troubleshoot failures, and implement proactive alerting.
- Mentor junior data engineers and contribute to data engineering best practices and standards.
Required Skills & Qualifications :
- Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or a related field.
- 5+ years of professional experience in data engineering, analytics engineering, or data platform roles.
- Strong proficiency in Python for data processing and pipeline development.
- Advanced hands-on experience with SQL for transformations, analytics, and performance tuning.
- Strong experience with Databricks (Spark, Delta Lake, workflows, notebooks).
- Strong hands-on experience with Snowflake, including data modeling and performance optimization.
- Experience building transformation layers using dbt (models, tests, macros, documentation).
- Solid understanding of data warehousing concepts, dimensional modeling, and analytical data models.
- Experience working with batch and streaming data pipelines.
- Familiarity with orchestration tools such as Airflow or equivalent.
- Experience working in cloud environments (AWS, GCP, or Azure).
- Strong problem-solving skills and ability to work with cross-functional stakeholders.
Good to Have :
- Experience with big data technologies such as Spark, Kafka, or Hadoop.
- Exposure to real-time data streaming and event-driven architectures.
- Knowledge of MLOps concepts and supporting ML workflows with data pipelines.
- Familiarity with MLflow, Feature Stores, or model data versioning.
- Hands-on experience with Python data science libraries such as:
- Pandas, NumPy
- Scikit-learn
- Statsmodels
- Experience supporting data science and machine learning teams with curated datasets.
- Experience building analytics-ready datasets for dashboards and executive reporting.
- Knowledge of CI/CD practices for data pipelines.
- Experience with monitoring, logging, and observability for data platforms.
- Exposure to business analytics, KPI tracking, or executive dashboards.
- Domain experience in finance, retail, healthcare, manufacturing, or SaaS.
- Prior experience mentoring teams or leading data engineering initiatives.
Relevant Certifications (Preferred) :
- Databricks Certified Data Engineer
- Snowflake SnowPro certifications
- AWS Data Analytics / Data Engineer certifications
- Google Professional Data Engineer
- Azure Data Engineer Associate