Search by job, company or skills

McCormick & Company

Senior/ Lead Data Engineer

Save
  • Posted 7 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Multiple roles for Senior Data Engineer/ Lead Data Engineer

Location: DLF Cyber City, Gurgaon

Work model: Hybrid

Shift: 1:00 PM - 9:30 PM IST

Position Summary:

You will play a pivotal role in the build and delivery of data products from simple to complex and supporting McCormick business units with their data and analytics needs.

Your responsibilities will include delivering and supporting data for existing analytics solutions, tooling, and solutions, researching new features and implementing automations. You will support business users, Data Scientists and Data Analysts to convert business expectations into data products and data models usable by business to deliver AI, analysis, reporting, and data-driven recommendations to stakeholders and executives.

This role will be accountable for building and maintaining scalable data pipelines from source systems. The Data Engineer will ensure the availability, reliability, and performance of data products by integrating raw data from various sources. Key responsibilities include data modeling, ETL (Extract, Transform, Load) development, and ensuring data quality and security.

Responsibilities:

Design and Execute

• Partner with data product managers to gather and deliver data pipelines.

• Design ETL solutions including data quality, data security, and data pipeline resiliency.

• Execute ETL solutions including data security, data quality and performance requirements.

Data Extraction, Load and Transformation

• Design, build and implement ELT pipelines to efficiently ingest and transform data from a wide variety of data sources and deliver datasets that meet business requirements.

• Optimize performance for large datasets and data workflows for performance, scalability, and reliability to support business needs.

• Develop and maintain scalable data pipelines leveraging Azure Synapse, PySpark, APIs, and SQL & performing advanced data cleaning, transformation, and manipulation to ensure high-quality, and reliable data flows.

• Implement CI/CD processes to streamline and automate data pipelines deployment.

• Apply data validation frameworks (Great Expectations, Fabric-native tools) to maintain accuracy.

• Utilize partitioning, indexing, clustering strategies to enhance query performance.

Process Improvement, Performance and Cost optimization tuning

• Collaborate with Data Science, AI, and Data product teams to optimize performance and cost effectiveness of their solutions.

• Identify and support the design of internal process improvements, including automating manual processes, optimizing data product delivery, and redesigning solutions for enhanced scalability.

• Implement solution adjustments to improve performance and cost-effectiveness of data products.

Desired key skills/ background of the candidate:

  • Bachelor's degree in Mathematics, Statistics, Computer Science, Data Analytics/Science, or related field
  • Microsoft Certified: Azure Data Engineer (DP203) or Microsoft Certified: Fabric Data Engineer Associate or related cloud technologies, Fabric IQ/Databricks certifications a plus
  • 3+ years of data engineering experience.
  • Demonstrated ability coding in one or more languages (PySpark preferred).
  • Experience with building data pipelines.
  • Experience with knowledge graphs a plus.
  • Demonstrated ability to manage multiple priorities simultaneously.
  • Demonstrated ownership of production-grade pipelines.
  • Experience supporting AI/ML workloads preferred.
  • Lakehouse design principles.
  • Data mesh concepts (where appropriate).
  • Medallion architecture maturity.
  • Semantic layer design.
  • Security architecture patterns.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148996405

Similar Jobs

Delhi, India

Skills:

PysparkApache SparkAutomationData QualityGitlabDatabricksData GovernancePythonCI CD PipelinesAI ML WorkflowsLLMOpsRAG PipelinesVector-Space ArchitecturesVector SearchSQL OptimizationmetadataDelta LakeSpark Performance OptimizationDatabricks REST APIsDistributed Data ProcessingScalable Data Platform Architecture

Noida, India

Skills:

JavaS3StormHadoopCassandraPysparkScalaKafkaBashRedisMapreduceSpark StreamingHiveSparkPythonHbaseAWSFlink

Noida, India

Skills:

JavaS3StormHadoopCassandraPysparkScalaKafkaBashMapreduceRedisSpark StreamingHiveSparkPythonHbaseAWSFlink