Search by job, company or skills

Valethi Technologies

Data Engineer

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Role Overview

We are looking for experienced Data Engineers with strong expertise in Python, PySpark, and Databricks to support enterprise-scale data ingestion and processing platforms. The ideal candidate should have hands-on experience building scalable data pipelines, orchestration frameworks, and cloud-based data solutions using modern Lakehouse architectures.

The role will primarily focus on:

  • Ingest Factory
  • Data Processing Factory
  • Data Pipeline Orchestration
  • Databricks-based ETL/ELT Solutions
Key Responsibilities
  • Design, develop, and optimize scalable data ingestion and processing pipelines.
  • Build and maintain ETL/ELT workflows using Python, PySpark, and Databricks.
  • Develop orchestration workflows using LakeFlow, Jobs, Tasks, and Declarative Pipelines.
  • Implement reusable frameworks, metadata-driven orchestration, and automation patterns.
  • Ensure data quality, monitoring, alerting, and lineage across data pipelines.
  • Collaborate with business and technical teams to understand data requirements and solution design.
  • Participate in Agile/SCRUM ceremonies, code reviews, and deployment activities.
  • Optimize Spark workloads, SQL queries, and compute performance within Databricks.
  • Support CI/CD and Infrastructure-as-Code practices using Git and Terraform.
Required Technical SkillsCore Technologies
  • Python
  • PySpark
  • Databricks
  • SQL
  • Spark
  • Git & CI/CD
  • Terraform
Detailed Skill RequirementsDatabricks Expertise
  • Strong hands-on experience with:
  • Databricks Notebooks
  • Jobs & Workload Optimization
  • Connectors & Data Acquisition
  • LakeFlow Orchestration
  • Declarative Pipelines
  • Delta Lake
  • Experience implementing:
  • Data lineage
  • Monitoring & alerting frameworks
  • Data quality checks
  • Data product concepts
Python & PySpark
  • Strong understanding of distributed processing using Spark.
  • Good knowledge of Python coding standards and package management.
  • Ability to differentiate single-node vs distributed Spark execution patterns.
  • Experience building scalable PySpark transformations and frameworks.
Data Engineering & Data Literacy
  • Strong understanding of:
  • Lakehouse and Data Warehouse architectures
  • ETL/ELT patterns
  • Source system integration patterns
  • Data models and schema design
  • Experience with data quality validation and monitoring practices.
Software Engineering Practices
  • Strong understanding of:
  • SOLID principles
  • DRY principles
  • Reusable framework design
  • Metadata-driven orchestration
  • Experience with:
  • Agile/SCRUM methodologies
  • Git workflows and pull requests
  • Unit, integration, and end-to-end testing
  • VS Code / Cursor IDE tools
Spark & SQL Optimization
  • Experience troubleshooting distributed Spark workloads.
  • Expertise in:
  • Query optimization
  • Join optimization
  • Compute and table performance tuning
  • Efficient filtering and workload reduction strategies
DevOps & Automation
  • Hands-on experience with:
  • CI/CD pipelines
  • Git version control
  • Terraform Infrastructure-as-Code (IaC)
  • Deployment automation practices
Required Experience
  • 5+ years of experience in Cloud Data Engineering.
  • Proven experience designing and building production-grade data platforms.
  • Experience working independently and collaboratively within Agile teams.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147317319

Similar Jobs

Nagpur, India

Skills:

Informatica PowercenterData Warehousing ConceptsSqlData ModelingDatabase DesignRelational Databasesdata quality frameworksdata governance practicesETL processes

Remote, India

Skills:

snowflake SparkSqlPrestoPower BiAWSHivePythonJenkinsGitAirflowFivetranCI CDdbtMatillion

Remote, India

Skills:

snowflake JavaScalaClouderaDatabricksPythonAWS

Remote

Skills:

Power BiAzure DatabricksAzure Data FactoryPysparkSqlpythonpower bi daxazure fundam

Remote

Skills:

snowflake SqlData ModelingData WarehousingPythonCloud ComputingData GovernanceInformatica IICSETL ProcessesHealthcare Analytics