Search by job, company or skills

A

Associate Data Engineer

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 16 hours ago
  • Be among the first 20 applicants
Early Applicant

Job Description

Roles & Responsibilities

  • Develop, test, and maintain data pipelines using Databricks, PySpark, and Python.
  • Ingest, transform, and process structured and semi-structured data from multiple sources.
  • Support the development of scalable ETL/ELT workflows for analytics, reporting, and machine learning use cases.
  • Work with data engineers, analysts, and data scientists to understand data requirements and deliver reliable datasets.
  • Perform data cleansing, validation, and quality checks to ensure accuracy and consistency.
  • Optimize Spark jobs and Databricks notebooks for performance, reliability, and cost efficiency.
  • Create and maintain documentation for data pipelines, workflows, data definitions, and processes.
  • Assist in troubleshooting pipeline failures, data issues, and performance bottlenecks.
  • Follow best practices for version control, code quality, testing, and deployment.
  • Support basic AI/ML data preparation activities, including feature engineering, dataset creation, and model input preparation.
  • Monitor scheduled jobs and workflows to ensure timely and successful data delivery.
  • Collaborate with cross-functional teams in an Agile or iterative development environment.

Basic Qualifications And Experience

  • 2-6 years of experience with Bachelor's degree in Computer Science, Data Engineering, Information Systems, Engineering, Mathematics, or a related field, or equivalent practical experience

Must-Have Qualifications

  • Bachelor's degree in Computer Science, Data Engineering, Information Systems, Engineering, Mathematics, or a related field, or equivalent practical experience.
  • Hands-on experience with Python for data processing, scripting, and automation.
  • Strong working knowledge of PySpark and distributed data processing concepts.
  • Proven hands-on experience using Databricks for data engineering, including notebooks, clusters, jobs, workflows, Delta tables, and performance optimization.
  • Ability to build, maintain, and troubleshoot scalable ETL/ELT pipelines in Databricks.
  • Experience working with Delta Lake and lakehouse architecture concepts.
  • Working knowledge of SQL for querying, transforming, and validating data.
  • Ability to work with structured and semi-structured data formats such as CSV, JSON, Parquet, and Delta.
  • Understanding of data engineering concepts such as ETL/ELT, data pipelines, data lakes, data warehouses, batch processing, and data quality.
  • Basic understanding of AI and machine learning concepts, including features, training datasets, model inputs/outputs, and model evaluation basics.
  • Experience supporting data preparation or feature engineering for AI/ML use cases.
  • Familiarity with cloud-based data platforms, preferably AWS, Azure, or GCP.
  • Understanding of Git or other version control tools.
  • Strong analytical, problem-solving, and troubleshooting skills.
  • Good communication skills and ability to work collaboratively with technical and non-technical stakeholders.
  • Willingness to learn new tools, technologies, and data engineering best practices.

Preferred Qualifications

  • Exposure to Delta Lake, Unity Catalog, or Lakehouse architecture.
  • Experience with workflow orchestration tools or Databricks Jobs.
  • Familiarity with CI/CD practices for data engineering projects.
  • Exposure to machine learning workflows using MLflow, scikit-learn, or similar tools.
  • Experience with Tableau, Power BI, or similar data visualization tools to create dashboards, support reporting needs, validate datasets, and perform exploratory analysis.
  • Understanding of data governance, security, and access control concepts.
  • Experience working in an Agile/Scrum environment.

Shift Information

This position may require working during later shifts (evening or night) depending on business needs.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147203841

Similar Jobs

Hyderabad, India

Skills:

Identity ManagementRest ApiItsmService CatalogServicenowMicrosoft IntuneAzure Active DirectoryIntegration HubFlow Designerapproval workflowsrole-based accessunit test cases

Hyderabad, India

Skills:

EdaPower BiPysparkPerformance TuningData WarehousingApache SparkBig Data TechnologiesTableauData ModelingSparksqlSqlSeabornMatplotlibEtl ToolsDatabricksData IntegrationPythonGenAIworkflow orchestrationR packagesLLM technologymachine learning model trainingfeature engineering

Hyderabad, India

Skills:

DatabricksPythonSqlAWSScaled Agile methodologies

Hyderabad

Skills:

Cloud ComputingApi IntegrationData WarehousingSqlPythonEtl