
Search by job, company or skills
Minimum 5+ years in Data Engineering, with at least 4 years of hands-on experience in Databricks and Azure ecosystem — this is a must.
• End-to-End Data Pipeline Delivery – Proven track record of building and owning ETL/ELT pipelines, data models, and platform components at enterprise scale across complex business domains.
• Expert SQL Skills – Advanced proficiency in complex joins, window functions, CTEs, query optimization, and performance tuning across large datasets.
• Strong PySpark & Python – Hands-on experience building distributed data processing pipelines, transformations, and automation scripts using PySpark and Python.
• Databricks Expertise – Deep working knowledge of Delta Lake, Unity Catalog, Workflows, medallion architecture (bronze-silver-gold), notebook development, and lakehouse best practices.
• Clean Code & Engineering Discipline – Writes modular, well-documented, and testable code. Follows coding standards, maintains version control hygiene, and builds reusable components that other developers can extend.
• Problem Solving & Debugging – Strong ability to diagnose and resolve complex data pipeline failures, distributed system issues, and production incidents under pressure.
• Performance Tuning – Demonstrated expertise in optimizing Spark jobs, SQL queries, and storage layers for cost efficiency, throughput, and reliability.
• Developer Ownership Mindset – Takes end-to-end responsibility from development through unit testing, deployment validation, and production support — not just writing code and handing off.
• Adaptability – Comfortable navigating evolving technology stacks, shifting priorities, and cross-functional collaboration in a dynamic, fast-paced Agile environment.
• Team Player – Collaborative engineer who actively contributes to code reviews, architecture discussions, knowledge sharing, and mentoring within a globally distributed team.
Nice to Have:
• AI Exposure – Familiarity with AI/ML concepts such as model training pipelines, feature engineering, RAG architectures, or tools like Azure OpenAI and LangChain.
• Experience working with supply chain data domains such as procurement, logistics, inventory, or demand planning is a strong plus. Understanding of SAP or ERP data structures in a supply chain context is an added advantage.
• Full Project Lifecycle & Cross-Team Collaboration – Has worked through the complete project lifecycle — from requirements gathering to production deployment. Experienced in collaborating with API and UI/UX teams, and understands how data flows end-to-end across upstream sources, backend services, APIs, and front-end applications.
Job ID: 148475463
Skills:
Scala, Spark ecosystem
Skills:
Java, Hadoop, Flume, Cassandra, PostgreSQL, Scala, Apache Spark, HBase, Linux Scripting, Sql, Mapreduce, Pig, Hive, Spark Streaming, Sqoop, Php, Ruby, Apache Storm, Python, NoSQL databases, HDFS, R
Skills:
Perl, Bash, Python, Linux Administration, ITRS Geneos, Networking knowledge, Market Data support
Skills:
Java, Pyspark, Scala, Apache Spark, Avro, Microservices, ELT, REST, Gcp, Distributed Systems, Azure, Kubernetes, Etl, AWS, orc, Parquet, Airflow, Cloud platforms, CI CD, Data storage formats, Dagster
Skills:
Sap Bods, Sql, PL/SQL, T-sql, Etl Development, Data Warehousing, Data Modeling, Informatica IICS, Oracle 19c, Stored Procedures
We don’t charge any money for job offers