
Search by job, company or skills

Job Responsibilities:
Data Pipeline Development : Design, develop, and optimize data pipelines to ingest, process, and transform data from various sources (e.g., APIs, databases, into the data warehouse.
Data Integration: Integrate data from various structured and unstructured sources into the Databricks Lakehouse environment, ensuring data accuracy and reliability
Data Lakehouse storage Management: Design and maintain data warehouse solutions using medallion architecture practices, optimizing storage, cloud utilization, costs and query performance
Collaboration with Data Teams : Work closely with data scientists, analysts, to understand requirements, translate them into technical solutions, and implement data solutions.
Data Quality and Monitoring : Cleanse, transform, and enrich data. Implement data quality checks and establish monitoring processes to ensure data integrity and accuracy. Implement monitoring for data pipelines and troubleshoot any issues or failures promptly to ensure data reliability.
Optimization and Performance Tuning: Optimize data processing workflows for performance, reliability, and scalability, including tuning spark jobs, caching, and partitioning data appropriately.
Data Security and Privacy: Manage and organize data lakes using Unity catalog, ensuring proper governance, security, role-based access and compliance with data management policies
Preferred Qualifications:
Technical Skills:
Job ID: 108710827
Skills:
snowflake , SQL Server, Postgres, Tableau, BigQuery, MySQL, Oracle, Etl Tools, Informatica, DataStage, Oozie, Redshift, Amazon Q, Microstrategy, Airflow, Apache Hop, Looker, Pentaho, GitHub Copilot, ChatGPT
Skills:
Scipy, AWS Athena, Apache Airflow, Data Cleansing, Devops, Pandas, Gcp, Numpy, Spark, Python, Google Datalab, Apache Druid, scikit-learn, MLlib, Python Notebooks, Google BigQuery, Imply, Zeppelin, Jupyter
Skills:
Pyspark, Azure Databricks, Data Warehousing Concepts, Sql, ETL frameworks, Delta Lake architecture, Business Intelligence Platforms
Skills:
Gitlab, Sql, SQL Server, DataFlow, Pyspark, Rest Apis, AWS, MySQL, Cloudformation, Oracle, Spark SQL, Azure, Gcp, Terraform, Jenkins, PostgreSQL, Web Services, Azure DevOps, Informatica Cloud Services IDMC IICS, Delta Lake, CI CD pipelines
Skills:
Sql, Google Cloud, ELT, Apache Airflow, Nosql, Jenkins, Git, Docker, Azure, Python, AWS, Etl
We don’t charge any money for job offers