Search by job, company or skills

Kezan Consulting

Data Engineer(DevOps)

4-9 Years
Save
  • Posted 5 hours ago
  • Over 50 applicants
Quick Apply

Job Description

Key Responsibilities

• Ensure platform uptime and application health as per SLOs/KPIs

• Monitor infrastructure and applications using ELK, Prometheus, Zabbix, etc.

• Debug and resolve complex production issues, performing root cause analysis

• Automate routine tasks and implement self-healing systems

• Design and maintain dashboards, alerts, and operational playbooks

• Participate in incident management, problem resolution, and RCA documentation

• Own and update SOPs for repeatable processes

• Collaborate with L3 and Product teams for deeper issue resolution

• Support and guide L1 operations team

• Conduct periodic system maintenance and performance tuning

• Respond to user data requests and ensure timely resolution

• Address and mitigate security vulnerabilities and compliance issues Technical Skillset

• Hands-on with Spark, Hive, Cloudera Hadoop, Kafka, Ranger

• Strong Linux fundamentals and scripting (Python, Shell)

• Experience with Apache NiFi, Airflow, Yarn, and Zookeeper

• Proficient in monitoring and observability tools: ELK Stack, Prometheus, Loki

• Working knowledge of Kubernetes, Docker, Jenkins CI/CD pipelines

• Strong SQL skills (Oracle/Exadata preferred)

• Familiarity with DataHub, DataMesh, and security best practices is a plus

• Strong problem-solving and debugging mindset

• Ability to work under pressure in a fast-paced environment.

• Excellent communication and collaboration skills.

• Ownership, customer orientation, and a bias for actionKey Responsibilities

• Ensure platform uptime and application health as per SLOs/KPIs

• Monitor infrastructure and applications using ELK, Prometheus, Zabbix, etc.

• Debug and resolve complex production issues, performing root cause analysis

• Automate routine tasks and implement self-healing systems

• Design and maintain dashboards, alerts, and operational playbooks

• Participate in incident management, problem resolution, and RCA documentation

• Own and update SOPs for repeatable processes

• Collaborate with L3 and Product teams for deeper issue resolution

• Support and guide L1 operations team

• Conduct periodic system maintenance and performance tuning

• Respond to user data requests and ensure timely resolution

• Address and mitigate security vulnerabilities and compliance issues Technical Skillset

• Hands-on with Spark, Hive, Cloudera Hadoop, Kafka, Ranger

• Strong Linux fundamentals and scripting (Python, Shell)

• Experience with Apache NiFi, Airflow, Yarn, and Zookeeper

• Proficient in monitoring and observability tools: ELK Stack, Prometheus, Loki

• Working knowledge of Kubernetes, Docker, Jenkins CI/CD pipelines

• Strong SQL skills (Oracle/Exadata preferred)

About Company

Job ID: 114642883

Similar Jobs

Pune, India

Skills:

snowflake JenkinsDockerGitlabPythondbtBatch ProcessingDevOps Infrastructure

Pune

Skills:

JenkinsDockerSparkKubernetesAirflowCloudera Hadoop

Pune, India

Skills:

data engineering Cyber SecurityAws ServicesJiraSqlConfluenceTerraformMachine LearningShell scriptingAWS CloudFormationDatabricksPython

Pune, India

Skills:

PrometheusVpcGrafanaCloud StorageDockerTerraformPythonBashFluxJenkinsCompute EngineIamHelmKubernetesCloud LoggingTektonGitOpsDeployment ManagerGoCloud BuildPub SubCloud RunCloud SQLGKECloud MonitoringGitLab CISecret ManagerArgoCD