Search by job, company or skills

Airtel Wireless

Sr. engineer Techops

4-6 Years
Save
new job description bg glownew job description bg glow
  • Posted 4 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We are looking for a skilled L2 techops role with 4+ years of experience in managing large-scale production systems. The ideal candidate should have a strong background in Linux servers, SQL, and big data tools (Hive, Spark), along with hands-on experience in monitoring, troubleshooting, and automation.

Key Responsibilities

  • Manage and maintain production environments ensuring high availability and reliability.
  • Perform system monitoring, performance tuning, and capacity planning.
  • Analyze and debug production issues by leveraging Airflow logs, Spark UI, and Hive query performance metrics.
  • Build and maintain dashboards and alerts in Grafana and Kibana for proactive monitoring and issue detection.
  • Monitor and troubleshoot OCP (OpenShift Container Platform) clusters and associated components.
  • Write and optimize SQL queries to analyze and troubleshoot data issues.
  • Collaborate with development, data engineering, and operations teams to ensure system reliability and scalability.
  • Participate in on-call rotations and incident management processes.
  • Automate routine operational tasks using scripting (Shell, Python, etc.).
  • Ensure adherence to best practices in observability, monitoring, and incident response.

Required Skills & Experience

  • 4–6 years of experience as an SRE, DevOps Engineer, or similar role.
  • Strong expertise in Linux system
  • Solid understanding of SQL with the ability to write and optimize queries.
  • Good working knowledge of Hive and Spark; ability to use Spark UI for debugging performance issues.
  • Hands-on experience in monitoring and analyzing logs using Kibana and Grafana.
  • Experience in Airflow log analysis and DAG issue resolution.
  • Familiarity with OCP (OpenShift) or other Kubernetes-based platforms for cluster monitoring.
  • Strong analytical, debugging, and problem-solving skills.
  • Scripting skills in Shell or Python for automation.
  • Understanding of CI/CD and deployment best practices is a plus.
  • good working knowledge with querying tools like Jupyterhub,metabase

Preferred Qualifications

  • Experience with cloud platforms (AWS, GCP, or Azure).
  • Knowledge of Prometheus, Elastic Stack, or similar observability tools.
  • Exposure to incident management and postmortem analysis.
  • Familiarity with big data pipelines and distributed systems.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 148307971

Similar Jobs

Pune, India

Skills:

KibanaShellHiveLinuxSparkGrafanaSqlPythonAirflowOCP OpenShift