Search by job, company or skills

A

IT Infrastructure Support Site Reliability Engineer

6-11 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 9 hours ago
  • Be among the first 40 applicants
Early Applicant
Quick Apply

Job Description

Key Responsibilities

Service Reliability & Automation

  • Establish, monitor, and enforce Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for infrastructure tooling, including configuration compliance, patch success rates, and deployment latency
  • Provide Level 3 expertise for tooling-specific incidents, focusing on automating incident remediation and reducing MTTR
  • Automate repetitive tasks across managed infrastructure to measurably reduce operational overhead (e.g., server build time reductions)
  • Conduct root cause analysis and lead blameless postmortems for service-impacting incidents to drive systemic improvements

Infrastructure & Configuration Management

  • Engineer and maintain automated scripts for asset management, configuration databases, and monitoring systems
  • Design, develop, and deploy full-stack applications, custom plugins, and automation scripts for direct device interaction
  • Maintain Infrastructure-as-Code (IaC) configurations for Windows and Linux servers using tools such as Ansible, Terraform, or Puppet
  • Implement drift detection and auto-remediation capabilities for configuration compliance

Network & Security Device Automation

  • Build API-driven tools for network configuration, firmware updates, pre/post-change validation, and real-time health monitoring
  • Deploy monitoring agents, centralized logging, and dashboards with alerts based on critical SLIs (latency, error rates, traffic, saturation)
  • Develop automation scripts for intelligent ticket handling, validation, and escalation workflows within enterprise ticketing systems

Monitoring & Continuous Improvement

  • Implement and manage monitoring solutions (Prometheus, Grafana, Datadog) and centralized logging platforms (ELK Stack)
  • Build custom dashboards, alerts, and reporting for infrastructure and security devices
  • Participate in continuous improvement initiatives to enhance automation, tooling reliability, and system resilience

More Info

Job Type:
Industry:
Function:
Employment Type:
Open to candidates from:
Indian

About Company

Job ID: 143731195