Role : Specialist (Zabbix)
Experience : 7+ Yrs
Location : Bangalore
Job Description:
- Design and implement production‑grade monitoring and observability architectures using Zabbix.
- Integrate Zabbix with foundational infrastructure layers, external monitoring platforms, and enterprise toolchains.
- Develop and maintain robust integration workflows to ensure seamless data exchange across monitoring and analytics tools.
- Build, manage, and optimize Grafana dashboards, including metric visualization, alerting, and performance monitoring.
- Build from scratch or Greenfield implementations
- Multi-environment scope -Including On-prem and clouds (Azure/AWS)
- Work with log and analytics stacks such as ELK, Fluentd, and Loki to enable end‑to‑end observability.
- Leverage and integrate AIOps platforms including BigPanda, Moogsoft, Dynatrace, and New Relic to operationalize intelligent monitoring solutions.
- Assist in designing advanced alerting frameworks, anomaly detection mechanisms, and automated incident response workflows.
- Own end‑to‑end deployment activities, ensuring scalability, reliability, and adherence to best practices.
- Perform ongoing maintenance and lifecycle management, including upgrades, patching, and configuration changes, while ensuring system stability and minimal downtime.
- Utilize scripting skills (Shell/Python preferred) to automate monitoring workflows, integrations, and routine operational tasks.
- Act as a system integrator across the full monitoring ecosystem, ensuring tool compatibility, scalability, and operational readiness.
Qualifications and Experience
- At least 4-5 years of experience with Linux OS (Command line) - Mandatory.
- Automation Scripting or programming experience capabilities (Python, Java etc...) – Is preferable
- 3-5 years of experience in supporting large scale deployments, preferably software solutions - Mandatory
- Experienced in production critical systems 24/7 environment support – Mandatory
- Knowledge of streaming solutions, and client operation - Preferable
- Strong demonstrable analytical skills and able to collate and analyze data from various sources.
- Experience with the following: High Availability clusters, Client - Server systems, Installing web server applications, Linux administration skills, Log analysis, etc.
- Storage experience is an advantage
- SQL Database experience (SQL) - preferably operational support experience is an advantage
- Knowledge in data centers / Cloud environment is an advantage
- A recognized University Degree, or equivalent experience, in Broadcast / Software / Systems / Communications / Electronics Engineering
- Competence and Skills
- Excellent communication skills
- Ability to effectively handle multiple customer demands and priorities
- Strong teamwork skills
- Strong understanding of changing pace and culture of operator environment
- Excellent command of English communication, written and verbal
- Good diagnostics and problem-solving skills
- Self-learning