- Design, develop, and implement Grafana dashboards and panels that provide actionable insights into system performance, reliability, and key performance indicators (KPIs).
- Experience and profound knowledge in the LGTM stack, Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics.
- Collaborate with consumers to understand monitoring needs and translate them into effective Grafana solutions.
- Provide technical guidance and support to teams adopting Grafana for monitoring.
- Conduct training sessions for internal teams on best practices, usage guidelines, and advanced features of Grafana.
- Create comprehensive documentation to facilitate self-service adoption.
- Stay informed about the latest developments in the Grafana ecosystem, including new features, updates, and best practices.
- Evaluate and recommend upgrades or new integrations to enhance our monitoring capabilities.
- Talk to our consumers and identify Observability needs , come up with suggestions on how to effectively use the tools.
- Guide consumers to Onboard and handhold them until they get the actual business value by adopting Observability.
- Define and implement Service Level Objectives (SLOs) for critical services.
Skills Required:
- Should have Observability as a core skill and not just from a tool perspective but also as a practice.
- Proven experience as a Grafana expert with hands-on design and implementation of monitoring solutions.
- Proficiency in Grafana, Prometheus, and related monitoring technologies.
- Experience in Terraform and Ansible, Proficiency in Otel and other Instrumentation concepts
- Scripting, Automation, and programming skills is a must have.
- Experience with data visualization and dashboard design principles.
- Knowledge in containerization technologies is a must have.
Experience in other observability toolsets apart from Splunk and Grafana is a plus.