Role Overview
We are seeking a Senior Engineer to design and build intelligent monitoring and observability systems for large-scale telecom environments. This role blends telecom domain expertise, distributed systems engineering, and applied machine learning to enable proactive fault detection, root cause analysis, and automated remediation.
Key Responsibilities
- Network Monitoring & Observability
- Design and implement scalable monitoring solutions using tools such as Prometheus and Grafana
- Integrate telemetry sources including SNMP, NetFlow, syslogs, and streaming metrics
- Implement distributed tracing using OpenTelemetry
- Build dashboards and alerting systems for key telecom KPIs (latency, packet loss, jitter, throughput)
- Distributed Systems & Data Streaming
- Design and maintain high-throughput, real-time data pipelines using platforms like Apache Kafka
- Develop event-driven architectures with strong fault tolerance and scalability
- Ensure reliable ingestion and processing of large-scale telemetry data
- AI / ML for Network Intelligence
- Develop anomaly detection models for time-series network data
- Build predictive models for congestion, failures, and capacity planning
- Reduce alert noise using clustering and classification techniques
- Implement ML pipelines using frameworks such as TensorFlow or PyTorch
- Backend Engineering
- Develop microservices and APIs to support monitoring and analytics platforms
- Build and maintain data processing pipelines
- Integrate machine learning models into production systems
Required Qualifications
- Bachelor's or Master's degree in Computer Science, Telecommunications, or related field
- 7+ years of experience in telecom engineering, backend systems, or network monitoring
- Strong understanding of 4G/5G network architecture and telecom KPIs
- Hands-on experience with observability and monitoring tools (e.g., Prometheus, Grafana, ELK)
- Proficiency in at least one programming language (C#, Python, Go, or Java)
- Experience with distributed systems and streaming platforms (e.g., Apache Kafka)
Preferred Qualifications
- Experience applying machine learning to time-series or network data
- Familiarity with AIOps concepts and platforms
- Knowledge of cloud-native observability practices
- Relevant certifications in monitoring, observability, or AIOps tools