Job Description: Service Reliability Engineer (Mid / Senior)
Location: Kochi – Infopark (Work from Office)
Experience: 3+ Years
Team: Engineering & Risk Operations
Employment Type: Full‑time
Role Summary
We are looking for a Mid‑level / Senior Service Reliability Engineer (SRE) to ensure the availability, performance, and security of our fintech and payment platforms. This role uniquely combines Service Reliability Engineering with Fraud Detection and Risk Monitoring, offering the opportunity to work at the intersection of engineering, risk, and operations.
The ideal candidate will thrive in a mission‑critical, high‑transaction environment, take ownership of system reliability, and contribute to fraud detection and incident response efforts.
Key Responsibilities
Service Reliability Engineering
- Ensure high availability and performance of customer‑facing services and payment systems
- Implement and maintain monitoring, observability, and alerting using Datadog, PagerDuty, and custom dashboards
- Investigate alerts, logs, and metrics across infrastructure, applications, databases, and APIs
- Perform incident response, root‑cause analysis (RCA), and post‑incident reviews
- Define, manage, and improve SLIs, SLOs, and SLAs
- Develop and maintain automation scripts for system provisioning, health checks, and failover processes
- Collaborate with engineering and infrastructure teams to ensure scalable and fault‑tolerant system designs
Fraud Detection & Risk Monitoring
- Monitor real‑time transactional activity using Splunk, Datadog, and internal data sources
- Investigate suspicious patterns, anomalies, and user behaviors to detect fraud, abuse, and financial risk
- Tune and optimize fraud detection rules, triggers, and thresholds based on evolving threat patterns
- Work closely with Fraud Operations, Compliance, and Customer Support teams on investigations and resolutions
- Build and maintain dashboards and reports tracking fraud KPIs, incidents, and operational metrics
- Participate in cross‑functional fraud response processes and continuous improvement initiatives
Required Qualifications
- 3+ years of experience in SRE, DevOps, Security Operations, or Fraud/Risk Monitoring roles
- Strong hands‑on experience with Datadog, PagerDuty, and Splunk
- Proficiency in Linux systems, shell scripting, and cloud platforms (AWS, GCP, or Azure)
- Solid understanding of incident management, observability, and CI/CD pipelines
- Experience working with fraud detection or risk monitoring systems in fintech, payments, or transaction‑heavy platforms
- Ability to write and analyze SQL queries and Splunk SPL, handling large log and event datasets
- Strong communication and collaboration skills across technical and non‑technical teams
Nice to Have
- Exposure to KYC / AML systems in regulated financial environments
- Knowledge of secure API design, authentication mechanisms (OAuth2, JWT)
- Familiarity with compliance standards such as PCI‑DSS, ISO 27001, ISO 8583
- Experience with rule engines, anomaly detection, or risk models
- Certifications such as AWS DevOps Engineer, CFE (Certified Fraud Examiner), or equivalent