
Search by job, company or skills
Position Overview
Job Title: Site Reliability Engineer (SRE)
Department: Technology
Location: Bangalore
Reporting To: Head of Infra
Tookitaki is looking for a Site Reliability Engineer (SRE) with 3–6 years of experience to help maintain and scale the infrastructure that powers our flagship products—FinCense and the AFC Ecosystem. As an SRE, you will work at the intersection of software engineering and infrastructure, ensuring high availability, performance, and scalability of our platforms. You will collaborate with engineering, DevOps, and client success teams to operationalize deployments across on-premise, VPC, and Compliance as a Service (CaaS) environments while improving monitoring, automation, and incident response.
Position Purpose
The SRE role is responsible for ensuring the reliability and efficiency of Tookitaki's production systems and environments. This includes building monitoring systems, improving deployment pipelines, automating routine operations, and responding to production incidents. You'll help build a resilient infrastructure that supports our mission to provide AI-driven solutions that prevent financial crime.
Key Responsibilities
System Monitoring & Incident Management
Infrastructure & Deployment Automation
Container & Orchestration Management
Cloud & Platform Operations
Security & Reliability Enhancements
Collaboration & Documentation
Qualifications and Skills
Education
Experience
Technical Skills
Soft Skills
Key Competencies
Success Metrics
Benefits
Job ID: 147194219
Skills:
Unix, Elk, Prometheus, Grafana, Datadog, Docker, Terraform, Python, AWS, Java, Cloudformation, Bash, Pulumi, Devops, Gcp, Linux, Arm, Azure, Kubernetes, Monitoring observability tools, Infrastructure as Code, SRE, Go, Azure Monitor
Skills:
Terraform, Ansible, Cloudformation, Prometheus, Incident Management, Dynatrace, Grafana, Security Compliance, Datadog, Monitoring Observability, Performance Scalability
Skills:
Elk, Prometheus, Slas, Networking, Dns, Grafana, Cdn, Graylog, Python, AWS, Performance Tuning, Bash, Devops, High Availability, Gcp, Load Balancing, Azure, Kubernetes, SLIs, Go, Disaster Recovery, observability tools, Security, OpenTelemetry, Infrastructure Engineering, Site Reliability Engineering, log management tools, reliability metrics, SLOs, container orchestration, incident management frameworks
Skills:
Github, PowerShell, Prometheus, Bash, Grafana, Jenkins, Git, Cloudwatch, Linux, Bitbucket, Terraform, AWS CloudFormation, Kubernetes, Python, AWS, Loki
Skills:
Cloudformation, Prometheus, Grafana, Pulumi, Datadog, Jenkins, Linux, Docker, Terraform, Ansible, AWS IAM, Puppet, Kubernetes, Python, AWS, Chef, Go, EKS, GitLab CI, GitHub Actions
We don’t charge any money for job offers