Job Description
Site Reliability Engineer (SRE)
We are seeking an experienced Site Reliability Engineer (SRE) to ensure the availability, performance, scalability, and reliability of customer-facing platforms. The SRE will work closely with DevOps, DBA, Development, and Security teams to provision infrastructure, deploy applications, automate workflows, and maintain operational excellence. This role has a direct impact on system stability, customer satisfaction, and overall platform performance.
Key Responsibilities & Deliverables
Manage, monitor, and maintain highly available Windows and Linux environments
Ensure system scalability by analyzing performance metrics and trends
Handle routine service requests while identifying and implementing automation opportunities
Design, implement, and maintain Infrastructure as Code (IaC) using Terraform, ARM Templates, and AWS CloudFormation
Manage data backups and implement disaster recovery strategies
Design and deploy CI/CD pipelines using GitHub Actions, Jenkins, Octopus, Ansible, and Azure DevOps
Enforce security best practices throughout the Software Development Lifecycle (SDLC)
Follow and promote ITIL best practices and standards
Act as a subject matter expert for emerging cloud technologies with a focus on AWS
Technical Skills & ProficienciesMandatory Skills
Strong hands-on experience with AWS
Experience administering Windows and Linux servers
CI/CD experience with GitHub Actions, Jenkins, and Octopus
Infrastructure automation using Ansible, Terraform, or similar toolsCloud & DevOps
Experience with Azure and AWS cloud services
Experience with observability and monitoring tools such as New Relic, Application Insights, AppDynamics, or Datadog
Hands-on experience with Docker and Kubernetes
Scripting skills in Bash, PowerShell, or PythonDatabase & Networking
SQL Server database maintenance and administration (preferred)
Strong understanding of networking concepts including VNET, Subnets, Private Link, and VNET PeeringAzure Services Knowledge
Azure AD, OAuth, Certificates
AKS, App Services, ASE
Load Balancer, Application Gateway, Firewall
API Management
Azure SQL and DatabasesLogging & Security
Experience analyzing application logs, IIS logs, system logs, security logs, and AWS CloudTrail events
Experience Requirements
5+ years of experience in SRE, DevOps, or System Administration
Proven expertise in supporting high availability Windows and Linux environments
Strong experience with the WISA stack (Windows, IIS, SQL Server, ASP.NET)
3+ years of experience working with cloud platforms (AWS, Azure & GCP)
1+ year of experience working with container technologies such as Docker and Kubernetes
Experience working in Agile methodologies such as Scrum, Kanban, or Lean
Education
Bachelor's Degree or Diploma in Computer Science, Information Systems, or equivalent practical experience
Skills: devops,gcp,azure,database,cloud,aws