Search by job, company or skills

intangles

Site Reliability Engineer

2-4 Years
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 3 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Description:

· Intangles Lab is looking for a hands-on Site Reliability Engineer from FinTech background to manage large 24×7 Cloud Operations.

· Looking for a Site Reliability Engineer with 2+ years of experience, having hands-on with the following technologies/skillset:

Must-Required Skills:

· AWS Cloud (Advanced): Certification is preferred.

· Networking (Intermediate): Proficiency in networking concepts is necessary.

· Ubuntu/Linux & OS (Advanced): Strong Linux & Networking basics, Prior working experience is preferred.

· Database (Basic Knowledge): Familiarity with SQL and NoSQL databases is required, having worked with at least one of them.

· Database Administration (MongoDB & PostgreSQL, Elasticsearch), having hands-on experience of at least one is required

· Containerization Tools: Docker

· Kubernetes (Advanced)

· Knowledge of Amazon EKS is compulsory.

· Working knowledge of StatefulSets is required.

· Familiarity with the HELM Chart is necessary.

CI/CD (Advanced):

· Proficiency in at least one CI/CD tool, such as CircleCI, Argo Project, GitHub Actions, or similar, is essential.

· Programming:

a.Basic programming knowledge is required, with the ability to write code.

b.Scripting Language: Python, Shell

Monitoring Stack:

· Prometheus, Grafana, Alert Mangaer, Istio, Jaeger, Datadog, PagerDuty (or similar). ElasticAPM

Optional Skills:

· Medium to High Level of Application Development Experience in languages like JavaScript, Python, and Java will be a bonus.

· Understanding of N-tier Architectures

· Understanding of REST & gRPC API Frameworks

· Understanding of Web Servers in NodeJS

Responsibilities:

· To work in a production environment with technologies like Linux, AWS, Terraform, Kubernetes, MongoDB, Elasticsearch & PostgreSQL Administration.

· To keep the production environment up & running, i.e. ensuring the reliability of the production environment.

· To troubleshoot, debug and fix issues in case of failures of the production and QA environment and provide technical solutions.

· To own the responsibilities of on-call as per the team's policy.

· To write and enhance automations as and when needed.

· To work closely with internal teams and customers to follow the processes and SLAs of uptime.

· To write, update and enhance documentation, including runbooks/playbooks and prepare postmortem reports for the production incidents.

· Considering the role is to ensure the platform's reliability, ready to work in a 24*7 work environment when required.

Additional Requirements:

· One should be aware of change/incident/problem/issue/risk management/escalations.

· Should be flexible in working in rotational shifts and night hours (Including weekends).

· Excellent thinking and problem-solving skills.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 147136589

Similar Jobs