Sr Principal Site Reliability Engineer

UKG

Noida

10-12 Years

Save

Posted 2 days ago
Be among the first 10 applicants

Early Applicant

Quick Apply

Job Description

Here's the job description for a Site Reliability Engineer at UKG, incorporating all the provided details:

About UKG Site Reliability Engineering

At UKG, Site Reliability Engineers are pivotal team members possessing a breadth of knowledge encompassing all aspects of service delivery. We develop software solutions to enhance, harden, and support our service delivery processes. This includes building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering, and auto-remediation.

We have a passion for learning and evolving with current technology trends, striving to innovate and relentlessly pursuing a flawless customer experience. We operate with an automate everything mindset, helping us bring immense value to our customers by deploying services with incredible speed, consistency, and availability.

Primary/Essential Duties and Key Responsibilities

Engage in and improve the lifecycle of services from conception to End-of-Life (EOL), including: system design consulting and capacity planning.
Define and implement standards and best practices related to: System Architecture, Service delivery, metrics, and the automation of operational tasks.
Support services, product & engineering teams by providing common tooling and frameworks to deliver increased availability and improved incident response.
Improve system performance, application delivery, and efficiency through automation, process refinement, postmortem reviews, and in-depth configuration analysis.
Collaborate closely with engineering professionals within the organization to deliver reliable services.
Identify and eliminate operational toil by treating operational challenges as a software engineering problem.
Actively participate in incident response, including on-call responsibilities.
Partner with stakeholders to influence and help drive the best possible technical and business outcomes.
Guide junior team members and serve as a champion for Site Reliability Engineering.

Qualifications

Engineering degree, or a related technical discipline, and 10+ years of experience in SRE.
Experience coding in higher-level languages (e.g., Python, Javascript, C++, or Java).
Knowledge of Cloud-based applications & Containerization Technologies.
Demonstrated understanding of best practices in metric generation and collection, log aggregation pipelines, time-series databases, and distributed tracing.
Ability to analyze current technology utilized and engineering practices within the company and develop steps and processes to improve and expand upon them.
Working experience with industry standards like Terraform, Ansible.

(Experience, Education, Certification, License and Training)

Must have hands-on experience working within Engineering or Cloud.
Experience with public cloud platforms (e.g., GCP, AWS, Azure).
Experience in configuration and maintenance of applications & systems infrastructure.
Experience with distributed system design and architecture.
Experience building and managing CI/CD Pipelines

More Info

Job Type:

Permanent Job

Industry:

Human Resources

Role:

Other Software /Hardware /EDP

Function:

Employment Type:

Full time

Open to candidates from:

Indian

About Company

UKG

UKG is an HR technology company on a mission to inspire every organization to become a great place to work. When you join our dynamic team of 3,000 U Krewers in India, you’ll help create outstanding workplace experiences for more than 80,000 organizations and their people around the world.

Job ID: 117050871

Jobs by Skill - IT