Search by job, company or skills

Z5

Senior Cloud Site Reliability Engineer

4-8 Years
Save
new job description bg glownew job description bg glow
  • Posted 13 hours ago
  • Be among the first 40 applicants
Early Applicant
Quick Apply

Job Description

Job description

SENIOR CLOUD SITE RELIABILITY ENGINEER

At ZS we honor the visible and invisible elements of our identities, personal experiences and belief systems the ones that comprise us as individuals, shape who we are and make us unique. We believe your personal interests, identities, and desire to learn are part of your success here. Learn more about our diversity, equity, and inclusion efforts and the networks ZS supports to assist our ZSers in cultivating community spaces, obtaining the resources they need to thrive, and sharing the

SENIOR CLOUD SITE RELIABILITY ENGINEER

ZSs CCoE (Cloud Center of Excellence) Team builds, maintains and helps architect the systems enabling ZS client-facing software solutions. We define and implement best practices to ensure performant, resilient and secure cloud solutions. The CCoE team at ZS is comprised of analytical problem solvers coming from diverse backgrounds while sharing a passion for quality delivery whether our customer is a client or another ZS employee. The team has presence in ZS s Evanston, Illinois and Pune, India offices.

What Youll Do:

Acting as a S enior Cloud S ite Reliability Engineer , you will be working with a team of operations engineers and software developers to analyze, maintain and nurture our Cloud solutions/products to support the ever-growing company s clientele . As a technical expert, you will be working closely with various teams to ensure the stability of the environment by:

  • Analyzing the current state, designing appropriate solutions and working with the team to implement them .
  • Coordinate emergency responses, perform root cause analysis, identify and implement solutions to prevent re-occurrences
  • Work with the team to identify ways to increase MTBF and lower MTTR for the environment
  • Review each entire application stack and execute initiatives to reduce failures, defects and issues with the overall performance
  • Identifying and working with the team to implement more efficient system procedures
  • Maintaining environment monitoring systems to provide the best visibility into the state of the deployed products/solutions
  • Perform root cause analysis on incoming infrastructure alerts and work with teams to resolve them
  • Maintaining performance analysis tools, identifying any adverse changes to performance and working with the teams to resolve them
  • Researching industry trends and technologies, and promote adoption of best-in-class tools and technologies
  • Taking the initiative to advance the quality, performance, or scalability of our Cloud Solutions, by influencing the architecture or design of our products
  • Design, develop and execute automated tests to validate solutions and environments
  • Troubleshoot issues across the entire stack infrastructure, software, application and network

What Youll Bring

  • 3+ years experience working as a Site Reliability Engineer or an equivalent position
  • 2+ years experience with AWS cloud technologies and at least one AWS certifications is required (Solution Architect / DevOps Engineer)
  • 1+ years experience functioning as a senior member in a n infrastructure/software team
  • Hands-on experience with AWS services like EC2, RDS, EMR, CloudFront, ELB, API Gateway, CodeBuild , AWS Config, Systems Manager, Service Catalog, Lambda, etc.
  • Full-stack IT experience with *nix, Windows, network/ firewall concepts, source control ( BitBucket ) and build/dependency management and continuous integration systems (TeamCity, Jenkins)
  • Expertise in at least one scripting language, Python preferred
  • Must have f irm understanding of application reliability, performance tuning and scalability
  • Exposure to big data technologies (Spark, Hadoop, Scala, etc. ) stack is preferred
  • Solid knowledge of infrastructure and cloud-native services along with network technologies
  • Solid understanding of RDBMS and Cloud Database engines like Postgres SQL, MySQL etc.
  • Firm understanding of Clusters, Load balancers and CDN
  • Experience in fault-tolerant system design
  • Familiarity with Splunk data analysis, D atadog or similar tools is a plus
  • A Bachelor s degree ( Master s preferred) in a related technical field
  • Excellent analytical, troubleshooting and communication skills
  • Possess strong verbal, written and team presentation communication skills. ZS is a global firm; fluency in English is required
  • This role requires healthy doses of initiative and the ability to remain flexible and responsive in a very dynamic environment
  • Ability to quickly learn new platforms, languages, tools, and techniques as needed to meet project requirements

Perks Benefits:

ZS offers a comprehensive total rewards package including health and well-being, financial planning, annual leave, personal growth and professional development. Our robust skills development programs, multiple career progression options and internal mobility paths and collaborative culture empowers you to thrive as an individual and global team member.

We are committed to giving our employees a flexible and connected way of working. A flexible and connected ZS allows us to combine work from home and on-site presence at clients/ZS offices for the majority of our week. The magic of ZS culture and innovation thrives in both planned and spontaneous face-to-face connections.

Considering applying

At ZS, were building a diverse and inclusive company where people bring their passions to inspire life-changing impact in global healthcare and beyond. We are most interested in finding the best candidate for the job and recognize the value that candidates with all backgrounds, including non-traditional ones, bring. If you are interested in joining us, we encourage you to apply even if you dont meet 100% of the requirements listed above.

ZS is an equal opportunity employer and is committed to providing equal employment and advancement opportunities without regard to any class protected by applicable law.

To Complete Your Application:

Candidates must possess work authorization for their intended country of employment. An on-line application, including a full set of transcripts (official or unofficial), is required to be considered.

Role: Site Reliability Engineer

Industry Type: Management Consulting

Department: Engineering - Software & QA

Employment Type: Full Time, Permanent

Role Category: DevOps

Education

UG: Any Graduate

PG: Any Postgraduate

Key Skills

Performance tuningRoot cause analysisData analysisAnalyticalCloudMySQLFinancial planningHealthcareWindowsPython

More Info

About Company

An experience with ZS means you’ll be encouraged to bring fresh thinking and co-create with industry-leading clients from day one. Here you’ll work side-by-side with a powerful collective of thinkers and experts shaping solutions from start to finish. At ZS, we believe that making an impact demands a different approach; and that’s why here your ideas elevate actions, and here you’ll have the freedom to pursue cutting-edge work and define your own path. Work side-by-side with like-minded people who rise in care of humanity’s greatest challenges to define what’s next. Join us and find a path where your passion can change lives.
ZS is a management consulting and technology firm that partners with companies to improve life and how we live it. We transform ideas into impact by bringing together data, science, technology and human ingenuity to deliver better outcomes for all. Founded in 1983, ZS has more than 13,000 employees in over 35 offices worldwide.

Job ID: 107490835