Search by job, company or skills

Applexus Technologies

Site Reliability Engineering Manager

10-15 Years
Save
new job description bg glownew job description bg glow
  • Posted a month ago
  • Over 50 applicants
Quick Apply

Job Description

Job description

  • As a Site Reliability Engineering (SRE) Manager, candidate will be responsible for building, developing, and retaining a high-performing team of software engineers and build an environment where they can thrive and succeed
  • While the primary role is leading/managing employees, you should have deep technical knowledge on distributed systems and cloud computing, security platforms and can quickly understand and respond to peer teams needs
  • It is also encouraged that you have strong experience working with short release cycles, do not hesitate to :- Actively participate in architectural and functional design, implementation and troubleshooting sessions
  • - Review hardware, software infrastructure and application functionality for identifying and optimizing performance bottlenecks
  • - Drive major incident management to restore order
  • - Spearhead in designing and implementing comprehensive monitoring for applications, integrations and anomalies- Innovate and find opportunities and drive automation efforts across various platform and security applications
  • - Working closely with Cross functional IT organization, Business group, Apples production support team, application engineers, systems engineers, database administrators and QA team to effectively ensure implementation and reliability of Platforms/Applications
  • - A proven track record with managing, motivating and providing technical guidance to a team of software engineers to draw out their best work will be key to success
  • - Ensuring quality in every deliverable, creative thinking, strong problem solving, and the ability to collaborate with other global cross-functional teams in a fast paced environment will be meaningful attributes to succeed in this role
  • At least 10+ years of prior demonstrated experience in a Site Reliability Engineering, DevOps, or an Infrastructure-focused role.
  • 3+ years of experience leading and managing high performance SRE teams.
  • Proven track record in leading sophisticated SRE projects, enterprise services at a large scale
  • Strong analytical, troubleshooting and problem solving skills
  • Good knowledge in at least one object oriented programming language (preferably Java , Python)
  • Unix Performance Monitoring & Tuning
  • Good understanding of Database concepts, PL/SQL and NoSql Technologies.
  • Hands on experience with monitoring and data analysis tools (e.g., Prometheus, Splunk, Grafana, Cloudwatch)
  • Building and operating container orchestrating systems like Kubernetes or EKS.
  • Deep understanding of security concepts and protocols - authentication, authorization, signing, encryption, SSL/TLS, SSH/SFTP, PKI, X509 certificates and PGP.
  • Good fundamentals on Release Management & continuous Integration
  • Familiarity with modern web services architectures, cloud platforms such as AWS, GCP, Azure and distributed storage systems (ScaleIO, Amazon S3).
  • Ability to communicate with large cross-functional teams about various engineering topics such as system architecture, detailed design, APIs, project schedules etc.
  • Ability to make right trade-off choices when dealing with functional complexity, conflicting priorities and aggressive schedules
  • Represent the team and remove hurdles to enable each team member to operate at the highest level of efficiency and productivity
  • Ability to hire, mentor and manage the performance of a large team.
  • Ability to connect with senior executives and business stakeholders.
  • A learning attitude to continuously improve self, team and the organisation.
  • Ability to work under pressure and manage difficult situations in a fast-paced work environment.
  • Bachelor or Masters or equivalent experience in Computer Science or other related field.

Preferred Qualifications

  • Java and JVM technologies runtime configurations and troubleshooting is a plus
  • Good fundamentals on data modelling and machine learning algorithms
  • Strong knowledge on securing applications, thorough understanding of OWASP top 10 risks and solutions.

More Info

Job Type:
Industry:
Function:
Employment Type:

Job ID: 107758089

Similar Jobs

Bengaluru, India

Skills:

JavaGolangLoggingvirtualizationApache SparkLinux Operating SystemContainersmetricsKubernetesPythonCloud Computing technologiestracingFlinkDruidstandard networking protocolsTrinoobservability systemsSRE principles

Bengaluru, India

Skills:

HelmAWSPythonAzureTerraformJenkinsAnsibleGo

Bengaluru, India

Skills:

Identity And Access ManagementDatadogApplication SecurityTerraformEntraIDBigPandaDevOps conceptsAlgorithms and data structuresInfrastructure as Code