Search by job, company or skills

jellylogic solutions

Site Reliability Engineer

Save
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Location - Remote

Timezone - Mandatory 4 hrs overlap with EST timezone

Mandatory Skills -

  1. Expert knowledge of J2EE/Spring/Hibernate
  2. Expert knowledge of AWS EKS & Kubernetes administration.

Role Overview

This is an infrastructure-centric role. The successful candidate will collaborate directly with SRE and Cloud Centers of Excellence (CoE) to manage, upgrade, and maintain the infrastructure that powers backend applications. The primary objective is ensuring seamless application deployment and infrastructure stability, rather than feature development.

Primary Responsibilities

-Manage and upgrade enterprise-grade infrastructure, focusing on high-availability and scalability.

-Lead application deployments across Kubernetes clusters using GitOps principles.

-Coordinate with SRE teams to maintain system uptime and implement infrastructure-level patches and upgrades.

-Provide expert-level troubleshooting for backend applications (J2EE/Spring/Hibernate) from an operational and performance perspective.

-Configure and optimize CDN layers to ensure global delivery performance.

Technical Requirements

-Infrastructure Management: Expert knowledge of AWS EKS and Kubernetes administration.

-Automation/CD: Mastery of GitOps tools, specifically ArgoCD or Flux, for automated deployments.

-Operational Troubleshooting: Deep understanding of J2EE, Spring, and Hibernate to diagnose performance issues and system crashes.

-CDN Governance: Technical proficiency in managing and configuring Cloudflare or CloudFront.

-Observability Mastery: Advanced use of Splunk, Dynatrace, or Datadog for proactive system monitoring and alerting.

Preferred Qualifications

-Experience working within a formal Cloud Center of Excellence (CoE) or SRE team.

-Demonstrated experience in performing major version upgrades of Kubernetes clusters or critical middleware.

Professional Attributes

-Operational Mindset: Prioritizes stability, security, and scalability over feature delivery.

-Collaboration: Ability to work effectively with global Ops teams and handle technical handovers.

-Reliability: Proven track record of managing production-critical infrastructure without downtime.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148881697

Similar Jobs

Pune, India

Skills:

JavaAppdynamicsCloudwatchPrometheusDynatraceBashAws CloudSplunkGrafanaPython

Pune, India

Skills:

Monitoring ToolscloudLinuxDistributed SystemsmetricsKubernetesPythonerror budgetslogstracesSLOsincident governanceobservability

Pune, India

Skills:

YamlBashJsonGcpECSAzureKubernetesPythonAWSPingAMGCEserver-less architecturesFargateForgeRockPingGatewayPingDSPingIDM

Pune, India

Skills:

YamlContinuous DeliveryBashJsonGcpECSKubernetesPythonAWSPingAMDisaster RecoveryFargateForgeRockConfiguration as CodePingGatewayPingIDMPingDS

Pune, India

Skills:

GolangTerraformLinuxAnsibleHelmKubernetesPythonAWSArgoCD