Search by job, company or skills

Falabella India

Site Reliability Engineer

Save
  • Posted 9 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Requirements

  • 1-2 years of experience in SRE, DevOps, Platform Engineering or software development roles.
  • Solid Linux fundamentals: process management, networking (TCP/IP, DNS, HTTP), file systems.
  • Hands-on with at least one scripting / programming language: Python or Go preferred.
  • Familiarity with containerization (Docker, Kubernetes/GKE) and basic kubectl operations.
  • Basic CI/CD exposure (GitHub Actions, Cloud Build, Argo CD or equivalent).
  • Understanding of observability concepts: metrics, logs, traces, alerting.

Technical Nice To Have

  • Experience with OpenTelemetry instrumentation or collector configuration.
  • Knowledge of GCP services (Cloud SQL, GCS, Pub/Sub, VPC, IAM).
  • Exposure to Infrastructure-as-Code (Terraform, Helm).
  • Playwright or Selenium for synthetic/E2E monitoring.
  • Familiarity with ITIL / incident management processes.

Soft Skills

  • Strong written English, able to write clear incident summaries and technical documentation.
  • Structured problem-solving under pressure; comfortable with ambiguity.
  • Collaborative mindset: proactive async communication across time zones.
  • Eagerness to learn; not afraid to ask questions or propose improvements.

This job was posted by Priyanka R N from Falabella.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 149070133

Similar Jobs

Bengaluru, India

Skills:

CloudformationPrometheusGrafanaElasticsearchCloudwatchGcpDockerTerraformAnsibleSplunkKubernetesAWSClickHouseOpenSearchFluent BitOpenTelemetry

Bengaluru, India

Skills:

RustGcpTerraformPythonKubernetesSecurity baselineGoFinOps mindsetReliabilityGPU workload understandingObservability

Bengaluru, India

Skills:

Security ComplianceGrafanaElkAWSPrometheusPythonBashTerraformGcpGraylogGitGoSelf-Hosted Infrastructure OwnershipKubernetes Container OrchestrationSystem Reliability Incident ResponseMonitoring ObservabilityScripting Engineering

Bengaluru, India

Skills:

TerraformAnsiblePrometheusGrafanaLinux System AdministrationKubernetesNVIDIA GPU Driver ManagementChaos EngineeringAI ML workload orchestrationSlurmGPU profiling tools

Bengaluru, India

Skills:

AnsiblePrometheusBashGrafanaKubernetesPythonLinux administration and troubleshootingOperational documentationIncident triageProduction SupportMonitoring and alerting