Java, Spring, Hibernate, Hadoop, Spark Streaming, Unix Shell Script, MSSQL, Azure, Synapse, Cosmos DB
Description
GSPANN is hiring a Site Reliability Engineer with expertise in Java and Spark. The role involves ensuring service reliability, automating operations, and supporting Java-based big data applications using Spark. You'll work closely with cross-functional teams to enhance system performance, observability, and scalability.
Location: Hyderabad
Role Type: Full Time
Published On: 1 July 2025
Experience: 5+ Years
Share this job
Description
GSPANN is hiring a Site Reliability Engineer with expertise in Java and Spark. The role involves ensuring service reliability, automating operations, and supporting Java-based big data applications using Spark. You'll work closely with cross-functional teams to enhance system performance, observability, and scalability.
Role and Responsibilities
- Gain a deep understanding of the business and map the full customer journey end-to-end.
- Apply software development principles to operations, leveraging broad experience in software engineering and Site Reliability Engineering (SRE) practices.
- Collaborate with stakeholders to enhance the design, observability, availability, scalability, and performance of critical services.
- Clearly communicate your availability to both the team and your manager.
- Automate manual workflows, investigate incidents thoroughly, and lead blameless post-mortems for continuous learning.
- Use standardized telemetry data to improve alert management, incident analysis, decision-making, and system optimization.
- Support planned changes by managing deployments, monitoring systems post-deployment, and creating or updating dashboards and alerts as needed.
- Develop and enhance new services, and deploy tools that automate the support of systems and services.
- Meet and uphold organizational Service Level Objectives (SLOs) consistently.
- Create value-focused deliverables including Standard Operating Procedures (SOPs), presentations, case studies, and accelerators.
Skills And Experience
- 5+ years of experience in software development, technical operations, and managing large-scale application environments.
- 5+ years in Service Engineering, IT Support, or Production Operations.
- 5+ years of hands-on experience with Java application development and support, including knowledge of Spring and Hibernate frameworks.
- Set up and debug Apache Spark jobs for over 4 years, with a solid understanding of data processing, cleansing, and integrity validation.
- Write and maintain Unix shell scripts for over 3 years, with strong hands-on scripting capability.
- Preferably have working knowledge of Microsoft Azure, Azure Cosmos DB, Azure Synapse Analytics, and Apache Kafka.
- Apply creative problem-solving skills to resolve cross-functional technical challenges in dynamic, fast-changing environments.
- Communicate effectively, take ownership of triage calls, and drive resolution of critical incidents to logical closure.
- Stay open to working in rotational shifts as required.