Search by job, company or skills

T

Data Platform and Data SRE - Senior Architect

10-14 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 days ago
  • Be among the first 10 applicants
Early Applicant
Quick Apply

Job Description

Platform Architecture and Design

  • Design and architect scalable, fault-tolerant data platforms leveraging modern technologies like Snowflake, Databricks, and cloud-native services
  • Establish architectural patterns that ensure high availability and resiliency across data systems
  • Develop technical roadmaps for platform evolution with reliability as a core principle

Reliability Engineering

  • Implement comprehensive SLA/SLO frameworks for data services
  • Design and execute chaos engineering experiments to identify and address potential failure modes
  • Create automated recovery mechanisms for critical data pipelines and services
  • Establish incident management processes and runbooks

Monitoring and Observability

  • Develop advanced monitoring solutions, including LLM-powered anomaly detection
  • Design comprehensive observability strategies across the data ecosystem
  • Implement proactive alerting systems to identify issues before they impact users
  • Create dashboards and visualization tools for reliability metrics

Data Quality and Governance

  • Establish data quality monitoring processes and tools
  • Implement data lineage tracking mechanisms
  • Develop automated validation protocols for data integrity
  • Collaborate with data governance teams to ensure compliance with policies

Innovation and Improvement

  • Research and implement AI/ML approaches to improve platform reliability
  • Lead continuous improvement initiatives for data infrastructure
  • Mentor team members on reliability engineering best practices
  • Stay current with emerging technologies and reliability patterns in the data platform space

Qualifications

  • 10+ years of experience in data platform engineering or related fields
  • Proven expertise with enterprise data platforms (Snowflake, Databricks, etc.)
  • Strong background in reliability engineering, SRE practices, or similar disciplines
  • Experience implementing data quality monitoring frameworks
  • Knowledge of AI/ML applications for system monitoring and reliability
  • Excellent communication skills and ability to translate technical concepts to diverse stakeholders

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Indian

About Company

Job ID: 118201129