Search by job, company or skills

O

Principal Software Engineer - Network Reliability Engineering - AI/ML

8-10 Years

This job is no longer accepting applications

new job description bg glownew job description bg glownew job description bg svg
  • Posted 4 months ago

Job Description

Responsibilities:

  • Architect, build, and support distributed systems for process control and execution based on Product Requirement Documents (PRDs).
  • Develop and sustain DevOps tooling, new product process integrations and automated testing.
  • Develop ML in Python 3; build backend services in Go (Golang); create command-line interface (CLI) tools in Rust or Python 3; and integrate with other services as needed using Go, Python 3, or C.
  • Build and maintain schemas/models to ensure every platform and service write is captured for monitoring, debugging and compliance
  • Build and maintain dashboards that monitor the quality and effectiveness of service execution for process as code your team delivers.
  • Build automated systems that route code failures to the appropriate oncall engineers and service owners.
  • Ensure high availability, reliability, and performance of developed solutions in production environments.
  • Support serverless workflow development for workflows which call and utlize the above mentioned services support our GNOC, GNRE, and onsite operations and hardware support teams.
  • Participate in code reviews, mentor peers, and help build a culture of engineering excellence.
  • Operate in an Extreme Programming (XP) asynchronous environment (chat/tasks) without daily standups, and keep work visible by continuously updating task and ticket states in Jira.

Required Qualifications:

  • 8 - 10 years of experience in process as code, software engineering, automation development, or similar roles
  • Bachelors in computer science and Engineering or related engineering fields
  • Strong coding skills in Go and Python3
  • Experience with distributed systems, micro-services, and cloud-native technologies
  • Proficiency in Linux environments and scripting languages
  • Proficiency with database creation, maintenance and code using SQL and Go or Py3 libraries
  • Understanding of network operations or large-scale IT infrastructure
  • Excellent problem-solving, organizational, and communication skills
  • Experience using AI coding assistants or AI-powered tools to help accelerate software development, including code generation, code review, or debugging.

Preferred Qualifications:

  • Process engineering experience (control systems, proportional integral derivatives (pid), statistical process control (SPC))
  • Proficiency with data modeling, data analysis, and reporting frameworks (e.g., SQL, Spark, Prometheus, Grafana, etc.)
  • Experience with C, Cpp, Java, or Rust
  • Experience developing automation and tools for network or scale cloud operations
  • Background in creating dashboards, alerts, and real-time reporting platforms
  • Familiarity with workflow automation (e.g., Apache Airflow), CI/CD pipelines, or infrastructure as code
  • Previous experience supporting or building tools for (any) hyperscale or scale could network, compute, or storage operations.
  • Knowledge of REST APIs, remote procedure calls (RPCs), and service oriented architectures (SOA)
  • Familiarity with eXtreme programming (xp), agile, and devops process
  • Experience with ticketing and version control systems (e.g., Jira, Git)

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Indian

About Company

We’re a cloud technology company that provides organizations around the world with computing infrastructure and software to help them innovate, unlock efficiencies and become more effective. We also created the world’s first – and only – autonomous database to help organize and secure our customers’ data. Oracle Cloud Infrastructure offers higher performance, security, and cost savings. It is designed so businesses can move workloads easily from on-premises systems to the cloud, and between cloud and on-premises and other clouds. Oracle Cloud applications provide business leaders with modern applications that help them innovate, attain sustainable growth, and become more resilient. The work we do is not only transforming the world of business--it's helping defend governments, and advance scientific and medical research. From nonprofits to companies of all sizes, millions of people use our tools to streamline supply chains, make HR more human, quickly pivot to a new financial plan, and connect data and people around the world.

Job ID: 130146401