We're looking for problem solvers, innovators, and dreamers who are searching for anything but business as usual. Like us, you're a high performer who's an expert at your craft, constantly challenging the status quo. You value inclusivity and want to join a culture that empowers you to show up as your authentic self. You know that success hinges on commitment, that our differences make us stronger, and that the finish line is always sweeter when the whole team crosses together.
The Systems Engineer (Platform Reliability) is responsible for ensuring the reliability, scalability, and operational excellence of enterprise platforms and server systems by replacing manual operational tasks with automated, event-driven workflows.
Success in this role is measured by the effectiveness of automation delivered, the depth of platform integrations, and the ability to enable internal teams and customers to do their best work using reliable, secure, and observable platforms, tools, and services.
Key Responsibilities
- Design, build, and maintain automated workflows using Ansible Automation Platform (AAP) and Event-Driven Ansible (EDA)
- Implement standardized, automated system remediation and support processes triggered by CloudWatch and ServiceNow events
- Ensure operating system and application patch compliance using Automox policies and custom Worklets
- Protect enterprise data through automated backup validation, reporting, configuration, and disaster recovery readiness using Rubrik Security Cloud
- Own the end-to-end Alteryx Server lifecycle, including:
- Image-based worker node creation
- Infrastructure and platform upgrades
- Scalability, reliability, and observability
- Develop and maintain observability pipelines and incident flows across CloudWatch, PagerDuty, StatusPage, and ServiceNow
- Manage and secure secrets using platforms such as 1Password and AWS Secrets Manager
- Administer and support Windows and Linux server environments
- Manage DNS, public domain registrations, and SSL certificate lifecycle
- Operate using GitOps practices, managing repositories, CI/CD pipelines, automation code, and documentation (AI-assisted where applicable)
- Participate in a rotating on-call schedule to support after-hours operational incidents
Required Skills & Experience
- 7+ years of experience administering cloud infrastructure, services, servers, and networking (AWS preferred)
- Advanced hands-on experience with Red Hat Ansible Automation Platform, including workflows, job templates, projects, and Event-Driven Ansible
- Hands-on experience with platforms such as Rubrik Cloud Protection, PagerDuty, StatusPage, 1Password, AWS Secrets Manager, and Route53
- Experience managing SSL certificate lifecycle, Microsoft DNS, and external DNS providers (e.g., GoDaddy, Network Solutions)
- Experience incorporating AI-driven capabilities into automation workflows (API-based preferred)
- Strong scripting and development skills using PowerShell, Bash, Python, and JavaScript
- Solid understanding of GitOps, Agile methodologies, and cloud networking fundamentals
- Experience with Alteryx products, Automox, Kubernetes/EKS, or Docker is a plus.
Education & Professional Attributes
- Bachelor's degree in Computer Science, Information Technology, or equivalent practical experience
- Independent, proactive problem-solver with a strong sense of ownership
- Clear, concise communicator and collaborative team member
- Strong commitment to continuous improvement of self, team, and the platforms, tools, and services delivered.
Key Platforms & Technologies
Automation & Configuration
- Ansible Automation Platform (AAP): Workflows, Job Templates, Projects
- Event-Driven Ansible (EDA)
Endpoint & Patch Management
- Automox: Agent deployment, patch compliance, custom Worklets.
Cloud Infrastructure
- Amazon Web Services (AWS) primary
- Google Cloud Platform (GCP) secondary
Observability & Incident Management
- Amazon CloudWatch: Metrics, alerts, dashboards, triage automation
- PagerDuty, StatusPage, ServiceNow
Data Protection & Resiliency
- Rubrik Security Cloud: Policy validation, backup integrity, DR readiness, and testing
Application Platform
- Alteryx Server: Infrastructure provisioning, health monitoring, version upgrades, image-based worker node automation
Security, Secrets & Networking
- 1Password, AWS Secrets Manager, AppViewX
- Route53, GoDaddy
- DNS management, secret rotation, and certificate lifecycle automation
Find yourself checking a lot of these boxes but doubting whether you should apply At Alteryx, we support a growth mindset for our associates through all stages of their careers. If you meet some of the requirements and you share our values, we encourage you to apply. As part of our ongoing commitment to a diverse, equitable, and inclusive workplace, we're invested in building teams with a wide variety of backgrounds, identities, and experiences.
This position involves access to software/technology that is subject to U.S. export controls. Any job offer made will be contingent upon the applicant's capacity to serve in compliance with U.S. export controls.