We are seeking a Senior Cloud Network Engineer to build, automate, and maintain secure network infrastructure. This is a high-execution role focused on Infrastructure as Code. You will be responsible for the actual delivery and lifecycle of cloud networking and security components using Terraform.
Key Responsibilities:
- Write and Maintain Production-Grade IaC: Develop and maintain modular Terraform code to manage the entire networking lifecycle, including cloud-native constructs (VPCs/VNets, TGW, DirectConnect/ExpressRoute, Route Tables, NACLs/NSGs/SGs) and third-party appliance deployment.
- Palo Alto VM-Series Automation: Hands-on responsibility for the automated bootstrapping and deployment of VM-Series firewalls (managing init-cfg, licenses, and software versions via S3/Azure Storage).
- Autoscaling & Resilience: Implement and manage Auto Scaling Groups (AWS) or Scale Sets (Azure) for firewalls, including integration with Gateway Load Balancer (GWLB) and managing lifecycle hooks.
- Multi-Cloud Expansion: Standardize networking patterns across AWS/Azure, with the opportunity to apply these skills to GCP, OCI, and Ali Cloud environments.
- Automated Policy Enforcement: Use cloud native tooling (AWS Firewall Manager/Azure Policy/Security Center) to centrally manage and enforce network security policies across all accounts and VPCs/VNets, ensuring consistent security group rules and WAF configurations.
- Compliance-as-Code & Monitoring: Implement and manage AWS Config Rules and Custom Lambda Checks to continuously monitor network state. You will be responsible for building automated remediation for non-compliant resources (e.g., auto-applying default SG to ALBs).
- Guardrail Implementation: Develop and deploy Service Control Policies (SCPs) and IAM boundaries to prevent shadow networking and ensure all deployments adhere to the organizational security baseline.
- Documentation: Create and maintain detailed network documentation, including topology diagrams, configuration standards, and operational procedures.
- Tier-3 Forensic Troubleshooting: Act as the final escalation point for complex cloud/hybrid network failures. You must be able to perform deep-packet analysis (TCPDump/Wireshark) and use cloud-native observability (Flow Logs, Reachability Analyzer) to conduct data-driven Root Cause Analysis (RCA).
- Person Specification/Requirements
Education & Experience:
- · 8+ Years Engineering: Must have a background in heavy-duty network engineering, with the last 3+ years dedicated to writing IaC (Terraform/HCL).
Technical Skills:
- Strong Terraform/IaC proficiency: Ability to write reusable, dry, and version-controlled modules. Deep understanding of state management and providers.
- Automated Firewall Specialist: Proven experience bootstrapping virtual appliances and managing stateful firewall clusters in an autoscaling environment.
- Python/Scripting: Proficiency in Python for interacting with Cloud APIs (Boto3) and automating tasks that IaC cannot handle alone.
- Cloud Fluency: Deep expertise in AWS and Azure; experience with GCP, OCI, or Ali Cloud is a significant plus.
- Version Control Proficiency: Comfortable using Git for daily work. You should understand how to manage your own branches, commit clean code, and participate in the Pull Request (PR) process for peer reviews.
- Collaborative Automation: Experience working in a team environment where network changes are tracked in a repository rather than performed manually in a console.
- Compliance Tooling: Hands-on experience with AWS Config, AWS Firewall Manager, and AWS Security Hub (or Azure equivalents like Azure Policy and Microsoft Defender for Cloud).
- Automated Remediation: Ability to write Python/Lambda functions or use Systems Manager (SSM) documents to automatically fix out-of-compliance network resources.
- Policy Auditing: Experience translating regulatory requirements (e.g., NIST) into automated technical checks within a cloud environment.
- Automate-First Mindset: A strong aversion to manual point-and-click configuration (ClickOps). You naturally look for ways to turn repetitive tasks into code.
- Engineering Rigor: A disciplined approach to changes. You believe that if a change isn't in Git, it didn't happen. You value peer code reviews and understand that done includes testing and documentation.
- Operational Empathy: You write code and documentation that your future self (and your teammates) can understand at 3:00 AM during a production incident.
- Pragmatic Problem Solving: The ability to balance perfect automation with the immediate needs of the business, knowing when to build a robust module versus a quick remediation script.
- Collaborative Mindset: Willingness to mentor others in IaC best practices and participate in a blameless post-mortem culture when automation fails.
- Certifications (Preferred)
- AWS Certified Advanced Networking – Specialty
- Azure Network Engineer Associate
- Google Professional Cloud Network Engineer
- OCI Cloud Operations Associate
- ACP Cloud Networking
- CCNP/CCIE
- Palo Alto Network Security Professional
- Terraform Associate