Key Responsibilities (Roles & Responsibilities) L3 Technical Lead (Azure + GCP + Windows)
1) Client Ownership & Service Delivery
- Act as the primary L3 technical lead for the client and ensure end-to-end delivery of Azure, GCP, and Windows managed services.
- Own SLA adherence, service health, operational KPIs, and incident/problem outcomes across both clouds.
- Lead client governance: daily/weekly status calls, monthly service reviews and reporting, Continuous service improvement plans.
- Understand client expectations, translate them into actionable technical deliverables (roadmap/backlog), and ensure timely execution with measurable outcomes.
- Ensure effective stakeholder management: clear communication, expectation alignment, risk visibility, and proactive advisories to improve customer confidence.
2) Technical Leadership (Azure + GCP + Windows)
- Provide L3 escalation support for complex Azure, GCP, and Windows issues across compute, and platform services.
- Drive design, implementation, and optimization of:
Azure (IaaS/PaaS)
- Azure VMs, Scale Sets, Storage (Blob/Files/Disks), VNets, Load Balancer/App Gateway, NSG/ASG, Private Endpoints, DNS.
- Backup/DR (Azure Backup, ASR), Monitoring (Azure Monitor, Log Analytics, Alerts), Automation (Runbooks/Functions), Governance (Policy/Blueprints where applicable).
GCP (IaaS/PaaS)
- Compute Engine, Instance Groups, Cloud Storage, VPC, Cloud VPN/Interconnect, Load Balancing, Cloud DNS, Firewall rules, Private Service Connect (where applicable).
- IAM, Organization policies, Resource hierarchy (Org/Folders/Projects), service accounts, key management (Cloud KMS), Secret Manager.
- Observability & operations: Cloud Monitoring, Cloud Logging, alert policies, log-based metrics, uptime checks; integration to SIEM (e.g., Splunk) when required.
Windows / Microsoft Platform
- Windows Server administration: AD DS, DNS, GPO, CA/certificates (as applicable), Build servers, Migrate to updated OS versions, IIS, patching, performance tuning, hardening, troubleshooting.
- Security baselines and vulnerability remediation support (OS and platform-level).
- Perform deep root cause analysis (RCA) for major incidents and ensure preventive controls are implemented (monitoring, automation, configuration hardening).
- Lead platform improvements: standardization, automation, hardening, monitoring improvements, reliability enhancements, and cross-cloud best practices.
3) Team Management & Leadership
- Manage engineers (L1/L2/L3 as applicable) under the client account across Azure + GCP + Windows scope: task allocation, mentoring, quality reviews, and capability building.
- Ensure shift/roster coverage, on-call effectiveness, timely ticket handling, and consistent quality output.
- Drive operational discipline: runbooks, SOPs, KB articles, handovers, cross-training, and continuous skill uplift (cloud + Windows + Security).
- Coach the team on incident handling, change hygiene, documentation standards, and customer communication etiquette.
4) Incident, Problem, Change & Risk Management
- Lead Major Incident Management (bridge calls, triage, comms, workaround, resolution, post-incident review) across Azure/GCP/Windows.
- Own Problem Management: trend analysis, recurring incidents, permanent fixes, Known Errors and preventive action tracking.
- Ensure effective Change Management: risk assessment, CAB support, rollout/backout plans, pre/post validation, and change documentation for both clouds and OS changes.
- Identify risks proactively and implement mitigation actions:
- Availability & resilience (multi-zone/region strategy where applicable)
- Security & compliance (policy enforcement, IAM hygiene, key/secret controls)
- Performance & capacity (scale patterns, bottleneck fixes)
- Operational risk (single points of failure, process gaps, skill gaps)
5) Consultation & Advisory
- Provide consultative support to client stakeholders:
- Architecture guidance, best practices, migration/modernization suggestions across Azure and GCP
- Security controls, governance, compliance alignment (RBAC/IAM, policies, logging, encryption)
- Backup/DR strategy, resilience recommendations, RTO/RPO alignment
- Participate in technical discussions, solution workshops, and roadmap planning.
- Propose and drive improvements: automation opportunities, tooling enhancements, operational maturity uplift, and reliability engineering practices.
6) Documentation & Process Excellence
- Maintain and continuously improve:
- Architecture diagrams (Azure + GCP + Windows)
- SOPs/runbooks, escalation matrix, support model, service catalog, operational reports (weekly/monthly)
- Configuration standards, hardening checklists, monitoring & alerting catalog, backup/DR runbooks
- Drive continuous improvement initiatives:
- Automation (scripts, runbooks, functions, pipelines)
- Monitoring enhancements (signal-to-noise tuning, SLO-based alerting)
- Standard configurations and policy enforcement (Azure Policy / GCP Org Policies)
- Audit readiness support (access reviews, logging evidence, compliance reporting)