Search by job, company or skills

Infrastructure Engineer II

Bank Of America

    Highlights

    Job Description

    More Info

    Recruiter Info

Job Description

  • Responsible for SRE Support for Container platforms apply SRE knowledge to identify potential gaps in the observability design or implementation.
  • Work with the clients, Application and development Teams to onboard the applications and integrate with CI/CD platform.
  • Be able to provide technical expertise to Configure, Deploy, and Support Bank workloads to securely run and operate in Container Infra (K8s/RedHat Open Shift/AKS).
  • Responsible for engineering of new capabilities to the OpenShift/Container Platforms and delivering those capabilities in a fully automated and supportable fashion.
  • Implement cluster services to manage On-Prem Bare Metal Open shift cluster deployments and off-prem deployments.
  • Work with monitoring tools and Application Development teams to enhance monitoring capabilities and modify monitoring dashboards for new observability plans created in support of initiatives or continuous improvement efforts.
  • Develop software or system scripts to simplify or eliminate the dependence on human intervention for recurring tasks.
  • Work with Production Support teams to perform knowledge transfer, playbook updates and training for new monitoring capabilities.
  • Identify vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and noise in monitoring and to help define solutions to improve system reliability.
  • Develop and maintain a catalog of extensible reliability scripts, tools, and libraries that can be leveraged for common instrumentation, automation and operational needs.
  • Be able to provide technical expertise to Configure, Deploy, and Support Bank workloads to securely run and operate in Container Infra (K8s/RedHat Open Shift/AKS).
  • Responsible for engineering of new capabilities to the OpenShift/Container Platforms and delivering those capabilities in a fully automated and supportable fashion.
  • Implement cluster services to manage On-Prem Bare Metal Open shift cluster deployments and off-prem deployments.
  • Responsible for SRE Support for Container platforms apply SRE knowledge to identify potential gaps in the observability design or implementation.
  • Work with the clients, Application and development Teams to onboard the applications and integrate with CI/CD platform.
  • Be able to provide technical expertise to Configure, Deploy, and Support Bank workloads to securely run and operate in Container Infra (K8s/RedHat Open Shift/AKS).
  • Responsible for engineering of new capabilities to the OpenShift/Container Platforms and delivering those capabilities in a fully automated and supportable fashion.
  • Implement cluster services to manage On-Prem Bare Metal Open shift cluster deployments and off-prem deployments.
  • Work with monitoring tools and Application Development teams to enhance monitoring capabilities and modify monitoring dashboards for new observability plans created in support of initiatives or continuous improvement efforts.
  • Develop software or system scripts to simplify or eliminate the dependence on human intervention for recurring tasks.
  • Work with Production Support teams to perform knowledge transfer, playbook updates and training for new monitoring capabilities.
  • Identify vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and noise in monitoring and to help define solutions to improve system reliability.
  • Develop and maintain a catalog of extensible reliability scripts, tools, and libraries that can be leveraged for common instrumentation, automation and operational needs.
  • Be able to provide technical expertise to Configure, Deploy, and Support Bank workloads to securely run and operate in Container Infra (K8s/RedHat Open Shift/AKS).
  • Responsible for engineering of new capabilities to the OpenShift/Container Platforms and delivering those capabilities in a fully automated and supportable fashion.
  • Implement cluster services to manage On-Prem Bare Metal Open shift cluster deployments and off-prem deployments.
Requirements
Education: B.E. / B. Tech / M.E. / M. Tech / MCA
Certifications If Any: N/A
Experience Range: 8 to 10 years
Foundational Skills
  • Experience as a Site Reliability Engineer within large, multinational organizations, with a preference for implementations of new technologies with a proven track record of success.
  • Demonstrated ability to design and develop significant components within an application.
  • Expertise in supporting Container production (K8s/RedHat Openshift) environments , and associated maintenance, change control, incident and problem management
  • Strong experience in Linux administration, programming experience in at least one language (Python, Shell scripting, Java etc) and Cloud-native technologies.
  • Strong experience in onboarding applications to container and multi-cloud platforms - Azure, AWS, GCP, IBM Cloud
  • Strong experience in Infrastructure automation using either of Terraform/Packer, Ansible or Python
  • Understanding or exposure of agile as well as ITSM incident/change/request management processes.
  • Experience of implementing platform resiliency, self-healing, health compliance dashboards, automation for day to day operational tasks over hybrid cloud for enterprise class production grade environment is desired.
  • Experience in PaaS logging, monitoring, and observability tools such as ELK, FluentD, Prometheus, Splunk, Nagios, Datadog, etc.
  • Experience in building large scale distributed enterprise platforms with focus on performance, scale, security, and reliability
  • Self-motivated and results oriented with excellent analytical, problem solving, interpersonal, presentation and communication skills.
  • Operate in a fast-paced environment with multiple concurrent priorities
Desired Skills
  • Experience in designing, analyzing and troubleshooting large scale distributed systems and good understanding of multi-vendor Cloud offerings.
  • Experience in cloud-native network, storage, and virtualization technologies
  • Experience in DevOps and GitOps models with IaaS, Config-as-Code, Policy-as-Code and CI/CD tools - bit bucket, jfrog, Jenkins, Artifactory, Ansible
  • Experience with modern performance monitoring and diagnostics tools (examples: Splunk, Splunk ITSI, AppD, Dynatrace, SolarWinds, etc.)
  • Understand relevant application technologies and development life cycles.
  • Operational Process Routines: Strong adherence to operating controls, risk management, process review and creation, documentation and collaborative knowledge sharing.
  • Inter-personal skills and Communication skills.
  • RedHat Openshift/Kubernetes Certifications; Cloud(Azure/AWS/GCP) Certifications.
  • Ability to use qualitative and quantitative analytical skills to assess the effectiveness of the operations, manage competing priorities and adapt to change in project scope .
  • Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.
Follow
Save
Report

Similar Jobs

Infrastructure Engineer II GBS IND

Company Name Confidential

People also considered

DelhiBengaluru / BangaloreNoidaMumbaiHyderabad / Secunderabad Telangana
Last Updated: 21-07-2024 09:31:48 AM
Home Jobs in Gurugram Infrastructure Engineer II