girl looking into her desktop
Back to search results

Cloud Senior Site Reliability Engineer

Charlotte, North Carolina;

Job Description:

About Us

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection.  Responsible Growth is how we run our company and how we deliver for our clients, teammates, communities, and shareholders every day.

One of the keys to driving Responsible Growth is being a great place to work for our teammates around the world. We’re devoted to being a diverse and inclusive workplace for everyone. We hire individuals with a broad range of backgrounds and experiences and invest heavily in our teammates and their families by offering competitive benefits to support their physical, emotional, and financial well-being.

Bank of America believes both in the importance of working together and offering flexibility to our employees. We use a multi-faceted approach for flexibility, depending on the various roles in our organization.

Working at Bank of America will give you a great career with opportunities to learn, grow and make an impact, along with the power to make a difference.  Join us!

About Global Technology:

Global Technology delivers technology services globally across the bank’s eight lines of business that serve individuals, companies, and institutions. The team also focuses on digital banking, payments, infrastructure, data management and technology that enhances cyber security, and risk and capital management. Innovation is at the heart of all Global Technology does.

Enterprise Cloud Platforms Team:

Enterprise Cloud Platforms team in the CTO organization offers Private and Public Cloud platforms for Bank of America’s developers to drive faster time-to-market, innovation with private and public cloud capabilities, and reduce complexity with bult-in integrations. We believe in high quality engineering culture to engineer our platforms with customer and platform mindset, design for large enterprise scale and resilience, and accelerate market innovation into the technical platforms we deliver.

As part of this team, you will have a large impact on the evolution of next generation Cloud services for Bank of America and explore an extensive list of new technologies that will drive innovation across our company.

We are seeking an experienced Senior Cloud Site Reliability Engineer (SRE) to support and administration of our Public Cloud (Azure /AWS /Google) and Containers (OpenShift) platform.

Our Cloud Service Reliability Engineers (cSREs) ensure that our Cloud services meet the reliability and uptime requirements of our demanding enterprise customers. This is achieved with, the best engineering practices and resilient design and through a well-defined and effective global on-call rotation that runs 24x7.

The role provides opportunity to work with wide range of technologies and unique perspective on how various services (on-prem/external) interact with each other. You will work with colleagues that are as smart, hardworking, and driven as you. You will get an opportunity to work in a team that keeps growing, innovating, and giving you room to be proactive and creative.

Are you ready for the next step in your career? Then we’d love to hear from you!

Position Summary

  • Responsible for reliability and support of Cloud Platform including Public Cloud (Azure /AWS /Google) services.
  • Monitor and troubleshoot Azure/AWS /Google environment performance issues, connectivity issues, security issues, etc.
  • Perform deep dives into systemic and latent reliability issues, incident management, problem management
  • Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues.
  • Perform blameless RCA, partner with engineering and operation teams across the organization to roll out fixes.
  • Identify and drive opportunities to improve automation for the cloud services; scope and create automation for deployment, management, and visibility of our services.
  • Evaluating and automating the scaling and capacity requirements within Azure environments
  • Engage with engineering teams throughout the full lifecycle from design, engineering, deployment, & operations.
  • Partner with risk and compliance teams to bring visibility and implement right controls and policies in the Cloud Platform
  • Ensure resiliency during implementation and identify/fix resiliency problems by collaborating with engineering teams
  • Be a key stakeholder in the design of cloud services and work with Architecture, engineering, product teams
  • Participate in 24x7 on-call coverage follow the sun model

Required Skills:

  • BS /MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
  • Minimum 8+ years of hands-on experience maintaining cloud platforms on a major cloud service provider.
  • Experience working on Azure operations and Administration.
  • Azure /Terraform /AWS /Google certifications are a plus
  • Strong experience in implementing, monitoring, and maintaining Microsoft Azure solutions, including major services related to Compute, Storage, Network and Security
  • Experience with monitoring tools such as Prometheus or Dynatrace, as well as cloud native tools like Azure Monitor and Log Analytics
  •  Understanding of cost management, inventory management, FinOps model
  • Strong understanding and background of working with a complex IAM infrastructure, including Active Directory, Azure AD Connect, Azure AD, and PingIdentity, Okta, or other SSO solutions.
  • Advanced knowledge of DNS, DHCP, Kerberos and Windows Authentication
  • Experience with IaC with Terraform
  • Python, Ansible and shell scripting
  • Experience with CI/CD tools such as git andJenkins, familiarity with using a GitOps model
  • Excellent understanding of Linux /Windows operating systems administration
  • Systematic problem-solving approach, sense of ownership and drive
  • Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must

Desired Skills

  • Experience in Terraform, Ansible
  • Experience working in a highly available multi-datacenter environment
  • Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.
  • Ability to juggle competing priorities and adapt to changes in project scope

Job Band:

H4

Shift: 

1st shift (United States of America)

Hours Per Week:

40

Weekly Schedule:

Referral Bonus Amount:

0

Job Description:

About Us

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection.  Responsible Growth is how we run our company and how we deliver for our clients, teammates, communities, and shareholders every day.

One of the keys to driving Responsible Growth is being a great place to work for our teammates around the world. We’re devoted to being a diverse and inclusive workplace for everyone. We hire individuals with a broad range of backgrounds and experiences and invest heavily in our teammates and their families by offering competitive benefits to support their physical, emotional, and financial well-being.

Bank of America believes both in the importance of working together and offering flexibility to our employees. We use a multi-faceted approach for flexibility, depending on the various roles in our organization.

Working at Bank of America will give you a great career with opportunities to learn, grow and make an impact, along with the power to make a difference.  Join us!

About Global Technology:

Global Technology delivers technology services globally across the bank’s eight lines of business that serve individuals, companies, and institutions. The team also focuses on digital banking, payments, infrastructure, data management and technology that enhances cyber security, and risk and capital management. Innovation is at the heart of all Global Technology does.

Enterprise Cloud Platforms Team:

Enterprise Cloud Platforms team in the CTO organization offers Private and Public Cloud platforms for Bank of America’s developers to drive faster time-to-market, innovation with private and public cloud capabilities, and reduce complexity with bult-in integrations. We believe in high quality engineering culture to engineer our platforms with customer and platform mindset, design for large enterprise scale and resilience, and accelerate market innovation into the technical platforms we deliver.

As part of this team, you will have a large impact on the evolution of next generation Cloud services for Bank of America and explore an extensive list of new technologies that will drive innovation across our company.

We are seeking an experienced Senior Cloud Site Reliability Engineer (SRE) to support and administration of our Public Cloud (Azure /AWS /Google) and Containers (OpenShift) platform.

Our Cloud Service Reliability Engineers (cSREs) ensure that our Cloud services meet the reliability and uptime requirements of our demanding enterprise customers. This is achieved with, the best engineering practices and resilient design and through a well-defined and effective global on-call rotation that runs 24x7.

The role provides opportunity to work with wide range of technologies and unique perspective on how various services (on-prem/external) interact with each other. You will work with colleagues that are as smart, hardworking, and driven as you. You will get an opportunity to work in a team that keeps growing, innovating, and giving you room to be proactive and creative.

Are you ready for the next step in your career? Then we’d love to hear from you!

Position Summary

  • Responsible for reliability and support of Cloud Platform including Public Cloud (Azure /AWS /Google) services.
  • Monitor and troubleshoot Azure/AWS /Google environment performance issues, connectivity issues, security issues, etc.
  • Perform deep dives into systemic and latent reliability issues, incident management, problem management
  • Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues.
  • Perform blameless RCA, partner with engineering and operation teams across the organization to roll out fixes.
  • Identify and drive opportunities to improve automation for the cloud services; scope and create automation for deployment, management, and visibility of our services.
  • Evaluating and automating the scaling and capacity requirements within Azure environments
  • Engage with engineering teams throughout the full lifecycle from design, engineering, deployment, & operations.
  • Partner with risk and compliance teams to bring visibility and implement right controls and policies in the Cloud Platform
  • Ensure resiliency during implementation and identify/fix resiliency problems by collaborating with engineering teams
  • Be a key stakeholder in the design of cloud services and work with Architecture, engineering, product teams
  • Participate in 24x7 on-call coverage follow the sun model

Required Skills:

  • BS /MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
  • Minimum 8+ years of hands-on experience maintaining cloud platforms on a major cloud service provider.
  • Experience working on Azure operations and Administration.
  • Azure /Terraform /AWS /Google certifications are a plus
  • Strong experience in implementing, monitoring, and maintaining Microsoft Azure solutions, including major services related to Compute, Storage, Network and Security
  • Experience with monitoring tools such as Prometheus or Dynatrace, as well as cloud native tools like Azure Monitor and Log Analytics
  •  Understanding of cost management, inventory management, FinOps model
  • Strong understanding and background of working with a complex IAM infrastructure, including Active Directory, Azure AD Connect, Azure AD, and PingIdentity, Okta, or other SSO solutions.
  • Advanced knowledge of DNS, DHCP, Kerberos and Windows Authentication
  • Experience with IaC with Terraform
  • Python, Ansible and shell scripting
  • Experience with CI/CD tools such as git andJenkins, familiarity with using a GitOps model
  • Excellent understanding of Linux /Windows operating systems administration
  • Systematic problem-solving approach, sense of ownership and drive
  • Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must

Desired Skills

  • Experience in Terraform, Ansible
  • Experience working in a highly available multi-datacenter environment
  • Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.
  • Ability to juggle competing priorities and adapt to changes in project scope

Shift:

1st shift (United States of America)

Hours Per Week: 

40

Learn more about this role

Full time

JR-23008959

Band: H4

Manages People: No

Travel: Yes, 5% of the time

Manager:

Talent Acquisition Contact:

Geeta Upadhye

Referral Bonus:

0

Jersey City pay and benefits information

Jersey City pay range:

$147,900 - $181,600 annualized salary, offers to be determined based on experience, education and skill set.

Discretionary incentive eligible

This role is eligible to participate in the annual discretionary plan. Employees are eligible for an annual discretionary award based on their overall individual performance results and behaviors, the performance and contributions of their line of business and/or group; and the overall success of the Company.

Benefits

This role is currently benefits eligible. We provide industry-leading benefits, resources and support to our employees so they can make a genuine impact and contribute to the sustainable growth of our business and the communities we serve.