girl looking into her desktop
Back to search results

Cloud Reliability Engineer - Core Technology Infrastructure

Charlotte, North Carolina;

Job Description:

Job Description:

  • Responsible for reliability and support of Internal Cloud, Public Cloud (Azure /IBM) and Openshift Containers (Dockers) services.
  • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
  • Troubleshoot issues across the entire stack: hardware, software, application, and network
  • Perform deep dives into both systemic and latent reliability issues; perform blameless RCA, partner with engineering and operation teams across the organization to roll out fixes.
  • Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization.
  • Identify and drive opportunities to improve automation for the cloud services
  • Provide on-call coverage as per rotation
  • Be a key stakeholder in the design of cloud services so that they are resilient from day 0 and identify/fix resiliency problems by collaborating with product teams

Required Qualifications

  • BS /MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
  • Minimum 6+ years of hands-on experience maintaining infrastructure services
  • Excellent understanding of Linux /Windows operating systems administration
  • Experience with VMware, Azure cloud, Openshift Docker, Kubernetes  
  • Experience with automation in one or more of the programming: Python, Java, Ansible and shell scripting and source control (git)
  • Experience with Sql/NoSql databases like Mysql, mongodb and CI/CD tools git /Jenkins
  • Experience with Ansible Tower, Redhat Satellite Foreman, capsule architecture knowledge is a plus.
  • Experience with Hashicorp Vault /Terraform /Consul /Nomad is a plus.
  • Systematic problem-solving approach, sense of ownership and drive
  • Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must.
  • Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.

Job Band:

H5

Shift: 

1st shift (United States of America)

Hours Per Week:

40

Weekly Schedule:

Referral Bonus Amount:

0

Job Description:

Job Description:

  • Responsible for reliability and support of Internal Cloud, Public Cloud (Azure /IBM) and Openshift Containers (Dockers) services.
  • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
  • Troubleshoot issues across the entire stack: hardware, software, application, and network
  • Perform deep dives into both systemic and latent reliability issues; perform blameless RCA, partner with engineering and operation teams across the organization to roll out fixes.
  • Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization.
  • Identify and drive opportunities to improve automation for the cloud services
  • Provide on-call coverage as per rotation
  • Be a key stakeholder in the design of cloud services so that they are resilient from day 0 and identify/fix resiliency problems by collaborating with product teams

Required Qualifications

  • BS /MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
  • Minimum 6+ years of hands-on experience maintaining infrastructure services
  • Excellent understanding of Linux /Windows operating systems administration
  • Experience with VMware, Azure cloud, Openshift Docker, Kubernetes  
  • Experience with automation in one or more of the programming: Python, Java, Ansible and shell scripting and source control (git)
  • Experience with Sql/NoSql databases like Mysql, mongodb and CI/CD tools git /Jenkins
  • Experience with Ansible Tower, Redhat Satellite Foreman, capsule architecture knowledge is a plus.
  • Experience with Hashicorp Vault /Terraform /Consul /Nomad is a plus.
  • Systematic problem-solving approach, sense of ownership and drive
  • Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must.
  • Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.

Shift:

1st shift (United States of America)

Hours Per Week: 

40

Learn more about this role

Full time

JR-21049595

Band: H5

Manages People: No

Travel: No

Manager:

Talent Acquisition Contact:

Angela Kathmann

Referral Bonus:

0