girl looking into her desktop
Back to search results

Sr Performance Test & Site Reliability Engineer, Core Technology Infrastructure

Charlotte, North Carolina

Job Description:

Team leadership experience in performance & capacity; key role in Dynatrace conversion in support of NvD/Resiliency charters. Requires experience in root cause analysis, tuning and optimization, and e2e visibility of application flows.

Experience and SME in:

SRE (Site Reliability Engineering)

  • Performance & Capacity

  • Development / Coding / Business Acumen

  • Architecture

  • Soft / Interpersonal Skills – Proactive Partner, Consultant, Mentor

  • Tuning / Optimization Role Responsibilities

  • Address and improve the performance, availability, latency, efficiency, monitoring, troubleshooting, and planning of production software and services

  • Work directly with engineering, design, architecture, development, and operations partners to ensure proper service design decisions, documentation, tooling, CI/CD automation, performance & capacity, and production standard practices

  • Assist in architectural decisions within domain expertise, both from a technical and business perspective .

  • Consult with project teams to ensure emerging software conforms to NFRs for availability, security, maintainability, and performance .

  • Partner with operations / production support partners to ensure delivery and deployment pipelines run smoothly, and to pinpoint, troubleshoot and remediate performance, capacity, and resiliency issues in LLE and Production environments

  • Proactively identify and resolve risks to services and systems before they become issues .

  • Develop the software, tools, and processes (ie. data collection and extensive monitoring) needed to maintain systems and services and their availability, performance, and resiliency

  • Capture and analyze major metrics, such as availability, mean time between failures and mean time to repair, and develop new metrics and KPIs as necessary. Add these metrics to monitoring dashboards and reporting systems .

  • Use detailed monitoring to improve the availability and performance of applications, services, systems, and infrastructure. Create new alerts to find anomalies and understand the root cause of system failures.

  • Create and deploy automation, alerting, self-healing architectures and other technologies to make the environment more maintainable

  • Monitor, manage, and troubleshoot regular processes to improve processes and workflows.

  • Create and maintain documentation for processes, automation, infrastructure, resources, and services.

  • Act as a subject matter expert and coach to mentor developers, testers, and engineers, as well as assist junior developers with software performance, troubleshooting, and debugging.

Required Skills

Senior Performance Engineer (w/ Strong SRE slant)

Role Requirements

Performance Engineering

  • Strong performance engineering experience (performance planning & strategy throughout full SDLC, application performance management, tuning and optimization, troubleshooting/triage, performance analysis, critical thinking, sees the bigger picture, understands performance beyond the testing).

  • Exhibits exemplary triage ownership / leadership ability (ability to own issue resolution, direct triage, drive participation and discussion, execute crisp and effective communications, strong organizational skills).

  • Solid working knowledge and understanding of software development, software system platforms, and architectural concepts (Java/JEE, application performance management, manual/automated code review, code analysis, architectural/enterprise integration patterns, 3-tier architecture, anti-pattern detection).

  • Skilled in performance testing tools, methodologies, and deliverables (experience leading work efforts in all aspects of performance testing including requirements, planning, complex scripting, test harness design, test execution, results analysis).

SRE (Site Reliability Engineering)

  • The rapid and iterative process of modern Agile development leaves a Reliability gap – teams can end up deploying new, yet unreliable, services at a quick pace.

  • The Site Reliability Engineer capability builds and implements quality software that enhances the reliability, repeatability, and flexibility of production services and systems in a DevOps environment .

  • The SRE creates a bridge between development and operations by applying a software engineering mindset to system administration topics

  • The SRE maintain a unique blend of development and operational focus o SREs all share a set of basic responsibilities for the service(s) they support and adhere to the same core tenets. o In general, the SRE is responsible for the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of their service(s).

  • The SRE must maintain the skill set necessary to write software to replace previously manual work, even when the solution is complicated.

Desired Skills

  • Nice-to-have:

  • MicroFocus LoadRunner

  • Parasoft or CA Virtualize / iTKO LISA (used in a Service Virtualization capacity)

  • IBM MQ

  • Microservices, Containers & Orchestration (ie. Docker, OpenShift, Kubernetes)

  • Mainframe – z/OS, CICS, IMS, DB2 • Oracle o OEM/AWR o SQL Developer or TOAD

  • Cassandra

  • DevCenter

  • OpsCenter

  • Jboss, Apache Tomcat

  • WireShark or comparable network packet analysis tool

Core Technology Infrastructure Organization:

  • Strives to bring new thoughts and ideas to teams in order to drive innovation and unique solutions

  • Excels in working among diverse viewpoints to determine the best path forward

  • Experience in connecting with a diverse set of clients to understand future business needs – is a continuous learner

  • Commitment to challenging the status quo and promoting positive change

  • Participate in and drive collaborative efforts to advance tools, technology, and ways of working to better serve an evolving client base

  • Believes in value of diversity so we can reflect, connect and meet the diverse needs of our clients and employees around the world

Sr Performance Test & Site Reliability Engineer will support, systems programming and data center capabilities. Responsible for components of highly complex engineering and/or analytical tasks and activities. Establishes input/output processes and working parameters for hardware/software compatibility, coordination of subsystems design, and integration of total system. Viewed as a technology subject matter expert; able to provide and communicate complex technology solutions across differing audiences including technical, managerial, business executives, and/or vendors. Will have responsibility for multiple, complex projects; will direct activities of teams related to special initiatives or operations and may have direct reports. Leads the resolution process for complex problems where analysis of situations or data requires an in-depth evaluation of various factors. Exercises judgment within broadly defined practices and policies in selecting methods, techniques, and evaluation criterion for obtaining results. Information Technology degree and/or technology certifications preferred or substantial equivalent experience. Typically 7-10 years of IT experience.

Job Band:

H5

Shift: 

1st shift (United States of America)

Hours Per Week:

40

Weekly Schedule:

Referral Bonus Amount:

0

Job Description:

Team leadership experience in performance & capacity; key role in Dynatrace conversion in support of NvD/Resiliency charters. Requires experience in root cause analysis, tuning and optimization, and e2e visibility of application flows.

Experience and SME in:

SRE (Site Reliability Engineering)

  • Performance & Capacity

  • Development / Coding / Business Acumen

  • Architecture

  • Soft / Interpersonal Skills – Proactive Partner, Consultant, Mentor

  • Tuning / Optimization Role Responsibilities

  • Address and improve the performance, availability, latency, efficiency, monitoring, troubleshooting, and planning of production software and services

  • Work directly with engineering, design, architecture, development, and operations partners to ensure proper service design decisions, documentation, tooling, CI/CD automation, performance & capacity, and production standard practices

  • Assist in architectural decisions within domain expertise, both from a technical and business perspective .

  • Consult with project teams to ensure emerging software conforms to NFRs for availability, security, maintainability, and performance .

  • Partner with operations / production support partners to ensure delivery and deployment pipelines run smoothly, and to pinpoint, troubleshoot and remediate performance, capacity, and resiliency issues in LLE and Production environments

  • Proactively identify and resolve risks to services and systems before they become issues .

  • Develop the software, tools, and processes (ie. data collection and extensive monitoring) needed to maintain systems and services and their availability, performance, and resiliency

  • Capture and analyze major metrics, such as availability, mean time between failures and mean time to repair, and develop new metrics and KPIs as necessary. Add these metrics to monitoring dashboards and reporting systems .

  • Use detailed monitoring to improve the availability and performance of applications, services, systems, and infrastructure. Create new alerts to find anomalies and understand the root cause of system failures.

  • Create and deploy automation, alerting, self-healing architectures and other technologies to make the environment more maintainable

  • Monitor, manage, and troubleshoot regular processes to improve processes and workflows.

  • Create and maintain documentation for processes, automation, infrastructure, resources, and services.

  • Act as a subject matter expert and coach to mentor developers, testers, and engineers, as well as assist junior developers with software performance, troubleshooting, and debugging.

Required Skills

Senior Performance Engineer (w/ Strong SRE slant)

Role Requirements

Performance Engineering

  • Strong performance engineering experience (performance planning & strategy throughout full SDLC, application performance management, tuning and optimization, troubleshooting/triage, performance analysis, critical thinking, sees the bigger picture, understands performance beyond the testing).

  • Exhibits exemplary triage ownership / leadership ability (ability to own issue resolution, direct triage, drive participation and discussion, execute crisp and effective communications, strong organizational skills).

  • Solid working knowledge and understanding of software development, software system platforms, and architectural concepts (Java/JEE, application performance management, manual/automated code review, code analysis, architectural/enterprise integration patterns, 3-tier architecture, anti-pattern detection).

  • Skilled in performance testing tools, methodologies, and deliverables (experience leading work efforts in all aspects of performance testing including requirements, planning, complex scripting, test harness design, test execution, results analysis).

SRE (Site Reliability Engineering)

  • The rapid and iterative process of modern Agile development leaves a Reliability gap – teams can end up deploying new, yet unreliable, services at a quick pace.

  • The Site Reliability Engineer capability builds and implements quality software that enhances the reliability, repeatability, and flexibility of production services and systems in a DevOps environment .

  • The SRE creates a bridge between development and operations by applying a software engineering mindset to system administration topics

  • The SRE maintain a unique blend of development and operational focus o SREs all share a set of basic responsibilities for the service(s) they support and adhere to the same core tenets. o In general, the SRE is responsible for the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of their service(s).

  • The SRE must maintain the skill set necessary to write software to replace previously manual work, even when the solution is complicated.

Desired Skills

  • Nice-to-have:

  • MicroFocus LoadRunner

  • Parasoft or CA Virtualize / iTKO LISA (used in a Service Virtualization capacity)

  • IBM MQ

  • Microservices, Containers & Orchestration (ie. Docker, OpenShift, Kubernetes)

  • Mainframe – z/OS, CICS, IMS, DB2 • Oracle o OEM/AWR o SQL Developer or TOAD

  • Cassandra

  • DevCenter

  • OpsCenter

  • Jboss, Apache Tomcat

  • WireShark or comparable network packet analysis tool

Core Technology Infrastructure Organization:

  • Strives to bring new thoughts and ideas to teams in order to drive innovation and unique solutions

  • Excels in working among diverse viewpoints to determine the best path forward

  • Experience in connecting with a diverse set of clients to understand future business needs – is a continuous learner

  • Commitment to challenging the status quo and promoting positive change

  • Participate in and drive collaborative efforts to advance tools, technology, and ways of working to better serve an evolving client base

  • Believes in value of diversity so we can reflect, connect and meet the diverse needs of our clients and employees around the world

Sr Performance Test & Site Reliability Engineer will support, systems programming and data center capabilities. Responsible for components of highly complex engineering and/or analytical tasks and activities. Establishes input/output processes and working parameters for hardware/software compatibility, coordination of subsystems design, and integration of total system. Viewed as a technology subject matter expert; able to provide and communicate complex technology solutions across differing audiences including technical, managerial, business executives, and/or vendors. Will have responsibility for multiple, complex projects; will direct activities of teams related to special initiatives or operations and may have direct reports. Leads the resolution process for complex problems where analysis of situations or data requires an in-depth evaluation of various factors. Exercises judgment within broadly defined practices and policies in selecting methods, techniques, and evaluation criterion for obtaining results. Information Technology degree and/or technology certifications preferred or substantial equivalent experience. Typically 7-10 years of IT experience.

Shift:

1st shift (United States of America)

Hours Per Week: 

40

Learn more about this role

Full time

JR-21055429

Band: H5

Manages People: No

Travel: No

Manager:

Talent Acquisition Contact:

Taimour Khan

Referral Bonus:

0

Street Address

Primary Location:
800 W TRADE ST, NC, Charlotte, 28255