Job Description:
At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.
Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to being an inclusive workplace, attracting and developing exceptional talent, supporting our teammates’ physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.
Bank of America is committed to an in-office culture with specific requirements for office-based attendance and which allows for an appropriate level of flexibility for our teammates and businesses based on role-specific considerations.
At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!
Job Description:
This job is responsible for defining an architectural vision and solution that supports the strategic outcomes of the Business' Products and Services. Key responsibilities include defining the target operating environment, designing for client resiliency, assisting with solution design, and defining non-functional requirements. Job expectations include working with stakeholders and service providers aligned to the Business' strategic objectives, evaluating the impact of strategic design decisions, and contributing to the architecture roadmap.
Responsibilities:
- Works across the business, operations and technology to create the solution intent and architectural vision for complex solutions and prioritize functional and non-functional requirements into a technology backlog to enable the technology roadmap and functionality to support evolving capabilities and services
- Contributes to the creation of the architecture roadmap of defined domains (Business, Application, Data, and Technology) in support of the product roadmap and the development of best practices including standardized templates
- Clarifies the architecture, assists with system design to support implementation, and provides solution options to resolve any architectural impediments
- Facilitates solution driven discussions, leads the design of complex architectures, and finds creative solutions through knowledge of domain, practical experiments, and proof of concepts while ensuring architecture is flexible, modular, and adaptable
- Educates team members on the technology practices, standardization strategies, and best practices to create innovative solutions
- Supports the team as needed to select the technology stack required for solutions and helps select preferred technology products
- Performs design and code reviews to ensure all non-functional requirements are sufficiently met (for example, security, performance, maintainability, scalability, usability, and reliability)
- Develop and Deploy AI architectures that meet business requirements, leveraging various AI technologies such as machine learning, deep learning, and natural language processing.
- Establish and enforce standards for AI system design, development, and deployment, ensuring consistency and scalability..
- Provide model developers and solutions developers with the value-add consulting support they need to optimally leverage suite of model development and production capabilities with a focus on building, staging and deploying AI enabled solutions.
- Work closely with model and application developers to better understand their needs, use cases, and improve user experience.
- Collaborate with cross-functional teams, working closely with application developers, data engineers, product managers, and other stakeholders to understand business needs
- Deploy machine learning models to solve complex problems and improve business outcomes.
- Continuously monitor AI system performance, identify areas for improvement, and implement optimizations to ensure systems meet business requirements.
- Create and maintain documentation for AI systems, including architecture diagrams, technical specifications, and user guides.
Required Qualifications:
- 10-15+ Years of Overall Experience
- Proficiency in languages such as Python, with experience in libraries like NumPy and scikit-learn.
- Knowledge of various machine learning algorithms, including supervised and unsupervised learning, neural networks, decision trees, clustering, and dimensionality reduction.
- Experience with deep learning frameworks such as TensorFlow, PyTorch, or Keras, and knowledge of their architectures and APIs.
- Proficient with SLURM workload manager with REST and Flask APIs for automated and secure job scheduling.
- Experienced in scalable infrastructure for deploying and managing large language models (LLMs),
- HPC engineer with hands-on experience designing and managing GPU-accelerated clusters for large-scale AI/ML workloads.
- Experience with deploying machine learning models in production environments, including containerization, microservices, and API design.
- Leveraging Prometheus and Grafana to collect and analyze metrics, identify performance issues, and implement fixes. Experience creating Slurm and Triton metrics will be a plus.
- Familiarity with Triton Inference Server, including its architecture, configuration, and deployment.
- Knowledge of model optimization techniques, including pruning, quantization, and knowledge distillation.
- Exploratory Data Analysis - Plotly, Seaborn, matplotlib
- Deep Learning, Neural Networks, Decision Trees, Ensemble Methods, Gradient Boosting, Support Vector Machines, Random Forest, Logistic Regression, Transfer learning, Transformer based models, BART, Hyperparameter Tuning, Gen-AI, CNN, Computer Vision, NLP
- Tools and Platforms like - Docker, Kubernetes, Jupyter, MLFlow, Github, Terraform, Jenkins, HuggingFace
- Flask API Development and Security
- Container Runtimes: Enroot, Pyxis, Podman
- Linux (RHEL/CentOS) System Administration
- Model Optimization techniques using Triton with TRTLLM
Desired Qualifications:
- Experience with data cleaning, feature scaling, and normalization
- Programming skills creating UI/UX using the Angular framework, HTML, CSS, and JavaScript
- Creating vector embeddings
- Tools and Platforms like - AWS (SageMaker, Lambda, EC2)
- Database Technologies – Oracle, MS-SQL, MongoDB, Redis and MySQL
- SQL and PL/SQL Scripting
Skills:
- Analytical Thinking
- Architecture
- Result Orientation
- Solution Design
- Technical Strategy Development
- Application Development
- Collaboration
- Data Management
- DevOps Practices
- Risk Management
- Agile Practices
- Automation
- Influence
- Solution Delivery Process
- Test Engineering
Shift:
1st shift (United States of America)
Hours Per Week:
40
Learn more about this role