Senior Engineer-ai Inference at Bank of America

What you'd actually do

Ensures that the design and engineering approach for complex features are consistent with the larger portfolio solution

Define the technology tool stack for the solution and evaluate and adapt new testing tool/framework/practices for team(s)

Enables team(s)/applications with Continuous Integration/Continuous Development (CI/CD) capabilities and engages with other technical stakeholders pertaining to efficient functioning of CI-CD pipeline

Guides and influences team(s) on design and best practices for high code performance –e.g. pairing, code reviews

Provides end-to-end delivery of complex features, including automation, for either a single team or multiple teams, at the program level

Skills

Required

Python development on Linux
Model Ops
AI/ML
advanced analytics
design patterns
software engineering practices
fundamental algorithms
code optimization
vLLM
Triton Inference Server
performance tuning
inference metrics
monitoring
observability
serving multiple tenants/clients
secure boundaries
Atheization & Authorization
Policy as Code
Systems Integration
Model Routing
Model Evaluation frameworks
RAG
Model Monitoring
Model Drift
KPIs
Test Driven Development
continual integration
clean code principles

Nice to have

open-source data science platform architecture
storage & compute separation
interactive development workbenches
containers
Jupyter
VSCode
Redis
Solar
Postgres DB
FAISS
Teradata
Oracle
SQL Server
Hadoop
Gen AI training and Inferencing platform
open-source model
Gen AI Mode

What the JD emphasized

Minimum 8 years of relevant experience required

Experience in Model Ops and design, software development with proven effectiveness in delivering technology in fast-paced, demanding, industry driven environment for AI/ML, and advanced analytics

Hands on experience in both Python development on Linux

Experience with deploying models using vLLM/Triton Inference Server

Performance Tuning those models and deployment to provide higher throughput

Experience with various inference metrics, and related monitoring and observability

Experience with serving multiple tenants/clients with model endpoints with secure boundaries

Model Evaluation frameworks to evaluate different models and their tradeoffs between efficiency and metrics

Experience building RAG for various knowledge bases, and document types

Model Monitoring – Ability to collect metrics to measure things like Model Drift, KPIs

Job Description:

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.

Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to being an inclusive workplace, attracting and developing exceptional talent, supporting our teammates’ physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.

Bank of America is committed to an in-office culture with specific requirements for office-based attendance and which allows for an appropriate level of flexibility for our teammates and businesses based on role-specific considerations.

At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!

Position Summary:

Join a groundbreaking team at Bank of America, at the forefront of innovation in AI. We are building the next generation of Gen AI platform, empowering new AI initiatives across Consumer, Small Business, Global Banking, and Wealth organizations. This is a unique opportunity to contribute to a critical platform that will enable secure, scalable, and high-performance AI capabilities across the organization. We value curiosity, collaboration, and a passion for pushing the boundaries of what’s possible with AI.

This position is focused on design, build, and serve the Gen AI inferencing capabilities.

This job is responsible for defining and leading the engineering approach for complex features to deliver significant business outcomes. Key responsibilities of the job include delivering complex features and technology, enabling development efficiencies, providing technical thought leadership based on conducting multiple software implementations, and applying both depth and breadth in a number of technical competencies. Additionally, this job is accountable for end-to-end solution design and delivery.

Responsibilities:

Ensures that the design and engineering approach for complex features are consistent with the larger portfolio solution
Define the technology tool stack for the solution and evaluate and adapt new testing tool/framework/practices for team(s)
Enables team(s)/applications with Continuous Integration/Continuous Development (CI/CD) capabilities and engages with other technical stakeholders pertaining to efficient functioning of CI-CD pipeline
Guides and influences team(s) on design and best practices for high code performance –e.g. pairing, code reviews
Provides end-to-end delivery of complex features, including automation, for either a single team or multiple teams, at the program level
Conducts research, design prototyping and other exploration activities such as evaluating new toolsets and components for release management, CI/CD, and features
Works with stakeholders to establish high-level solution needs and with architects for technical requirements
Collaborate with product teams, data analysts and data scientists to design and build solutions.
Design and execute the implementation plans to both move forward strategically, while at the same time ensuring the current technology stack is supporting current needs.
Manage multiple priorities, and simultaneously engage with multiple teams worldwide.
Be vocal and actively participate in all session with business stakeholders and agile teams.
Manage next generation of architectural decision for advanced analytics platform, create strategy, roadmaps, present to tech and non-tech leaders.
Coach and mentor team members.

Required qualifications:

Minimum 8 years of relevant experience required.
Experience in Model Ops and design, software development with proven effectiveness in delivering technology in fast-paced, demanding, industry driven environment for AI/ML, and advanced analytics.
Hands on experience in both Python development on Linux. Strong understanding of modern open-source data science platform architecture for storage & compute separation, interactive development workbenches, containers, and toolsets such as Jupyter, VSCode etc.
Experience of data sources and Vector Store platforms such as Redis, Solar, Postgres DB, FAISS, Teradata, Oracle, SQL Server, Hadoop etc.
Experienced in using design patterns and following best software engineering practices.
An understanding of fundamental algorithms and ability to optimize existing code.
Proficient written and verbal communication skills to support and shape the platform and clearly articulate technical designs and concepts; and to communicate effectively with all levels within the organization.
Experience with deploying models using vLLM/Triton Inference Server
Performance Tuning those models and deployment to provide higher throughput.
Experience with various inference metrics, and related monitoring and observability.
Experience with serving multiple tenants/clients with model endpoints with secure boundaries.
Experience with Atheization & Authorization, Policy as Code, Systems Integration, and Model Routing
Model Evaluation frameworks to evaluate different models and their tradeoffs between efficiency and metrics.
Experience building RAG for various knowledge bases, and document types.
Model Monitoring – Ability to collect metrics to measure things like Model Drift, KPIs.
Self-starter with the ability to challenge conventions, excellent communication skills.
Strong analytical skills which enable ability to problem solve, apply reason, take initiative, use judgment, and perform concurrent tasks.
Follows Test Driven Development practices including continual integration and clean code principles.

Desired Qualifications:

Experience developing Gen AI training and Inferencing platform with open-source model, Gen AI Model servicing capabilities, designing RAG frameworks, MCP modules for enterprise data systems.

Skills:

Automation
Influence
Result Orientation
Stakeholder Management
Technical Strategy Development
Application Development
Architecture
Business Acumen
Risk Management
Solution Design
Agile Practices
Analytical Thinking
Collaboration
Data Management
Solution Delivery Process

Shift:

1st shift (United States of America)

**Hours Per Week: **