Software Engineer Iii- Gen AI Inferenci… at Bank of America

What you'd actually do

Codes solutions and unit test to deliver a requirement/story per the defined acceptance criteria and compliance requirements

Designs, develops, and modifies architecture components, application interfaces, and solution enablers while ensuring principal architecture integrity is maintained

Mentors other software engineers and coach team on Continuous Integration and Continuous Development (CI-CD) practices and automating tool stack

Executes story refinement, definition of requirements, and estimating work necessary to realize a story through the delivery lifecycle

Performs spike/proof of concept as necessary to mitigate risk or implement new ideas

Skills

Required

OOP in Python/Scala/Java
AI/ML/GenAI Lifecycle Management and Development
MLOps
Fine – Tuning techniques
Inference Frameworks
vLLM/Triton Inference Server
containers
production deployment
automation
Performance Tuning
Python/Unix based systems
generative AI RAG process
chunking, embedding, retrieval, reranking and summarization
application development
MongoDB
Redis
Angular/React Frameworks
Containerization
Building API based application leveraging FAST API services
JWT Integration
API Gateway
CI/CD practices

Nice to have

Scala
Java
Unix
Angular
React
MongoDB
Redis

What the JD emphasized

Gen AI Inferencing

Gen AI platform

AI/ML/GenAI Lifecycle Management and Development

MLOps

Fine – Tuning techniques

Inference Frameworks

vLLM/Triton Inference Server

production

automation

Performance Tuning

generative AI RAG process

chunking, embedding, retrieval, reranking and summarization

application development

FAST API services

AI/ML and GenAI work

Continuous Integration (CI)

Continuous Deployment (CD)

Other signals

design, build, and operate of reusable toolkits for Gen AI RAG capabilities

developing and delivering complex requirements to accomplish business goals

Experience with AI/ML/GenAI Lifecycle Management and Development and its Ecosystem

building frameworks using MLOps, Fine – Tuning techniques, Inference Frameworks

deploying models using vLLM/Triton Inference Server in containers in production with automation

Hands on experience and knowledge generative AI RAG process for various use cases, including chunking, embedding, retrieval, reranking and summarization

Job Description:

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.

Being a Great Place to Work and providing a culture of caring is core to how we drive Responsible Growth. We are intentional about fostering an inclusive workplace where every teammate has the opportunity to succeed, build a career and contribute to our shared success. This includes attracting and developing exceptional talent, recognizing and rewarding performance, and supporting our teammates’ physical, emotional, and financial wellness through affordable, competitive and flexible benefits.

We value the unique perspectives individuals bring from all backgrounds and career paths - whether shaped by military service, community college education, or a wide range of work and life experiences. These journeys foster resilience, leadership and innovation, strengthening our workforce and positively impact the communities we serve.

Bank of America is committed to an in-office culture that supports collaboration, engagement, and career development. Our approach includes clear in-office expectations, while providing an appropriate level of flexibility based on role-specific responsibilities and business needs.

At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!

Position Summary

Join a groundbreaking team at Bank of America, at the forefront of innovation in AI. We are building the next generation of Gen AI platform, empowering new AI initiatives across Consumer, Small Business, Global Banking, and Wealth organizations. This is a unique opportunity to contribute to a critical platform that will enable secure, scalable, and high-performance AI capabilities across the organization. We value curiosity, collaboration, and a passion for pushing the boundaries of what’s possible with AI.

This position is focused on design, build, and operate of reusable toolkits for Gen AI RAG capabilities.

This job is responsible for developing and delivering complex requirements to accomplish business goals. Key responsibilities of the job include ensuring that software is developed to meet functional, non-functional and compliance requirements, and solutions are well designed with maintainability/ease of integration and testing built-in from the outset. Job expectations include a strong knowledge of development and testing practices common to the industry and design and architectural patterns.

Responsibilities:

Codes solutions and unit test to deliver a requirement/story per the defined acceptance criteria and compliance requirements
Designs, develops, and modifies architecture components, application interfaces, and solution enablers while ensuring principal architecture integrity is maintained
Mentors other software engineers and coach team on Continuous Integration and Continuous Development (CI-CD) practices and automating tool stack
Executes story refinement, definition of requirements, and estimating work necessary to realize a story through the delivery lifecycle
Performs spike/proof of concept as necessary to mitigate risk or implement new ideas
Automates manual release activities
Designs, develops, and maintains automated test suites (integration, regression, performance)
Utilizes multiple architectural components (across data, application, business) in design and development of client requirements
Manage multiple priorities, and simultaneously engage with multiple teams.
Participates in estimating work necessary to realize a story/requirement through the delivery lifecycle.
Be vocal and actively participate in all session with business stakeholders and agile teams.
Collaborate with product teams, data analysts and data scientists to design and build solutions.

Required qualifications:

5+ years OOP in Python/Scala/Java programming experience with expert level development skills.
Bachelors degree in Engineering ,computer Science ,IT,C omputer application or job related field required .
Experience with AI/ML/GenAI Lifecycle Management and Development and its Ecosystem. Hands on experience building frameworks using MLOps, Fine – Tuning techniques, Inference Frameworks
Experience with deploying models using vLLM/Triton Inference Server in containers in production with automation. Performs Continuous Integration and Continuous Development (CI-CD) activities. Performance Tuning those models and deployment to provide higher throughput.
Track record of maintaining large scale Python/Unix based systems.
Hands on experience and knowledge generative AI RAG process for various use cases, including chunking, embedding, retrieval, reranking and summarization.
Hands-on experience in application development in one or more areas MongoDB, Redis, Angular/React Frameworks, Containerization, Building API based application leveraging FAST API services, JWT Integration, API Gateway
Develop efficient utilities, automation frameworks, data science platforms that can be utilized across multiple Data Science teams for AI/ML and GenAI work.
Working in large sized teams that collaboratively develop on a shared multi-repo codebase using IDEs (e.g. VS Code rather than Jupyter Notebooks), Continuous Integration (CI), Continuous Deployment (CD) and Continuous Testing
Strong automation, scripting, and Python development skills. Hands-on DevOps experience with one or more of the following enterprise development tools: Version Control (GIT/Bitbucket), Build Orchestration (Jenkins), Code Quality (SonarQube and pytest Unit Testing), Artifact Management (Artifactory) and Deployment (Ansible)

Desired Qualifications

Experience building & deploying Gen AI inferencing platform with open-source toolsets, building inferencing & servicing capabilities (AI Gateway, Policy store, Observability) for RAG/ MCP use cases etc.
Hands on experience on driving and maintaining a culture of quality, innovation, and experimentation.
Research on new tools and capabilities for better UI and UX for advanced analytics platform, quick prototype and demonstrate the features and capabilities, and participate on various user forums.

Skills:

Application Development
Automation
Influence
Solution Design
Technical Strategy Development
Architecture
Business Acumen
DevOps Practices
Result Orientation
Solution Delivery Process
Analytical Thinking
Collaboration
Data Management
Risk Management
Test Engineering

Shift:

1st shift (United States of America)

**Hours Per Week: **