What you'd actually do

Design and lead enterprise data ingestion pipelines for structured, semi-structured, and unstructured data.

Build scalable ingestion patterns for sources such as databases, APIs, documents, PDFs, SharePoint, file shares, data lakes, wikis, ticketing systems, code repositories, emails, and application logs.

Define transformation patterns for document parsing, text extraction, normalization, deduplication, enrichment, chunking, classification, and metadata generation.

Ensure ingestion pipelines comply with security, privacy, retention, and regulatory requirements.

Provide technical leadership across data engineering, AI platform, cloud, and application teams

Skills

Required

Data Ingestion Pipeline Design
ETL/ELT Processes
Structured and Unstructured Data Handling
RAG Optimization
LLM Data Preparation
Data Transformation
Data Validation
Data Classification
Metadata Generation
Pipeline Orchestration
Monitoring and Error Handling
Data Quality Checks
Security and Privacy Compliance
Regulatory Compliance (Fintech)
Technical Leadership
Cloud Platforms (AWS/Azure/GCP)
Document Intelligence/OCR
Entity Extraction
Content Classification
Source Lineage
Data Freshness
Versioning
Access Controls
Auditability
Incremental Ingestion
Change Detection
Delta Processing
Reprocessing Strategies

Nice to have

Experience with specific RAG databases
Experience with LLM context engineering
Familiarity with various data sources (SharePoint, wikis, ticketing systems, code repositories, emails, application logs)

What the JD emphasized

design and lead the engineering of scalable ingestion pipelines

prepare enterprise data for RAG and AI workloads

extracting, transforming, enriching, validating, chunking, classifying, and loading structured and unstructured data into downstream RAG databases

clean, traceable, secure, current, and optimized for retrieval and LLM consumption

Partner with RAG database engineers to ensure ingested data is optimized for embedding, indexing, and retrieval.

Partner with context engineers to ensure data is structured and enriched in ways that support effective LLM reasoning.

Ensure ingestion pipelines comply with security, privacy, retention, and regulatory requirements.

Job Description:

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.

Being a Great Place to Work and providing a culture of caring is core to how we drive Responsible Growth. We are intentional about fostering an inclusive workplace where every teammate has the opportunity to succeed, build a career and contribute to our shared success. This includes attracting and developing exceptional talent, recognizing and rewarding performance, and supporting our teammates’ physical, emotional, and financial wellness through affordable, competitive and flexible benefits.

We value the unique perspectives individuals bring from all backgrounds and career paths - whether shaped by military service, community college education, or a wide range of work and life experiences. These journeys foster resilience, leadership and innovation, strengthening our workforce and positively impact the communities we serve.

Bank of America is committed to an in-office culture that supports collaboration, engagement, and career development. Our approach includes clear in-office expectations, while providing an appropriate level of flexibility based on role-specific responsibilities and business needs.

At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!

Job Description: This job is responsible for defining and leading the engineering approach for solutions at the program or portfolio level, to deliver significant business outcomes. Key responsibilities include continuously improving the design, quality, and reuse of the solution and delivering technology enablers that improve development efficiencies for the solution. Job expectations include familiarity with at least one area of engineering, acting as a “go to” reference across the organization, and applying knowledge to improve technical competencies through recruitment and development activities.

Developer Experience (DevEx) provides enterprise technical standards and common technical services, platforms, and tools that are leveraged by delivery teams across all lines of business. Within the SDLC Software Delivery Lifecycle program, this role leads portfolio product delivery strategy and execution for enterprise software delivery capabilities, ensuring the right investments, operating model, governance, and prioritization are in place to improve how internal technical users build, test, and deliver software at scale.

The Data Ingestion & AI Pipeline Principal Engineer is responsible for designing and leading the engineering of scalable ingestion pipelines that prepare enterprise data for RAG and AI workloads. This role focuses on extracting, transforming, enriching, validating, chunking, classifying, and loading structured and unstructured data into downstream RAG databases.

This engineer ensures that source data becomes clean, traceable, secure, current, and optimized for retrieval and LLM consumption.

Responsibilities:

Develops the engineering approach for the entire program/portfolio solution and works with Architecture, to develop/analyze/deliver the implementation of technical enablers
Leads the planning, definition, and design of the complex features which span multiple teams and explore solution alternatives
Creates ideas on designing complex technology and solution development approaches
Leads the technical oversight for teams in solution development including design reviews and code within own domain
Defines the technology tool stack for the solution within ranged of internally approved and supported technologies
Explores state-of-the-art technologies to improve development efficiencies, quality of test/QA coverage, and release management
Leads and is responsible for the end-to-end test strategy/creation/adherence, and the integration between teams for a program/portfolio solution
Improve the experience for our developers, making it easier to deliver industry-leading solutions, while managing work efficiently and with the right controls
Advance our technology platforms through innovation
Reduce risk and improve quality across our technology portfolio by aligning to a single enterprise architecture strategy and delivering governance that enables consistency, integration and automation
Design and lead enterprise data ingestion pipelines for structured, semi-structured, and unstructured data.
Build scalable ingestion patterns for sources such as databases, APIs, documents, PDFs, SharePoint, file shares, data lakes, wikis, ticketing systems, code repositories, emails, and application logs.
Define transformation patterns for document parsing, text extraction, normalization, deduplication, enrichment, chunking, classification, and metadata generation.
Establish ingestion standards for source lineage, data freshness, versioning, access controls, and auditability.
Implement pipeline orchestration, monitoring, retry logic, error handling, and operational dashboards.
Design incremental ingestion, change detection, delta processing, and reprocessing strategies.
Partner with RAG database engineers to ensure ingested data is optimized for embedding, indexing, and retrieval.
Partner with context engineers to ensure data is structured and enriched in ways that support effective LLM reasoning.
Evaluate and implement OCR, document intelligence, parsing, entity extraction, content classification, and data quality capabilities.
Define data quality checks for completeness, accuracy, duplication, stale content, unsupported formats, and sensitive information.
Ensure ingestion pipelines comply with security, privacy, retention, and regulatory requirements.
Provide technical leadership across data engineering, AI platform, cloud, and application teams.
Serve as a senior technical authority for enterprise AI platform engineering.
Own architecture decisions that impact multiple teams, systems, or domains.
Create reusable patterns, reference architectures, standards, and engineering guardrails.
Mentor senior engineers and influence technical direction without requiring direct reporting authority.
Balance innovation with operational reliability, security, compliance, scalability, and cost management.
Communicate complex AI and data engineering concepts clearly to engineering, product, risk, security, and executive stakeholders.

Required Qualification:

10+ years of software engineering, data engineering, platform engineering, or AI engineering experience.
5+ years designing large-scale enterprise systems.
2+ years working with LLM, RAG, vector search, semantic search, or AI platform capabilities.
Extensive experience designing and operating production-grade data pipelines.
Strong understanding of ETL/ELT, streaming, batch processing, orchestration, and data quality engineering.
Experience ingesting and processing structured and unstructured enterprise content.
Hands-on experience with cloud data platforms and pipeline technologies such as Azure Data Factory, Synapse, Databricks, Fabric, Spark, Kafka, Event Hubs, Airflow, dbt, or equivalent.
Experience with document processing, OCR, schema inference, parsing, content extraction, and metadata enrichment.
Understanding of RAG architectures, embeddings, vector databases, and LLM-oriented data preparation.
Strong knowledge of data governance, lineage, access control propagation, and enterprise security patterns.
Proven ability to design reusable engineering frameworks and standards.
Proven ability to lead architecture across multiple engineering teams.
Strong written and verbal communication skills.
Experience operating systems in regulated, security-conscious, or enterprise-scale environments.
Bachelor’s degree in Computer Science, Engineering, Information Systems, Applied Mathematics, or a related technical field

Desired Qualifications:

Experience with Azure AI Document Intelligence, Microsoft Graph, SharePoint ingestion, Purview, or enterprise content management systems.
Experience with large-scale data lakehouse architectures.
Experience with semantic chunking, entity extraction, taxonomy generation, or knowledge graph creation.
Experience with sensitive data detection, PII handling, DLP, and compliance-aware ingestion.
Reliable ingestion pipelines that continuously feed high-quality enterprise knowledge into RAG systems.
Reduced manual data preparation and improved automation of content onboarding.
Strong lineage, governance, and access-control-aware ingestion patterns.
Faster time-to-ingest for new enterprise data sources.
Improved downstream retrieval and LLM response quality through better data preparation.
Enterprise architecture
Distributed systems design
AI platform engineering
Data governance and security
Cloud-native engineering
Observability and operational excellence
Technical strategy and roadmap development
Cross-functional influence
Vendor and platform evaluation
Production support and continuous improvement

Skills:

Automation
Influence
Result Orientation
Stakeholder Management
Technical Strategy Development
Application Development
Architecture
Business Acumen
Risk Management
Solution Design
Agile Practices
Analytical Thinking
Collaboration
Data Management
Solution Delivery Process

Shift:

1st shift (United States of America)

**Hours Per Week: **