What you'd actually do

Design, provision, and maintain the platform infrastructure required for end-to-end machine learning lifecycles. Optimize the platform for distributed training, model evaluation, and batch/real-time inference.

Design and manage the enterprise Feature Store. Ensure consistent, low-latency feature delivery, preventing data leakage between training pipelines and real-time production inference.

Architect and maintain vector databases and indexing pipelines required to support Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) patterns, and semantic search.

Serve as the SME for how external applications interact with the data lakehouse. Design, build, and secure high-throughput APIs, data connectors, and reverse-ETL patterns to sync data back into business systems (e.g., CRMs, ERPs, marketing automation).

Partner closely with Data Scientists and MLOps teams to establish CI/CD automation for ML (MLOps). Transition experimental, unoptimized data science notebooks into resilient, production-grade automated workflows.

Skills

Required

5+ years of data engineering experience
2+ years supporting machine learning platforms, MLOps, or complex platform integrations
AWS SageMaker, MLflow, or equivalent cloud-native ML platforms
Feature store frameworks (e.g., Feast, SageMaker Feature Store)
Vector databases (e.g., Pinecone, Milvus, Qdrant, or Pgvector)
Apache Spark / AWS EMR, Ray, or Dask
Building rest APIs
Webhooks
Streaming tools (e.g., AWS Kinesis, Kafka)
Python (Pandas, NumPy, Scikit-Learn)
SQL
GitHub Actions, GitLab CI, or Jenkins

Nice to have

Deploying and fine-tuning open-source LLMs
Orchestrating AI agents using frameworks like LangChain or LlamaIndex
Reverse-ETL tools (e.g., Census, Hightouch)
Enterprise integration platforms

Your work days are brighter here.

We’re obsessed with making hard work pay off, for our people, our customers, and the world around us. As a Fortune 500 company and a leading AI platform for managing people, money, and agents, we’re shaping the future of work so teams can reach their potential and focus on what matters most. The minute you join, you’ll feel it. Not just in the products we build, but in how we show up for each other. Our culture is rooted in integrity, empathy, and shared enthusiasm. We’re in this together, tackling big challenges with bold ideas and genuine care. We look for curious minds and courageous collaborators who bring sun-drenched optimism and drive. Whether you're building smarter solutions, supporting customers, or creating a space where everyone belongs, you’ll do meaningful work with Workmates who’ve got your back. In return, we’ll give you the trust to take risks, the tools to grow, the skills to develop and the support of a company invested in you for the long haul. So, if you want to inspire a brighter work day for everyone, including yourself, you’ve found a match in Workday, and we hope to be a match for you too.

About the Team

We are a newly formed, forward-looking Cybersecurity Data Engineering & Enablement Team driving the future of our enterprise defense strategy. Our mission is to build a next-generation, centralized data lakehouse that unifies all security telemetry into a single, high-performance ecosystem. Operating across two specialized verticals—Data Engineering (ingestion, enrichment, and semantic layers) and Data Platform (foundational infrastructure, security architecture, and AI enablement)—we are designing a scalable, cloud-native foundation from the ground up. By combining cutting-edge data architecture with advanced analytics, we empower our threat hunters, data scientists, and incident responders with the real-time, trusted intelligence needed to protect the enterprise at scale.

About the Role

We are seeking a highly specialized Senior Data Engineer** - Cybersecurity **to serve as the Subject Matter Expert (SME) for AI/ML and Platform Integration. This critical role sits at the intersection of core data platform infrastructure, advanced analytics, and external system integrations. Your primary mission is to optimize our data platform to serve as a high-performance engine for Data Science, Machine Learning (ML), and Generative AI (GenAI) workloads.

Additionally, you will own the integration fabric of the platform—building the robust APIs, webhook ingestion engines, and data connectors that seamlessly sync our central lakehouse with downstream business applications, SaaS platforms, and third-party ecosystems.

Key Responsibilities

AI/ML Data Infrastructure & Tooling: Design, provision, and maintain the platform infrastructure required for end-to-end machine learning lifecycles. Optimize the platform for distributed training, model evaluation, and batch/real-time inference.
Enterprise Feature Store Architecture: Design and manage the enterprise Feature Store. Ensure consistent, low-latency feature delivery, preventing data leakage between training pipelines and real-time production inference.
Vector Infrastructure for GenAI: Architect and maintain vector databases and indexing pipelines required to support Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) patterns, and semantic search.
Platform Integration & API Management: Serve as the SME for how external applications interact with the data lakehouse. Design, build, and secure high-throughput APIs, data connectors, and reverse-ETL patterns to sync data back into business systems (e.g., CRMs, ERPs, marketing automation).
MLOps Collaboration & Automation: Partner closely with Data Scientists and MLOps teams to establish CI/CD automation for ML (MLOps). Transition experimental, unoptimized data science notebooks into resilient, production-grade automated workflows.
Compute Optimization for Data Science: Configure and optimize compute engines tailored for heavy mathematical and data science workloads (e.g., Ray, Spark/EMR GPU instances).

About You

Basic Qualification

Experience: 5+ years of data engineering experience, with at least 2+ years dedicated to supporting machine learning platforms, MLOps, or complex platform integrations.
ML Data Stack: Deep hands-on experience with AWS SageMaker, MLflow, or equivalent cloud-native ML platforms.
Feature Stores & Vector DBs: Proven experience implementing feature store frameworks (e.g., Feast, SageMaker Feature Store) and vector databases (e.g., Pinecone, Milvus, Qdrant, or Pgvector).
Distributed Compute & ML Libraries: Strong experience using Apache Spark / AWS EMR, Ray, or Dask to process massive datasets for feature extraction and model preparation.
Integration Patterns: Expert knowledge of building rest APIs, Webhooks, and utilizing streaming tools (e.g., AWS Kinesis, Kafka) for real-time integration.
Languages & CI/CD: Advanced proficiency in Python (including ML ecosystems like Pandas, NumPy, Scikit-Learn) and SQL. Extensive experience with GitHub Actions, GitLab CI, or Jenkins for data/ML pipelines.

Other Qualifications

Experience deploying and fine-tuning open-source LLMs or orchestrating AI agents using frameworks like LangChain or LlamaIndex.
Experience with reverse-ETL tools (e.g., Census, Hightouch) or enterprise integration platforms.

Workday Pay Transparency Statement

The annualized base salary ranges for the primary location and any additional locations are listed below. Workday pay ranges vary based on work location. As a part of the total compensation package, this role may be eligible for the Workday Bonus Plan or a role-specific commission/bonus, as well as annual refresh stock grants. Recruiters can share more detail during the hiring process. Each candidate’s compensation offer will be based on multiple factors including, but not limited to, geography, experience, skills, job duties, and business need, among other things. For more information regarding Workday’s comprehensive benefits, please click here.

Primary Location: USA.VA.Reston

Primary Location Base Pay Range: $159,600 USD - $239,400 USD

Additional US Location(s) Base Pay Range: $144,400 USD - $258,000 USD

Our Approach to Flexible Work

With Flex Work, we’re combining the best of both worlds: in-person time and remote. Our approach enables our teams to deepen connections, maintain a strong community, and do their best work. We know that flexibility can take shape in many ways, so rather than a number of required days in-office each week, we simply spend at least half (50%) of our time each quarter in the office or in the field with our customers, prospects, and partners (depending on role). This means you'll have the freedom to create a flexible schedule that caters to your business, team, and personal needs, while being intentional to make the most of time spent together. Those in our remote "home office" roles also have the opportunity to come together in our offices for important moments that matter.

Pursuant to applicable Fair Chance law, Workday will consider for employment qualified applicants with arrest and conviction records.

Workday is an Equal Opportunity Employer including individuals with disabilities and protected veterans.

At Workday, we are committed to providing an accessible and inclusive hiring experience where all candidates can fully demonstrate their skills. If you require assistance or an accommodation at any point, please email accommodations@workday.com.

Are you being referred to one of our roles? If so, ask your connection at Workday about our Employee Referral process!

At Workday, we value our candidates’ privacy and data security. Workday will never ask candidates to apply to jobs through websites that are not Workday Careers.

Please be aware of sites that may ask for you to input your data in connection with a job posting that appears to be from Workday but is not.

In addition, Workday will never ask candidates to pay a recruiting fee, or pay for consulting or coaching services, in order to apply for a job at Workday.