What you'd actually do

operate, harden, and continuously improve the production infrastructure that powers the Peakon Agent, multi-agent architectures, AI Features and related ML workloads

manage the entire deployment lifecycle for the Peakon Agent and other AI Features, ensuring the reliability of long-running agentic loops, memory stores, and tool-use environments

build and maintain tooling to surface agent trajectory and evaluation data, supporting performance testing, latency benchmarking, and load simulations specific to LLM-driven applications

collaborate on the automation of essential security upgrades for ML dependencies

provide clear runbooks, robust observability, on-call and predictable incident response

Skills

Required

Python
LangChain
LlamaIndex
Docker
Kubernetes
GitOps
GitHub Actions
MLOps
LLM
Agentic systems
Model monitoring
Regression tracking
Automated evaluation
LangSmith
System Design
Architectural Governance
Threat modeling
Guardrails
Regulated enterprise environments
Data auditability
Compliance

Nice to have

advanced fine-tuning
alignment techniques
prompt engineering
simulations of agent behaviors
RAG
autonomous decision-making agents

What the JD emphasized

Proven track record as an MLOps or ML-savvy SRE/Platform Engineer supporting production-grade LLM and agentic systems

Deep understanding of the model development lifecycle, specifically regarding model monitoring, regression tracking, and automated evaluation using tools like LangSmith

Proven experience navigating highly regulated enterprise environments to ensure data auditability, clear ownership boundaries, and strict compliance

Your work days are brighter here.

We’re obsessed with making hard work pay off, for our people, our customers, and the world around us. As a Fortune 500 company and a leading AI platform for managing people, money, and agents, we’re shaping the future of work so teams can reach their potential and focus on what matters most. The minute you join, you’ll feel it. Not just in the products we build, but in how we show up for each other. Our culture is rooted in integrity, empathy, and shared enthusiasm. We’re in this together, tackling big challenges with bold ideas and genuine care. We look for curious minds and courageous collaborators who bring sun-drenched optimism and drive. Whether you're building smarter solutions, supporting customers, or creating a space where everyone belongs, you’ll do meaningful work with Workmates who’ve got your back. In return, we’ll give you the trust to take risks, the tools to grow, the skills to develop and the support of a company invested in you for the long haul. So, if you want to inspire a brighter work day for everyone, including yourself, you’ve found a match in Workday, and we hope to be a match for you too.

About the Team

Sitting at the critical intersection of ML engineering, platform engineering, and observability, the Peakon MLOps Engineer serves as the central operational link between development and production. You will work closely with ML engineers, backend engineers, and the central Agent Forge / ML Runtime platform teams to ensure our autonomous agents run reliably and execute complex workflows seamlessly for customers, providing clear runbooks, robust observability, on-call and predictable incident response. Your primary objective is to enable ML engineers to safely ship agent behavior changes and understand their impact by providing streamlined tooling, agent-specific monitoring, and operational support.

About the Role

The Peakon MLOps Engineer is responsible for operating, hardening, and continuously improving the production infrastructure that powers the Peakon Agent, multi-agent architectures, AI Features and related ML workloads. This ownership spans deployments, monitoring, and on-call workflows. Key operational functions include managing the entire deployment lifecycle for the Peakon Agent and other AI Features, ensuring the reliability of long-running agentic loops, memory stores, and tool-use environments, while contributing to driving operational excellence across the ML platform. Furthermore, the role involves building and maintaining tooling to surface agent trajectory and evaluation data, supporting performance testing, latency benchmarking, and load simulations specific to LLM-driven applications, and collaborating on the automation of essential security upgrades for ML dependencies.

About You

Basic Qualifications Summary

Minimum 8 years of relevant industry experience. Holds a Bachelor’s/Master or PhD in Computer Science, Data Science, Statistics, Mathematics, Engineering, or equivalent practical experience.
Proven track record as an MLOps or ML-savvy SRE/Platform Engineer supporting production-grade LLM and agentic systems. Proficient in Python and frameworks like LangChain and LlamaIndex.
Hands-on experience operating containerized services (Docker, Kubernetes) using Git-based workflows (GitOps, GitHub Actions). Solid understanding of modern ML stacks (platforms, feature stores, registries, messaging layers) or deep platform engineering background.
Demonstrated ability to own production infrastructure end-to-end—managing monitoring, incident response, rollbacks, and continuous reliability/uptime improvements.
Deep understanding of the model development lifecycle, specifically regarding model monitoring, regression tracking, and automated evaluation using tools like LangSmith.
Strong communication and collaboration skills under pressure, acting as a bridge between ML engineers, backend teams, and central platform/security specialists.

Other Qualifications Summary

Solid knowledge of data science principles and ML algorithms applied directly to LLMs, RAG, and autonomous decision-making agents.
Experience leading model-building processes, including advanced fine-tuning, alignment techniques, prompt engineering, and simulations of agent behaviors.
Strong understanding of software development principles coupled with demonstrated proficiency in System Design and Architectural Governance to deploy, scale, and maintain high-availability ML models.
Expertise in threat modeling and security for ML/agent systems to enforce strict behavioral guardrails. Proven experience navigating highly regulated enterprise environments to ensure data auditability, clear ownership boundaries, and strict compliance.
Track record of strong technical decision quality and leadership, with the ability to coordinate cross-functional initiatives, translate complex business needs into resilient implementations, and mentor team members while bridging the gap between ML engineering and platform teams.

Workday Pay Transparency Statement (For EU Locations Only)

Listed below is the base salary range applicable to this position. Workday pay ranges (and the precise pay offered to the successful candidate) are based on a number of objective criteria such as relevant experience and skills, and educational qualifications, level of responsibility, demands of the role, work location and business need. As a part of the total compensation package, this role may be eligible for the Workday Bonus Plan or a role-specific commission/bonus, as well as annual refresh stock grants awarded by Workday Inc. For more information regarding Workday’s comprehensive benefits, please click here.

Primary Location Base Pay Range: €96,000 EUR - €144,000 EUR Ireland

Our Approach to Flexible Work

With Flex Work, we’re combining the best of both worlds: in-person time and remote. Our approach enables our teams to deepen connections, maintain a strong community, and do their best work. We know that flexibility can take shape in many ways, so rather than a number of required days in-office each week, we simply spend at least half (50%) of our time each quarter in the office or in the field with our customers, prospects, and partners (depending on role). This means you'll have the freedom to create a flexible schedule that caters to your business, team, and personal needs, while being intentional to make the most of time spent together. Those in our remote "home office" roles also have the opportunity to come together in our offices for important moments that matter.

Pursuant to applicable Fair Chance law, Workday will consider for employment qualified applicants with arrest and conviction records.

Workday is an Equal Opportunity Employer including individuals with disabilities and protected veterans.

At Workday, we are committed to providing an accessible and inclusive hiring experience where all candidates can fully demonstrate their skills. If you require assistance or an accommodation at any point, please email accommodations@workday.com.

Are you being referred to one of our roles? If so, ask your connection at Workday about our Employee Referral process!

At Workday, we value our candidates’ privacy and data security. Workday will never ask candidates to apply to jobs through websites that are not Workday Careers.

Please be aware of sites that may ask for you to input your data in connection with a job posting that appears to be from Workday but is not.

In addition, Workday will never ask candidates to pay a recruiting fee, or pay for consulting or coaching services, in order to apply for a job at Workday.