Principal, Performance and Capacity Engineer (infrastructure & Platform)

Workday Workday · Enterprise · IND.Chennai

Principal Technical Architect on the Capacity Engineering team responsible for designing and architecting cutting-edge frameworks for capacity engineering across Workday's core and shared services, ensuring systems can handle rapid growth. Focuses on hands-on technical leadership, system analysis, optimization, and innovation in performance and resilience for large-scale distributed systems.

What you'd actually do

  1. Architecting Scalable Frameworks: Design and implement architectural frameworks and tooling for proactive capacity planning, modeling, and management for critical Workday services.
  2. Technical Leadership & Guidance: Provide senior technical leadership, mentoring engineers, and guiding the team on best practices for building scalable, resilient, and performant distributed systems.
  3. System Analysis & Optimization: Conduct in-depth analysis of system performance, resource utilization, and growth patterns to identify optimization opportunities and predict future capacity needs.
  4. Cross-Functional Collaboration: Partner closely with engineering teams, product management, and SREs to integrate capacity insights into the development lifecycle and influence product architecture.
  5. Innovation & Research: Stay ahead of industry trends and emerging technologies, evaluating and recommending new approaches to continuously enhance Workday's scalability and operational efficiency.

Skills

Required

  • 12+ years of experience in software engineering
  • significant hands-on experience in system-level architecture, performance, resiliency, and scalability for complex distributed systems
  • 5+ years in designing and implementing complex distributed system architectures
  • 8+ years experience with at least two of the programming languages (e.g., Java, Python, Go)
  • writing production-level code for distributed systems
  • Deep understanding, knowledge, and hands-on experience with Kubernetes in production environments
  • Expertise in distributed computing principles, microservices architectures, and cloud-native patterns
  • Profound understanding of JVM internals (e.g., garbage collection, memory management, threading)
  • Proven ability to design and implement robust, scalable technical architectures
  • translate broad business requirements into clear, executable design specifications
  • Exceptional analytical and problem-solving skills

Nice to have

  • Master’s degree (e.g., MS in Computer Science, Distributed Systems, or related field) is strongly preferred or equivalent practical experience
  • leading public cloud platforms (AWS and GCP)

What the JD emphasized

  • designing and architecting cutting-edge frameworks
  • hands-on technical leadership and innovation
  • designing, building, and scaling highly performant and resilient distributed systems
  • deep understanding of system & JVM internals, performance tuning, and capacity management in large-scale environments
  • Deep understanding, knowledge, and hands-on experience with Kubernetes in production environments
  • Expertise in distributed computing principles
  • Profound understanding of JVM internals