Federated Systems Field Engineer

This role focuses on migrating data pipelines and training/inference workloads to a federated AI platform, researching, designing, and testing distributed AI workflows across cloud environments, and providing technical support for model execution. It involves developing expertise in federated learning frameworks like NVIDIA FLARE.

What you'd actually do

  1. Work directly with data scientists and AI model developers to migrate data pipelines and training/inference workloads onto a federated AI platform, ensuring scalability and performance.
  2. Research, design, develop, and test distributed AI workflows, including data preprocessing, model training, and inference across AWS, GCP, and Azure environments.
  3. Provide hands-on technical support to data science teams, including troubleshooting VM, OS-level, and cloud infrastructure issues impacting model execution.
  4. Develop and maintain expertise in federated learning frameworks, including NVIDIA FLARE, to enable secure and efficient distributed model development.

Skills

Required

  • Bachelor's degree
  • 2+ years of experience working in agile environments with hands-on involvement in testing and validating data, ML, or distributed workflows
  • 2+ years of experience implementing solutions in cloud environments (AWS, GCP, or Azure), including deploying data pipelines or AI/ML workloads
  • 2+ years of experience with cloud networking fundamentals, including configuring and troubleshooting network components supporting distributed systems
  • 2+ years of experience supporting or contributing to AI/ML or GenAI solutions, with exposure to model architecture, training, or inference workflows
  • 2+ years of experience developing in Python for data engineering, automation, or machine learning use cases

Nice to have

  • 2+ years of experience refactoring and optimizing code for performance, scalability, and maintainability in cloud or distributed environments
  • 2+ years of experience with identity and access management concepts in cloud or federated environments
  • 2+ years of experience working with modular cloud platforms, orchestration tools, or model control processes supporting ML pipelines

What the JD emphasized

  • hands-on technical support
  • federated learning frameworks
  • training/inference workloads
  • distributed AI workflows

Other signals

  • federated AI platform
  • training/inference workloads
  • distributed AI workflows
  • federated learning frameworks