Machine Learning Services Engineer - Firefly

Adobe Adobe · Enterprise · Bucharest, Romania

The ML Services Software Development Engineer will architect and implement performant, robust, and scalable GenAI services supporting deep learning models in large-scale, distributed environments. This role leads inference platform projects, collaborates with ML researchers to optimize GPU utilization, and monitors ML platform performance.

What you'd actually do

  1. Architect and implement performant, robust and scalable GenAI services that supports different deep learning models in large-scale and distributed environments
  2. Lead inference platform projects from scoping requirements to launch, ensuring ongoing support
  3. Identify and resolve usability, extensibility, scalability issues specific to the platform
  4. Collaborate with machine learning researchers to identify and resolve requirements in order to improve the inference GPU utilization on the ML platform
  5. Monitor machine learning platform performance and modify infrastructure to fit fluid cloud resource needs

Skills

Required

  • Python
  • distributed systems
  • Linux
  • Docker
  • Kubernetes
  • AWS or similar cloud infrastructure
  • large and complex code bases
  • API design techniques
  • HW resource management for ML training and/or deployment

Nice to have

  • CUDA programming experience
  • Web application development experience
  • CI/CD systems
  • Agile development processes including Scrum

What the JD emphasized

  • large-scale and distributed environments
  • ongoing support
  • scalability issues
  • improve the inference GPU utilization
  • ML platform performance
  • fluid cloud resource needs

Other signals

  • GenAI services
  • deep learning models
  • inference platform
  • ML platform performance
  • GPU utilization