Solutions Architect, Genai

NVIDIA NVIDIA · Semiconductors · Singapore, Singapore

NVIDIA is seeking a Solutions Architect with deep expertise in Generative AI (LLM) model building, focusing on training LLMs at scale and customizing them. This role involves designing solutions, leading workshops, and collaborating with internal teams to build full-stack GenAI solutions for enterprise use cases. The position requires significant experience in deep learning and generative AI, with an emphasis on large-scale LLM training.

What you'd actually do

  1. Be an expert and help customers to customize GenAI (LLM) models and optimise the training performance at scale.
  2. Lead workshops and trainings on NVIDIA's technologies.
  3. Closely partner with other Solutions Architects, engineering, product and business teams at NVIDIA to build GenAI full stack solutions for industry vertical and enterprise use cases.
  4. Work with business managers to thoughtfully craft the vision, actionable and effective strategies for the group.
  5. Encourage industry leaders by articulating the business value from the state of the art in Generative AI.

Skills

Required

  • BS, MS, or PhD in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering or related fields (or equivalent experience)
  • 8+ overall years of work-related experience in deep learning, data science or software development with knowledge of parallel computing with GPUs.
  • Specifically focusing on generative AI at scale, with emphasis on training Large Language Models (LLMs) at scale.
  • Clear written and oral communication skills with the ability to collaborate with management and engineering.
  • Share knowledge with clients, partners and co-workers.
  • Experience leading workshops, training sessions, and presenting technical solutions to diverse audiences.
  • Professional or native language proficiency in English and Mandarin.
  • Ability to travel up to 30% of the time to support customer in South East Asia and beyond.

Nice to have

  • Hands-on experience with NVIDIA's NeMo SDKs or Megatron.
  • Demonstrable ability to customize LLM models with new capabilities as well as for training speed, memory efficiency, and resource utilization.
  • Familiarity with containerization technologies (e.g., Docker or enroot/pyxis, etc.) and orchestration tools (e.g., Slurm or Kubernetes, etc.) for scalable and efficient model building.

What the JD emphasized

  • Generative AI (LLM) model building
  • training Large Language Models (LLMs) at scale
  • customizing LLM models

Other signals

  • Generative AI (LLM) model building
  • training Large Language Models (LLMs) at scale
  • customizing LLM models