Solution Architect, Generative AI

NVIDIA NVIDIA · Semiconductors · Tokyo, Japan

NVIDIA is seeking a Solution Architect to promote adoption and provide technical support for their GPU-accelerated computing solutions, focusing on generative AI, machine learning, and deep learning for enterprise clients in Japan. The role involves pre-sales activities, technical support for model training and deployment, and developing solutions for inference and agent-based systems.

What you'd actually do

  1. Develop and demonstrate solutions based on NVIDIA’s pioneering GenAI software and hardware technologies with a focus on inference. This may include advising our customer on agent-based system design and optimal models and infrastructure.
  2. Work directly with key customers to understand their challenges and provide the best solutions based on NVIDIA products
  3. You will drive sustainability by performing in-depth analysis and optimization to ensure the best performance and cost-effectiveness using the NVIDIA software platform. This includes transitioning pipelines to lower precision compute.
  4. Drive pre-sales conversations, build architectures and demos to accelerate the customer AI journey based on NVIDIA products, and work closely with Sales Account Managers to secure design wins.
  5. Create or run Proofs of Concept and demos that require presentation skills, the explanation of complex topics, and Python coding to execute data pipelines, train ML/DL models, and deploy them on container-based orchestrators.

Skills

Required

  • Excellent verbal, written communication, and technical presentation skills in Japanese
  • Business level English communication
  • BS or MS in Computer Science, Engineering, Mathematics, or Physics (or equivalent experience)
  • 5+ years of industry or academic experience related to Generative AI or Deep Learning
  • Strong coding development and debugging skills
  • Python
  • C/C++
  • Bash
  • Linux
  • Demonstrated experience with cluster orchestration tools including Docker, Kubernetes, or SLURM across cloud service providers and on premises
  • Ability to multitask effectively in a dynamic environment
  • Strong analytical and problem-solving skills
  • Clear written and oral communication skills with the ability to effectively collaborate with management and engineering
  • Have a strong desire to share knowledge with clients, partners and co-workers

Nice to have

  • Expertise in deploying large-scale training and inferencing pipeline
  • Experience with pre-training, post-training of transformer-based architectures for language or vision
  • A deep understanding of the latest generative AI or deep learning methods and algorithms
  • Experience using or operating Kubernetes, as well as experience writing or customizing Kubernetes configurations
  • Experience in designing agent-based systems: Experience working as an architect on the overall design, including optimal infrastructure, agent architecture, and authentication systems

What the JD emphasized

  • focus on inference
  • agent-based system design
  • model training and deployment
  • pre-sales
  • technical support
  • Python coding
  • large-scale training and inferencing pipeline
  • pre-training, post-training
  • agent-based systems

Other signals

  • customer-facing technical support
  • pre-sales
  • model training and deployment
  • inference