Customer Engineer, AI Infrastructure Modernization Tpu, Google Cloud

Google Google · Big Tech · Mumbai, Maharashtra, India

Customer Engineer focused on AI infrastructure modernization using Google Cloud's TPU/GPU accelerators. The role involves guiding customers on architecture, deployment, and optimization of large-scale training and inference jobs, working with AI/ML accelerators, and supporting sales teams in piloting and deploying these solutions.

What you'd actually do

  1. Become a trusted advisor to the top customers, helping them understand and incorporate AI accelerators into their overall cloud and IT strategy by designing training and inferencing platforms, using the accelerators Google Cloud has to offer.
  2. Demonstrate how Google Cloud is differentiated, highlighting the power of accelerators by working with customers on POCs, demonstrating features, optimizing model performance, profiling, and bench marking.
  3. Design and implement complex, multi-host AI training and inferencing solutions on Google Cloud TPUs, focusing on scalability and performance tuning.
  4. Conduct performance profiling and optimization of customer models and data pipelines for the TPU architecture, identifying and resolving bottlenecks.
  5. Advise customers on best practices for integrating their MLOps workflows with the Google Cloud AI Platform ecosystem for TPU utilization.

Skills

Required

  • cloud native architectures
  • modern cloud infrastructure
  • networking (switching/routing for ethernet/RoCE/infiniband)
  • customer-facing or support roles
  • developing and deploying models using deep learning frameworks (TensorFlow, PyTorch, or JAX)

Nice to have

  • AI Infrastructure systems
  • DPU, RoCE, InfiniBand
  • cooling
  • accelerators, GPUs and TPUs
  • AI and software stacks and platforms
  • AI infrastructure market knowledge

What the JD emphasized

  • AI accelerators
  • training and inferencing
  • TPU
  • performance tuning
  • profiling
  • optimization

Other signals

  • AI accelerators (TPU/GPU)
  • customer-facing technical expert
  • large-scale training and inference