Customer Engineer, AI Infrastructure Modernization Tpu, Google Cloud

Google Google · Big Tech · Sydney NSW, Australia +1

Customer Engineer for Google Cloud AI Infrastructure, focusing on TPU/GPU accelerators for training and inference. The role involves advising customers on AI infrastructure strategy, designing and implementing solutions, performance tuning, and MLOps integration.

What you'd actually do

  1. Become a trusted advisor to customers, helping them understand and incorporate AI accelerators into their overall cloud and IT strategy by designing training and inferencing platforms.
  2. Demonstrate how Google Cloud is differentiated, highlighting the power of accelerators by working with customers on POCs, demonstrating features, optimizing model performance, profiling, and bench marking.
  3. Design and implement multi-host AI training and inferencing solutions on Google Cloud TPUs, focusing on scalability and performance tuning.
  4. Conduct performance profiling and optimization of customer models and data pipelines for the TPU architecture, identifying and resolving issues.
  5. Advise customers on best practices for integrating their MLOps workflows with the Google Cloud AI Platform ecosystem for TPU utilization.

Skills

Required

  • cloud native architectures
  • modern cloud infrastructure
  • networking (e.g., switching/routing for ethernet/RoCE/infiniband)
  • customer-facing or support roles
  • developing and deploying models using deep learning frameworks (e.g., TensorFlow, PyTorch, or JAX)

Nice to have

  • IT infrastructure consultant
  • enterprise architect
  • data center investment strategies and proposals
  • AI Infrastructure systems
  • DPU
  • RoCE
  • InfiniBand
  • cooling
  • GPUs
  • TPUs
  • AI compute clusters
  • AI infrastructure market knowledge

What the JD emphasized

  • AI accelerators
  • training and inferencing
  • TPU
  • performance tuning
  • customer-facing

Other signals

  • customer-facing technical expert
  • AI infrastructure
  • TPU/GPU accelerators
  • training and inference optimization