Customer Engineer, Public Sector

Google Google · Big Tech · Mexico City, CDMX, Mexico

Customer Engineer role focused on architecting and managing Large Language Model (LLM) deployments, including on-premises and cloud environments. The role involves auditing multi-agent orchestration, agent construction, and vector databases, orchestrating scalable inference and training environments using Docker and Kubernetes, and securing the MLOps lifecycle against AI-specific threats. Requires significant experience in AI/ML development, infrastructure engineering, and container orchestration.

What you'd actually do

  1. Architect and manage Large Language Model (LLM) deployments across on-premises (NVIDIA/AMD) and cloud (cloud computing platform, Google Cloud platform (GCP) environments. Audit multi-agent orchestration, agent construction, and vector databases to map data flows and enforce privilege boundaries.
  2. Use Docker and Kubernetes to orchestrate scalable inference and training environments, optimizing Graphics Processing Unit (GPU) utilization and resource isolation.
  3. Protect model weights, secure data ingestion, and harden inference endpoints across the Machine Learning operations (MLOps) lifecycle. Investigate and mitigate AI-specific threats (e.g., prompt injection, jailbreaking, data poisoning). Map testing findings to MITRE ATLAS, OWASP for LLMs, and STRIDE models.
  4. Bridge local high-compute clusters and cloud AI services while maintaining a consistent security posture.

Skills

Required

  • AI/ML development
  • AI infrastructure engineering
  • software development
  • containerization
  • Docker
  • Kubernetes
  • Python
  • PyTorch
  • TensorFlow
  • Hugging Face Transformers
  • LLM deployments
  • MLOps

Nice to have

  • AI/ML research
  • LLM deployment frameworks
  • vLLM
  • NVIDIA Triton
  • Ollama
  • agent development
  • OWASP for LLMs
  • cloud-native AI services
  • Google Vertex AI
  • air-gapped systems
  • on-premises HPC systems

What the JD emphasized

  • 10 years of experience in AI/ML development, AI infrastructure engineering, or software development
  • 5 years of experience with containerization (e.g., Docker) and orchestration (e.g., Kubernetes)
  • 5 years of experience with Python and with libraries like PyTorch, TensorFlow, or Hugging Face Transformers
  • AI/ML development
  • AI infrastructure engineering
  • MLOps lifecycle
  • AI-specific threats
  • prompt injection
  • jailbreaking
  • data poisoning
  • MITRE ATLAS
  • OWASP for LLMs
  • STRIDE models
  • on-premises high-performance computing (HPC) systems

Other signals

  • LLM deployments
  • AI infrastructure engineering
  • MLOps lifecycle
  • AI-specific threats