Devops Engineer

NVIDIA NVIDIA · Semiconductors · Yokneam, Israel

NVIDIA's Manufacturing Information Systems team is seeking a DevOps Engineer to build and operate Kubernetes infrastructure across Azure AKS and on-prem clusters, extend CI/CD pipelines, maintain on-prem infrastructure, and automate provisioning. The role involves troubleshooting across the full stack and contributing to initiatives like GPU cluster enablement and secret management.

What you'd actually do

  1. Design, build, and operate Kubernetes infrastructure across Azure AKS and on-prem clusters, including ingress, autoscaling with Keda, TLS management, and GPU-enabled workloads
  2. Extend and harden CI/CD pipelines in GitLab, manage runners across multiple environments, and evolve GitOps-based deployments through ArgoCD
  3. Maintain and improve the critical on-prem infrastructure — Linux servers, NGINX, container platforms, and networking — that several production workflows depend on
  4. Partner with development, data, and architecture teams to streamline delivery, improve observability across Datadog, and shorten time-to-recovery during incidents
  5. Contribute to flagship initiatives on the roadmap: per-site Kubernetes cluster rollouts, AKS upgrades and node pool reorganization, GPU cluster enablement, and secret management with Azure Key Vault, and Sealed Secrets

Skills

Required

  • Kubernetes
  • Docker
  • CI/CD
  • GitLab
  • Linux administration
  • Bash
  • Azure
  • GitOps
  • ArgoCD

Nice to have

  • on-prem Kubernetes at scale
  • MetalLB
  • secret management (HashiCorp Vault, Azure Key Vault, Sealed Secrets)
  • SQL (PostgreSQL, MySQL)
  • MongoDB
  • Datadog

What the JD emphasized

  • 3+ years in a DevOps, SRE, or infrastructure engineering role
  • Hands-on proficiency with Kubernetes and container tooling (Docker for example) in production environments
  • Track record of building and maintaining CI/CD pipelines, ideally in GitLab, including runner management and pipeline-as-code
  • Solid Linux administration skills and fluency in Bash
  • Practical background with a major cloud platform, Azure preferred (or AWS o/GCP)
  • Working knowledge of GitOps workflows and tooling such as ArgoCD or Flux