Senior Data Backend Engineer

NVIDIA NVIDIA · Semiconductors · Hillsboro, OR +1 · Remote

Senior Data Backend Engineer at NVIDIA focused on building high-performance AI data pipelines for autonomous vehicle (AV) domains. The role involves designing and optimizing microservices and data pipelines to process large AV datasets for data mining and AI training, including RAG workflows and agentic patterns.

What you'd actually do

  1. Scope and build tools, microservices, workflows, and distributed applications to accelerate data mining and AI training.
  2. Design and implement solutions for streaming, resilience, logging, security, authentication, workflow orchestration, and data management.
  3. Deploy AI models.
  4. Design and develop Retrieval-Augmented Generation (RAG) workflows enabling hybrid and agentic patterns.
  5. Analyze and operationalize complex distributed systems for speed-of-light performance.

Skills

Required

  • Python
  • Golang
  • Kubernetes
  • Helm
  • Hive
  • Parquet
  • SQL
  • vector databases
  • Milvus
  • ETL pipelines
  • big data engines
  • data mining
  • AI development
  • high-performance, scalable software systems
  • architectural skills
  • problem-solving mentality
  • collaboration skills

Nice to have

  • large-scale real-time streaming
  • augmented reality
  • data curation
  • Spark
  • Large Language Models
  • Vision-Language Models
  • Retrieval-Augmented Generation (RAGs)
  • NVIDIA RAPIDS

What the JD emphasized

  • massive volumes of AV data
  • AI dataset management
  • data mining
  • AI training
  • video data curation
  • behavioral search
  • streaming
  • resilience
  • logging
  • security
  • authentication
  • workflow orchestration
  • data management
  • RAG
  • agentic patterns
  • distributed systems
  • Python
  • Golang
  • Kubernetes
  • Helm
  • Hive
  • Parquet
  • SQL
  • vector databases
  • Milvus
  • ETL pipelines
  • big data engines
  • NVIDIA RAPIDS
  • large-scale real-time streaming
  • augmented reality
  • data curation
  • Spark
  • LLMs
  • Vision-Language Models
  • Retrieval-Augmented Generation (RAGs)

Other signals

  • AI training data pipelines
  • massive volumes of AV data
  • AI dataset management
  • data mining
  • AI training