Senior System Software Engineer - Github

NVIDIA NVIDIA · Semiconductors · Pune, India

Senior Software Engineer to build and maintain a large-scale private cloud system on GitHub and Kubernetes, supporting CI services for various NVIDIA teams. The role involves developing scalable cloud solutions, managing job scheduling and resources, creating metrics/alert/storage services, and applying ML/deep learning to improve system performance. Requires strong OOP (Java preferred), experience with cloud infrastructure, Kubernetes, message brokers, and databases.

What you'd actually do

  1. Build creative, scalable cloud solutions to handle millions of jobs and thousands of systems
  2. Tackle challenging problems in infrastructure such as job scheduling, resource management, and automated recovery
  3. Develop complete solutions including Metrics, Alert, and Storage Services
  4. Dig into data, analyze it extensively, and apply deep learning algorithms/machine learning to improve system performance and predictability
  5. Contribute to our GitHub-based CI workflow to streamline and optimize processes

Skills

Required

  • Strong object-oriented programming background
  • Java
  • developing large-scale cloud infrastructure applications
  • Kubernetes
  • Message brokers
  • relational databases like MySQL
  • NoSQL databases such as Elasticsearch
  • BS/MS in Computer Science, Computer Engineering, or equivalent experience
  • 9+ years of proven experience

Nice to have

  • Real-world experience with distributed systems
  • containers
  • Kubernetes API
  • computer algorithms
  • breaking down complex problems
  • crafting, implementing, and deploying major infrastructure features
  • Machine Learning and Data Analytics
  • build simple systems that operate efficiently with minimal support

What the JD emphasized

  • 9+ years of proven experience
  • Proven experience in developing large-scale cloud infrastructure applications