Engineering Manager - Openbmc Platform

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

Engineering Manager for OpenBMC Platform at NVIDIA, responsible for owning and delivering an end-to-end manageability stack for Data Center Systems. This role involves managing a team of software engineers, designing and building OpenBMC based firmware, and ensuring quality, reliability, and performance of the delivered firmware.

What you'd actually do

  1. Own and deliver OpenBMC based manageability stack for next generation Data Center Compute Systems.
  2. Own firmware delivered to data centers in terms of quality, reliability and telemetry performance.
  3. Manage and lead a distributed team of software engineers to deliver firmware stack with high quality.
  4. Work with data center architects and cloud customers for correct requirements and scope implementation to ensure speed of light product development.
  5. Work closely with cross functional teams to ensure scalable manageability architecture for all data centers products

Skills

Required

  • BS, MS, or PhD in EE/CS or related field of education or equivalent experience.
  • 10+ overall years of relevant experience working on server firmware (BMC) and platform software development
  • 5+ years of experience in managing a software/firmware engineering team
  • Hands on experience with data center health management workflow.
  • Proven record of delivering server firmware for large data centers.
  • Strong knowledge of data center management, server architecture and server manageability in data centers.
  • Strong and demonstrable skill in C/C++ and Python.
  • Experience programming and debugging skills for server platforms.
  • Experience in SCM (e.g. Git, Perforce) and project management tools like Jira.
  • Possess excellent written and oral communication skills, good work ethics, high sense of team-work, love to produce quality work.
  • Self-starter who loves to find creative solutions to complicated problems

Nice to have

  • Hands on experience with BMC firmware/software stack for data center health management and server manageability.
  • Proven engineering managers driving large complex problem with 25+ engineers working

What the JD emphasized

  • server firmware (BMC)
  • server manageability