Network Development Snr Manager

Oracle Oracle · Enterprise · Seattle, WA +1

Senior Manager for Network Reliability Engineering team responsible for operational excellence in OCI's physical network, supporting AI/ML and GPU workloads in a cloud environment. Duties include managing engineers, defining roadmaps, establishing metrics, driving strategic initiatives for HPC and AI/ML capabilities, solving distributed systems problems, and improving network monitoring and automation.

What you'd actually do

  1. Attract, develop & manage a team of highly skilled Network engineers
  2. Define and develop roadmaps to deliver operational efficiencies
  3. Establish & report on a body of metrics that define service availability
  4. Drive strategic technology initiatives to deliver and operate HPC and AI/ML capabilities for our customers.
  5. Solve difficult problems in distributed systems, infrastructure, and highly available services

Skills

Required

  • Network Reliability Engineering
  • Cloud Networking
  • High Performance Computing (HPC)
  • GPU Systems
  • Distributed Systems
  • Infrastructure Management
  • Network Monitoring
  • Automation
  • Engineering Management
  • Hiring
  • Onboarding
  • Performance Management

Nice to have

  • Large scale physical network reliability
  • Technical leadership
  • Experience in large Enterprises, ISP, or Cloud providers
  • Organizational skills
  • Verbal communication skills
  • Written communication skills
  • Judgment to influence product roadmap direction, features, and priorities

What the JD emphasized

  • AI/ML
  • GPU workloads