Software Development Manager

Oracle Oracle · Enterprise · Seattle, WA +1

Experienced engineering manager to lead an AI Data Plane team responsible for building and managing a high-performance GPU platform for AI/ML/HPC workloads on Oracle Cloud Infrastructure. The role involves hands-on technical management, people management, and process improvement, focusing on delivering scalable and highly available cloud services for hyper-scale AI customers.

What you'd actually do

  1. Engineering and project management. You'll work with customers and stakeholders to understand their feature requirements and pain points. You'll build roadmaps and make priority tradeoffs to deliver those features. Plan and drive engineering projects to successfully deliver that value to our customers, while managing risks and blockers along the way. You'll take large, technically complex projects and break them down into manageable pieces, develop actionable plans and successfully deliver them are expected.
  2. People management. You'll handle coaching and performance-management for the engineers on your team. You'll forecast staffing needs, and then work with our recruiting teams to hire those positions.
  3. Process improvement. You'll work across engineering teams to identify and resolve systemic problems and bottlenecks, to improve our overall engineering velocity and efficiency.
  4. This role is inherently cross-functional and requires the ability to collaborate with PM team and other teams across OCI Core Service teams (Compute, Storage, Network)

Skills

Required

  • 4+ years of engineering management and people management experience
  • 5+ years in leading large cross-functional projects, operating large-distributed services in cloud environments
  • 5+ years of owning/driving roadmap strategy and definition
  • Bachelor's degree in computer science or related engineering field
  • Strong technical knowledge in distributed systems, high performance computing, and GPU systems.
  • Proven industry expert in high scalability, availability, low-latency domains
  • Excellent organizational, verbal, and written communication skills
  • Experience working in cloud platform(s) (AWS, OCI, GCP, Azure etc).

Nice to have

  • on-call experience

What the JD emphasized

  • high-performance GPU platform
  • AI/ML/HPC workloads
  • thousands of GPUs
  • AI Data Plane (DP) team
  • highly available, massive scale, integrated cloud service
  • hyper-scale AI customers
  • hands-on technical management
  • high-performance, mission-critical environments
  • distributed systems
  • high scalability, availability, low-latency domains