Principal Network Development Engineer

Oracle Oracle · Enterprise · Seattle, WA +1

This role focuses on the design, deployment, and operations of network fabric and systems for Oracle Cloud Infrastructure (OCI), specifically supporting AI/ML and HPC workloads. The engineer will work with RDMA cluster networking, network protocols, scripting, automation, and data center design, acting as a subject matter expert and providing technical leadership.

What you'd actually do

  1. Supports the design, deployment, and operations of a large-scale global Oracle Cloud Infrastructure (OCI).
  2. Primarily focused on the development and support of network fabric and systems through a combination of a deep level understanding of networking at the protocol level coupled with programming skills.
  3. Develop solutions to enable front line support teams to act on network failure conditions.
  4. Mentor junior engineers.
  5. Participates in network solution and architecture design process and contribute to the roadmap’s development.

Skills

Required

  • Bachelor’s degree in CS or related engineering field with 6+ years of Network Engineering experience or master’s with 5+ years of Network Engineering experience
  • Experience working in a large ISP or cloud provider environment
  • Experience working in a network operations role
  • strong knowledge of protocols such as MPLS, BGP/OSPF/IS-IS, TCP, IPv4, IPv6, DNS, and DHCP
  • Extensive experience with scripting or automation and data center design
  • Python preferred but must demonstrate expertise in scripting or compiled language
  • Experience with networking protocols such as TCP/IP, VPN, DNS, DHCP, and SSL
  • Experience with network monitoring and telemetry solutions
  • Experience with network modeling and programming – YANG, OpenConfig, NETCONF
  • Ability to use professional concepts and company objectives to resolve complex issues in creative and effective ways
  • Capable of working under limited supervision
  • Excellent organizational, verbal, and written communication skills
  • Excellent judgment in influencing product roadmap direction, features, and priorities
  • Participate in an on-call rotation
  • Collaborate with program/project managers to develop milestones and deliverables
  • Will primarily use existing procedures and tools to develop and safely execute network change
  • Develop solutions to enable front line support teams to act on network failure conditions
  • Mentor junior engineers
  • Participates in network solution and architecture design process and contribute to the roadmap’s development
  • Participate in operational rotations as either primary or secondary
  • Provide break-fix support for events
  • Serve as the escalation point for event remediation
  • Lead post-event root cause analysis
  • Frequently develops scripts to automate routine tasks for team and business units
  • Coordinate with networking automation services for the development and integration of support tooling
  • Coordinate with network monitoring to gather telemetry and create alerts rules using them
  • Build dashboards to represent data at various network layers and device roles that help identify network issues, anomalies
  • Serves as SME on software development projects for network automation and network monitoring
  • Collaborate with network vendor technical account team and internal Quality Assurance team to drive bug resolution and assist in the qualification of new firmware and/or operating systems
  • Tackles complex technical challenges with creative solutions, balancing both immediate and long-term considerations to drive business and product outcomes
  • Leads by example, providing technical leadership and mentorship to peers, while influencing cross-functional teams through collaboration and clear communication
  • Takes end-to-end responsibility for key technical initiatives and projects, ensuring timely and high-quality delivery
  • Able to adapt to changing technology and find new approaches to existing challenges

Nice to have

  • Experience in RDMA Networking is a plus
  • VxLAN and EVPN will be an added advantage

What the JD emphasized

  • RDMA cluster networking domain
  • AI, ML, HPC workloads
  • network fabric and systems
  • protocol level
  • scripting or automation
  • network monitoring and telemetry solutions
  • network modeling and programming
  • network change
  • network failure conditions
  • network solution and architecture design process
  • operational rotations
  • event remediation
  • root cause analysis
  • automate routine tasks
  • networking automation services
  • network monitoring
  • network layers and device roles
  • software development projects for network automation and network monitoring
  • network vendor technical account team
  • new firmware and/or operating systems
  • technical initiatives and projects