What you'd actually do

Own end-to-end frontend and backend network design, deployment, and operations for AI and compute lab clusters

Serve as a primary networking point of contact for backend fabrics, including Arista- and internally developed network OS-based scale-out networks supporting AI workloads

Design, deploy, and support high-throughput, low-latency cluster networking, including congestion management (PFC/ECN), RDMA validation, and lossless transport

Perform hands-on troubleshooting and root-cause analysis across L1–L4 using packet captures, telemetry, and vendor tools to resolve complex lab issues

Support silicon, hardware, and software bring-ups, ensuring reliable connectivity and on-time validation

Skills

Required

6+ years of experience designing, deploying, and operating network infrastructure in production or lab environments
Experience working in multi-vendor environments, including Arista, FBOSS-based platforms, and lab networking hardware
Experience with configuration management, code repositories, and zero-touch provisioning (ZTP) for network infrastructure
Experience with IPv4/IPv6, L2/L3 protocols, including STP, OSPF, BGP, TCP/IP, DHCP, DNS, VLANs, VRRP, LACP, MC-LAG, ACLs, MACsec, and EVPN/VXLAN
Working knowledge of scripting or programming languages (e.g., Python, shell) for automation and tooling
Demonstrated experience to operate consistently while working under your own initiative, seeking feedback and input where appropriate in a global, time-critical environment, managing multiple priorities and mission-critical timelines
Understanding of physical infrastructure design, including structured cabling, space, power, and cooling systems
Networking L1 expertise in validating multi-vendor optics, with proficiency using the BCM shell and I2C utilities to troubleshoot hardware-level issues
Experience with network automation, CI/CD pipelines, audit frameworks, and validation tooling
Hands-on experience with backend cluster networking, including scale-out fabrics, RDMA networks, and congestion management
Experience supporting AI/ML or high-performance compute clusters in lab or pre-production environments
Hands-on experience with lab test equipment, optics qualification (e.g., 400G/800G), optical switches and physical infrastructure
Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews)
Demonstrated ability to integrate AI tools to optimize/redesign workflows and drive measurable impact (e.g., efficiency gains, quality improvements)
Hands-on experience with disaggregated networking products and software, such as Meta's open network OS (FBOSS), SONiC, Cumulus Linux, or equivalent open networking platforms

Nice to have

Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
Networking certifications such as CCIE, JNCIE or equivalent
Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies

Meta's Lab Infrastructure, Network, Compliance, and Security (LINCS) team is seeking a network engineer to help build and scale the network infrastructure supporting Meta's global engineering labs. Our team is responsible for network design, deployment, and operations for Meta's global engineering labs where we support multiple engineering teams. With the importance of rapidly maturing new technologies like the Metaverse and Gen AI, there are significant opportunities to re-think traditional networking and iterate quickly in our environment. This role offers an opportunity to work directly with engineering teams that are maturing new hardware and software on the path to production.

Responsibilities

Own end-to-end frontend and backend network design, deployment, and operations for AI and compute lab clusters Serve as a primary networking point of contact for backend fabrics, including Arista- and internally developed network OS-based scale-out networks supporting AI workloads Design, deploy, and support high-throughput, low-latency cluster networking, including congestion management (PFC/ECN), RDMA validation, and lossless transport Perform hands-on troubleshooting and root-cause analysis across L1–L4 using packet captures, telemetry, and vendor tools to resolve complex lab issues Support silicon, hardware, and software bring-ups, ensuring reliable connectivity and on-time validation Lead and execute lab network lifecycle activities, including upgrades, migrations, capacity expansions, and decommissioning across regions Develop and maintain network automation, configuration templates, and zero-touch provisioning (ZTP) workflows Create and maintain MOPs, runbooks, and readiness checklists for internal teams and vendor executions Provide direct consultation and training to cross-functional partners, enabling teams to operate and troubleshoot lab networks End-to-end ownership of projects from requirements definition through customer handoff Collaborate closely with hardware, software, systems, and lab operations teams to validate new platforms, optics, and network designs Support limited travel (about 10%) for critical lab builds, migrations, or escalations

Qualifications

Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience Bachelor's degree in Computer Science, Computer Engineering, a relevant technical field, or equivalent practical experience 6+ years of experience designing, deploying, and operating network infrastructure in production or lab environments Experience working in multi-vendor environments, including Arista, FBOSS-based platforms, and lab networking hardware Experience with configuration management, code repositories, and zero-touch provisioning (ZTP) for network infrastructure Experience with IPv4/IPv6, L2/L3 protocols, including STP, OSPF, BGP, TCP/IP, DHCP, DNS, VLANs, VRRP, LACP, MC-LAG, ACLs, MACsec, and EVPN/VXLAN Working knowledge of scripting or programming languages (e.g., Python, shell) for automation and tooling Demonstrated experience to operate consistently while working under your own initiative, seeking feedback and input where appropriate in a global, time-critical environment, managing multiple priorities and mission-critical timelines Understanding of physical infrastructure design, including structured cabling, space, power, and cooling systems Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy review) Networking L1 expertise in validating multi-vendor optics, with proficiency using the BCM shell and I2C utilities to troubleshoot hardware-level issues Experience with network automation, CI/CD pipelines, audit frameworks, and validation tooling Hands-on experience with backend cluster networking, including scale-out fabrics, RDMA networks, and congestion management Experience supporting AI/ML or high-performance compute clusters in lab or pre-production environments Hands-on experience with lab test equipment, optics qualification (e.g., 400G/800G), optical switches and physical infrastructure Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews) Hold networking certifications such as CCIE, JNCIE or equivalent Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies Demonstrated ability to integrate AI tools to optimize/redesign workflows and drive measurable impact (e.g., efficiency gains, quality improvements) Hands-on experience with disaggregated networking products and software, such as Meta's open network OS (FBOSS), SONiC, Cumulus Linux, or equivalent open networking platforms

Network Engineer, Engineering R&d Environments

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Responsibilities

Qualifications