Performance & Capacity Engineering - Capacity Planning Optimization

Meta Meta · Big Tech · Bellevue, WA +2

Meta is seeking a Performance & Capacity Engineer to optimize site-wide performance and capacity planning for Meta's products and infrastructure. This role involves building software and mathematical optimization models to manage capacity, power, and cost, with a focus on strategic-level impact and cross-functional collaboration. The role requires experience with optimization algorithms, LP solvers, distributed systems, and integrating AI tools for workflow optimization.

What you'd actually do

  1. Own infrastructure capacity planning for Meta: including Servers, Data Centers, Network
  2. Design, implement and launch software systems to improve capacity planning efficiency and quality, partnering with software engineers
  3. Contribute to end to end capacity planning processes, methodologies, and data to deliver executable and optimized plans
  4. Manage and resolve critical escalations and exceptions in all areas of the capacity planning
  5. Build linear programming models to perform simulation and optimization studies of demand and supply projections, scenario planning, and feasibility analysis while balancing various constraints

Skills

Required

  • performance or software engineering and/or optimization pertinent data science
  • designing and implementing models and optimization algorithms
  • coding/scripting languages such as Python, R, Java, C, C++, PHP
  • LP solvers such as Xpress or Gurobi
  • distributed systems at scale
  • infrastructure operations and technical infrastructure knowledge
  • cross-functional teams
  • optimizing complex systems, working with large datasets, and driving business impact
  • integrate AI tools to optimize/redesign workflows
  • responsible, ethical AI practices
  • ongoing AI skill development

Nice to have

  • AI, Metaverse

What the JD emphasized

  • building software and mathematical optimization models, not manual planning
  • optimize these capacity plans and scalably manage exceptions at the most strategic levels with company level impact
  • optimization algorithms
  • LP solvers
  • AI tools to optimize/redesign workflows
  • responsible, ethical AI practices
  • ongoing AI skill development