What you'd actually do

Design highly reliable and performant Linux applications used to manage our virtualization stack across thousands of AI compute servers in multiple global datacenters.

Integrate Crusoe applications with a wide variety of hardware and software AI chip-vendor stacks. Build solutions to optimize and monitor virtualized hardware (GPUs, Infiniband/ROCe NICs, Ephemeral Storage, etc.) in cutting-edge AI/HPC environments.

Work side by side with our Linux Kernel and Hypervisor teams to ensure our Crusoe applications are seamlessly integrated with a variety of kernels and hypervisors.

Analyze and enhance the performance of the entire virtualization stack, from the hypervisor to the virtualized guest OS, with a specific focus on optimizing AI/ML workloads. This includes profiling, bottleneck identification, and implementing low-level optimizations.

Diagnose and resolve complex system issues across our virtualization stack (drivers, kernel, hypervisor, guest OS, and crusoe applications). Work closely with kernel and hypervisor teams to debug and resolve integration challenges.

Skills

Required

Linux kernel
virtualization
hardware tuning
distributed systems
object oriented programming
low-level systems programming
Linux systems
device drivers
memory management
process scheduling
GPUs
CPUs
Infiniband
Ethernet NICs
Ephemeral Disks
PCI Express
distributed applications
highly-scalable systems design
communications protocols (GRPC, REST, TCP/IP, etc.)
databases (Postgres, Redis)
systems design applications (Pub/Sub, Kafka)
software applications (Golang, Java, Python)
software applications (C, C++, Rust)
clean, maintainable code
unit-test driven mindset
Excellent Communication Skills
Rapid and Agile Learner
Virtualization Concepts
hypervisors
virtual machine lifecycles
Linux KVM tooling
CI/CD
Gitlab or Github CI/CD pipelines

Nice to have

virtualization specifically for AI/ML workloads
GPU virtualization
debugging or contributing to kernel or hypervisor code
configuring thousands of live compute nodes in a bare-metal production environment

What the JD emphasized

critical to this role

must

highly reliable and performant

cutting-edge AI/HPC environments

seamlessly integrated

specific focus on optimizing AI/ML workloads

complex system issues

integration challenges

highest level of software quality, reliability, and security

cohesive and integrated product development

technical excellence

Linux Systems Familiarity

Solid understanding of hardware devices

Strong grasp of distributed applications and highly-scalable systems design

Strong experience building software applications

Keen eye for clean, maintainable code

unit-test driven mindset

Excellent Communication Skills

Rapid and Agile Learner

Virtualization Concepts

CI/CD and Validation

Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of the stack — from electrons to tokens — to power the world's most ambitious AI workloads. When you join Crusoe, you join a team that is building the future, faster.

We're in the midst of the greatest industrial revolution of our time. The demand for AI compute is boundless, and power is a bottleneck. We're solving that — with an energy-first approach that makes AI infrastructure better for the world and faster for the people innovating with AI.

We're looking for problem-solving, opportunity-finding teammates with a sense of urgency, who believe in the scale of our ambition and thrive on a path not fully paved — people who want to grow their careers alongside a team of experts across energy, manufacturing, data center construction, and cloud services.

If you want to do the most meaningful work of your career, help our customers and partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at Crusoe.

About This Role:

The Crusoe Cloud Software Development team is seeking a passionate and experienced Systems Engineer II, Compute specializing in Systems Applications. This pivotal role is critical in the design and development of our compute platform, specifically focusing on building compute applications for virtualized AI-platforms. An understanding of the linux kernel, virtualization, hardware tuning, distributed systems, object oriented programming, and low-level systems programming are critical to this role. Excellent communication skills and a desire to work with a wide range of technologies across the linux stack are both a must. This is a full-time position.

What You’ll Be Working On:

**Compute Application Development & Scaleout: **Design highly reliable and performant Linux applications used to manage our virtualization stack across thousands of AI compute servers in multiple global datacenters.
AI Hardware Platform Integration: Integrate Crusoe applications with a wide variety of hardware and software AI chip-vendor stacks. Build solutions to optimize and monitor virtualized hardware (GPUs, Infiniband/ROCe NICs, Ephemeral Storage, etc.) in cutting-edge AI/HPC environments.
Kernel & Hypervisor Integration - Work side by side with our Linux Kernel and Hypervisor teams to ensure our Crusoe applications are seamlessly integrated with a variety of kernels and hypervisors.
Performance Analysis & Tuning: Analyze and enhance the performance of the entire virtualization stack, from the hypervisor to the virtualized guest OS, with a specific focus on optimizing AI/ML workloads. This includes profiling, bottleneck identification, and implementing low-level optimizations.
System-Level Troubleshooting: Diagnose and resolve complex system issues across our virtualization stack (drivers, kernel, hypervisor, guest OS, and crusoe applications). Work closely with kernel and hypervisor teams to debug and resolve integration challenges.
Code Review and Quality Assurance: Conduct thorough code reviews to ensure the highest level of software quality, reliability, and security within compute applications and virtualization stack.
Cross-Functional Collaboration: Collaborate with other engineering teams, including hardware design, OS development, and AI/ML application teams, to ensure cohesive and integrated product development.
Technical Leadership: Provide technical guidance and mentorship to junior engineers, fostering a culture of technical excellence and collaborative problem-solving within the compute applications team.

What You’ll Bring to the Team:

Linux Systems Familiarity: Experience building applications on Linux kernels, specifically pertaining to virtualization, device drivers, memory management, and process scheduling.
Hardware Integration: Solid understanding of hardware devices such as GPUs, CPUs, Infiniband and Ethernet NICs, Ephemeral Disks, and PCI Express.
Systems Design: Strong grasp of distributed applications and highly-scalable systems design. Specific focus around communications protocols (GRPC, REST, TCP/IP, etc.), databases (Postgres, Redis), and systems design applications (Pub/Sub, Kafka).
Software Architecture: Strong experience building software applications, both at the higher (Golang, Java, Python) and lower (C, C++, Rust) levels. Keen eye for clean, maintainable code, and a unit-test driven mindset.
Excellent Communication Skills: Ability to collaborate with teams across an organization, blocking out noise, and focusing on what needs to get done to get a project across the line.
Rapid and Agile Learner: Capable of adapting quickly, eager to research new technology and not get overwhelmed by unfamiliar tech stacks.
Virtualization Concepts: General knowledge of hypervisors, virtual machine lifecycles, and Linux KVM tooling.
CI/CD and Validation: Understanding of how to build Gitlab or Github CI/CD pipelines that deliver bug-free code across a multitude of compute platforms.

Bonus Points:

Experience with virtualization specifically for AI/ML workloads, including GPU virtualization.
Previous work debugging or contributing to kernel or hypervisor code, particularly around device management.
Experience with configuring thousands of live compute nodes in a bare-metal production environment.

Benefits:

Competitive compensation
Restricted Stock Units
Paid time off & paid holidays
Comprehensive health, dental & vision insurance
Employer contributions to HSA account
Paid parental leave
Paid life insurance, short-term and long-term disability
Professional development & tuition reimbursement
Mental health & wellness support
Commuter benefits (parking & transit)
Cell phone stipend
401(k) Retirement plan with company match up to 4% of salary
Volunteer time off

Compensation:

Compensation will be paid in the range of $137,000 - $161,000. Restricted Stock Units are included in all offers. Compensation to be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.

About This Role:

What You’ll Be Working On:

**Compute Application Development & Scaleout: **Design highly reliable and performant Linux applications used to manage our virtualization stack across thousands of AI compute servers in multiple global datacenters.

AI Hardware Platform Integration: Integrate Crusoe applications with a wide variety of hardware and software AI chip-vendor stacks. Build solutions to optimize and monitor virtualized hardware (GPUs, Infiniband/ROCe NICs, Ephemeral Storage, etc.) in cutting-edge AI/HPC environments.

Kernel & Hypervisor Integration - Work side by side with our Linux Kernel and Hypervisor teams to ensure our Crusoe applications are seamlessly integrated with a variety of kernels and hypervisors.

Performance Analysis & Tuning: Analyze and enhance the performance of the entire virtualization stack, from the hypervisor to the virtualized guest OS, with a specific focus on optimizing AI/ML workloads. This includes profiling, bottleneck identification, and implementing low-level optimizations.

System-Level Troubleshooting: Diagnose and resolve complex system issues across our virtualization stack (drivers, kernel, hypervisor, guest OS, and crusoe applications). Work closely with kernel and hypervisor teams to debug and resolve integration challenges.

Code Review and Quality Assurance: Conduct thorough code reviews to ensure the highest level of software quality, reliability, and security within compute applications and virtualization stack.

Cross-Functional Collaboration: Collaborate with other engineering teams, including hardware design, OS development, and AI/ML application teams, to ensure cohesive and integrated product development.

Technical Leadership: Provide technical guidance and mentorship to junior engineers, fostering a culture of technical excellence and collaborative problem-solving within the compute applications team.

What You’ll Bring to the Team:

Linux Systems Familiarity: Experience building applications on Linux kernels, specifically pertaining to virtualization, device drivers, memory management, and process scheduling.

Hardware Integration: Solid understanding of hardware devices such as GPUs, CPUs, Infiniband and Ethernet NICs, Ephemeral Disks, and PCI Express.

Systems Design: Strong grasp of distributed applications and highly-scalable systems design. Specific focus around communications protocols (GRPC, REST, TCP/IP, etc.), databases (Postgres, Redis), and systems design applications (Pub/Sub, Kafka).

Software Architecture: Strong experience building software applications, both at the higher (Golang, Java, Python) and lower (C, C++, Rust) levels. Keen eye for clean, maintainable code, and a unit-test driven mindset.

Excellent Communication Skills: Ability to collaborate with teams across an organization, blocking out noise, and focusing on what needs to get done to get a project across the line.

Rapid and Agile Learner: Capable of adapting quickly, eager to research new technology and not get overwhelmed by unfamiliar tech stacks.

Virtualization Concepts: General knowledge of hypervisors, virtual machine lifecycles, and Linux KVM tooling.

CI/CD and Validation: Understanding of how to build Gitlab or Github CI/CD pipelines that deliver bug-free code across a multitude of compute platforms.

Bonus Points:

Experience with virtualization specifically for AI/ML workloads, including GPU virtualization.

Previous work debugging or contributing to kernel or hypervisor code, particularly around device management.

Experience with configuring thousands of live compute nodes in a bare-metal production environment.

Benefits:

Competitive compensation

Restricted Stock Units

Paid time off & paid holidays

Comprehensive health, dental & vision insurance

Employer contributions to HSA account

Paid parental leave

Paid life insurance, short-term and long-term disability

Professional development & tuition reimbursement

Mental health & wellness support

Commuter benefits (parking & transit)

Cell phone stipend

401(k) Retirement plan with company match up to 4% of salary

Volunteer time off

Compensation:

Systems Engineer Ii, Compute

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

About This Role:

What You’ll Be Working On:

What You’ll Bring to the Team:

Bonus Points:

Benefits:

Compensation:

About This Role:

What You’ll Be Working On:

What You’ll Bring to the Team:

Bonus Points:

Benefits:

Compensation: