Senior Cloud Engineer (r4222)

Shield AI Shield AI · Defense · San Diego, CA +2 · Cloud & Infrastructure

This role supports applied AI development by engineering, deploying, provisioning, and managing critical cloud systems in multi-cloud environments (Azure, AWS). The Senior Cloud Engineer will focus on performance, scalability, reliability, and security compliance, collaborating with AI, DevOps, and Security teams. Responsibilities include infrastructure management, system monitoring, troubleshooting, and automation using IaC and scripting languages.

What you'd actually do

  1. Manage and optimize multi-cloud infrastructure (Azure, AWS) for performance, reliability, and scalability.
  2. Support and optimize cloud and virtual machine environments, assisting with capacity planning, performance monitoring, security compliance, and vulnerability remediation.
  3. Assist in implementing and maintaining infrastructure systems, including servers, storage, backup solutions, and disaster recovery processes, for both public and private clouds.
  4. Collaborate cross-functionally with AI, DevOps, and Security teams to ensure compliance, observability, and resilience in mission-critical environments.
  5. Provide escalated support for operational issues possibly during and after normal business hours for systems, workloads, and Kubernetes AI infrastructure.

Skills

Required

  • Bachelor’s degree in Computer Science or related field, or equivalent experience (4+ years) plus an engineer level certification, Azure/AWS Associate, or another similar level certification.
  • 4 years’ experience supporting applications and systems in a production environment in high-availability, mission-critical, or defense-grade environments preferred.
  • Comfortable with operational efficiencies utilizing Infrastructure as Code (IaC) solutions (e.g., Terraform, Ansible).
  • Strong understanding of networking concepts (VPCs, VPNs, subnets, routing, firewalls).
  • Experience in automating repetitive tasks using scripting languages such as PowerShell, Python, or Bash.
  • Experience with deployment and systems administration of at least one type of Linux distribution (i.e. RHEL, Ubuntu)
  • Experience with concepts of Microsoft Windows Server administration, Azure and Active Directory environments
  • Possesses organizational skills, with a process-oriented mindset, attention to detail, and effective verbal and written communication abilities.
  • Ability to work independently to accomplish assigned tasks.
  • Solution-oriented, constructive approach to problem-solving.

Nice to have

  • Experience deploying and maintaining workloads in Azure public cloud environments.
  • Hands-on experience with containerization and Kubernetes-based workloads.
  • Strong understanding of virtualization and private cloud platforms (e.g., VMware, Hyper-V, KVM).
  • Background in DevOps, Site Reliability Engineering (SRE), or cloud infrastructure roles.
  • Proficiency with configuration management and automation tools (e.g., Ansible, Chef, Puppet, Terraform).
  • Experience building and optimizing CI

What the JD emphasized

  • mission-critical environments
  • Kubernetes AI infrastructure