Staff Engineer – Vulnerability Management Automation (platform and Tools - Vms)

GEICO GEICO · Insurance · Bethesda, MD +3

Staff Engineer focused on building and operating large-scale automation for vulnerability discovery, prioritization, and remediation, including safe OS patch orchestration. The role involves designing and implementing services, controllers, schedulers, and integrations for a VM lifecycle management platform on Kubernetes, with a focus on security engineering, platform engineering, and software development.

What you'd actually do

  1. Define the technical roadmap for vulnerability management and patch automation platforms.
  2. Establish standards, patterns, and paved roads for scanning, triage, remediation, and verification.
  3. Design and implement services for asset/CMDB enrichment, risk scoring, and intelligent targeting (by business criticality, exposure, blast radius).
  4. Build controllers/schedulers for maintenance windows, deployment rings/canaries, pre/post checks, automated backoff/rollback, and progressive delivery.
  5. Deliver self‑service CLIs/SDKs and internal UIs to request, schedule, and track remediation with clear SLAs and audit trails.

Skills

Required

  • Kubernetes
  • Platform Engineering
  • Security Engineering
  • Software Development
  • Automation
  • Vulnerability Management
  • Patch Orchestration
  • API Design
  • Event-driven pipelines
  • Controllers
  • Schedulers
  • Integrations
  • Infrastructure as Code
  • Configuration Management
  • Asset/CMDB enrichment
  • Risk Scoring
  • Policy-driven workflows
  • Windows and Linux patching
  • CIS hardening
  • Drift detection
  • CMDB
  • ITSM/ticketing systems
  • Change control
  • SLOs
  • Telemetry
  • CLIs/SDKs
  • Internal UIs

Nice to have

  • Tenable/Nessus, Qualys, Rapid7
  • CVSS v3.x, KEV, EPSS
  • WSUS/MECM/SCCM, Ansible/Puppet/Chef/Salt, dnf/yum/apt, Winget/MSU
  • Remedy, ServiceNow
  • Packer
  • SIEM
  • data lake

What the JD emphasized

  • zero-downtime
  • low downtime
  • safe, zero-to-low downtime OS patch orchestration
  • policy-driven
  • observable software
  • vulnerability discovery, prioritization, and remediation
  • safe guardrails
  • SLOs
  • standardization
  • reuse
  • operational toil
  • safe changes
  • risk
  • audit trails
  • auditability