Lead System Administrator

Braze Braze · Enterprise · Chicago, IL · Engineering

Lead Systems Administrator responsible for the reliability, security, and operational excellence of IT services, including incident response, root cause analysis, system improvement through automation, and mentoring. The role involves advanced support for various SaaS platforms and infrastructure, scripting for efficiency, and managing vendor relationships.

What you'd actually do

  1. Serve as the primary escalation point for the Service Desk to investigate and resolve complex technical issues
  2. Own the maintenance, configuration, availability, and business continuity of core IT services
  3. Act as Incident Manager or partner closely with Incident Management during service outages, and security incidents, ensuring clear and timely communication to the business
  4. Identify recurring issues, define corrective actions, and implement long-term solutions
  5. Provide advanced support for Google Workspace, including email delivery, permissions, security issues, and service integrations

Skills

Required

  • Google Workspace
  • Slack
  • Okta
  • Zscaler
  • macOS
  • networking
  • AWS
  • VMware
  • Bash
  • Python
  • Ruby
  • Jira
  • Git
  • GAM
  • IT operations best practices
  • security
  • storage
  • data protection
  • disaster recovery
  • networking fundamentals
  • OSI model
  • software development lifecycle principles

Nice to have

  • ITIL Foundation
  • AWS
  • Azure
  • Google Cloud Platform

What the JD emphasized

  • technical escalation point
  • reliability, security, and operational excellence
  • lead incident response
  • deep root cause analysis
  • proactively improve systems through automation and process improvements
  • mentor and subject matter expert
  • escalation point
  • complex technical issues
  • maintenance, configuration, availability, and business continuity
  • Incident Manager
  • service outages, and security incidents
  • recurring issues
  • long-term solutions
  • advanced support
  • escalated issues
  • Troubleshoot and maintain integrations
  • Tier 3 support
  • design, write, and maintain custom scripts or applications
  • improve system efficiency and reduce manual effort
  • Manage vendor relationships
  • Investigate and remediate security-related issues
  • Mentor team members
  • promote IT best practices
  • hands-on problem solver
  • takes ownership
  • strives for operational excellence
  • supporting SaaS platforms
  • API-based administration
  • virtualization and cloud environments
  • scripting and automation
  • designing, implementing, and improving IT services
  • Strong understanding of IT operations best practices
  • security, storage, data protection, and disaster recovery
  • Self-directed
  • detail-oriented
  • prioritizing work based on impact and urgency
  • Strong networking fundamentals
  • software development lifecycle principles