Slack Proactive Monitoring Engineer

Salesforce Salesforce · Enterprise · Indianapolis, IN +3

Salesforce is seeking a Slack Proactive Monitoring Engineer to ensure the health and performance of Slack's enterprise deployments. This role involves continuous monitoring of metrics, detecting anomalies, triaging issues, and coordinating proactive customer outreach and remediation efforts. The engineer will also partner with customer success teams for technical reviews and perform root cause analysis.

What you'd actually do

  1. Continuously monitor dashboards, alerting systems, and telemetry data (error rates, latency spikes, API failures, deployment anomalies) for early signals of degradation.
  2. Triage and correlate alerts from multiple sources (Splunk, internal tools, etc) to identify patterns before customers report issues.
  3. Actively monitor Slack platform health dashboards, network latency signals, message delivery queues, and database capacities for high-frequency workspaces.
  4. Identify customers potentially affected by degraded service conditions and coordinate proactive outreach with Customer Success and Support teams.
  5. Perform root cause analysis (RCA) on proactively detected issues, documenting findings in internal case and incident management systems.

Skills

Required

  • technical support
  • site reliability engineering
  • observability and monitoring tools
  • cloud-based SaaS architecture
  • APIs
  • logs, metrics, and traces analysis
  • communication skills

Nice to have

  • Slack platform experience
  • Salesforce Service Cloud / OrgCS case management
  • scripting or automation (Python, JavaScript, Bash)
  • customer-facing support engineering or reliability role at a SaaS company
  • ITIL, SRE, or similar certification

What the JD emphasized

  • 2+ years of experience in technical support, site reliability engineering, or a related operations role.
  • Hands-on experience with observability and monitoring tools (e.g., Grafana, Splunk, Datadog, PagerDuty, or equivalent).
  • Proficiency in reading and analyzing logs, metrics, and traces.