Intercom’s approach to a great on-call experience | Brian Scanlan (Intercom)
In this deep-dive episode, Brian Scanlan, Principal Systems Engineer at Intercom, describes how the company’s on-call process works. He explains how the process started and key changes they’ve made over the years, including a new volunteer model, changes to compensation, and more.
Discussion points:
Discussion points:
- (1:28) How on-call started at Intercom
- (10:11) Brian’s background and interest in being on-call
- (14:06) Getting engineers motivated to be on-call
- (16:37) Challenges Intercom saw with on-call as it grew
- (19:53) Having too many people on-call
- (23:20) Having alarms that aren’t useful
- (26:03) Recognizing uneven workload with compensation
- (27:22) Initiating changes to the on-call process
- (30:08) Creating a volunteer model
- (33:02) Addressing concerns that volunteers wouldn’t take action on alarms
- (34:40) Equitability in a volunteer model
- (36:36) Expectations of expertise for being on-call
- (40:56) How volunteers sign up
- (44:15) The Incident Commander role
- (46:19) Using code review for changes to alarms
- (50:02) On-call compensation
- (52:50) Other approaches to compensating on-call
- (55:08) Whether other companies should compensate on-call
- (57:32) How Intercom’s on-call process compares to other companies
- (1:00:46) Recent changes to the on-call process
- (1:04:13) Balancing responsiveness and burnout
- (1:07:12) Signals for evaluating the on-call process
Mentions and links:
- Follow Brian on LinkedIn or Twitter
- Brian’s article: How we fixed our on call process to avoid engineer burnout
- Gergely Orosz’s On-Call Compensation