How SiriusXM revamped their platform and developer experience | Jared Wolinsky
Abi Noda: Jared, it's exciting to finally have you on the show. Really excited to chat with you today. Thanks for your time.
Jared Wolinsky: No problem. Excited to be here. Been a medium time listener and a big fan of the product and the podcast, so happy to be here.
Abi Noda: Awesome. Well, we wanted to start our discussion off today with recapping the journey that your organization has been on this past year and a half, eventually arriving at this re-imagining of the developer platform, which we'll dive into soon. As I understand it, for the past year and a half, your organization was upstream of a pretty major undertaking at SiriusXM that involved the rebuilding of streaming apps. Share more with listeners about what this big business push was and what your org's role was in this effort.
Jared Wolinsky: Yeah, definitely. I joined the company at about the third quarter of 2022 amidst a major push to compete more in the streaming space. Everybody knows SiriusXM as the satellite radio company. We're in the car. We have a great listener base within the car. Fewer people know that we also have separate digital streaming apps that you can listen to the same content that you get in the car on your iPhone, Android, number of consumer devices, on your Fire TV, Apple TV, Samsung, all of that, in the web. It just hasn't been in the past a major part of our business and our push to get subscribers. That's changed. It's obvious to us and to the market, I think, that as options for streaming and personal devices for listening to all types of audio continue to expand and get better and even push their way into the car, this needs to be a major part of our strategy.
Again, in around 2022, we decided in order to innovate at the pace that we wanted to and do the kinds of things from a user experience standpoint that we wanted, we needed to modernize the tech stack. We needed to really put ourselves in a position to move fast in a way that a modern technology organization can and should. So to your point earlier decided to, completely from scratch, rebuild the whole thing. Everything from the cloud accounts to the backend services to the client applications on all the consumer devices I mentioned.
Now you mentioned upstream. Of course, in order for anybody to actually get software development done and build any of this, there needs to be the developer platform there, and that's where my team came in. I came aboard and put together an organization to really, again, from scratch stand up a new cloud footprint on AWS, figure out the spectrum of guardrails to paved paths and permissions and all of that, and account structure that we needed to do. Then everything on top of that that a modern software development organization needs to do what they do day in and day out, delivery pipelines, local development experience, observability, all of that stuff is under my team, platform engineering. Now, in a perfect world, we would be actually upstream of all the efforts to build the software, but in reality we were doing it in parallel and that presented, I think, some of the challenges that you and I have talked about in the past.
Abi Noda: When I talk to other organizations who've undergone similar efforts as what we're discussing today. It's often happening not in parallel to a major product revamp. I'm curious for you to take us into the trenches, having to make a lot of these mission-critical decisions with long-term impact under the heightened pressure of such a big business shift. Take us into what that experience has been like.
Jared Wolinsky: Yeah, I guess, rewinding a little bit, we did actually get the new platform and the new apps out at the end of last year, end of 2023. That part of the effort took about a year and a half. But to your point, that experience of building the plane while we're flying it was challenging in a couple of ways. The first one is that I came into this role having the rest of my experience being on the product engineering side, either as an individual contributor, software engineer or leading teams. I've been actually building the product software that whatever company I've been at has been driving the business for. Coming into this role With that experience, I have been the user for my current team right now, so I had a lot of big ideas about how to improve the user experience, how to treat this whole effort with a platform as a product mindset. It's really a big part of our org strategy.
Having those aggressive deadlines and having to unblock teams who are building production software on the platform as you're building it, presented certain boundaries on what you can and can't do in terms of designing an ideal user experience, doing a lot of user research, putting together anything beyond an MVP in certain areas. We found ourselves really focusing on blockers for the first year, I would say, or close to a year. It showed up in our OKRs. I made it very clear our first objective was to unblock development teams building toward the platform.
We did everything we could to get those table stakes features out there. Obviously, we need to get AWS accounts stood up and get people on boarded into them. It's non-negotiable, we need to get that done first. We need to get people GitHub access so that they can actually push code and we need to connect those things. You're building the foundation from the ground up, and you, again, maybe have to revisit certain more fun product and user experience additions later on. I'd say that was the biggest challenge we faced.
Abi Noda: While working within the constraints of this time crunch and having to make expedited decisions. When you reflect on the past year and a half in hindsight, what are maybe some of the shortcuts you feel you've made where there are regrets or at least changes that you plan to make now with full view of the problems?
Jared Wolinsky: Yeah, one big one comes to mind, and that is in the way that we structured the cloud accounts. We originally went out with what we call workload-based AWS accounts, where workload is a purposely ambiguous term that could mean any interrelated software and infrastructure that has a similar security profile or risk profile, probably worked on by the same team or teams. Where it makes sense, we set these boundaries around each workload. It could be commerce, it could be content services, it could be search, and each one of those gets its own account. That allows us to be a little more granular than the sort of monolithic overall platform account structure that I'm burying the lede, we ended up with going to production. We did that for a couple of reasons, but the biggest one was just speed the market. Anytime you set boundaries at the account level in the cloud, you're going to run into some friction in certain cases, especially on a platform that was as immature as ours as we were building it out.
We over-indexed on making sure that the developers had as few obstacles as we could get them while they were building the production software, and we ended up moving most of the teams that were directly involved in the backend platform and the client applications into one account and kept some of the other things that weren't directly related to the product software, like the data org, the science org, and our own platform engineering stuff still in separate accounts. What that did at the expense of probably better security posture in the short term is it, again, unblocked teams as soon as they could get unblocked and they were able to accelerate a little faster.
One of the things we hit a little sooner than we thought we would were account limits. We've run into things like API rate limiting on the AWS side, some hard quotas in the number of IAM roles that the account can have on the dev side. It's taken up time and it's your classic tech debt. I can't imagine a clearer example of purposely taking on something that you're going to have to work back in the future for a benefit in the short term and then having to pay some of that back, which we're starting to do now.
Abi Noda: You launched these apps at the end of last year and you've shared with me that following the launch of these apps, there's been a pause that's allowed you to reimagine the developer platform and developer experience at SiriusXM. I know one thing that your organization has done is defined and then began to work backwards from what you've referred to as a idealized user guide for developers. Would love for you to walk through what that is and what inspired that effort.
Jared Wolinsky: Sure. Yeah, I think in my description of what it was like to build out all those table stakes foundational features while we were on a launch schedule, I wasn't very subtle in presenting that I wouldn't prefer to work that way, and this has given us the opportunity to work more how I envision this org working, which is product first and user first. As you mentioned, after the launch at the end of last year, we have some time now where the business facing engineering teams are focused on follow-ups, things that didn't make it into the scope for launch, fixing bugs and planning for future releases. But there are things that we are not a direct dependency on for the most part, so for the first time since I've been here and since the platform engineering org and its current state has existed, we have a chance to actually take a step back, define our own roadmap and start thinking about the user experience in all of the different areas that we own.
You mentioned the user guide. This has been our vehicle for breaking down what has been a really ambitious idea in some of our heads, certainly my own and my engineering leads and product leads, but is too large to remain in our heads. We decided to work backwards and we put together a user guide which is going to serve or should serve in the future as the actual document that a new engineer on the SiriusXM platform can go to and walk through from beginning to end what it looks like to onboard onto the platform as a user, all the way through creating a service locally, modifying it, running it on your own machine, deploying it to AWS, testing it, troubleshooting it, observing it in Datadog and CloudWatch and any number of other points in the SDLC that we're responsible for that we think are important to the user journey.
We did that. We painstakingly put together working session after working session with folks who own the different areas of our platform. We shopped it around with users. We really got it to the point where we think this is a really great user experience as a developer that I would want to have. We're using that now to identify the deltas between where we are at the moment and that experience and spinning out projects and initiatives to actually get there. That's the theme for this year on our team, is finally getting to that point where we're upping the user experience on the platform.
Abi Noda: In a moment, I want to ask you to actually walk through that developer workflow in detail, in first person, with me and listeners. Before that, I want to ask you about the process for creating this. Share more about the type of research or the type of iteration you've done. Did you build this journey up by observing what people were doing or is this more of a future vision of an ideal state that doesn't really exist at all within the organization? Would love to better understand the process for how you and your team define this.
Jared Wolinsky: Sure. It's a little column A, a little column B. It started based on the way our organization is structured today, which itself is based on parts of the developer workflow that we want to support and help developers do better. Taking a step back, my organization is made up of five teams. We've got CloudFoundation, which is the AWS ownership and core infrastructure. On top of that, you've got three teams that are responsible for core SDLC section. DevX is the local development and developer tool team. We have delivery team that's responsible for deployment and build in the pipelines. And observability, which is pretty self-explanatory, responsible for actually getting visibility into your software once it's running. Then we have a fifth team enablement that's responsible for telling that cohesive story about how it all works and unblocking and making teams more successful using the platform.
A lot of research went into how the teams are structured around the different areas we want to support to begin with. We started there when we were looking at the user guide. What is our user experience in each of these areas right now? We went from there to building out what we call the technologist journey, which is just a more specific developer focused version of the classic user journey that product managers put together for any consumer facing product. We used that to really work in sequence through what a developer would do on the platform. Some of the things that are in the technologist journey are, again, the core software development life cycle steps, like creating new software for the first time, iterating on it. That word is definitely doing a lot of legwork there, but everything from local development to testing everything that goes into changing or enhancing your software.
Then, of course, deployment. After that is what we call operation. Everything that happens when you have production software already out there, whether it's taking actions like scaling up or down a service, failing over something, or just using your observability tools to get insight into it. Then we have some things that are outside the standard software development life cycle. Onboarding steps, as I mentioned earlier. The workload onboarding, there's a new team or a new domain area, a new service that needs to be stood up on the platform. Then there's user onboarding, new developer. Then retirement, which isn't one we need to focus too much on right now just because we're probably not retiring software that's been made in the last year anytime soon, but it is one we want to call out and think about in the future.
Underlying all of this is the platform upgrade leg, which is crosscutting and not really part of the technologist journey, but something we thought was so important that we needed to think about it separately. How do we release enhancements to all of the capabilities that we're going to build in service of these parts of the technologist journey without requiring a huge amount of headache for the users that are going to be accessing it. We went from... Anyway, I'll stop there. I've been talking a while.
Abi Noda: This developer workflow you've created is, in effect, this macro paved path as we've talked about before. I get questions a lot from listeners about how to approach paved paths and implement them within their organizations. One question I was asked recently was where do you start? I want to ask you this question because I know in defining this idealized path you had to think about the different parts of the developer journey that you've described. Then also think about which are the most important ones or in scope to be solved. Share your advice and experience on thinking about where does this paved path begin and what's in scope, what's out of scope?
Jared Wolinsky: Yeah. I'll start with something I mentioned earlier, which was how we were prioritizing our work during this crunch time of building the initial version of the platform. It's anything that's actually blocking users from doing what they need to do, that comes first. That's almost cheating though. We're not talking about paved paths there at all. We're talking about unblocking and building any path at all. After that, assuming that you're not blocking your users and they have a way to do the basic things that they need to do, it's best to, of course, talk to some users, find out where the pain points are, and combine that with where you know that users are spending most of their time. I'm a big proponent of actually using a data oriented approach to prioritizing and figuring out which paved paths to build. We might talk a little bit more about that later, but whatever you can do to get an idea of what your users think is causing them the most consternation, where they're wasting the most time and, again, where you just know that developers spend most of their time is a good place to start.
For example, we use the terminology inner loop and outer loop to define the local development experience, as we've started calling it the local-ish development experience, because it might be on your local machine. It might be on a remote dev environment. It might involve a highly privileged sandbox account that's actually in the cloud, but wherever the user is doing their day-to-day iteration on the software, that's generally a good place to start, I think, because that's where you're going to get the most bang for your buck. Again, assuming that you have a deployment path and you have some observability in there and everything else has a baseline.
The outer loop, which actually, in most companies, looks like somebody pushing up a pull request to GitHub or GitLab, getting that approved and then deploying to the real hosting environment. You can improve that, but in a healthy development platform that should be taking up much less of people's time than the actual process of writing software and iterating on it locally. Again, not knowing any more context about specific people asking that question, I would look there to start.
Abi Noda: I think that's great advice. A similar question to that around how to get started that I've been discussing with others is getting an IdP, and I know that's such a buzzword, some type of internal developer platform stood up. Is that a prerequisite to ideate and/or execute on this type of vision around paved paths? I know you mentioned that investing in IdPs, first, can potentially run the risk of putting a target on your neck. Share more about what your view is on that.
Jared Wolinsky: Yeah, assuming we're talking about an IdP as a full-featured out of the box platform that a team can set up and get a very opinionated paved path through the whole developer experience. I do have some advice there. It can be a really good idea to jumpstart a smaller organization. Maybe an organization that is not investing very heavily in a team like mine with a lot of resources there, or just a smaller startup or smaller company, assuming that you can get the budget to pay for one of those things, or you have the resources to stand up open source version of it. That can be an easy low cognitive load way to get started. I don't think it's a prerequisite though, without a doubt.
I would follow the path that I was mentioning earlier, which is identify the parts of the developer experience, developer workflow that you think need the most work and that you're going to be able to deliver the most value to your users in the shortest amount of time and start there and build from it. The one caveat there is if you don't take any time to think about the bigger picture and how this whole thing connects together and how maybe the service creation story needs to be related to the service iteration story or the deployment story. You may wind up with a Frankenstein's monster of well-intentioned features that you've built that present a really difficult challenge or high cognitive load in working from one step to the other.
It's definitely a balance. What I would say about having a target on your back, that's definitely true if you take the other approach and you try to figure out the entire big picture and do a waterfall release cycle of taking a whole year to release anything because you're trying to get the whole end-to-end process perfect, and then you wind up not delivering any value and your developer users and their leadership or wondering, "What is this team doing over there? They're not making our lives any easier. We're doing everything on our own." You may never get the chance to actually see that big vision through to the end for whatever reason.
Abi Noda: Defining this vision and beginning to move it toward approval and mobilization is, of course, a large effort. Who have been the key partners in this process? Or in other words, what has been your process for iterating and communicating this evolving vision with stakeholders and bringing people along for that journey as you move toward, hopefully, having it funded and approved?
Jared Wolinsky: Well, I am blessed here to be working with a lot of leadership that I know really well from past companies that we've worked at together. We have really strong alignment on the product and technology organization of the vision that we're building out here. I know that's one major hurdle that a lot of folks in my position have is getting leadership buy-in on investing in this stuff. That's not a problem I've necessarily had to deal with, and I consider myself fortunate. Some of the people that I partner with on a weekly, if not daily basis, are really my counterparts in the different development organizations, our VP of services engineering and our VP of client engineering, our operations leads. All the people that are either platform leaders or peers or partners of my team or represent our user base in that they lead product engineering teams.
Super important to keep them in the loop. It's a two-way street in terms of communication and feedback. Also, we would be neglectful in our job if we didn't take their feedback and their feature requests seriously. We all have the same sort of job function. They're the users. We're lucky enough that our users are within the same company. Not all people who are building out products of any sort have that advantage. We are constantly asking them where do they think we should be building using tools like DX to find out where people's pain points are and partnering with them to make that happen.
Abi Noda: As you work toward actualizing this vision, I know you've mentioned to me that you think of enablement and platform as two distinct investments or efforts or areas. Share more of your thoughts on that, if you could.
Jared Wolinsky: Yeah, I already mentioned one of my five teams is the enablement team. That should show you how important I think that side of the house is. I've been in organizations in the past where the team who's playing the role that my team has here indexes entirely on being a platform team and really makes it clear enablement is not part of what they offer. They'll offer product features and a platform and it's entirely self-service. There's advantages there, certainly for the team, and if they're getting the self-service part right, great. But it really, I think, hampers the ability to have that two-way conversation with your users and make the platform as successful as you can. Now, there's also the danger of over-indexing on enablement and finding your team spending 75 or 80% of their time answering questions and really holding hands of the users.
I think of it as an evolution depending on the maturity level and where you are as a company in adopting the platform. When we started, we were making some pretty transformational shift in the company and the technology stack. We had acquired Pandora as a company at the end of 2019. It had been two or three years since a good portion of the engineering org had been part of SiriusXM, but they were still entirely on a separate technology stack on their side, which was internally hosted, private cloud, using a lot of HashiCorp technologies. A good portion of the engineers on the Pandora side didn't really have practical experience with AWS or maybe with any public cloud. Not only was there a learning curve for things like CDK and CloudFormation infrastructure as code, there was a learning curve for what does it look like to on a public cloud?
And not to mention the folks coming from the SiriusXM side, some of them were working in AWS, some weren't. But one thing you could be sure of, everybody had a ramp-up period when they were building out and adopting our new platform. Enablement was key to that strategy to make sure that we moved along the development process and were able to unblock teams. Part of that strategy was a Slack-based support workflow that toes the line between not being overly onerous on the requester by having a really complex ticketing system in place while still having some structure. Not having just complete open-ended Slack conversations for support. But we made it clear from day one that enablement and support was not something we should be trying to rid ourselves of and look down upon as part of our job, but one of the highest value things we could be working on.
Now as time goes on, I hope that the organization matures. Education levels go up. We build better abstractions, provide better documentation, and we can ratchet down some of that support. But I can unequivocally say we would not have launched as successfully, at least from my team standpoint, if we had not been playing that sort of enablement role.
Abi Noda: Cool. Yeah, shifting gears a little bit. One of the things that's really impressed me about your organization is the process that you've put in place around prioritization of your platform roadmap, and it's also interesting to hear the way that you've iterated on your approach to arrive at what you do currently. I'd love to start with when you first came on the job. You've mentioned to me you didn't want to just make decisions based on your gut, so you put in a scoring system for prioritization. Share with listeners what that V1 method look like.
Jared Wolinsky: Yeah, one of the first roles I opened backing up to when I joined this organization was for a product manager to lead product for platform engineering. When I eventually brought Eleanor on as was our product lead, the first thing we partnered on together was this prioritization framework because to your point, we all start with gut feeling prioritization, but I really didn't want to be in that position to do it that way for very long. The whole reason for my org existing or a big one, is to reduce cognitive load of other developers and having to gut feel prioritize things is a major cognitive load on me and my team. Putting a well-researched prioritization framework in place was a way to accomplish that. What we did was we started from the top down and said, "Well, what is our vision or our mission as an organization?"
I just touched on it, but the number one thing we identified is speed. Time to market, developer speed, whatever you want to call it. There are other things that are super important for us, cost optimization, security, runtime, reliability, and quality. But most of those things have another team that that's their first priority. We've got the InfoSec team. We have SRE teams and operations on the reliability side. We have FinOps on the cost side. But the one thing again that we really own is time to market and developer speed. We focused on that. We started there and we started prioritizing the impact areas that we thought would make good criteria for projects that we were going to take on. We made a formula. We said, "Here are the five things we care about. It's speed, number one. Reliability, number two. Security cost, and then our own efficiency as a platform engineering team because if we can improve the way that we work, we can provide more value in the other areas and trust or reputation."
Things that could help or hurt our standing with our user base, which in turn helps or hurts our ability to get these solutions adopted and help them. Then we just started tweaking it. We actually started with the gut feeling prioritized list of projects, and then we started running them all through the calculation that we put together. We weighted impact scores to each of those things. We said speed is worth 30% of the total prioritization score and reliability is 25%, and so on and so on to reflect our prioritized list of factors. Then we just started seeing where there were inconsistencies between where the formula ended up and where we feel it should be prioritized. That may sound a little counterintuitive because I'm talking about how I want to not use gut feeling to prioritize, but ultimately we want to come up with a formula that can replicate what we actually care about in terms of prioritization. We just kept tweaking and iterating on it.
Occasionally we would find that we were missing an impact factor altogether. One of them that we added later on is data oriented. Maybe something doesn't directly impact cost or reliability or speed, but it allows us to measure things which will in turn allow us to make more informed decisions, which will impact the other things. Anything related to our metrics program or anything that gets us more visibility into measurement and metrics falls into that category. It's a living, breathing thing and it evolves over time.
Abi Noda: What were some of the challenges that came about with this approach?
Jared Wolinsky: One of the challenges is calibrating the impact of projects from different domains on the same impact factors. For an example, we have projects in the developer experience domain, which could relate to putting in place a local stack configuration or base pod, which allows our developers to more easily replicate what it's like to run software in our AWS environment, but locally. And that has some impact on developer speed. Meanwhile, putting better documentation tooling in place from the enablement team could also impact developer speed, but in a very different way. But ultimately, we need these things to come out in a prioritized list for the org, so we're assigning numbers regardless of what the domain is. But it can be hard to apples to apples, say.
It's a two for speed for that developer experience project, and it's a one for the documentation project. That's something we've just relied on practice and doing a lot of these and triaging and prioritizing to get the hang of. But the initial challenge outside of that sort of tactical one was just getting buy-in from the organization. People want to move forward on things that they think are important and they may not be happy with where that project ends up after running it through the prioritization formula. There's a constant fight to check each other and make sure that we're not biasing the formula to push up numbers on projects that we think are more important, but are, again, calibrating and having integrity across a number of different teams.
Abi Noda: More recently, you've defined a new way of thinking about projects ideas where, as I understand it, everything flows down from these high level business goals. You care about all the way down to specific projects. You've shown me the diagram of it. It'll be a little bit difficult for listeners to visualize, but we'll do our best attempt here. Try to explain what inspired this approach? What it looks like and how it works?
Jared Wolinsky: Yeah, it's funny that we're talking about the prioritization formula, while we're literally reworking the entire thing right now. But one of the key points that we were missing with our current prioritization framework that you pointed out was this clear connection to the goals of the higher levels of the organization. In our case, that's the goals and objectives of the technology org and the product org, and then SiriusXM as a whole. What we decided to sit down and do is make sure that we have a clear way to map all the way from the company level objectives to tech org, to our platform engineering level objectives and all the way down to really where the prioritization and the execution happens at the team level in our org. And to do that, instead of just having this one layer of trying to map a project to how it impacts those certain prioritization impact factors I mentioned earlier, we actually defined here are the business outcomes or the goals that we track to reflect our success as an organization.
And the highest level of those are things like, again, developer speed, ease of delivery, and those are two things that we actually use DX to measure on a quarter to quarter basis, and then we work down from there. We've got, I think, four top level goals. Again, speed, ease of delivery, cost and... Blanking on the fourth one. But they flow down into more and more granular levels that allow us to prioritize at a team level. Below that, we have platform capabilities which identify all of the different scopes of the five teams in the organization. It might be build and deployment or local development or sandbox development or observability, incident management, all the levers we have to pull in order to affect those higher level goals. Then when we prioritize projects, all we have to do is determine how much those things are impacting those capabilities and we can connect the dots all the way up to the goals and the prioritization happens much more easily than trying to tie things directly to those five impact factors.
Abi Noda: With this mapping of goals and projects and business objectives, how does this framework really get utilized and put into action? You get a request or an idea from within your organization or another leader within your company? How does that then map to this framework?
Jared Wolinsky: I can tell you how I would like it to work and then how it's actually working today. I want there to be a number of different intake processes and ideation processes. I don't want to be that platform team that just works on a backlog of requests from across the technology org without any product vision, but I also don't want to be the team that's only ideating from within and ignoring what's happening outside. Really there's two or three streams that I'd like to see. One is coming from leadership within my organization, either the product managers or the engineering leaders or myself. We have an idea. We have a biweekly roadmap review where we can get new projects into the intake process, and we're using a monday.com board to implement the prioritization formula. And then we go through as a team and we run it through the formula and it'll be something similar when we actually operationalize the new framework. Then when we're happy with it, we sort descending by priority and we have our roadmap.
The other places that I'd like to see ideation happen, and I want to see more of this is bottom up from the engineers within my org on each team. They're the ones who know best the challenges and the capabilities of their particular areas. I would love to lean on them more to get projects on the roadmap, and it's something that we haven't done a great job of so far. Then the third one is from outside the organization. We have requests from the security team. We have requests from the development teams themselves, and we have a feature request intake process for that. We triage those alongside things that are happening within the organization. But ultimately, unless we decide something is more of a one-off backlog task that's not big enough to be project level, everything goes through the same prioritization framework regardless of where it came from.
Abi Noda: Really interesting to learn about this prioritization framework. I hope we're able to explore this further at some point, maybe through a webinar or panel discussion with other organizations and leaders as well. Jared, thanks so much for your time today and walking us through your developer experience journey and the work you're doing to enable developers at SiriusXM. Really enjoyed this conversation.
Jared Wolinsky: Likewise. Yeah, it's been fun talking about it, and I will take you up on that panel offer when it becomes real.
Abi Noda: Awesome.
Jared Wolinsky: Thanks, Abi.