Moving off Heroku, slowly
I was a huge fan of Heroku about ten years ago.
It was one of the reasons Podia was able to iterate so quickly and avoid spending too much time thinking about infrastructure. And, in the places that Heroku couldn’t support our use case, like providing custom domains for our users, we had to expend considerable effort and stress managing servers. It sucked and Heroku proved its worth.
For many years after the acquisition by Salesforce, Heroku was an outlier: an acquired company that kept iterating and didn’t get shut down, or enshittified by the parent company. It was a miracle! I can’t remember the exact date but that all seemed to change once Heroku started requiring a salesforce login prompt and everything was stamped with the Salesforce branding (and the sales team became much much worse). From that point onwards, Heroku seem to calcify and my enthusiasm transitioned to “it’s fine”.
Our escape plan for Podia since Day 1 has always been “when our Heroku bill hits ~$20k/mo, that would fund two engineers to manage our infrastructure instead”. We’ve not hit that level in 9 years and so, even with the stagnated platform, it made sense to stay.
At that point we were making full use of the platform: our tests ran on Heroku CI, we used review apps to share our work and get feedback on the functionality, we deployed to a staging app, and in production we used Heroku Postgres and Redis databases, Heroku scheduler, and various services through their add-on marketplace. We even used Heroku for hosting our Metabase instance.
Then the month-long CI outage happened in 2021 and we moved our builds over to GitHub Actions. They never came back.
Then we negotiated rates with various add-on providers by going directly to them. Heroku takes a 30% cut of all add-on revenue so it’s a win-win situation to go directly to a provider and say “hey, we’re paying you $100 dollars on Heroku, how about we pay you $85 directly?”. You save money; they make more money. It’s also lazy revenue for Heroku and right now we need to be encouraging them to really strive and earn their money.
We moved to a hosted Metabase instance that meant one less thing for us to manage (and didn’t suffer from the 30sec Heroku timeout on queries).
Around 2023, I started hearing a lot about CrunchyData and a few of my friends started moving their databases over there. That kicked off our big database migration—Postgres databases went to Crunchdata and redis databases went to redis.com. This was easily the most impactful part of our slow move away from Heroku because with a Postgres database and two redis databases in production, we were suffering forced maintenance periods roughly every 2 weeks. Yes, every second Friday there was mild panic as all the errors and alarms went off when Heroku took the database offline and did whatever they needed to do, then restarted the app. This whole process took about 10 minutes but it was the stress which really got to me: I had to inform our support team, update our status page, coordinate a developer to watch the process and acknowledge the alerts etc. TWICE A MONTH! It was a ridiculous situation that our database provider was the main cause of our downtime! If anyone is going to be the cause of downtime, it should be us not our providers.
That’s our situation now: we’ve divested Heroku of our CI, add-ons, and databases so now the only thing left running over there are the dynos powering the web and background processes.
That’s a great place to be! We don’t need to go any further if it doesn’t make any sense right now; but we’re primed and ready to jump to a different infrastructure when the time is right.
And I’ve done the maths recently. We could definitely save money by moving to dedicated servers, even with over-provisioning to the max-autoscaling capacity with have on Heroku (which is more than double our typical usage). At the current moment, we have a good discount on an enterprise contract with Heroku and that means the savings from moving our servers to another provider just don’t justify the risk and opportunity cost.
For the moment.
If those costs rise, it would be easy to justify a business case for moving away and a relatively straightforward technical migration.
So if you are still fully-in on Heroku, here are my recommendations:
- Move all your add-ons to direct billing. You’ll save money, they’ll make more money, and Heroku stops earning money for doing fuck-all. No brainer.
- Extract your databases to a hosted database provider. I throughly recommend CrunchyData for Postgres databases because they’re just such a no-drama provider. I think I’ve talked to them more often at conferences than I’ve needed to for support, which is exactly what I’m looking for.
- Your next step would be to move CI to another provider. GitHub is fine and well-supported by the community but there are also specialised providers like BuildKite or CircleCI.
- The last step: moving your compute off Heroku to another PaaS, orchestration layer on top of AWS, or dedicated servers deployed with Kamal. There are a gazillion options here—many many more than there were a decade ago1. Just remember to consider the total cost and all the trade-offs involved. You get paid the big bucks to think about the entire system and not just to blindly chase down costs to the lowest possible level.
I think my original “when we hit $20k/mo” rule has served us well but today, with all the options available and ever widening gap between hardware advances and Heroku’s choices, I think a better rule is: once your Heroku bill hits $10k/mo (after negotiations), and you can save more than 50% by switching, then it’s definitely time to move.
And if you’re wondering: but what about the review apps? How can I replace them? Well, I have a story to tell there too (soon).