#33 Be Careful: Exploding Costs That Might Kill Your Product
Infrastructure costs can unexpectedly increase and harm your project success. Here are several common issues that repeatedly cause budget problems in software projects.
All architectural decisions involve trade-offs. This is widely understood. However, there is one aspect that is often overlooked: the financial implications.
Several factors can lead to unexpected costs. These include inadequate preparation for denial of service attacks (DoS & DDoS), excessive logging without considering the need for it, massive use of expensive storage tiers, and implementing auto-scaling without limits.
I cover all of these topics at conferences in my talk "What Every Software Architect Should Know About Infrastructure". It is time to describe them here as well.
Being prepared for DoS & DDoS
This year, around May or June (I can't remember exactly), I saw a thread on Reddit where someone described their problematic situation. In short, they received a $104k bill for hosting their static website.
You might ask, "What? Over $100k for a static site? That is insane." Yes, it is. The reason for such an amount was a denial of service attack. In the starter plan, the hosting provider charges $55 per 100 GB of bandwidth above the first 100 GB, which is free.
There were different opinions about this situation. Some people said that it was the author's fault and that he should have read and analyzed the pricing information first, which explicitly stated that the provider charges $55 per 100 GB. Other opinions were that the provider should have some built-in mechanisms to defend against DDoS attacks in such cases. I think both sides are right to a certain extent, but this is not the time to find the guilty. Instead, let's look at what can be done to avoid such a situation.
Denial of service attacks usually occur in one of three layers: 7 (application), 4 (transportation), and 3 (network).
The first step you can take is to minimize potential points of attack. The math is simple - the fewer points, the better.
Next, you can use out-of-the-box solutions from the cloud providers. In Azure, this is Azure Frontdoor, and in AWS, it is AWS Shield. Another option is to use third parties like Cloudflare. I really like the latter because it offers seamless integration with the aforementioned clouds + it's pretty cheap.
Mistakes that cost a lot
Another example of exploding costs was the Cara app. Here is the story (according to Gergely Orosz):
In short, the Cara app used serverless on Vercel and it worked perfectly. The problem was that at some point the app went viral and incurred extreme costs ($98k to run it for a few days).
I think this is a good lesson for anyone building a public-facing application that has the potential to go viral, and when I say be ready, I mean at least one of the following.
Have enough money ready
Think of this as having a safety net:
Secure investment funding (venture capital, angel investors, or personal capital)
Be prepared for significant infrastructure costs
Scale infrastructure dynamically based on demand
Advantage: Guaranteed service availability for all users
Challenge: Requires money (sometimes a lot of money)
Controlled scaling with limits
With this approach, you need to implement scaling boundaries:
Set clear infrastructure limits (e.g., 2-3 server instances)
Accept that some users may experience service unavailability during peak times
Communicate limitations transparently to users
Advantage: Predictable costs and manageable infrastructure
Challenge: Potential user dissatisfaction during high-demand periods
Cost-optimized infrastructure from day one
Here, you have to focus on optimization from the start of application development:
Optimize application performance and resource usage
Aim for significantly reduced operational costs (e.g., 10% of unoptimized costs)
Implement efficient caching strategies
Choose cost-effective service providers and technologies
Advantage: Balance between cost and performance
Challenge: Requires a lot of initial optimization efforts (but you will benefit a lot from it)
The hidden costs of storage
One of the most common problems I see when it comes to high infrastructure costs is data storage. I can't count the number of times I have seen files stored in the hot tier of the cloud while being accessed extremely infrequently.
At first glance, when a cloud provider charges $0.023 per GB for a hot tier, it might seem negligible. For applications storing just a few hundred megabytes, the monthly cost is indeed minimal—around two cents. However, the real challenge emerges when your application's storage needs grow exponentially.
Consider building a social media application. Let's break down this scenario:
User base: 10,000,000 users
Storage allowance: 100 images per user
Maximum image size: 5 MB
Doing the math:
10,000,000 users × 100 images × 5 MB = 5 PB = $115k per month
Shocking, isn't it? This shows why we can't be misled by looking at small-scale costs. What seems like pennies can quickly escalate to six figures when operating at scale.
High costs generated by logs
As a software developer, I know firsthand that we developers have a habit: we love to log everything. And when I say everything, I really mean everything!
Let me share an example that shows just how expensive this can get. In one project I was involved in, the forecasted logging cost was about $10k per month. But guess what? The real bill ended up being more than $100k per month! That's ten times more than was planned for.
What was the problem? We discovered the team was logging tons of information that nobody ever looked at or used.
The solution was simple but effective: they cut down on unnecessary logs and in the end, switched to a different logging platform. This meant only keeping the logs that actually helped us monitor and fix issues in our system. Sometimes less really is more – especially when it comes to your cloud bill.
TL;DR
Unexpected infrastructure costs can explode due to the following factors:
DDoS attacks. A static website got a $104k bill due to unprotected bandwidth charges.
Viral success without preparation. The Cara team faced a $98k bill for running a viral application for several days.
Hot storage misuse. What seems cheap ($0.023/GB) can scale to $115k/month for a social media app with 10M users.
Excessive logging. One project's logging costs jumped from $10k to $100k/month due to logging unnecessary data.
Solutions include:
Using DDoS protection (Cloudflare, AWS Shield, Azure Frontdoor)
Having funding ready or implementing controlled scaling
Optimizing infrastructure from day one
Using appropriate storage tiers
Logging only necessary information
What are your thoughts on exploding costs? Have you ever faced such a situation in your project/product?
One other thing that comes to my mind is simply setting cloud budget to ensure you won’t get any surprises after whole billing period.