The day Gmail went down…and what we can learn

If you’re a Gmail user like a large number of people worldwide, you may have noticed this screen when you tried to check your email today.

Google-Error-500

Unfortunately a 500 error code is a server’s way of telling us that something is temporarily wrong, but it doesn’t tell us more than that and so far Google hasn’t released information about what caused the outage.  We can however learn some important lessons from this outage.

There’s no such thing as “too big to crash” on the Internet

Google is one of the largest companies in the world, operating some of the most used websites in the world.  They have an entire team, the Google Site Reliability Engineering team, devoted to monitoring their service health and keeping outages like this from happening. Ironically the team was engaged in a Q&A on Reddit.com when the outage took place.  I guess they do more than anyone realized to keep the Google services running smoothly.

Using multiple servers doesn’t prevent crashes

I’ve always been impressed at Google’s server infrastructure. They build their own servers that are about as bare bones as can be and then house them in shipping containers.  If you’re a tech person like us here at Oso Studio, that’s amazing!  Each shipping container holds almost 1,200 servers (possibly even more now since many parts have been miniaturized even further.

Sometimes a small issue can cause a big problem

While Google hasn’t said what has caused the outage/service disruption, there have been other high-profile outages at other companies caused by something as simple as a network switch going out.  Today’s server environment is incredibly complex in order to be able to serve an exponentially growing amount of traffic and data and with more pieces in the puzzle come more components that can fail.

So what can you, as a business owner, learn from today’s Google outage?

So if a multi-billion dollar company can have an outage, what can you, a SMB owner, do to prevent an outage of your website?  Fortunately for most businesses, their websites are much different than Google’s.  Instead of needing to constantly update content in an email application, most business websites serve the same information to all users.

Content Distribution Network

A Content Delivery Network (CDN) is the first step in speeding up your website and keeping it online even if your primary server fails.  A CDN is a network of servers located around the world that store a copy of your website that is updated almost constantly.  By having access to your site on a server that is close to them, a site visitor notices a faster loading time.  Plus, when a CDN notices that your server is reporting an error, it falls back to serving the last working copy of your site so that your visitors never know your site is down.  There are several CDN services available to meet the needs of your website including CloudFlare and Amazon WebFront.  If you would like to talk about getting your website setup on a CDN, get in touch with us, and we’ll work with you to identify the best solution for your needs.

Burst Scalability

While a CDN helps with traffic inflows as well, it is still a great practice to have a hosting environment that gives you significant overhead for additional traffic.  Depending on your typical traffic level a scaleable cloud solution, virtual private server, or a dedicated server may be appropriate for your needs.  One of our political action clients was involved in a major rally a few years ago in Washington DC.  Because of our inbound marketing efforts, they held the number one spot and several other page one spots on Google when you searched for their cause.  Almost overnight, their website traffic spiked by over 1000% because their rally was picked up by national TV news outlets and print publications.  Because we had designed a hosting package for them with huge spikes in mind, their website was able to stay online and serve over 10,000 visitors per hour for several weeks.

Don’t rely on the web host that advertises $5/month unlimited hosting

Servers are expensive, bandwidth is expensive, and hosting facilities are more expensive than you could imagine.  Web hosts that offer “unlimited” hosting for incredibly cheap prices are betting on the fact that you are never going to put their claim of unlimited to the test.  The only way they can make money at such a low price is by cramming hundreds if not thousands of websites on a single server.  This can have several unfortunate side effects, the most common of which is slow performance and unplanned outages if any one site on the server gets a traffic spike.

So if you’re serious about keeping your site online 24/7 regardless of traffic spikes or server crashes, get in touch with us and we can work with you to identify your needs and design a hosting package to give you the confidence that your site is working for you and not giving users that dreaded Server 500 error.