Outlier events force rapid adaptations.

Events like 9/11 and COVID-19 force rapid adaptations, both by individual sites and businesses and by the internet as a whole. Today, new circumstances are forcing new, urgent needs for adaptation whose impact will endure far beyond this crisis.

The recent tsunami of news and disruption brought on by the Coronavirus has led me to reflect on how much the internet has changed since the 9/11 terrorist attacks, the last news event of this scale and the only other event of this scale since the commercial internet blossomed in the mid-1990s.

Figure 6.7-1 from the Apollo 13 Mission Report.Figure 6.7-1 from the Apollo 13 Mission Report.

Both events dominated public attention and news reporting for weeks or months and had nearly unprecedented impact on people’s daily lives. But the two events evolved differently and were covered by two vastly different versions of the web.

Confronting the unprecedented

The September 11, 2001 terrorist attacks unfolded over a couple of hours in the morning between the crash of the first plane into the World Trade Center and collapse of the two towers. At CNN.com, where I was working at the time, the top story going into the morning was Michael Jordon’s second return from retirement and traffic to the site was at normal load. But after 9am as word of the story started getting out traffic surged at an unprecedented pace, doubling every seven minutes. The wall of traffic that came to the site that day was far more rapid than any specific point in the Coronavirus story, which has been more of a steadily rising wave or tide than a sudden surge.

That wall of traffic, which we at CNN.com estimated represented a demand of 20X average daily peak demand, exposed weaknesses that all news sites (indeed all web sites) had at the time: limited capacity based on the cost of bandwidth and servers, which were at least an order of magnitude more expensive in 2001 than today. CNN.com was built to sustain a 2X daily peak demand with no changes and had reserve capacity in other Turner sites that could be repositioned behind CNN.com to provide another 2X capacity in 30-45 minutes. CNN.com also had pre-designed templates in place to reduce the size of its home page to double the capacity yet again, bringing total capacity to 8X daily peak. Beyond that, the home page could be reduced in size further, but this was a manual exercise that took longer to execute. Clearly, at a doubling rate of every seven minutes, the internet arrived at the doorstep faster than we could scale. Although CNN was able to return to service before the other national sites recovered, it only was able to do so in highly reduced form. The images below show how the site’s home page evolved that day and the next as we dealt with the load. It is worth noting that traffic for most of the day on 9/12 was 90% of the peak on 9/11.

9/11 Traffic: Off the Chart

Capacity plan as of 9/11 versus actual arrival rate.Capacity plan as of 9/11 versus actual arrival rate.

Lessons learned from 9/11

There are several lessons that CNN and the internet news industry learned that day that, along with secular trends in computing and web infrastructure, have helped allow the web to provide a much more robust experience during the current crisis.

First, we learned the true peak: somewhere on the order 20X. And we learned the peak arrival rate: doubling every seven minutes. We had previously estimated the true peak to be 10X based on presidential election traffic, and we have previously estimated the peak doubling rate to be 30 minutes based on our experience with commercial airline crashes and other major breaking news stories. Our experience on 9/11 fundamentally changed our calculations of how to survive outlier news events. No longer could we adjust on-the-fly; instead, we needed the capability in place all the time. And if we did not want to reduce the content load, that meant 20X capacity. This represented a significant increase in the cost structure, but here sectoral trends worked in our favor over time to eventually eliminate hosting costs as a constraint. First, the development of Linux as a host operating system running on low cost Intel based hardware reduced server costs 5-10X. And the development of CDNs and later AWS and other cloud platforms reduced bandwidth and hosting costs further so that by the time of the Iraq war two years later it was feasible to have 20X capacity in place at all times.

Evolution of CNN.com on 9/11

CNN.com on September 12. The home page converted over to Special Report.
CNN.com on 9/10/2001. The standard home page at the time.
CNN.com at 9:30a. The webmaster had already switched to the light home page template, with nothing below the fold, which doubled capacity on the site. Note that the story was so fresh that the reporting in the right column (so-called T2 space) was on unrelated stories such as Micheal Jordan’s second comeback.
CNN.com at 10a. Nothing but the bare essentials: logo, image, and some text HTML. At this point, the site failed for one hour before we were able to reposition capacity behind the site and coordinate a full restart, which required us to down all our internet connections to allow traffic to flow evenly into our server pool.
CNN.com on September 12. The home page converted over to Special Report.
CNN.com on 9/10/2001. The standard home page at the time.

We also learned that the internet has a pretty steady level of traffic no matter what is happening. On 9/11 a number of us happened to be at AOL for a meeting of AOL Time Warner companies and we were able to see what was happening to AOL’s network, which like CNN.com was a good proxy for the general internet at the time. What we saw was that while CNN’s traffic was surging, overall traffic on AOL only increased about 20%. But the traffic all concentrated on a very small section of the web, with CNN, MSNBC, and Yahoo at the center of it. What this indicated was that as far back as 2001, the core internet backbone had plenty of capacity for whatever the world could throw at it.

Lessons for Modern Web

What strikes me most about following the news coverage of Coronavirus on the web now is how rich and reliable an experience it is across a wide range of sites. No news sites are collapsing under load (although the same cannot be said for government sites, unfortunately); sites are loaded with high quality images and graphics, and every site can carry multiple live streams for free at near TV quality resolution, all under peak loads. The depth of reporting and analysis is substantially more robust than anything we could have produced in 2001, and the best news sites on the web—nytimes.com and washingtonpost.com and to a lesser extent nbcnews.com—are rising to the occasion.

Traffic Impact of COVID-19

Traffic to select health and science sites with COVID-19 information.Traffic to select health and science sites with COVID-19 information.
Source: SEMRush

There are deeper lessons here for sites outside the news sector. First, there is no excuse anymore for collapsing under load, since having capacity in place is a trivial expense in the overall scheme of things, even if you aren’t CNN.com. This might seem obvious, but oddly sites still do crash under true peak loads in what is a classic case of penny-wise, pound-foolish thinking. If your site is not there when the peak demand for your site arrives, you have truly blown it; you won’t get another chance. Consider Zoom, which has been able to absorb extraordinary unplanned growth; had they stumbled, they may have lost those new users that flocked to their service forever.

Second, organizations need to put as much thought into the User Experience around their COVID-19 communications as they do with their pre-crisis web sites and communications. This is especially true for institutions such as hospitals and universities, where they are attempting to sustain something as close as possible to normal operations. While the informational needs of their patients or students are probably as intense and varied as ever, many institutions, in their rush to deal with the onslaught of ever changing policies and news, simply publish a single page guide to all COVID-19 information. The pages often take the form of a blog style reverse chronological list of posts without any overarching structure to help the user navigate to their specific informational needs. This is really a disservice to those users. The ease with which a user can find the information they need from you in this crisis not only conveys a message about your organization and its ability to handle a crisis and adapt in times of stress, it also provides real value to your users who, like you, are struggling to navigate a difficult situation under extreme stress.

Outlier events force rapid adaptations, both by individual sites and businesses and by the internet as a whole. In the early days of the internet, it was flash crowds—of which 9/11 was the most extreme example—that drove innovation in scaling. And from the beginning of the internet, going back to the Morris worm in 1988 and continuing today, large scale security breaches and DOS attacks have forced rapid innovation and adaptations on businesses. Today, new circumstances are forcing new, urgent needs for adaptation, such as online learning, large scale remote working, and changes in how we conduct our elections. I am sure that the internet community will rise to these challenges, but there will be plenty of white knuckle moments. Just remember to put yourself in your customers’ shoes when rushing to get out that COVID information page on your web sites.