Emergency 911 systems were down for more than an hour on Monday in towns and cities across 14 U.S. states. The outages led many news outlets to speculate the problem was related to Microsoft‘s Azure web services platform, which also was struggling with a widespread outage at the time. However, multiple sources tell KrebsOnSecurity the 911 issues stemmed from some kind of technical snafu involving Intrado and Lumen, two companies that together handle 911 calls for a broad swath of the United States.
On the afternoon of Monday, Sept. 28, several states including Arizona, California, Colorado, Delaware, Florida, Illinois, Indiana, Minnesota, Nevada, North Carolina, North Dakota, Ohio, Pennsylvania and Washington reported 911 outages in various cities and localities.
Multiple news reports suggested the outages might have been related to an ongoing service disruption at Microsoft. But a spokesperson for the software giant told KrebsOnSecurity, “we’ve seen no indication that the multi-state 911 outage was a result of yesterday’s Azure service disruption.”
Inquiries made with emergency dispatch centers at several of the towns and cities hit by the 911 outage pointed to a different source: Omaha, Neb.-based Intrado — until last year known as West Safety Communications — a provider of 911 and emergency communications infrastructure, systems and services to telecommunications companies and public safety agencies throughout the country.
Intrado did not respond to multiple requests for comment. But according to officials in Henderson County, NC, which experienced its own 911 failures yesterday, Intrado said the outage was the result of a problem with an unspecified service provider.
“On September 28, 2020, at 4:30pm MT, our 911 Service Provider observed conditions internal to their network that resulted in impacts to 911 call delivery,” reads a statement Intrado provided to county officials. “The impact was mitigated, and service was restored and confirmed to be functional by 5:47PM MT. Our service provider is currently working to determine root cause.”
The service provider referenced in Intrado’s statement appears to be Lumen, a communications firm and 911 provider that until very recently was known as CenturyLink Inc. A look at the company’s status page indicates multiple Lumen systems experienced total or partial service disruptions on Monday, including its private and internal cloud networks and its control systems network.
In a statement provided to KrebsOnSecurity, Lumen blamed the issue on Intrado.
“At approximately 4:30 p.m. MT, some Lumen customers were affected by a vendor partner event that impacted 911 services in AZ, CO, NC, ND, MN, SD, and UT,” the statement reads. “Service was restored in less than an hour and all 911 traffic is routing properly at this time. The vendor partner is in the process of investigating the event.”
It may be no accident that both of these companies are now operating under new names, as this would hardly be the first time a problem between the two of them has disrupted 911 access for a large number of Americans.
In 2019, Intrado/West and CenturyLink agreed to pay $575,000 to settle an investigation by the Federal Communications Commission (FCC) into an Aug. 2018 outage that lasted 65 minutes. The FCC found that incident was the result of a West Safety technician bungling a configuration change to the company’s 911 routing network.
On April 6, 2014, some 11 million people across the United States were disconnected from 911 services for eight hours thanks to an “entirely preventable” software error tied to Intrado’s systems. The incident affected 81 call dispatch centers, rendering emergency services inoperable in all of Washington and parts of North Carolina, South Carolina, Pennsylvania, California, Minnesota and Florida.
According to a 2014 Washington Post story about a subsequent investigation and report released by the FCC, that issue involved a problem with the way Intrado’s automated system assigns a unique identifying code to each incoming call before passing it on to the appropriate “public safety answering point,” or PSAP.
“On April 9, the software responsible for assigning the codes maxed out at a pre-set limit,” The Post explained. “The counter literally stopped counting at 40 million calls. As a result, the routing system stopped accepting new calls, leading to a bottleneck and a series of cascading failures elsewhere in the 911 infrastructure.”
Compounding the length of the 2014 outage, the FCC found, was that the Intrado server responsible for categorizing and keeping track of service interruptions classified them as “low level” incidents that were never flagged for manual review by human beings.
The FCC ultimately fined Intrado and CenturyLink $17.4 million for the multi-state 2014 outage. An FCC spokesperson declined to comment on Monday’s outage, but said the agency was investigating the incident.
from Krebs on Security https://ift.tt/36h4NXC