For roughly an hour last week, Google customers were unable to access some of the company’s most popular products, such as Gmail, YouTube, Drive, Maps, Docs and more. Now, the company has provided more details about the outage and its cause.
According to Google, the problem was caused by a bug in the automated quota management system, which had impacted the Google User ID Service. As a result, users weren’t able to send or receive emails, and some were met with the following error message: “There was a problem with the server”.
"On Monday 14 December, 2020 from 03:46 to 04:33 US/Pacific, credential issuance and account metadata lookups for all Google user accounts failed," Google said.
"As a result, we could not verify that user requests were authenticated and served 5xx errors on virtually all authenticated traffic. The majority of authenticated services experienced similar control plane impact: elevated error rates across all Google Cloud Platform and Google Workspace APIs and Consoles."
In short, the bug decreased the capacity of Google’s central ID system. As a result, the system struggled to verify the authentication of user requests.
The outage lasted for about an hour, but as soon as Google’s technicians fixed it disaster struck yet again, this time taking down just Gmail. For approximately seven hours, some users were unable to send or receive emails.
"The error message indicated that the email address did not exist, and as a result, the impacted emails were never delivered," Google explained. "Affected senders may have received a bounce email generated by an intermediate SMTP service."
The problem was in the ongoing migration to update the underlying configuration system of Gmail’s SMTP inbound service, it was said.
"[This change] shifted the formatting behavior of a service option so that it incorrectly provided an invalid domain name, instead of the intended 'gmail.com' domain name, to the Google SMTP inbound service.”