2022-07-28 17:05 UTC
2022-07-28 17:10 UTC
2022-07-28 21:24 UTC
Our SaaS management plane uses AWS us-east-2 as primary region. AWS started having networking and power issues, which broke our EC2 instances accross all AZs. This resulted in console.ves.volterra.io login issues and errors.
Our SaaS management plane uses AWS us-east-2 as the primary region. AWS started having networking and power issues, which broke some of our EC2 instances accross all AZs. This resulted in console.ves.volterra.io login issues and API errors.
On Thursday, July 17:05pm UTC, we received alerts about login and API errors. Right after notification of the issue, we started investigating the fastest solution to recover. At 17:10 we discovered that many of our VMs in AWS are in error state and AWS disclosed that it is experiencing outage in us-east-2. Because of this AWS outage, we were unable to stop/restart or remove VMs. At 21:24 pm UTC we completely recovered all services.
Management plane outage was caused by an outage in AWS region and all their availability zones.
Operations and Engineering is working to improve failover to backup region to prevent console and API errors.