We recently addressed an issue affecting Public API, Downloads, Uploads, Login, and Sign. We would like to take the opportunity to further explain these issues and the steps we have taken to keep them from happening in the future.
On October 2, 2023, between 09:50 am PDT and 10:27 am PDT, some users may have experienced difficulties while working in Box. During this time, users may have had a degraded experience or difficulty accessing Public API, Downloads, Uploads, Login, and Sign. The issue occurred due to a change to one of our backend systems that created a bottleneck that impacted other internal services. We were able to resolve the issue by deploying more instances of the service experiencing the bottleneck. In addition, we are implementing improvements for enhanced control over the flow of traffic within our backend systems. We are also working to establish new alerts on the impacted processes that will decrease time to detect and mitigate similar issues if they occur in the future.
Analysis
Box services are underpinned by a common service communication layer called “Service Mesh” providing Box’s services with the abilities to securely communicate and automatically adapt the scale of the services to traffic bursts. During this incident we observed that the Service Mesh layer did not behave as expected. Specifically, the impacted service did not auto-scale as planned and, as a result, the service became overwhelmed leading to the issue. A deeper analysis of the issue demonstrated that the existing Service Mesh configuration, while optimum for the on-premise deployments, was not optimized for the traffic profile experienced in the public cloud.
We have been working with the technology vendor for our Service Mesh implementation to tune the configuration according to the new environment and have been rolling out the changes incrementally. We have also increased the capacity on critical services while rolling out these changes to minimize reliance on auto-scaling during that period.
Corrective Actions
The following corrective actions have been completed or are planned:
We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter.
Sincerely,
The Box Team