[Major] Issues with Box AI
Incident Report for Box
Postmortem

We recently addressed issues affecting Box AI. We would like to take the opportunity to further explain these issues and the steps we have taken to keep them from happening in the future.

Between March 7, 2024 5:00 PM PT and March 8, 2024 3:00 AM PT, some users may have experienced difficulties while using Box AI in Box Notes and Preview. During this time, queries made to Box AI returned an error. The issue occurred due to a change to a load balancing configuration for one of our endpoints used for Box AI, which resulted in unexpected credential issues. We were able to resolve the issue by fixing the configuration and improving the resiliency of our system. In addition, we have increased alerting and implemented more monitoring and logging to prevent similar issues from occurring in the future. 

Analysis

At Box, we are always striving to improve the efficiency and fault tolerance of our services. The AI team implemented a change to resolve latency issues in our services. A misconfiguration in this change temporarily resulted in AI query failures, which we were able to resolve by identifying and addressing the problematic configuration settings. Additionally, the issue revealed some gaps in our alerting and testing systems, as well as gaps in our internal team process, both of which impacted remediation time.

Corrective Actions

The following corrective actions have been completed or are planned:

  • Add in more alerting and automated processes to prevent future complications
  • Improve observability by adding in more logs, metrics and tracing throughout the system.
  • Improve testing reliability by making testing environments mirror production environments.

We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter. 

Sincerely,
The Box Team

Posted Apr 02, 2024 - 09:56 PDT

Resolved
After further monitoring, this incident is now considered resolved. Box AI Service has been restored to full functionality. If you continue to experience any issues, please contact Box Support at https://support.box.com.
Posted Mar 08, 2024 - 02:14 PST
Investigating
We are investigating an ongoing issue affecting Box AI. Users may see errors or slowness when using Box AI. We will provide more information as soon as it is available.
Posted Mar 08, 2024 - 01:27 PST
This incident affected: Box AI.