We've confirmed that all systems are back to normal with no customer impact as of 02/08, 00:00 UTC. Our logs show the incident started on 02/07, 21:30 UTC and during that 2.5 hours that it took to resolve the issue , customers would have experience alerting issue i.e. notification was not received for alerts configured based on availability as well as metrics.
- Root Cause: The failure was due to communication failures between two services which are responsible for alert rules and alerts input.
- Lessons Learned: We understand the root caused completely and work has planned to avoid re-occurrence of this issue in future.
- Incident Timeline: 2 Hours & 30 minutes - 02/07, 21:30 UTC through 02/08, 00:00 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.
We continue to investigate issues within Application Insights. Root cause is not fully understood at this time. Customers continue to experience alerting related issue i.e. no email notification will be received for set alerts. We are working to establish the start time for the issue, initial findings indicate that the problem began at 02/07 ~21:30 UTC. We currently have no estimate for resolution.
- Next Update: Before 02/08 03:00 UTC
We are aware of issues within Application Insights and are actively investigating. Customers may not receive alerting emails based on availability tests. We provide more information as we learn.
- Work Around: Customers may use azure portal to view failures and success in availability charts.
- Next Update: Before 02/08 00:30 UTC
We are working hard to resolve this issue and apologize for any inconvenience.
This post first appeared on MSDN Blogs | Get The Latest Information, Insights, Announcements, And News From Microsoft Experts And Developers In The MSDN Blogs., please read the originial post: here