tag:status.achieveit.com,2005:/historyAchieveIt Status - Incident History2024-03-29T01:23:04-04:00AchieveIttag:status.achieveit.com,2005:Incident/198857482024-02-01T21:12:53-05:002024-02-01T21:12:53-05:00Slowness loading some plans and dashboards<p><small>Feb <var data-var='date'> 1</var>, <var data-var='time'>21:12</var> EST</small><br><strong>Resolved</strong> - We have published a fix for the performance issue related to loading certain plan information and all system metrics are normal.</p><p><small>Feb <var data-var='date'> 1</var>, <var data-var='time'>11:48</var> EST</small><br><strong>Identified</strong> - We have identified the cause of the performance degradation and are working on a fix. To alleviate the performance bottleneck, we have increased resources available to the database in the Commercial hosting environment and are seeing that load has substantially improved. We will continue to monitor database performance and change scaling parameters as needed as we work on a permanent resolution.</p><p><small>Feb <var data-var='date'> 1</var>, <var data-var='time'>10:46</var> EST</small><br><strong>Investigating</strong> - We are investigating an issue that appears to be causing some instances of plan information to load either slowly or fail to load when viewing an individual plan, running cross-plan reports, or loading custom dashboards.</p>tag:status.achieveit.com,2005:Incident/176219992023-06-19T13:04:44-04:002023-06-19T13:04:44-04:00Application unavailable in Commercial production environment<p><small>Jun <var data-var='date'>19</var>, <var data-var='time'>13:04</var> EDT</small><br><strong>Resolved</strong> - All system functionality has returned to normal.</p><p><small>Jun <var data-var='date'>19</var>, <var data-var='time'>12:47</var> EDT</small><br><strong>Monitoring</strong> - A spike in database utilization related to a maintenance task appears to have caused the service interruption. We've halted the maintenance process and the application is now operating normally.</p><p><small>Jun <var data-var='date'>19</var>, <var data-var='time'>12:23</var> EDT</small><br><strong>Investigating</strong> - There is a service interruption preventing users from logging into the application in the Commercial environment (my.achieveit.com). We are investigating the cause of the failure.</p>tag:status.achieveit.com,2005:Incident/162398512023-02-23T21:21:44-05:002023-02-23T21:21:44-05:00Intermittent email delivery delays<p><small>Feb <var data-var='date'>23</var>, <var data-var='time'>21:21</var> EST</small><br><strong>Resolved</strong> - The Mailgun service appears to have stabilized. During the past 12 hours, all emails sent from our system through Mailgun have been delivered as expected. We are marking our issue as resolved, and will provide additional updates from Mailgun if there are any of importance to our customers.</p><p><small>Feb <var data-var='date'>23</var>, <var data-var='time'>10:56</var> EST</small><br><strong>Monitoring</strong> - Mailgun continues to be under an active DDoS attack. We have observed improvements in email delivery rates over the past 24 hours which indicates that Mailgun's mitigation efforts are being effective. We will continue to monitor delivery rates and Mailgun reports for the immediate future and provide updates when there is a substantial change.</p><p><small>Feb <var data-var='date'>22</var>, <var data-var='time'>16:34</var> EST</small><br><strong>Identified</strong> - The email service used by AchieveIt to send emails from our production applications is experiencing intermittent delays delivery certain emails to recipients. The service, Mailgun, has reported that they have been under a distributed denial of service attack since February 21, 2023. During that time, Mailgun has been working with its hosting and infrastructure providers to reduce the impact of the attack and restore normal delivery.<br /><br />We have observed that a small percentage of emails from AchieveIt in the past 36 hours have been either delayed or not yet delivered. Most emails are, however, being delivered as expected. We will continue to monitor Mailgun's response and provide updates here. We are also evaluating options for delivering the remainder of the delayed emails.</p>tag:status.achieveit.com,2005:Incident/102009072022-06-09T12:58:02-04:002022-06-09T12:58:02-04:00Email delivery is delayed<p><small>Jun <var data-var='date'> 9</var>, <var data-var='time'>12:58</var> EDT</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Jun <var data-var='date'> 9</var>, <var data-var='time'>12:57</var> EDT</small><br><strong>Update</strong> - We have verified that email flow has been completely restored and all delayed emails have either been delivered or are in the process of being delivered.</p><p><small>Jun <var data-var='date'> 9</var>, <var data-var='time'>10:09</var> EDT</small><br><strong>Update</strong> - Mailgun has confirmed that they have restored all services but are still seeing delays in delivering emails. We are seeing many emails being successfully delivered, but it is likely that some emails are also being delayed. All delayed emails will be delivered once Mailgun resolves their system problem.</p><p><small>Jun <var data-var='date'> 9</var>, <var data-var='time'>09:46</var> EDT</small><br><strong>Monitoring</strong> - Mailgun has at least partially resolved their system outage and we are beginning to see emails flow again. We will continue to monitor to ensure that emails continue to be delivered as expected and post an update soon.</p><p><small>Jun <var data-var='date'> 9</var>, <var data-var='time'>09:08</var> EDT</small><br><strong>Identified</strong> - The service that AchieveIt uses to send emails from our platform, Mailgun, is experiencing a service interruption that is causing emails sent from our system to be delayed. We are monitoring Mailgun's updates as they work toward a fix and will publish updates here as we know more. Any email that should be sent will still be delivered once the email service has been restored.</p>tag:status.achieveit.com,2005:Incident/90658852022-01-12T21:38:35-05:002022-01-14T08:41:08-05:00Slowness loading some plan information<p><small>Jan <var data-var='date'>12</var>, <var data-var='time'>21:38</var> EST</small><br><strong>Resolved</strong> - We have confirmed that all plans and plan items are loading as expected. We will publish a post mortem with additional information about the incident in the next two days.</p><p><small>Jan <var data-var='date'>12</var>, <var data-var='time'>20:15</var> EST</small><br><strong>Monitoring</strong> - We have confirmed system performance has returned to normal. We will continue to monitor activity for the next hour to confirm that all performance issues have been resolved.</p><p><small>Jan <var data-var='date'>12</var>, <var data-var='time'>18:33</var> EST</small><br><strong>Update</strong> - We have finished applying the changes to the database to improve the performance. We are seeing improvements but are continuing to evaluate to identify if there are any scenarios where displaying list views of plans are not functioning as expected.</p><p><small>Jan <var data-var='date'>12</var>, <var data-var='time'>16:32</var> EST</small><br><strong>Update</strong> - We are in the process of applying changes to the database to alleviate the performance issues. In the process of making these changes, some areas of the application may have limited functionality. We will continue to update as we complete these changes.</p><p><small>Jan <var data-var='date'>12</var>, <var data-var='time'>15:44</var> EST</small><br><strong>Identified</strong> - We have identified that the performance degradation is related to a database configuration. We are working to confirm the root cause and also make configuration adjustments to alleviate the immediate performance issue.</p><p><small>Jan <var data-var='date'>12</var>, <var data-var='time'>13:52</var> EST</small><br><strong>Investigating</strong> - We are investigating an issue that appears to be causing information for some plans to load slowly, and in some instances fail to load.</p>tag:status.achieveit.com,2005:Incident/87164422021-11-29T14:07:58-05:002021-11-30T14:24:57-05:00SSL certificate error preventing access to web application in US Government environment<p><small>Nov <var data-var='date'>29</var>, <var data-var='time'>14:07</var> EST</small><br><strong>Resolved</strong> - We have verified that all system access has returned to normal.</p><p><small>Nov <var data-var='date'>29</var>, <var data-var='time'>11:40</var> EST</small><br><strong>Monitoring</strong> - We have updated the SSL certificate and confirmed that the errors preventing the app from functioning are cleared. We will continue to monitor the issue for the next hour.</p><p><small>Nov <var data-var='date'>29</var>, <var data-var='time'>11:20</var> EST</small><br><strong>Identified</strong> - We have identified that one of the SSL certificates in our US Government environment was not automatically renewed as it should have been. We're working to update that certificate and restore service.</p><p><small>Nov <var data-var='date'>29</var>, <var data-var='time'>10:56</var> EST</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p>tag:status.achieveit.com,2005:Incident/79026102021-09-02T12:55:13-04:002021-09-10T15:14:08-04:00DNS lookup failures<p><small>Sep <var data-var='date'> 2</var>, <var data-var='time'>12:55</var> EDT</small><br><strong>Resolved</strong> - We have confirmed that DNS lookups are working functioning as expected and all services are accessible again. We will perform additional investigation and post a post mortem as soon as we have identified the root cause of the interruption.</p><p><small>Sep <var data-var='date'> 2</var>, <var data-var='time'>12:35</var> EDT</small><br><strong>Monitoring</strong> - We have updated a configuration in our DNS records and we believe that has allowed lookups to succeed again. We will continue to monitor the issue and provide an update with additional resolution details shortly.</p><p><small>Sep <var data-var='date'> 2</var>, <var data-var='time'>12:13</var> EDT</small><br><strong>Identified</strong> - We have identified a problem with part of the security configuration in our DNS settings that is likely causing the lookup failures. We are working with our DNS provider to try to resolve the issue.</p><p><small>Sep <var data-var='date'> 2</var>, <var data-var='time'>11:00</var> EDT</small><br><strong>Investigating</strong> - We are receiving reports that some users are not able to access our production web applications that appear to be related to DNS lookups failing in some public DNS resolvers. We are investigating the issue and will update with additional details.</p>tag:status.achieveit.com,2005:Incident/74872242021-07-12T19:52:43-04:002021-07-12T19:53:05-04:00Intermittent Connectivity Issues<p><small>Jul <var data-var='date'>12</var>, <var data-var='time'>19:52</var> EDT</small><br><strong>Resolved</strong> - As of 21:36 UTC (5:36 PM EDT), Microsoft has reported that it has mitigated the performance degradation on its CDN services. We have observed that AchieveIt application behavior has returned to normal. We will continue to monitor system performance, but we believe the issue is now resolved.</p><p><small>Jul <var data-var='date'>12</var>, <var data-var='time'>15:40</var> EDT</small><br><strong>Identified</strong> - Our hosting provider, Microsoft Azure, is experiencing a degradation of their content delivery network (CDN) service that is affecting some AchieveIt users. AchieveIt uses the Azure CDN to serve the static web content for our web application. A user who is affected by this incident may experience the AchieveIt application loading slowly or sometimes timing out. You may be able to temporarily resolve the connection problem by refreshing the web application in your browser or retrying the action that timed out.<br /><br />We will continue to monitor and provide updates as Microsoft works to resolve the problem.</p>tag:status.achieveit.com,2005:Incident/68006802021-04-20T11:38:00-04:002021-04-23T06:39:31-04:00SSO login failures<p><small>Apr <var data-var='date'>20</var>, <var data-var='time'>11:38</var> EDT</small><br><strong>Resolved</strong> - An issue was identified with our single sign-on provider, Auth0, which prevented customers that login via SSO to be unable to access the AchieveIt web application hosted in our Commercial environment. The interruption spanned from approximately 3:38 PM UTC until 8:00 PM UTC. During that time users who did not already have an active user session and were required to login using their organization's credentials through SSO could not access the application. Other portions of the system, including direct login with username and password; email notifications and scheduled email reports; and providing progress updates through the Progress Update Landing Page still functioned normally.<br /><br />Auth0 is continuing to provide status updates on the root cause as they investigate at https://status.auth0.com/incidents/zvjzyc7912g5.</p>tag:status.achieveit.com,2005:Incident/68006222021-04-01T20:28:00-04:002021-04-20T20:38:43-04:00Azure DNS outage<p><small>Apr <var data-var='date'> 1</var>, <var data-var='time'>20:28</var> EDT</small><br><strong>Resolved</strong> - From approximately 9:40 PM UTC until 10:40 PM UTC, Azure's global DNS service experienced a service availability issue caused by an anomalous surge in DNS queries from across the globe. Azure's edge DNS servers failed to handle the increased volume due to a code defect, and the result was that during this time many users were unable to reach the AchieveIt web application in both our Commercial and US Government environments. Azure has since resolve the defect in its DNS service to prevent a recurrence of the same type of service interruption.</p>