Please check this page for the latest updates on the status of DiscoverLink Talent
9:15 am CT, September 21, 2017
All systems are operating normally.
8:30 pm CT, September 20, 2017
The Microsoft Azure team has determined the root cause and applied a fix to our server. The server has stabilized over the past hour and we will continue to monitor the situation to make certain the fix worked. We thank you again for your patience and apologize for any inconvenience this may have caused you and your team. Please notify your Client Services manager right away if you see any further issues.
4:45 pm CT, September 20, 2017
We have determined that the DiscoverLink Talent application is repeatedly restarting itself, which logs off active users every time it happens. The application is only behaving this way on one of our five servers, and the reason for this different behavior is not known. The Microsoft Azure team continues to actively investigate the cause of the restarting, and we are hopeful that as soon as they pinpoint the reason we will have a solution identified.
As a backup measure, we are in the process of cloning the impacted server to create a brand new instance. Because we do not know if this will resolve the issue, we are also building a brand new server from scratch and are in the process copying over all of the content to the new server. This is a lengthy process, so we hope that the Microsoft Azure team will find a solution before these measures are complete.
Thank you for your continued patience as we employ multiple strategies to return our service to normal.
1:10 pm CT, September 20, 2017
We continue to work on the issues with one of our production servers. The Microsoft Azure team has identified a few root causes and have narrowed down their investigation to the application server. We appreciate your patience and encouraging email responses. Please know we are focused on resolving this issue and will continue to keep you updated as progress is made.
10:30 am CT, September 20, 2017
Yesterday, September 19th, at 9:30 a.m. CT, we began noticing system slowness followed by unexpected memory spikes and subsequent user logoffs, and we began troubleshooting immediately.
While we have not yet determined the root cause of the issue, we have taken a number of steps to isolate and try to address the issue, some of which have offered some improvement, but none of which have completely resolved the issue. We have actively engaged the Microsoft Azure team, who supports our cloud environment, and they are continuing to troubleshoot the issue with us. Here’s what we know so far:
- This issue is only impacting one of our five production servers. We are currently investigating moving impacted clients to a new server, but need to assess if any downtime would be required.
- We have increased the memory of the impacted server, and while that provided a small improvement, it did not resolve the issue.
- We removed two analytics tracking tools that were on the impacted server, and at 1:00 a.m. this morning we thought this had resolved the issue, as we were able to hold sessions in a stable state for more than three hours. However, traffic on the server was very low at that point, and when normal traffic resumed this morning, sessions were again interrupted.
We are currently investigating a number of other options, and we will continue to update you throughout the day as we learn more. Please know that we are doing everything in our power to return our service to normal, and we have focused every engineering, quality assurance and client service resource available on resolving this issue. We will continue this focus until the issue is completely resolved.
We are very sorry for the inconvenience caused by this issue, and we appreciate your continued patience as we work to resolve it.