Sporadic email failures on EU production server

Incident Report for Viedoc

Postmortem

Description

Between 9AM CEST on 2025-09-02 and noon CEST 2025-09-03 there were sporadic failures in email alerts expected to be triggered from Viedoc on the EU production instance. No other instances were affected. Customers were initially informed through the status page at 15.56 CEST on 2025-09-03.

Cause

On Tuesday at 9AM CEST on 2025-09-02, there was a very high load on the system. The high load on the system caused worker function apps, used to handle various processes  in the EDC, to scale out to the maximum use of 20 instances in parallel. This caused failures, and in combination with in-built retry policy of failing actions, there was an overload on the system.

While form save and form view continued to work as expected throughout the duration of the issue, there were sporadic errors in specific system actions in the post-processing step.

  • EDC email alerts, configured in the CRF design, were sporadically not sent during this time window. It did not affect all emails.
  • Terms defined to be coded in the medical coding tool sporadically failed to sync to Viedoc Coder. It did not affect all terms.

Corrective action

Once the root cause of the exceptions was identified, the worker function app was gradually scaled down to run on 4 instances to avoid having too many actions ongoing at the same time. This work begun around 11AM CET on 2025-09-03 and an immediate improvement could be seen. Around noon 2025-09-03 the situation was completely resolved, and all emails could be sent as expected again.

Email alerts not sent during this issue could not be re-sent and will also not show up in the communication log.

All terms that had not synced properly with Viedoc Coder during the issue, were synced with Viedoc Coder on the afternoon of 2025-09-03 and 2025-09-04.

Preventive action

Scaling down the worker function app to run on 4 instances is not considered to bring any noticeable impact on performance, but will ensure the same overload issue will not occur again. An evaluation will be performed if it is necessary to scale up the worker function app again and if so, how this could be done without causing any similar issues.

Posted Sep 09, 2025 - 20:24 UTC

Resolved

Since approximately noon CEST, emails triggered from the EDC has been delivered as expected.
Posted Sep 03, 2025 - 14:27 UTC

Monitoring

A fix has been implemented and emails are now being sent as expected. We will continue to monitor the situation.
Posted Sep 03, 2025 - 11:25 UTC

Identified

The issue has been identified and a fix is being implemented.
Posted Sep 03, 2025 - 08:49 UTC

Investigating

Emails triggered from EDC actions in Viedoc has been failing sporadically on the EU production instance since approximately 9AM CEST 2025-09-02. No other instances are affected. Investigations are ongoing.
Posted Sep 02, 2025 - 13:56 UTC
This incident affected: Viedoc 4 - Europe (Main portal (clinic/admin/designer)).