Update - The mailstore05 machine is erroring with "failed to lock." We are in contact with our vendor so that it can be properly restarted. Customers on this mailstore will be unable to access their email during this time. Estimated timeframe is an hour or so. Incoming customer email will spool in the Edgewaves during this time and be delivered once the server is back up and running. All over mailstores are functioning normally.
Jul 3, 15:42 PDT
Update - Email copying to the new hardware was going well, but then threw a "general error" and the system failed. Mailstore05 was unavailable for about 20 minutes, but is coming back online now. This is still the existing system, not the new hardware. Performance issues are expected to remain the same while engineering works to re-do the mailstore copy.
Jul 3, 15:34 PDT
Update - Mailstore load continues to be high, which means continued performance issues for those with mailboxes on that server. Mail queues have been rising frequently during the day, but engineers have been manually managing these to help make sure that delivery times don't escalate too far. Transfer of affected mailstore to new hardware is 80% complete, which will provide additional performance benefits. Engineering continues to evaluate and implement additional actions to bring the system load down to expected normal levels.
Jul 3, 13:03 PDT
Update - The mail delivery queue was emptied again so all mail has been delivered. Mailstore load continues to remain at moderate to high levels. Engineers are working to keep mail delivered in a timely manner. The migration of the poor performing mailstore to new hardware is making progress.
Jul 3, 10:32 PDT
Update - Mailstore load remains moderate to high. Users on this system continue to experience performance issues. Mail queues slightly elevated at times; engineering is working to keep them clear for timely mail delivery.
Jul 3, 08:31 PDT
Update - Mail delivery queues were emptied earlier this morning but for unknown reasons are climbing back up. Engineering continues to monitor performance. The migration of the poor performing mailstore to new hardware is making progress.
Jul 3, 08:27 PDT
Update - System load remained moderate throughout the evening and morning. Mail delivery queues began building for as yet unknown reason in the wee hours but are currently draining. Engineering continues to monitor performance. Migration of the poor performing mailstore to new hardware is approximately 2/3 complete.
Jul 3, 06:30 PDT
Update - Mail queues remain empty, and average mailstore load is remaining moderate, with the occasional activity spike. Migration to new storage hardware is progressing much faster than anticipated.
Jul 2, 20:03 PDT
Update - Mail system performance continues to improve overall. Mail queue delays have remained minimal throughout the bulk of the day, aside from elevation this morning. Average load on the problematic mailstore has stayed lower, where customers can access and navigate, however fluctuations at various intervals are still manifesting to users as intermittent slowness or timeouts. Engineers continue to look for other services to migrate from this storage array to keep the system stable for customer use.
In tandem, for permanent performance correction, engineers have been installing new hardware and are now beginning to migrate the mailstore away from the TrueNAS network storage device that is performing poorly.
Jul 2, 16:58 PDT
Update - Mailstore load continues to fluctuate, and is averaging high. This continues to mean that customers on the affected mailstore is experiencing performance issues. Engineering is continuing to work to keep this load as low and stable as possible. Mail delivery queues have not increased since earlier, and mail delivery should be timely.
Jul 2, 14:06 PDT
Update - Storage array remains under heavy load after restoring POP/IMAP sessions, however, the delivery queue build-up was cleared quickly. Delivery queue is currently remaining steady, and engineering is watching for increased build-up.
Jul 2, 11:54 PDT
Update - Mailstore load remains fluctuating heavily. This means continued performance issues for customers on the affected mailstore and intermittent connectivity issues when talking to the mail server (timeouts). Mail queue size increased, then began reducing, but had started creeping up again.. Engineering is briefly pausing IMAP/POP access to mailstore05 to allow the queue to drain. This will not affect webmail access. This will allow messages to get delivered to users. Queue depth should stay fairly low after this, as the system is keeping up in general.
Engineering continues migrating non-mail systems to give us more performance. This has greatly helped keep the delivery queues stable, but has not returned system performance to normal status.
Jul 2, 11:32 PDT
Update - Mailstore load is fluctuating heavily, and averaging higher. This means continued performance issues for customers on the affected mailstore and intermittent connectivity issues when talking to the mail server (timeouts). This increases load is beginning to affect the incoming mail queues, and they are starting to rise slowly. This means moderate delays, at this time, for incoming messages to get delivered to mailboxes.
Jul 2, 09:37 PDT
Update - Mail queues have remained low into the morning, which means means standard email delivery times. However, load on the storage system has increased significantly in the last 15 minutes, but is lowering again. This level remains higher than optimal, and continues to mean slowness and timeouts for customers whose mailboxes are housed on that mailstore. Engineering continues to monitor the load, adjusting settings and migrating services away from that hardware to increase performance.
Jul 2, 08:33 PDT
Update - Email delivery queues remained empty throughout the evening, as expected. Overall system load on the affected mailstore is still elevated and Webmail navigation and access will present as slow/sluggish still, but available. Additional migration of services away from the network storage device have continued throughout the evening. Engineering will be monitoring system load as we move into the morning hours.
Jul 2, 05:36 PDT
Update - Mail queue delays continue to get lower, and are getting close to empty. There is not need to attempt to disable POP/IMAP again on the affected mailstore at this point to assist with the processing. System load is lower, but still elevated. Webmail navigation and access will likely present as slow/sluggish still. Additional services have been offloaded from the mailstore storage array, which is continuing to assist in the recovery.
Jul 1, 20:12 PDT
Update - Good progress on mail queue delays. They aren't empty yet, but they're continue to lower. The short disabling of POP/IMAP to the poor performance mailstore was beneficial in getting this lower, quickly, but those services have since been re-enabled. (If queue backlog doesn't continue to go down at the expected rate, this tactic may be utilized again later in the evening, if needed.)
Engineering continues to move services from the network storage device, to be able to continue offering more and more performance, and get us back to stable conditions.
Load still remains high, and mailboxes on this mailstore continue to have performance issues. Message delays for other users is lessening.
Jul 1, 16:09 PDT
Update - Efforts to increase resources are going well. Loads and message queues remain high, causing customer problems. However, with the additional resources (only part of the ongoing freeing-up), the queue has been reducing over the past hour. In an effort to reduce queue backlog, we are briefly suspending IMAP/POP for mailstore05 users. Webmail access remains. This will be a short duration, to help reduce load at a faster rate.
Jul 1, 15:03 PDT
Update - No change in status.
Loads on the mail system continue to be high. Customers are continuing to experience delays or timeouts when logging into or navigating webmail for on the problematic mailstore. Performance issues are leading to longer delays in email delivery for all customers, as mail queues build. Email is not being lost, but is holding in queue for delivery, until that queue can be successfully processed by the system.
Engineers are continuing to offload services from the affected equipment. This is going well, but is taking time. Doing so does come with additional performance impact to the system, but this is temporary, and will help the overall load as a whole.
Jul 1, 12:51 PDT
Update - Loads on the mail system continue to rise. Customers are continuing to experience delays or timeouts when logging into or navigating webmail for on the problematic mailstore. Performance issues are leading to longer delays in email delivery for all customers, as mail queues build. Email is not being lost, but is holding in queue for delivery, until that queue can be successfully processed by the system.
Engineers are offloading services from the affected equipment, to give it extra resources to process mail. Doing so does come with additional performance impact to the system, but this is temporary, and will help the overall load as a whole.
Jul 1, 10:24 PDT
Update - Loads are rising and affecting performance. We're seeing delays or timeouts when logging into or navigating webmail for users on the problematic mailstore. Performance issues are leading to longer delays in email delivery, as mail queues build. Engineers are continuing to work to reduce load.
Jul 1, 08:33 PDT
Update - Engineering is monitoring the email systems. Load is rising on system, as expected as mailboxes start getting used more in the morning. Slow access times for the offending mailstores are still present. Email is flowing, and there remains minimal delay to delivery/sending. Engineering is continuing to work on reducing load for more stable performance.
Jul 1, 07:56 PDT
Update - At approximately 8pm, email backlog was successfully cleared out, due to some limiting of pop/imap connections on the problematic mailstore (used to reduce system load, so that it could better process and catch up). These connections were allowed again. Load on the system immediately rose, but has continued to remain stable, though elevated, throughout the night. This has meant that customers have been able to successfully connect, and there have not been significant delay in email sending.
This is lower-use time, and engineering continues to monitor the effects, as with normal customer usage, load will increase into the morning.
Jul 1, 04:55 PDT
Update - Engineers continue to work to alleviate system load. Backlogged emails have stayed steady since the afternoon. System handling and database tweaks have been made, trying to help the system catch up through the deliveries. This process continues. Customers on the problematic mailstore continue to have intermittent connectivity issues and webmail slowness. All customers are experiencing delays due to the mail queue backlogs.
Jun 30, 16:56 PDT
Update - Some modifications made by engineering appear to have been effective. We continue to work through mail delivery backlogs, though that amount is still high.
Jun 30, 13:30 PDT
Update - Engineering continues to work to find a resolution for affected system. The underlying issue is performance issues with the storage array that houses a significantly affected mailstore. Because of the performance issues on this machine, subsequent problems are appearing that are affecting customers who are not on that set of hardware. Primarily, this is manifesting as significant sending and receiving delays to customers using our email servers.
Engineering continues to work to determine the reason for the storage array performance issues, and is working to reduce load on their own and with our vendor to come up with permanent fix
Jun 30, 10:57 PDT
Update - Status unchanged from last update. Engineering continues to work on mailstore performance issues. Customer email is accessible, but performance of some systems remains below expectations, resulting in continued slow access for those mailboxes affected.
Engineering is working to move mailboxes to other servers to help alleviate loading on problem server.
Jun 30, 07:28 PDT
Update - Engineering continues to work on mailstore performance issues. Customer email is accessible, but performance of some systems remains below expectations, resulting in continued slow access for those mailboxes affected.
Jun 30, 04:50 PDT
Update - Engineering is continuing to work on the issues. Heavy load continues to affect one of the mailstores and customer email boxes on that server.
Jun 29, 20:13 PDT
Update - Underlying issues are with our storage array. Attempts to mitigate the affect this has been having on customers has been unsuccessful. This device in its entirety has been rebooted and is back up. Engineering continues to review performance of the systems from this change.
Jun 29, 17:42 PDT
Update - Engineers are doing an emergency reboot of some key processes to assist with the ongoing mailstore issues. Customers that have been experiencing slowness will be unable to connect to email during this time (some were already experiencing this intermittently). Downtime for the affected mailstores will be between 5 and 30 minutes, depending upon the reboot process.
Jun 29, 14:58 PDT
Update - Load is lowering, but affected mailstores are still under extremely high load as they work to process email and connection requests. This continues to manifest as slowness and timeouts for customers on the affected devices. Engineering continues to work on additional methods to assist performance. Email pre-harvest is being temporarily suspended as a whole to assist with performance while we deal with the unanticipated system performance hit.
Jun 29, 11:53 PDT
Update - Pre-migration harvesting has been backed off to improve overall system performance. Engineering continues to monitor the situation.
Jun 29, 09:57 PDT
Identified - Engineering has identified slow email processing. Some customers are experiencing slow loading times for webmail. There is heavy load on several of the mail servers due to email migration pre-harvest. Engineering is working to mitigate these issues at this time. This is affecting customer access to the mail server. No messages are being lost.
Jun 29, 08:07 PDT