We are currently experiencing issues with Printix Cloud. We are working to resolve the issues. We will post updates to this announcement when we have news. Select Follow if you wish to be notified about these update
Incident Report:
Printix Sev-1 Incident 13th April 2023
Overview
The Printix service was disrupted for 45 mins from 15:15pm UTC April 13th to 4:00pm UTC on April 13th due to a repeat failure of the Kafka cluster. Service was restored by reindexing and restarting the Kafka cluster pending additional maintenance to increase capacity on Sunday 16th April..
What Happened
15:15pm UTC 13th April:
Our platform monitoring generated alerts for Kafka messages not processing. Investigation showed no messages were being processed. At the same time alerts were also generated for Printer locks.
15:30pm – 4:00pm UTC 13th April:
The Kafka cluster was restarted and reindexed, messages processing resumed and alerts cleared.
Resolution
Restart and reindex of Kafka to restore service pending the addition of additional nodes to the cluster to increase processing capability.
Root Causes
Index corrupted errors in logs, due to a high load on Kafka cluster from volume of events being received.
Impact
Degradation of service, customers were unable to execute print jobs during the incident which lasted approximately 45 mins.
Action Items
- Add extra nodes to Kafka cluster (16th April CAM-4144)
- Monitor for repeat of this problem over weekend so we can respond quickly should the problem reoccur before the planned change
Comments
0 comments
Article is closed for comments.