We are currently experiencing issues with Printix Cloud. We are working to resolve the issues. We will post updates to this announcement when we have news. Select Follow if you wish to be notified about these updates.
During this period you may experience slow performance, inability to sign in and release documents via Printix App. Direct print to printers on the local network is still possible.
- 2022-05-16 09:14 (CET)
Announcement published. - 2022-05-16 10:27 (CET)
System is up, but only two out of three Cassandra database cluster nodes are active, so you may still experience some slowness and experience sporadic errors. Team is still working on it. - 2022-05-16 11:25 (CET)
Problem with sign in to Printix Partner Portal resolved. - 2022-05-16 11:58 (CET)
Third Cassandra database cluster node becoming active. Once it is fully active, performance should be back to normal. - 2022-05-16 15:44 (CET)
It has been reported that users of Printix Go have issues signing in with ID code and registered cards. For ID code, users may get the message Wrong ID code. We are investigating. - 2022-05-16 17:27 (CET)
We are still investigating issue why sign in with ID code and card does not work. The work-around is to Reset the ID code and use the new ID code. In case of card, you should Register card again. - 2022-05-16 22:30 (CET)
Our developers have produced a fix for why sign in with already created ID codes and cards stopped working. It will probably be around an hour before it is ready to be deployed to production. - 2022-05-16 22:45 (CET)
It appears there are intermittent sign in issues. We are working to resolve those as well. - 2022-05-17 00:05 (CET)
The fix for having existing ID codes and cards working, has now been deployed. A partly fix for the intermittent sign in issues has also be deployed and have reduced this issue. Another deployment is on its way to hopefully resolve the issue with intermittent sign in issues. - 2022-05-17 14:32 (CET)
Through deep analysis of the Cassandra database setup we believe we have established that inconsistent data is causing the intermittent sign in issues and the Printix Client unintentionally requesting sign in. The work to change the code and setup has begun, and will be deployed as soon as it is ready and tested, so the issues should gradually disappear as the changes are deployed. We anticipate three deployments to occur, with no or very little interruption to service. After each deployment we will closely monitor the system and performance before proceeding to the next deployment. We will post updates as the deployments are completed. - 2022-05-17 15:47 (CET)
The first of the three deployments have now been completed. We will continue to monitor and prepare for the next deployment. - 2022-05-17 21:35 (CET)
Additional updates have been deployed to fix the issue with intermittent sign in and the Printix Client unintentionally requesting sign in. There are still changes to other parts of the code that needs to be completed, tested, and deployed before we can change the status of this announcement to resolved. - 2022-05-18 14:02 (CET)
Another set of changes have been completed, tested, and deployed. Changes to one more service is still work in progress, but once that is tested and deployed we will change the status of this announcement to resolved. - 2022-05-19 14:12 (CET)
The last service has now been tested and deployed. Incident report published below. Status changed to resolved.
Incident report
The root cause of this incident was inconsistent data in the Cassandra databases. The inconsistencies started to happen after the completed scheduled maintenance that occurred on 2022-05-14 (Saturday 14th, May 2022). Saturday's maintenance involved moving our third Casandra database cluster node to the same rack as the other two to allow faster performance and speedier recovery/restart. The inconsistencies were introduced because of the Cassandra nodes being taken offline during the maintenance.
During our Sunday evening late (CET), we started to get support requests from our customers in the Asia Pacific region. It was mostly related to partners who had issues signing in to the Printix Partner Portal and Printix Administrator, but also support requests related to printing. Our team made some changes in the night between Sunday and Monday, but at this time, we had not realized, that the root cause of the issues was related to data inconsistency.
Monday, as Europe came online, we got more support requests, and also a PagerDuty alert. We published the first version of this announcement at 09:14 (CET). While investigating and fixing (temporarily as it turned out) the issue with some users of Printix Go getting the message Wrong ID code, we started to see a pattern, that hinted at inconsistent data.
Tuesday, as we continued work and analysis, we established the root cause as inconsistent data. It was decided to change the method by which data is read from the Cassandra databases. Before data was read from one database but going forward, we will read from a majority of databases, and then if there are inconsistencies, the system will intelligently figure out the right value and fix the inconsistency.
We then started to change the relevant code in the services one by one, testing the changes on our test system, and then finally deploy to production. As we gradually deployed services to production, we closely monitored the performance of the system, to see if the changed method of reading from multiple databases, would result in any noticeable performance degradation. Fortunately, that did not turn out to be the case.
Wednesday, the work continued. We had already prioritized services according to which ones impacted most user operations, so for example sign in issues had already been addressed during the previous day's work and deployments.
Thursday, we have now completed the code changes and deployments, and have published this incident report. We are very sorry about the inconvenience this has caused.
Actions
- Perform a review of the Cassandra setup and investigate a potential migration to an externally hosted Cassandra service for further automation and stability.
Friendly regards
The Printix Team
Comments
0 comments
Article is closed for comments.