NetCloud Perimeter PKI infrastructure problems stopping some new devices from becoming active

Incident Report for NetCloud us0 Instance

Resolved

The protective load-shedding code changes have been deployed, and appear to be effective. We are continuing to monitor the situation, and will be working on additional optimizations in the next few days.
Posted Nov 26, 2018 - 21:13 MST

Identified

We are currently working on a protective load-shedding mechanism to prevent clients from inadvertently saturating the NCP PKI database infrastructure. The NCP master controller changes are undergoing test right now, and will be pushed to production, if no problems are encountered.
Posted Nov 26, 2018 - 17:25 MST

Update

We are continuing to investigate this issue.
Posted Nov 26, 2018 - 13:57 MST

Update

NCP PKI certificates are issued for a year. It is possible that existing NCP devices whose certificates will expire in the next 30 days may be experiencing connectivity problems (like new NCP devices), if they have been marked for PKI certificate renewal.
Posted Nov 26, 2018 - 13:50 MST

Investigating

As of 12:12 PM MST, we have been investigating an issue with our PKI certificate infrstructure that is consuming all database CPU resources. This issue appears to be preventing some new devices from getting a PKI certificate from the NetCloud Perimeter (NCP) service, and therefore they are unable to go active.
Posted Nov 26, 2018 - 13:21 MST