Around 1 am yesterday morning, it was reported to me that the forum was inaccessible and was showing a database error landing page. A certificate expired, along with a database memory issue, yet I did not have proper access to rectify it. When I was put onto the DigitalOcean account, I had access to only one server and obviously this became an issue yesterday. We had to touch base with the "old" server guy to get this fixed. He ended up setting everything up as a managed cluster. He had to balance this with his job, which is why it took so long. Regardless, this makes administration of everything a fair bit easier in the long run. I now have access to everything like I should. It goes without saying I am terribly sorry all this happened, but now we should be a bit better off when this happens again. More detailed info below:
We are now using Kubernetes for the orchestration engine - the thing that manages how all the containers work together.
It uses TLS certificates to authenticate different pieces of the puzzle to talk to each other.
The master cert expired and there isn't an easy way (back then in that version of kubernetes) to renew those certs. It's now a fixed thing. Anyway, that expired certificate made the control utility (kubectl) useless so we couldn't restart the failed mysql db container pod.
So, the cluster cert had to be fixed before mysql could be fixed.
This basically meant redoing the kubernetes cluster.
It was decided to use the new managed kubernetes cluster option instead of rolling it again as it was done before to make it easier to admin going forward.
You won't have to worry about expiring certs and such. It's something DO takes care of.
However, this meant there was a need to copy various data from one place to another as well as export and reimport the mysql database as well as recreate the nginx, mysql, php, and memcached deployments.