Migrating from Single Machine to Clustered BugSnag On-premise
To get started with your migration from Single Machine to Clustered BugSnag On-premise, you’ll need the following things prepared in advance:
1.23
and at most version 1.28
The migration process has the following steps:
Migrating the configuration - Converts the configuration on the Single Machine instance to the Clustered version.
Configuring the instances to run in migration mode - Configures Single Machine and Clustered instances to connect to each other to migrate the data.
Migrating the databases - Moves data from the Single Machine instance to the Clustered instance. Any new events processed in the Single Machine instance are migrated over. The Clustered instance will not process any events sent to it.
Configuring the instances to disable migration mode - Configures Single Machine and Clustered instances to disable migration mode. Once done the instances are not connected to each other and will process events separately. After this point you cannot go back to migration mode.
Running the Post-Migration Script - Executes necessary database upgrades and reconfigurations on the clustered instance post-migration.
It’s highly recommended to run Clustered On-premise in non-migration first before attempting a migration to ensure that the cluster is setup correctly. You can then clear the data prior to a migration by running kubectl delete namespace bugsnag
.
When configuring the Clustered instance make sure the cluster is using appropriately sized nodes and the storage sizes for the databases are set correctly as these cannot be changed very easily after the migration is complete. Please contact us for recommendations according to your usage.
Once the migration has been completed and both instances are running in non-migration mode, you can safely change the DNS to route traffic to the new instance or reconfiguring the notify / session end-points in the application.
The migration can take some time and is dependent on the number of events you have, however you can continue to use your current Single Machine instance whilst the data is migrating. We estimate that it takes around 1 hour to migrate 8 million events of data. There will be some downtime after the data has been migrated and Clustered BugSnag On-premise starts, this may take around 10-15 mins.
Download migrate-settings.rb on the Single Machine instance. Run the following command: ruby migrate-settings.rb
. This will generate a config required for the migration: config.yaml
.
Copy config.yaml
to where you will be starting the migration.
Run the install script to download and install the Replicated KOTS CLI. The KOTS CLI is a kubectl
plugin that runs locally on any computer.
curl https://kots.io/install/1.103.2 | bash
The Replicated KOTS admin console provides a user interface for installing and managing BugSnag On-premise in your kubernetes cluster. Install the admin console by running the following command:
kubectl kots install bugsnag-clustered-kots/stable \
--name Bugsnag \
--license-file ./license.yaml \
--config-values ./config.yaml \
--namespace bugsnag
The admin console will be available at http://localhost:8800
On the clustered instance, in the configuration tool available at http://localhost:8800
:
On the Single Machine instance:
You can check the progress of the uploads migration by running the following on the Single Machine BugSnag instance:
docker logs bugsnag_migrate-uploads
Once this states “Migration of uploads completed.”, this migration is complete.
You can check the progress of the MongoDB migration by checking the status of the replica set:
kubectl -n <NAMESPACE> exec -it mongo-0 -- mongo-local --eval "rs.status()"
If the MongoDB instance in Kubernetes is in STARTUP2
mode you can check the progress of the collection copying using:
kubectl -n <NAMESPACE> logs -f mongo-0 | grep "repl writer worker"
Once the instance is in SECONDARY
mode you can use the following to check the replication lag:
kubectl -n <NAMESPACE> exec -it mongo-0 -- mongo-local --eval "rs.printSlaveReplicationInfo()"
Once this states that the MongoDB instance is 0 secs behind master, this migration is complete.
Once MongoDB has finished migrating run migrate-redis.sh on the Single Machine BugSnag instance and select “Backup redis data” option to begin backing up the redis data from the instance. Once that has completed you can use the “Restore redis data to kubernetes” option to begin the restore. If you do not have access to the Kubernetes cluster from that instance you should copy the output archive to an instance which does along with the script and run migrate-redis.sh using the restore option and specifying the archive location to begin migrating the redis data to the Clustered instance.
Note that once migration mode has been disabled on both instances any new events sent to the Single Machine instance will not be migrated and there will be a period of downtime until the Clustered installation is fully running and the error traffic has been routed over to the new instance. Also migration mode cannot be re-enabled once it has been disabled. You will have to start the migration from scratch.
Once all the migrations are complete, to disable migration mode and run the two instances separately:
On the Single Machine instance configure the Single Machine BugSnag instance via Replicated settings to disable migration mode and restart:
Run the following to allow access to the Replicated KOTS admin console for the clustered installation:
kubectl kots admin-console -n bugsnag
Disable migration mode and deploy the config change.
After deploying the config change, download and run the post-migration-mongo-upgrade.sh script to upgrade and reconfigure MongoDB instances on the clustered instance.
Once BugSnag is running, Elasticsearch will require a reindex, which you can monitor using “Events in Elasticsearch/Mongo” graph under the “Support” dashboard on Grafana. There will be a period where historical events will not be available in the dashboard until the reindex has completed, any new events will show up on the dashboard immediately. We estimate that it takes around 1 hour to reindex 1 million events of data.