Activity
Just to confirm: I'm running into the same issue and would be very much interested in a fix :). Identical system information, other than the URLs.
Edit: I'm trying to upgrade from
14.6.3-ce.0
.Edited by Michael B.Managed to fix this on my end by running the following commands:
gitlab-rake gitlab:background_migrations:finalize[ProjectNamespaces::BackfillProjectNamespaces,projects,id,'[null\,"up"]'] gitlab-rake db:migrate gitlab-ctl reconfigure apt dist-upgrade gitlab-ctl restart
That's based on the steps outlined/linked by @AHBrook in #360377 (comment 926678321) below – thank you! Also cf. omnibus-gitlab#6795 (closed), and omnibus-gitlab#6797 (closed).
Edited by Michael B.Thanks, Michael! The steps you listed have solved the issue in both of our installations.
These are the steps:
gitlab-rake gitlab:background_migrations:finalize[ProjectNamespaces::BackfillProjectNamespaces,projects,id,'[null\,"up"]'] gitlab-rake db:migrate gitlab-ctl reconfigure apt dist-upgrade gitlab-ctl restart
Hi Brad,
I did those after the migration failure.
gitlab-ctl reconfigure
could not finish, so the service was down. It can be started manually if needed, but no changes will be allowed, per example, renew the SSL Cert with Let's Encrypt.With those steps, I could resolve the issue and fix the db migration.
Ran into the same issue doing staged upgrades from 14.8.2 to 14.10.0. (Did each intermediate upgrade: 14.8.2 -> 14.8.3 -> 14.8.6 -> 14.9.0 -> 14.9.5 -> 14.10.0). Still hit the issue.
Thanks to the steps above (skipping the
apt dist-upgrade
one that seems like overkill).Note also the first step:
gitlab-rake gitlab:background_migrations:finalize[ProjectNamespaces::BackfillProjectNamespaces,projects,id,'[null\,"up"]']
will be parsed incorrectly if you use zsh as your shell. Run it in bash or escape it correctly.
Please register or sign in to replyWe just encountered the same thing, upgrading from 14.8.2 to 14.10.0. We checked and didn't see any warnings regarding this upgrade path.
Bit more info: We did a restore of our VM back to 14.8.2 whole cloth, and then did a staged update to 14.9.3, and then to 14.10.0. All went fine. I suspect there is some upgrade happening in the 14.9.X upgrade that the 14.10 is expecting to already be done.
I spoke too soon! We ran into the exact same issue in production, despite going 14.8.2 -> 14.9.3 -> 14.10.0. The only difference between our test and prod boxes was that we had to restore Test from a VM backup.
We attempted to restart the server and run
sudo gitlab-ctl reconfigure
again, and we got the same error. So we are going to roll back production and start over. Interestingly, despite the errors, we are seeing all the services up and running properly and are able to log in and see our systems. It still makes me uneasy though.I did find a stack Overflow article that seemed similar, indicating RAM issues: https://stackoverflow.com/questions/46907157/cannot-install-gitlab-using-omnibus-error-executing-action-run-on-resource-b
The following errors are in our PostgreSQL "current" log:
2022-04-27_09:12:00.64116 LOG: starting PostgreSQL 12.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1), 64-bit 2022-04-27_09:12:00.65239 LOG: listening on Unix socket "/var/opt/gitlab/postgresql/.s.PGSQL.5432" 2022-04-27_09:12:00.86046 LOG: database system was shut down at 2022-04-27 09:10:49 GMT 2022-04-27_09:12:00.90642 LOG: database system is ready to accept connections 2022-04-27_16:27:56.75786 ERROR: duplicate key value violates unique constraint "namespace_aggregation_schedules_pkey" 2022-04-27_16:27:56.75788 DETAIL: Key (namespace_id)=(106) already exists. 2022-04-27_16:27:56.75788 STATEMENT: /*application:sidekiq,correlation_id:71f72b733d02f72777dfa460f0b04b34,jid:aa777b2d954ff0ab295f8f09,endpoint_id:Namespaces::ScheduleAggregationWorker,db_config_name:main*/ INSERT INTO "namespace_aggregation_schedules" ("namespace_id") VALUES (106) RETURNING "namespace_id" 2022-04-27_18:16:35.62741 ERROR: duplicate key value violates unique constraint "namespace_aggregation_schedules_pkey" 2022-04-27_18:16:35.62743 DETAIL: Key (namespace_id)=(106) already exists. 2022-04-27_18:16:35.62744 STATEMENT: /*application:sidekiq,correlation_id:21f0bab666a93a76ef21d4099defc274,jid:531e0695389fa2420bf3a39c,endpoint_id:Namespaces::ScheduleAggregationWorker,db_config_name:main*/ INSERT INTO "namespace_aggregation_schedules" ("namespace_id") VALUES (106) RETURNING "namespace_id"
Edited by Tony BrookAfter a bunch of hunting, a GitLab community post showed up in my search and pointed me in the right direction.
https://forum.gitlab.com/t/gitlab-ctl-reconfigure-doesnt-work-after-gitlab-omnibus-updated/68715
The command suggested for the finalize migrations didn't work for me, but running the one the output suggested did. Now, everything looks to be working properly... but I'm still worried about the long-term health of the system. We still see errors in our sql logs about
column "on_hold_until" does not exist at character 316
, but that doesn't appear to be hurting anything.- Developer
@AHBrook Thanks, I've edited the issue description with possible fixes linking to your comment and the forum and Reddit posts.
- Developer
I believe the workaround for Docker/Docker Swarm deployments would be:
-
Disable auto-reconfigure by mounting file
/etc/gitlab/skip-auto-reconfigure
-
Disable automatic migration at reconfigure even if it ends up running despite the above, add to
gitlab.rb
the valuegitlab_rails['auto_migrate'] = false
-
Start the service container with these altered config mounted in it and it will not enter a crash loop this time
😌 -
Open a shell
🐚 and run the commands you needgitlab-rake gitlab:background_migrations:finalize[ProjectNamespaces::BackfillProjectNamespaces,projects,id,'[null\,"up"]'] gitlab-rake db:migrate gitlab-ctl reconfigure gitlab-ctl restart
-
Revert all the config change and file creation performed above for future upgrades
🎉
(thanks to @hchouraria for these directions
😄 )What would the workaround be for Kubernetes deployments?
Edited by Greg Myers -
- Maintainer
What would the workaround be for Kubernetes deployments?
The gitlab-toolbox pods should be available with the new codebase once the migration jobs are failing with this.
Execing into the toolbox pod and running
gitlab-rake gitlab:background_migrations:finalize[ProjectNamespaces::BackfillProjectNamespaces,projects,id,'[null\,"up"]']
should get that in place.Then re-running the helm upgrade command for the chart should trigger a new rollout with a new migration job.
I see that from the forum post they tried this, and the finalise job is failing in their case. It looks like the finalise failure needs to be investigated.
I'm trying to work out how to get around this issue for my K8s install.
The above command fails as per my forum post, is there a way to find out why it's failing? The error given by the above command even with trace on is not helpful.
`rake aborted!
Gitlab::Database::BackgroundMigration::BatchedMigrationRunner::FailedToFinalize: Gitlab::Database::BackgroundMigration::BatchedMigrationRunner::FailedToFinalize`
@greg My customer attempted these steps, unsuccessfully. Any recommendations?
Thought I'd come back to share what I've done to resolve it and upgrade to 14.10.4
- First, - downgraded to 14.7.7 (Where I was originally)
- Tried the upgrade again (crazy I know), this failed. Reverted my install back to 14.7.7
- Ran
kubectl exec <gitlab-toolbox-pod-name> -it -- bash
- Then executed
gitlab-rake gitlab:background_migrations:finalize[ProjectNamespaces::BackfillProjectNamespaces,projects,id,'[null\,"up"]']
- Which came back with
Done.
- Re-ran the upgrade to 14.10.4 which worked this time.
Edited by Adam