Monday, December 30, 2019

Draft documentation of case study - Custom user model mid-project #2

Custom user documentation (Draft subject to improvements from others)

Based on Tobias McNulty's "How to Switch to a Custom Django User Model Mid-Project" [1] and also on Aymeric Augustin's approach documented in Django ticket #25313 [2]

Assumptions
- Existing project without a custom user model
- All migrations are up to date and deployed in production
- Existing auth_user table has data which must be kept
- Relationships with other models exist and must be kept

Case

This was documented after switching auth.user to common.user and company.userprofile to common.userprofile. UserProfile has a
one-to-one key to User and a foreign key to Company. It was all proven on a Windows 10 dev machine before deploying to Ubuntu 18.04 staging on the local network and finally Ubuntu 18.04 production on a DigitalOcean VM.

Strategy

There are two strategies. One is to throw away history, delete all migrations, empty (truncate) the migrations table and start again.[2] Very attractive if the project repo is young and history is fresh and therefore disposable. This is the second approach documented here.

The other strategy is to use the migration system to make the switch, ensuring nothing breaks. That is the Tobias approach and the first one documented here.

Both strategies are genuine bottlenecks. All pending changes must be completed and fully deployed before starting and no planned changes are commenced until after the switch is fully deployed.

Objective

- Completely align development, staging and production systems
- Series of new migrations
- Series of sql commands to adjust content_type records
- Series of scripts to execute migrations and sql commands

Process

1. Ensure all references to User everywhere (including 3rd party apps) are indirect[3][4]. Ensure all code concerned with access control and relying on users or user authentication is covered by unit tests as far as possible and all tests are passing.


2. Make migrations and apply them. Ensure development, staging and production systems are all synchronised and each database (structure) is identical. This starts the bottleneck.


3. Start a new app or use an existing one which has no models.py. The reason there needs to be initially no models is the migration which creates the custom user must be '0001_initial.py' to persuade Django there are no dependency issues. In this documentation I call the app "common" but it can be anything eg "proj_user", "accounts" etc.

 
4. Write a new common/models.py ...

    from django.db import models
    from django.contrib.auth.models import AbstractUser


    class User(AbstractUser):
        """ Retain the model name 'User' to avoid unnecessary refactoring during
        the switchover process. Make no other changes here until after complete
        deployment to production.
        """
        class Meta:
            # use the existing Django users table for the initial migration
            db_table = "auth_user"


5. Write a new common/admin.py

    from django.contrib import admin
    from django.contrib.auth import get_user_model
    from django.contrib.auth.admin import UserAdmin

    class CommonUserAdmin(UserAdmin):
        """ This can be named as desired """
        ...
        # balance of admin code - in due course include userprofile inline


    admin.site.register(get_user_model(), CommonUserAdmin)


6. Include the new app in settings.py among other local apps and adjust AUTH_USER_MODEL ...

    INSTALLED_APPS = [
        # ...
        'common',
    ]

    AUTH_USER_MODEL = 'common.User'


7. Make the initial migration to create the new User model ...

    python manage.py makemigrations  --> common/migrations/0001_initial.py


8. Write a script to deploy (rather than execute) the migration as follows ... [5]

Windows 10 - PostgreSQL 10 ...

    :: deploy_migration.bat
    :: defeat Django's sanity check by manually entering that migration in the database
    :: and for good measure update content_types to avoid further Django sanity checks

    set host=dev_laptop
    set dbowner=whoever

    psql --username=%dbowner% --port=5432 --dbname=ssds --host=%host% --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('common', '0001_initial', CURRENT_TIMESTAMP)";

    psql --username=%dbowner% --port=5432 --dbname=ssds --host=%host% --command "UPDATE public.django_content_type SET app_label = 'common' WHERE app_label = 'auth' and model = 'user'";


Linux (Ubuntu 18.04) - PostgreSQL 10 ...

    # fetch_ssds.py [6]
    # These next two psql command lines fake an initial migration to create
    # a custom-user in a pre-existing project and adjust content_types to
    # prevent Django from barfing if it automatically tried to add them
    #
    import os

    host="dev_laptop"
    dbowner="whoever"

    cmd = "sudo psql --username=%s --port=5432 --dbname=ssds --host=%s --command \"INSERT INTO public.django_migrations (app, name, applied) VALUES ('common', '0001_initial', CURRENT_TIMESTAMP);\"" % (dbowner, host)
    #
    os.system(cmd)
    #
    cmd = "sudo psql --username=%s --port=5432 --dbname=ssds --host=%s --command \"UPDATE public.django_content_type SET app_label = 'common' WHERE app_label = 'auth' and model = 'user';\"" % (dbowner, host)
    #
    os.system(cmd)


9. After deploying with the above technique in development run all unit tests and correct any errors or failures both in project code and in the above scripts. Refresh the dev database (structure) from production (yet again) and repeat step 8 above and test again. All unit tests must pass. Important - repeat until perfect.


10. Deploy to staging using one of the above scripts from step 8, modified for the staging environment. When perfectly deployed on staging and all testing is done, ensure production is backed up then deploy to production in similar fashion. This ends the bottleneck.

The balance of Tobias's process is optional and starts in dev


11a. Edit common/models.py then makemigrations to rename the table of existing users from auth_user to common_user. Finally migrate to execute the rename to common_user

    class User(AbstractUser):
        """ Retain the model name 'User' to avoid unnecessary refactoring during
        the switchover process. Make no other changes here until after complete
        deployment to production.

        Comment out Meta entirely to migrate to the default table name and
        add post save signal
        """
        pass

        #class Meta:
        #    # use the existing Django users table for the initial migration
        #    db_table = "auth_user"

    from django.db.models.signals import post_save
    from django.contrib.auth import get_user_model

    def create_user_profile(sender, instance, created=True, **kwargs):
        if created:
            from common.models import UserProfile
            UserProfile.objects.get_or_create(user=instance)

    post_save.connect(create_user_profile, sender=get_user_model())


11b. Also optional is renaming the sequence to match the table name. If step 11a above is done, the renamed "common_user" table still uses the original "auth_user_id_seq" name. The migration process doesn't appear to change the sequence name. This doesn't matter because everything still works. However it is a wrinkle worth removing. It requires a psql command. This does not need a migration because reversing to an earlier revision will still work no matter what the sequence is named.

psql "ALTER SEQUENCE IF EXISTS auth_user_id_seq RENAME TO common_user_id_seq";

12. Refactor the code and unit tests to use the new table name if necessary, perform another makemigrations/migrate cycle and if all is well deploy to production using the deployment script to execute the 11b psql command.

That completes the Tobias approach.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Here begins Aymeric's approach ...

In overview, all changes are made identically with Tobias's up to the point before migrating the changed User model. That is step 7 above. This understandable because Tobias based his approach on Aymeric's suggestion in ticket #25313.

Follow Tobias's approach from steps 1 to 6. Continue with step 2 below.

1. See steps 1 - 6 above.


2. Delete all migrations in all apps inside the project


3. Document all Django and third-party migrations in the django_migrations table. Ignore existing project migrations. An easy way to do this is export the table to a .csv file.


4. Truncate the django_migrations table. Empty it.


5. Begin writing a script to fake-apply Django and third party migrations. Here are the dev machine (Windows 10) migrations (which are all INSERT INTO commands) from this case ... See the above scripts to notice psql arguments which have been omitted here for clarity.

    :: contenttypes initial
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('contenttypes', '0001_initial', CURRENT_TIMESTAMP)";

    :: contenttypes 0002_remove_content_type_name
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('contenttypes', '0002_remove_content_type_name', CURRENT_TIMESTAMP)";

    :: auth initial
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0001_initial', CURRENT_TIMESTAMP)";

    :: auth 0002_alter_permission_name_max_length
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0002_alter_permission_name_max_length', CURRENT_TIMESTAMP)";

    :: auth 0003_alter_user_email_max_length
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0003_alter_user_email_max_length', CURRENT_TIMESTAMP)";

    :: auth 0004_alter_user_username_opts
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0004_alter_user_username_opts', CURRENT_TIMESTAMP)";

    :: auth 0005_alter_user_last_login_null
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0005_alter_user_last_login_null', CURRENT_TIMESTAMP)";

    :: auth 0006_require_contenttypes_0002
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0006_require_contenttypes_0002', CURRENT_TIMESTAMP)";

    :: auth 0007_alter_validators_add_error_messages
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0007_alter_validators_add_error_messages', CURRENT_TIMESTAMP)";

    :: auth 0008_alter_user_username_max_length
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0008_alter_user_username_max_length', CURRENT_TIMESTAMP)";

    :: auth 0009_alter_user_last_name_max_length
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0009_alter_user_last_name_max_length', CURRENT_TIMESTAMP)";

    :: auth 0010_alter_group_name_max_length
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0010_alter_group_name_max_length', CURRENT_TIMESTAMP)";

    :: auth 0011_update_proxy_permissions
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('auth', '0011_update_proxy_permissions', CURRENT_TIMESTAMP)";

    :: admin initial
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('admin', '0001_initial', CURRENT_TIMESTAMP)";

    :: admin 0002_logentry_remove_auto_add
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('admin', '0002_logentry_remove_auto_add', CURRENT_TIMESTAMP)";

    :: sessions initial
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('sessions', '0001_initial', CURRENT_TIMESTAMP)";

    :: sites initial
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('sites', '0001_initial', CURRENT_TIMESTAMP)";

    :: sites 0002_alter_domain_unique
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('sites', '0002_alter_domain_unique', CURRENT_TIMESTAMP)";


6. Recreate a fresh set of project migrations to capture the current state of project models. Be aware that the database (structure) freshly copied from production is already up-to-date with the current state of all project models EXCEPT the custom user. Django has too many sanity checks to permit a successful actual migration for the new custom user so the above script needs to be extended with the new project migrations including adjustments some of which would otherwise be performed by the migrations system.


7. In this case there were in-project dependencies in the new migrations so the script sequence of commands needs to be adjusted until all dependencies are satisfied. Starting the script with non-project migrations (5 above) ought to satisfy external project dependencies. Here is the balance of the migration deployment script including database adjustments with :: comments ...

    :: INSERT common initial
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('common', '0001_initial', CURRENT_TIMESTAMP)";

    :: UPDATE common content_type follows common initial
    psql --command "UPDATE public.django_content_type SET app_label = 'common' WHERE app_label = 'auth' and model = 'user'";

    :: RENAME auth_user to common_user
    psql --command "ALTER TABLE IF EXISTS auth_user RENAME TO common_user";

    :: RENAME auth_user_id_seq to common_user_id_seq
    psql --command "ALTER SEQUENCE IF EXISTS auth_user_id_seq RENAME TO common_user_id_seq";

    :: RENAME auth_user_user_permissions to common_user_user_permissions
    psql --command "ALTER TABLE IF EXISTS auth_user_user_permissions RENAME TO common_user_user_permissions";

    :: RENAME auth_user_user_permissions_id_seq to common_user_user_permissions_id_seq
    psql --command "ALTER SEQUENCE IF EXISTS auth_user_user_permissions_id_seq RENAME TO common_user_user_permissions_id_seq";

    :: RENAME auth_user_groups to common_user_groups
    psql --command "ALTER TABLE IF EXISTS auth_user_groups RENAME TO common_user_groups";

    :: RENAME auth_user_groups_id_seq to common_user_groups_id_seq
    psql --command "ALTER SEQUENCE IF EXISTS auth_user_groups_id_seq RENAME TO common_user_groups_id_seq";

    :: RENAME company_userprofile to common_userprofile
    psql --command "ALTER TABLE IF EXISTS company_userprofile RENAME TO common_userprofile";

    :: INSERT company initial
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('company', '0001_initial', CURRENT_TIMESTAMP)";

    :: UPDATE company content_type follows company initial
    psql --command "UPDATE public.django_content_type SET app_label = 'common' WHERE app_label = 'company' and model = 'userprofile'";

    :: INSERT billing initial
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('billing', '0001_initial', CURRENT_TIMESTAMP)";

    :: INSERT billing 0002_auto_20191224_1613
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('billing', '0002_auto_20191224_1613', CURRENT_TIMESTAMP)";

    :: INSERT common 0002_auto_20191224_1613
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('common', '0002_auto_20191224_1613', CURRENT_TIMESTAMP)";

    :: INSERT substance initial
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('substance', '0001_initial', CURRENT_TIMESTAMP)";

    :: INSERT billing 0003_auto_20191224_1613
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('billing', '0003_auto_20191224_1613', CURRENT_TIMESTAMP)";

    :: INSERT company 0002_auto_20191224_1613
    psql --command "INSERT INTO public.django_migrations (app, name, applied) VALUES ('company', '0002_auto_20191224_1613', CURRENT_TIMESTAMP)";


8. Repeat the following process, editing script, code and/or unit tests until perfection is attained ...

8.1 Load a fresh copy of the production database (structure)

8.2 Execute the script on the dev machine

8.3 makemigrations and migrate to catch anything previously missed

8.4 Run up the dev server and perform a quick sanity check

8.5 Run all unit tests


9. With perfection attained, commit all code to the repo and deploy to staging. Assuming the staging database (structure) is identical with production perform steps 4 to 6. Edit the script so it works on the staging machine and continue with step 8 until staging perfection is attained.

Note that in this case the deployment script shown in (Tobias) 8 above was used for faking migrations and updating the database on staging and in production.


10. Deploy to production ...

10.1 Backup production ready for restoration if things go awry

10.2 Notify users of an outage and stop the web server

10.3 Deploy project code from the repo

10.4 Execute the script (perhaps call it from the deployment process)

10.5 makemigrations on production and if necessary migrate

10.6 Restart web server and run tests


11. Make the scripts safe in case they are acccidentally re-run via the deployment system.


Conclusion

Both approaches were tried in development and staging. Aymeric's approach is almost entirely scriptable and therefore less subject to Django sanity checks preventing progress. It was used in this case and is now in production and appears to be working properly.

A side benefit of blowing away all project migrations is faster unit testing

miked@dewhirst.com.au

27/12/2019



[1] https://www.caktusgroup.com/blog/2019/04/26/how-switch-custom-django-user-model-mid-project/ by Tobias McNulty as a variation of Django docs https://docs.djangoproject.com/en/2.2/topics/auth/customizing/#changing-to-a-custom-user-model-mid-project

[2] https://code.djangoproject.com/ticket/25313#comment:13 by Aymeric Augustin

[3] https://docs.djangoproject.com/en/2.2/topics/auth/customizing/#referencing-the-user-model

[4] Note that get_user_model() cannot be called at the module level in any models.py file (and by extension any file that a models.py imports), since you'll end up with a circular import. Generally, it's easier to keep calls to get_user_model() inside a method whenever possible (so it's called at run time rather than load time), and use settings.AUTH_USER_MODEL in all other cases. This isn't always possible (e.g., when creating a ModelForm), but the less you use it at the module level, the fewer circular imports you'll have to stumble your way through. (From Tobias [1])

[5] Tobias notes that Django won't permit 'migrate common --fake-initial' if there are other migrations which include settings.AUTH_USER_MODEL

[6] fetch_ssds.py is a comprehensive auto-deployment script. Only the relevant (and simplified) portion is shown.

No comments:

Post a Comment