- Backup the database so you can restore if something happens
- In Django, Create an an empty copy of the Customer database e.g CustomerB
- Using a SQL query tool, select all duplicate copies from Customer and insert into CustomerB
- If needed, update the model for CustomerB so that you are able to associate or reference every duplicate in CustomerB with the right record in Customer
- Using a SQL query too delete all duplicate copies of Customers from Customer table
- If this breaks your application in any way, account for that by ensuring that every duplicate record in CustomerB has a primary record in Customer that suffices and returns results. This will ensure that all relations with other models and records are honored.
- Update or fix, both models to ensure that duplicates can no longer be inserted
- Perform any other schema updates
On Friday, January 21, 2022 at 12:43:03 PM UTC-7 yebo...@gmail.com wrote:
Okay bro try this script hope it worksOn Fri, 21 Jan 2022 at 10:05, jools <jul...@pinabausch.org> wrote:Dear Django users,here's an interesting case and I'm curious whether somebody can point me in the right direction.Here's the problem:My company has a database of customers. Over the years, many duplicates have been created. For cleaning up duplicates, I'd like to have a search-and-replace functionality.I.e.: Replace all references in the database to customer <old> with customer <new>.The database schema has quite a bit of complexity, so I'm aiming to find a very generic solution.Approach:Django has a bit of functionality to find all references to an existing object, namely django.db.models.deletion.Collector and I'm using it to find all references.Though, the "replace" logic seems quite hard to get right:
- It has to keep parent links intact
- It has to recognize references in parent models (Customer model is derived from the concrete model Actor)
- It has to recognize generic relations, built with Django's content types
My stub implementation comes below.
- Has anybody else implemented sth like this, e.g. in helper library?
- Do you think the approach is right?
- What would you differently?
- Any caveats that you know of?
Best regardsJoolsStub implementation:from django.db.models.deletion import Collector, ProtectedError, RestrictedError
def replace_object(old_obj, new_obj, using='default'):
"""
Replace an old object with a new object throughout the database.
"""
collector = Collector(using)
try:
collector.collect([old_obj])
except ProtectedError:
pass
except RestrictedError:
pass
for model, obj in collector.instances_with_model():
for field_opts in obj._meta.get_fields():
if __is_qualified(field_opts, new_obj):
setattr(obj, field_opts.name, new_obj)
obj.save()
def __is_qualified(field_opts, new_obj):
if not hasattr(field_opts, 'related_model'):
return False
# This check is wrong for multi-table inheritance, I suppose?
if field_opts.related_model != new_obj.__class__:
return False
if hasattr(field_opts, 'parent_link') and field_opts.parent_link is True:
return False
return True--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/4cd485cd-cc20-481d-a7be-3179bcc5c98bn%40googlegroups.com.
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/32c497c1-0640-4cae-9fdd-16fb4fb62e70n%40googlegroups.com.
No comments:
Post a Comment