On Sep 8, 2017 12:12 PM, "Mark Phillips" <mark@phillipsmarketing.biz> wrote:
Thanks for all your replies and helpful suggestions. To answer some of your questions -1. Where is the data coming from? - It is textual data in a spec/spreadsheet. The data is a lot of meta data (name and one or more values) for describing attributes of scanned documents. There is what I would call a "base set" of data, which means it is what we can think of now based on a review of a representative set of documents. However, as new documents are imported into the application, there may be other types of metadata with multiple values that will have to be created on the fly (hence the need for the admin forms to add meta date in the future). The import function I need is a one-off function. We need it as we develop the models and test them against various documents. Sometimes it is easier to just delete the database and rebuild it when we are developing the app than to back out certain migrations. So a simple way to populate the metadata for development purposes, and then one time when we go into production is what we are looking for. Currently, we have 24 metadata names, and each one can have one to 20 values.
I use a management command to do exactly that: drop the database, create the database, run migrations, and populate the database with a fair bit of data. I use FactoryBoy to generate hundreds of fake users and other objects upon reset.
If the files are only being used for development, and should not normally be loaded in to the database, then I would definitely recommend a separate management command rather than building the fixture/fake data loading process directly in to the migrations.
2. The manage.py loaddata is an appealing option. However, it will take further effort to convert the spreadsheet data to any of the formats for this option. I think a csv file reader is more suitable for our purposes.
If it is simple data that is being some what auto generated, I'd consider writing a FactoryBoy definition so you can generate 1 or thousands of entries on demand.
3. I will have to think about the validation concepts. These are simple name-value pairs, so it is not clear what I am validating against, unless it is detecting duplicates.
Forms can do all sorts of validation. Duplicate detection of one of them. Value formatting and type checking is another. You may have some integer values that should only be within a certain range (ie a person can't be 29,000,000 years old, or maybe they can, who knows...). Forms are where input is filtered and massaged to match the business logic in your models.
If you dump data directly in to models or the database, then you risk dumping in data that hasn't been evaluated by your application which may produce subtle bugs of the worst kind where things work in dev and break in prod because your input channel is different.
4. I am also looking into a spreadsheet -> csv file > mysql load data as perhaps the easiest way to complete this project. The spreadsheet is easy to create and update the metadata with the least effort, and then it is pretty automatic from the spreadsheet to the database.I am open to other suggestions!
I'd recommend FBoy and eliminate Excel entirely if I could. Otherwise an Excel parser would be my next choice.
-James
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/CA%2Be%2BciU_Rz84C-rKMeQRjA85FyM_%2B2Tc8hLFT6Zxwf7cwntnCA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
No comments:
Post a Comment