Thursday, May 30, 2013

Re: Importing data using loaddata from exported data using dumpdata

Ah, a bit of digging found the solution. This link explains the issue: http://bit.ly/13mS8ZB specifically:

"cmd.exe" supports an ascii and an utf16 output mode: this can be selected by
    "cmd.exe"    -> ascii mode
    "cmd.exe /u" -> unicode mode
The only difference between the two modes is that in unicode mode all output from
the "cmd.exe" itself (and all included tools like dir) will be in utf16, while in
ascii mode all output of "cmd.exe" will be in ascii.

I had set the shell that Bitnami was calling to Powershell, which, clearly was set to utf-16. Setting it back to regular cmd.exe was all I needed to do!

Once more, thanks for walking me through this.

On Thursday, May 30, 2013 11:19:07 PM UTC+3, ke1g wrote:
If all else fails this untested tool (attached) might translate bad files to good ones.  Run it like this:

  python utr16toascii.py bad_file name_for_new_good_file


On Thu, May 30, 2013 at 4:10 PM, Bill Freeman <ke1...@gmail.com> wrote:
This file is encoded in UTF-16 with a byte order mark.  That is to say, other than starting with \xff\xfe (the two character byte order mark),, every other character is nul (\x00).  There are actually 1449 useful characters in this 2900 byte file.  A converted version is attached.  json.load() is happy with it.

I suspect that it was produced correctly, but the act of opening it in a Windows editor converted it to "wide" characters, which Windows has preferred for a while now.  I don't know how to tell windows to give you the actual byte size of a file, rather than rounding up to a number of "k".  You could use the following python incantation:

    >>> with open('the_file') as fp:print len(fp.read))

The length of my file, downloaded but not opened in an editor, should be 1449.  The length of the bad one should be 2900.  The question remains about the length of the file as produced by dumpdata, but before opening in an editor.  If it is already bad, it must be cmd.exe's ">" operation that is performing the conversion, or possibly the default encoding in that python.  Though if you are using the same python for the loaddata, it should have the same default encoding, though I'm not sure that applies to files read directly, rather than sent to stdout.

If the editor is what's doing it, there are editors that won't.  IDLE, which comes with a lot of Windows python installs has an editor that is a possibility.  Other Windows users may want to comment.


On Thu, May 30, 2013 at 3:27 PM, Gitonga Mbaya <git...@gmail.com> wrote:
I just did a fresh dump and I realise the difference is not that drastic. The extra stuff must come from trying to edit it. Here is a fresh file from the dump...


On Thursday, May 30, 2013 9:50:26 PM UTC+3, ke1g wrote:
Can you load the file using json.load()?  I.e.; is that one of the things that you have already tried?


On Thu, May 30, 2013 at 2:32 PM, Gitonga Mbaya <git...@gmail.com> wrote:
All you suggest I had already tried. Without indent, same result. dumping an xml file, same thing. The only thing I didn't try was loading it in a different project.

I am doing all this on Windows 7 on the same machine.

On Thursday, May 30, 2013 8:57:42 PM UTC+3, ke1g wrote:
Try again without the indent (just for grins).

Are the two systems on the same box, or did you have to transfer it over a network, or via a flash drive, or the like?

If two boxes, is one Windows and the other not?  (Line boundaries differ, though I would hope that the json tools would be proof against that.)

Are there non-ASCII characters in any of the strings?  (Encodings could differ.)

See if you can make it work for one application.   E.g.:

  python manage.py dumpdata books > file.json

and in the other project:

  loaddata fixture/file.json

(You should be able to leave off the fixture/ if that's where you have put it.)

Try again in the XML format:

  python manage.py dumpdata --format xml > file.xml

  python manage.py loaddata file.xml

(I'm pretty sure that loaddata figures out the format for itself, at least it doesn't document a format switch.  I've never tried this, so it's possible that loaddata only supports JSON.)

Bill


On Thu, May 30, 2013 at 1:38 PM, Gitonga Mbaya <git...@gmail.com> wrote:
Bill,

This is are the exact steps I follow:

python manage.py dumpdata --indent=4 > fixtures/data.json

python manage.py loaddata fixtures/data.json

That is when I get:

DeserializationError: No JSON object could be decoded

I checked the json code using http://jsonlint.com/ and it was reported as being valid. (The json code is reproduced at the end of this post for your info)

I openned the file using Notepad++, copied it all into regular Notepad.exe and then saved it as a new json file. When I do the loaddata command with that new file it works just fine.

When I copy paste the code from Notepad.exe back into a new file on Notepad++ and save that, the resultant file works just fine as well.

This link: http://stackoverflow.com/questions/8732799/django-fixtures-jsondecodeerror suggested that the unicode text file needed to be converted to ascii. It was also pointed out that the file in a hexeditor should start with 5B and not any other byte. Sure enough, in the hexeditor, the file straight from the dump began with FF FE, but the notepad saved json file began with 5B. Could it be my setup that is at fault producing the wrong json file dump?

[
    {
        "pk": 1, 
        "model": "books.publisher", 
        "fields": {
            "state_province": "MA", 
            "city": "Cambdridge", 
            "name": "O'Reilly Media", 
            "country": "USA", 
            "website": "www.oreilly.com", 
            "address": "73 Prince Street"
        }
    }, 
{
        "pk": 2, 
        "model": "books.publisher", 
        "fields": {
            "state_province": "CA", 
            "city": "Bakersfield", 
            "name": "Randomn House", 
            "country": "USA", 
            "website": "www.randomn.com", 
            "address": "234 Hollywood Boulevard"
        }
    }, 
{
        "pk": 3, 
        "model": "books.publisher", 
        "fields": {
            "state_province": "NY", 
            "city": "New York", 
            "name": "Pearson Vue", 
            "country": "USA", 
            "website": "www.pearson.com", 
            "address": "1 Wall Street"
        }
    }, 
    {
        "pk": 1, 
        "model": "books.author", 
        "fields": {
            "first_name": "Eric", 
            "last_name": "Meyer", 
            "email": ""
        }
    }, 
    {
        "pk": 2, 
        "model": "books.author", 
        "fields": {
            "first_name": "Seth", 
            "last_name": "Meyer", 
            "email": ""
        }
    }, 
        {
        "pk": 3, 
        "model": "books.author", 
        "fields": {
            "first_name": "Vincent", 
            "last_name": "Meyer", 
            "email": ""
        }
    }, 
{
        "pk": 1, 
        "model": "books.book", 
        "fields": {
            "publisher": 1, 
            "authors": [
                1
            ], 
            "isbn": 123456789, 
            "publication_date": null, 
            "title": "CSS: The Definitive Guide"
        }
    },
    {
        "pk": 2, 
        "model": "books.book", 
        "fields": {
            "publisher": 3, 
            "authors": [
                2
            ], 
            "isbn": 987654321, 
            "publication_date": null, 
            "title": "Primer on Banking"
        }
    },
   {
        "pk": 3, 
        "model": "books.book", 
        "fields": {
            "publisher": 2, 
            "authors": [
                1,2
            ], 
            "isbn": 543216789, 
            "publication_date": null, 
            "title": "Frolicking on the Beach"
        }
    }
]

On Sunday, March 4, 2012 12:04:08 AM UTC+3, Vincent Bastos wrote:
Hi,

I am having trouble importing data using loaddata from a .json file that I created from a dumpdata export. I have a production application which runs MySQL on one server and a development machine which runs SQLite. I simple executed ./manage.py dumpdata > file.json on the production machine, but when I execute ./manage.py loaddata file.json I get the error:

ValueError: No JSON object could be decoded

I would appreciate some sort of trouble shooting direction, as I could not find anything that would help me in the docs.

Cheers

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

No comments:

Post a Comment