Monday, July 1, 2019

Re: Accessing data from a 30 GB file in json format

To be able to traverse the JSON structure you'd normally need the entire structure in memory. For this reason you can't (easily) apply suggestions to iterate over a file efficiently to a JSON file: you can perhaps read the file efficiently, but the structure in memory will still grow in memory. I've found these packages made for efficiently reason large JSON files after a quick search: https://github.com/ICRAR/ijson or https://github.com/kashifrazzaqui/json-streamer. https://stackoverflow.com/a/17326199/248891 shows a simple example when using ijson



On Monday, 1 July 2019 12:07:39 UTC+2, Nibil Ashraf wrote:
Hey,

I have a file with a size of around 30GB. The file is in json format. I have to access the data and write that to a csv file. When I tried to do that with my laptop which has a a RAM of 4GB, I am getting some error. I tried to load the json file like this json_parsed = json.loads(json_data)

Can someone help me with this? How should I do this? If I should go with some server, please let me know what specifications should I use? 

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/00dd3f6a-85da-4942-97bb-eae2652cfe96%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment