My best guess is that no line on its own is valid JSON. This will cause ValueError
to be thrown every time, and you will never get to data.append(...)
as an exception has always been thrown by then.
If the entire file is a JSON array like this:
[
{
"direction": "left",
"time": 1
},
{
"direction": "right",
"time": 2
}
]
Then you can simply use something like:
with open('output.json', 'r') as f:
data = json.load(f)
If, however, it is a bunch of JSON items at the top level, not enclosed within a JSON object or array, like this:
{
"direction": "left",
"time": 1
}
{
"direction": "right",
"time": 2
}
then you'll have to go with a different approach: decoding items one-by-one. Unfortunately, we can't stream the data, so we'll first have to load all the data in at once:
with open('output.json', 'r') as f:
json_data = f.read()
To parse a single item, we use decode_raw
. That means we need to make a JSONDecoder
:
decoder = json.JSONDecoder()
Then we just go along, stripping any whitespace on the left side of the string, checking to make sure we still have items, and parsing an item:
while json_data.strip(): # while there's still non-whitespace...
# strip off whitespace on the left side of the string
data = json_data.lstrip()
# and parse an item, setting the new data to be whatever's left
item, data = decoder.parse_raw(data)
# ...and then append that item to our list
data.append(item)
If you're doing lots of data collection like this, it might be worthwhile to store it in a database. Something simple like SQLite will do just fine. A database will make it easier to do aggregate statistics in an efficient way. (That's what they're designed for!) It would probably also make it faster to access the data if you're doing it frequently rather than parsing JSON a lot.