2

I am stuck on an issue where I am trying to parse for the id string in JSON that exists more than 1 time. I am using the requests library to pull json from an API. I am trying to retrieve all of the values of "id" but have only been able to successfully pull the one that I define. Example json:

{
"apps": [{
    "id": "app1",
    "id": "app2",
    "id": "new-app"
}]
}

So what I have done so far is turn the json response into dictionary so that I am actually parse the first iteration of "id". I have tried to create for loops but have been getting KeyError when trying to find string id or TypeError: list indices must be integers or slices, not str. The only thing that I have been able to do successfully is define which id locations to output.

(data['apps'][N]['id']) -> where N = 0, 1 or 2

This would work if there was only going to be 1 string of id at a time but will always be multiple and the location will change from time to time.

So how do return the values of all strings for "id" from this single json output? Full code below:

import requests
url = "http://x.x.x.x:8080/v2/apps/"
response = requests.get(url)

#Error if not 200 and exit
ifresponse.status_code!=200:
print("Status:", response.status_code, "CheckURL.Exiting")
exit()

#Turn response into a dict and parse for ids 
data = response.json()
for n in data:
    print(data['apps'][0]['id'])

OUTPUT:
app1

UPDATE: Was able to get resolution thanks to Robᵩ. Here is what I ended up using:

def list_hook(pairs):
result = {}
for name, value in pairs:
    if name == 'id':
        result.setdefault(name, []).append(value)
    print(value)

data = response.json(object_pairs_hook = list_hook)

Also The API that I posted as example is not a real API. It was just supposed to be a visual representation of what I was trying to achieve. I am actually using Mesosphere's Marathon API . Trying to build a python listener for port mapping containers.

5
  • Muliple occurences of an id? Is that valid JSON? (Java Script Object Notation). A JavaScript object cannot have an attribute by the same name twice, I would say. Mar 18, 2016 at 14:21
  • You cannot have the same key twice in the same dictionary. I think you exposed the problem wrongly. You may want to change it to {"apps": [{"id": "app1"},{"id": "app2"},{"id": "new-app"}]}. That is also what I understand from your for n in data loop - you may want to print(data['apps'][n]['id'])
    – Hugo Sousa
    Mar 18, 2016 at 14:48
  • According to my reading of both ECMA 404, and json.org, that is valid JSON. At least according to the letter, if not the spirit, if the standard.
    – Robᵩ
    Mar 18, 2016 at 15:39
  • The API, while not technically broken, is useless. You can't access the various id fields from Python's JSON parser. I suspect you can't access them from any other language's parser, either. Can you complain to the author of the API?
    – Robᵩ
    Mar 18, 2016 at 15:41
  • Okay, there is a way if you pass a special hook function into data.json(). I'll create an example.
    – Robᵩ
    Mar 18, 2016 at 15:49

1 Answer 1

1

Your best choice is to contact the author of the API and let him know that his data format is silly.

Your next-best choice is to modify the behavior of the the JSON parser by passing in a hook function. Something like this should work:

def list_hook(pairs):
    result = {}
    for name, value in pairs:
        if name == 'id':
            result.setdefault(name, []).append(value)
        else:
            result[name] = value
    return result

data = response.json(object_pairs_hook = list_hook)

for i in range(3):
    print(i, data['apps'][0]['id'][i])
1
  • Thanks very much @Robᵩ . You lead me in the right direction with the hook function. The for loop with the range became unneeded. Thanks again.
    – wbassler23
    Mar 18, 2016 at 19:31

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.