I wrote this JSON walker to help with moving objects around while updating Legacy JSONs to match a new schema. Please let me know how I could further optimize it, or more importantly, make it easier to understand and maintain.
def walk(self, obj=None):
"""Recursively walks JSON and returns a list of all
key combinations.
sample input JSON:
{
'key1' : 'value1',
'key2':'value2',
'key3':{'key3a':'value3a'},
'key4':{'key4a': [
{
'key4aa':'value4aa'
}
],
[
{
'key4ab':'value4ab',
'key4ac':'value4ac'
}
]
'key4b':'value4b'
}
}
corresponding output:
[
['key1' : 'value1'],
['key1', 'key2' : 'value2'],
['key1', 'key3', 'key3a'],
['key1', 'key4', key4a, 0, 'key4aa'],
['key1', 'key4', key4a, 1, 'key4ab'],
['key1', 'key4', key4a, 1, 'key4ac'],
['key1', 'key4', 'key4b']
]
"""
# python default values are calculated when the function
# is defined, not when it is called. This is a work-around
if obj is None:
obj = self._json
# optimization: these are the JSON schema default values.
# they can not have child elements
if obj == "" or obj == -1:
return []
# walk a dictionary
elif isinstance(obj, Mapping):
flat_tree = []
for key, val in obj.iteritems():
# make a new branch for each child key
children = self.walk(val)
if children == []:
flat_tree.append([key])
else:
flat_tree.extend([[key] + child for child in children])
return flat_tree
# walk lists
elif isinstance(obj, Iterable) and not isinstance(obj, StringTypes):
flat_tree = []
for ii, val in enumerate(obj):
# make a new branch for each child key
children = self.walk(val)
if children == []:
flat_tree.append([ii])
else:
flat_tree.extend([[ii] + child for child in children])
return flat_tree
# All possible iterable datatypes have been ruled out
# so this element is a JSON value type: (bool, int, str, float, ect.)
else:
return []
value1
andvalue2
appear in the output, but notvalue3a
orvalue4b
? Why shouldkey1
appear in every item of the output, but notkey2
— aren't they siblings of each other and therefore should be treated identically? – 200_success♦ Dec 22 '13 at 12:46