Previously I asked a question on how to group dictionaries in a list in a hierarchical structure.
Initially I wanted to group a list of dictionaries that looks like the following, using multiple keys:
[{'dept':1, 'age':10, 'name':'Sam'},
{'dept':1, 'age':12, 'name':'John'},
.
.
.
{'dept':2,'age':20, 'name':'Mary'},
{'dept':2,'age':11, 'name':'Mark'},
{'dept':2,'age':11, 'name':'Tom'}]
And the output would be:
[
{
2: {
20: [
{
'age': 20,
'dept': 2,
'name': 'Mary'
}
]
},
{
11: [
{
'age': 11,
'dept': 2,
'name': 'Mark'
},
{
'age': 11,
'dept': 2,
'name': 'Tom'
}
]
}
},
{
1: {
10: [
{
'age': 10,
'dept': 1,
'name': 'Sam'
}
]
},
{
12: [
{
'age': 12,
'dept': 1,
'name': 'John'
}
]
}
}
]
Using this code:
import itertools, operator
l = [{'dept':1, 'age':10, 'name':'Sam'},
{'dept':1, 'age':12, 'name':'John'},
{'dept':2,'age':20, 'name':'Mary'},
{'dept':2,'age':11, 'name':'Mark'},
{'dept':2,'age':11, 'name':'Tom'}]
groups = ['dept', 'age', 'name']
groups.reverse()
def hierachical_data(data, groups):
g = groups[-1]
g_list = []
for key, items in itertools.groupby(data, operator.itemgetter(g)):
g_list.append({key:list(items)})
groups = groups[0:-1]
if(len(groups) != 0):
for e in g_list:
for k,v in e.items():
e[k] = hierachical_data(v, groups)
return g_list
print hierachical_data(l, groups)
This method works fine, but now I need to optimize. The dictionaries has a big memory overhead and this grouping gets pretty slow when we are dealing with huge number of records.
I was wondering if there is any algorithm I could use to reduce the time needed to do the grouping.
P.S: I wouldn't mind changing the data-structure as long as it gives me the same hierarchical format, or any better suggestion of course.
Thanks.
select dept, age, name from something order by dept, age;
This isO(n)
, but if this is too slow, then yes - hack away. – Leonid Oct 17 '12 at 21:37