Associative array in python with keys as a fragment of array

Question

How to better implement such thing:

Loop over the numbers. Each number in diapason and belong to one part of associative array.

E.g.

 di = {}
 di[ xrange(0,10) ] = "A"
 di[ xrange(11,20) ] = "B"
 di[ xrange(21,30) ] = "C"

 direction = di[ 24 ]
 # direction should be "C"

 direction = di[ 6 ]
 # direction should be "A"

Edit: Populate whole di with discreet numbers is not my way, because I'm talking about IP addresses, really big data, such as netmasks 255.255.255.255. Such di will overflow my RAM.

What are you trying to do with the array? For IP addresses, it's unlikely you want any explicit mapping; rather, you'll use bitwise operations involving the integer representations of the addresses and masks. — chepner, Commented Jul 28, 2014 at 20:27
if you want zero storage just use a function (in my answer) that maps numbers to letters. for more complicated things you probably want a specialized module — Gabriel, Commented Jul 28, 2014 at 20:30
what's the data space of keys to values? i.e. you want di[range] = value, how many ranges are there, and how many values? You most likely want some kind of binary tree-based approach (actually list + bisect) or a graph model of some kind. — Corley Brigman, Commented Jul 28, 2014 at 20:35

njzk2 · Accepted Answer · 2014-07-28 21:05:25Z

I would create a custom dict that takes xranges as keys :

class DictRange(dict):
    def __getitem__(self, val):
        for key in self.keys():
            if val in key:
                return super(DictRange,self).__getitem__(key)

The drawback is that you have to go through all the keys to find your element. Use like that :

di = DictRange()
di[ xrange(0,10) ] = "A"
di[ xrange(11,20) ] = "B"
di[ xrange(21,30) ] = "C"

print di[24]
# C
print di[6]
# A

See http://repl.it/WAJ

Update

Using bisect and assuming that you can spare some time during initialization to speed up access, you can do something like:

import bisect
class DictRange(dict):

    def __setitem__(self, key, val):
        super(DictRange,self).__setitem__(key, val)
        self.ks = [key[0] for key in self.keys()]

    def __getitem__(self, val):
        return super(DictRange,self).__getitem__(self.keys()[bisect.bisect(self.ks, val) - 1])

the get will be O(logn) where n is the number of keys, but the set will become O(n). As for the next solution, it relies on the ranges being contiguous.

Other solution

An other solution, which could be faster, depending on your range sizes, would be to use as key the first item of the range, and a bisect to find the proper key:

import bisect
ranges = [0, 11, 21, 31]
di = {}
di[0] = 'A'
di[11] = 'B'
di[21] = 'C'
di[31] = 'unknown' # special value to indicate the end of the range

print di[ranges[bisect.bisect(ranges, 24) - 1]]
# C
print di[ranges[bisect.bisect(ranges, 6) - 1]]
# A
print di[ranges[bisect.bisect(ranges, 31) - 1]]
# Unknown

This will be much faster. bisect is O(logn) where n is the number of ranges, the rest is O(1)

Could you tell me, why it is so slow? How to improve the speed? — Watch0nMe, Commented Jul 28, 2014 at 20:46
it is very slow because each search needs to browse through all keys until the correct one is found. I suppose the in implementation of xrange is probably O(1), but overall, it is still O(n) where n is the number of keys in the dict, compared to O(1) for a proper hash search as used in a regular dict. — njzk2, Commented Jul 28, 2014 at 20:52
the way a dict (or any hash set) works is that the hash of the key is pseudo unique and easily mappable. in your case it is not possible, as the key is not identified with the search value. — njzk2, Commented Jul 28, 2014 at 20:54

unutbu · Accepted Answer · 2014-07-28 20:16:10Z

0

di = dict.fromkeys(xrange(0, 10), 'A')
di.update(dict.fromkeys(xrange(10, 20), 'B'))
di.update(dict.fromkeys(xrange(20, 30), 'C'))
print(di)

yields

{0: 'A', 1: 'A', 2: 'A', 3: 'A', 4: 'A', 5: 'A', 6: 'A', 7: 'A', 8: 'A', 9: 'A', 10: 'B', 11: 'B', 12: 'B', 13: 'B', 14: 'B', 15: 'B', 16: 'B', 17: 'B', 18: 'B', 19: 'B', 20: 'C', 21: 'C', 22: 'C', 23: 'C', 24: 'C', 25: 'C', 26: 'C', 27: 'C', 28: 'C', 29: 'C'}

Note that xrange(N, M) generates ints starting at N and ending at M-1 (inclusive). So if you want to include 10, 20 in the keys, then the xranges should be xrange(10, 20) and xrange(20, 30).

answered Jul 28, 2014 at 20:16

unutbu

885k197 gold badges1.8k silver badges1.7k bronze badges

this is great, but I'm talking about really big data, such as netmask 255.255.255.255. In your way this di will overflow my RAM.
– Watch0nMe
Commented Jul 28, 2014 at 20:19
Then a dict is the wrong data structure. Perhaps then you should be using the ipaddress module.
– unutbu
Commented Jul 28, 2014 at 20:23
What about map, or something else? I'm interesting in this way: di[5] == "A". or map[11] == "B"
– Watch0nMe
Commented Jul 28, 2014 at 20:26

Add a comment |

bruno desthuilliers · Accepted Answer · 2014-07-28 20:20:40Z

0

You could use tuples instead of lists to solve the TypeError (which your code doesn't raise FWIW) but the lookup still wouldn't work. The simplest thing that could possibly work would be to populate you dict with every discrete key for a given value, ie:

map = (
   (0, 10, "A"),
   (11, 20, "B"),
   # etc
   )

d = {}
for start, stop, val in map:
   for k in xrange(start, stop):
       d[k] = val

answered Jul 28, 2014 at 20:20

bruno desthuilliers

78.2k6 gold badges101 silver badges129 bronze badges

See Edit, I can't use such dict, because it is too big.
– Watch0nMe
Commented Jul 28, 2014 at 20:27
Typical XY problem. Why didn't you post your real question instead ?
– bruno desthuilliers
Commented Jul 28, 2014 at 20:33

Add a comment |

Gabriel · Accepted Answer · 2014-07-28 20:29:36Z

If you don't want to populate a dictionary just use a function,

from __future__ import division
import string
def return_ltr(i):
    n = i//10 
    return string.ascii_uppercase[n]

Otherwise, if you want to populate a dictionary do the following. Your current code is generating xrange objects and using them as keys. However, that is not the same as using the numbers for keys. Try the following,

import string
d = {}
i = 0
for ltr in string.ascii_uppercase:
    for j in xrange(10):
        d[i] = ltr
        i += 1

If you only want a subset of the letters you can slice string.ascii_uppercase. For example using string.ascii_uppercase[0:3] will just give you A, B, and C.

Collectives™ on Stack Overflow

Associative array in python with keys as a fragment of array

4 Answers 4

Update

Other solution

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Update

Other solution

Your Answer

Sign up or log in

Post as a guest

Related