Take the 2-minute tour ×

Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

numpy array as quasi-hash table

up vote 2 down vote favorite

Motivation: I have a large number of large read-only dicts with large string keys mapped to tuples of two floats. They are taking up a large amount of heap space.

Proposed solution: a 3xn numpy array, with the first column being the hash value of the key, and sorted by that column. Only contains and getitem ops are required for my needs.

I realize that this solution is O(log(n)) rather than O(c) but I assume that this isn't that big a problem for my application as there aren't that many lookups per request and my primary problem is memory constraints.

I'm not sure how well searchsorted performs here though, or if I should try to change the stride somehow so that the first column is contiguous or something like that. I should also probably be using a record array.

import numpy

class FCD(object):

  def __init__(self, d):
    data = []
    for key, value in d.iteritems():
      data.append((hash(key), value[0], value[1]))
    data.sort()
    self.data = numpy.array(data)

  def __getitem__(self, key):
    hkey = hash(key)
    ix = numpy.searchsorted(self.data[:, 0], hkey)
    if ix < len(self.data) and self.data[ix][0] == hkey:
      return self.data[ix][1:]
    else:
      raise KeyError

  def __contains__(self, key):
    hkey = hash(key)
    ix = numpy.searchsorted(self.data[:, 0], hkey)
    return ix < len(self.data) and self.data[ix][0] == hkey

asked Apr 12 at 20:35

flem
111

I am a little confused as to why you do not just use a dict here. Can you prove that your solution is more memory efficient. The lookups will certainly be slower. – ebarr Apr 13 at 12:27

add a comment |

Your Answer

Sign up or log in

Post as a guest

Name

Email required, but not shown

Post as a guest

Name

Email required, but not shown

discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged python data-structures numpy hash-table or ask your own question.

question feed

asked	5 months ago
viewed	126 times

current community

your communities

more stack exchange communities

numpy array as quasi-hash table

Your Answer

Browse other questions tagged python data-structures numpy hash-table or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

numpy array as quasi-hash table

Know someone who can answer? Share a link to this question via email, Google+, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Browse other questions tagged python data-structures numpy hash-table or ask your own question.

Related

Hot Network Questions