Join the Stack Overflow Community
Stack Overflow is a community of 6.5 million programmers, just like you, helping each other.
Join them; it only takes a minute:
Sign up

I'm experiencing a very weird problem when using a large numpy array. Here's the basic context. I have about 15 lists of paired objects which I am constructing adjacency matrices for. Each adjacency matrix is about 4000 x 4000 (square matrix where the diagonal means the object is paired with itself) so it's big but not too big. Here's the basic setup of my code:

def createAdjacencyMatrix(pairedObjectList, objectIndexList):
   N = len(objectIndexList)
   adj = numpy.zeros((N,N))
   for i in range(0, len(pairedObjectList):
      #put a 1 in the correct row/column position for the pair etc.

   return adj

In my script I call this function about 15 times, one for each paired object list. However every time I run it I get this error:

    adj = np.zeros((N,N))
MemoryError

I really don't understand where the memory error is coming from. Even though I'm making this big matrix, it only exists within the scope of that function, so shouldn't it be cleared from memory every time the function is finished? Not to mention, if the same variable is hanging around in memory, then shouldn't it just overwrite those memory positions?

Any help understanding this much appreciated.

EDIT : Here's the full output of the traceback

Traceback (most recent call last):
  File "create_biogrid_adjacencies.py", line 119, in <module>
    adjMat = dsu.createAdjacencyMatrix(proteinList,idxRefDict)
  File "E:\Matt\Documents\Research\NU\networks_project\data_setup_utils.py", line 18, in createAdjacencyMatrix
    adj = np.zeros((N,N))
MemoryError
share|improve this question
    
Why do you say it only exists in the scope of that function, when in fact you return adj? Or does "that function" refer to something other than createAdjacencyMatrix? – Warren Weckesser Dec 11 '14 at 6:03
    
Can you upload the full traceback, so that one can better figure out the problem. – Irshad Bhat Dec 11 '14 at 6:12
    
If the values in adj are always either 0 or 1, you could save some memory by using adj = numpy.zeros((N, N), dtype=numpy.uint8). Then each element uses 1 byte instead of 8. – Warren Weckesser Dec 11 '14 at 6:12
    
@WarrenWeckesser I thought about that and it may work for this case but I'm trying to figure out the underlying mechanism. Also even when I return it, its just one variable in a for loop in a script which then gets saved to a file before the next one comes up. Also, why would the memory error be from that line and not an overflow of the variable in the script for loop – Matt Dec 11 '14 at 7:02
    
@Matt You might want to put a print statement in that outer for-loop you are talking about. It would be useful to know after how many iterations your program hits a memory error - right off the bat, with the first of the 15 matrices, or further down. Also, does each object belong to a unique pair? Or can it be a part of many pairs? If it's the former, then adjacency matrices are horribly space-inefficient. You are using O(N^2) amount of space instead of just an O(N) amount. – Praveen Dec 12 '14 at 7:40

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.