1

I'm experiencing a very weird problem when using a large numpy array. Here's the basic context. I have about 15 lists of paired objects which I am constructing adjacency matrices for. Each adjacency matrix is about 4000 x 4000 (square matrix where the diagonal means the object is paired with itself) so it's big but not too big. Here's the basic setup of my code:

def createAdjacencyMatrix(pairedObjectList, objectIndexList):
   N = len(objectIndexList)
   adj = numpy.zeros((N,N))
   for i in range(0, len(pairedObjectList):
      #put a 1 in the correct row/column position for the pair etc.

   return adj

In my script I call this function about 15 times, one for each paired object list. However every time I run it I get this error:

    adj = np.zeros((N,N))
MemoryError

I really don't understand where the memory error is coming from. Even though I'm making this big matrix, it only exists within the scope of that function, so shouldn't it be cleared from memory every time the function is finished? Not to mention, if the same variable is hanging around in memory, then shouldn't it just overwrite those memory positions?

Any help understanding this much appreciated.

EDIT : Here's the full output of the traceback

Traceback (most recent call last):
  File "create_biogrid_adjacencies.py", line 119, in <module>
    adjMat = dsu.createAdjacencyMatrix(proteinList,idxRefDict)
  File "E:\Matt\Documents\Research\NU\networks_project\data_setup_utils.py", line 18, in createAdjacencyMatrix
    adj = np.zeros((N,N))
MemoryError
5
  • Why do you say it only exists in the scope of that function, when in fact you return adj? Or does "that function" refer to something other than createAdjacencyMatrix? Commented Dec 11, 2014 at 6:03
  • Can you upload the full traceback, so that one can better figure out the problem. Commented Dec 11, 2014 at 6:12
  • 1
    If the values in adj are always either 0 or 1, you could save some memory by using adj = numpy.zeros((N, N), dtype=numpy.uint8). Then each element uses 1 byte instead of 8. Commented Dec 11, 2014 at 6:12
  • @WarrenWeckesser I thought about that and it may work for this case but I'm trying to figure out the underlying mechanism. Also even when I return it, its just one variable in a for loop in a script which then gets saved to a file before the next one comes up. Also, why would the memory error be from that line and not an overflow of the variable in the script for loop Commented Dec 11, 2014 at 7:02
  • 1
    @Matt You might want to put a print statement in that outer for-loop you are talking about. It would be useful to know after how many iterations your program hits a memory error - right off the bat, with the first of the 15 matrices, or further down. Also, does each object belong to a unique pair? Or can it be a part of many pairs? If it's the former, then adjacency matrices are horribly space-inefficient. You are using O(N^2) amount of space instead of just an O(N) amount. Commented Dec 12, 2014 at 7:40

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.