I'm experiencing a very weird problem when using a large numpy array. Here's the basic context. I have about 15 lists of paired objects which I am constructing adjacency matrices for. Each adjacency matrix is about 4000 x 4000 (square matrix where the diagonal means the object is paired with itself) so it's big but not too big. Here's the basic setup of my code:
def createAdjacencyMatrix(pairedObjectList, objectIndexList):
N = len(objectIndexList)
adj = numpy.zeros((N,N))
for i in range(0, len(pairedObjectList):
#put a 1 in the correct row/column position for the pair etc.
return adj
In my script I call this function about 15 times, one for each paired object list. However every time I run it I get this error:
adj = np.zeros((N,N))
MemoryError
I really don't understand where the memory error is coming from. Even though I'm making this big matrix, it only exists within the scope of that function, so shouldn't it be cleared from memory every time the function is finished? Not to mention, if the same variable is hanging around in memory, then shouldn't it just overwrite those memory positions?
Any help understanding this much appreciated.
EDIT : Here's the full output of the traceback
Traceback (most recent call last):
File "create_biogrid_adjacencies.py", line 119, in <module>
adjMat = dsu.createAdjacencyMatrix(proteinList,idxRefDict)
File "E:\Matt\Documents\Research\NU\networks_project\data_setup_utils.py", line 18, in createAdjacencyMatrix
adj = np.zeros((N,N))
MemoryError
return adj
? Or does "that function" refer to something other thancreateAdjacencyMatrix
?adj
are always either 0 or 1, you could save some memory by usingadj = numpy.zeros((N, N), dtype=numpy.uint8)
. Then each element uses 1 byte instead of 8.