Sign up ×
Stack Overflow is a community of 4.7 million programmers, just like you, helping each other. Join them, it only takes a minute:

Good evening,

I am currently working on a first year university project to simulate continuum percolation. This involves randomly distributing some discs/spheres/hyperspheres across a square/cube/hypercube in n dimensional space and finding a cluster of connected particles that spans the boundaries.

In order to speed up what is essentially collision detection between all these particles to group them up into connected clusters, I have decided to use spatial partitioning so my program scales nicely with number of particles. This requires me to divide the n dimensional space up with evenly sized boxes/cubes/hypercubes and place particles inside the relevant boxes so that an optimised collision check may be done which requires less comparisons since only particles lying in the boxes/cubes/hypercubes adjacent to that in which the new particle lies need to be checked. All the detail has been worked out algorithmically.

However, it seemed like a good idea to use an ndarray which has "dimension" equal to that of the space being studied. Then each "point" in the ndarray would itself contain an array of particle objects. It would be easy to look at the objects in the ndarray existing in coordinates around that of the new particle and cycle through the arrays contained in those which would themselves contain the other particles against which the check must be done. I then found out that ndarray can only contain objects of a fixed size, which these arrays of particles are not since they grow as particles are randomly added to the system.

Would a normal numpy array of array of array (etc..) be the only solution or do structures similar to ndarray but able to accomodate objects of variable size exist? Ndarray seemed great because it is part of numpy which is written in the compiled language c so it would be fast. Furthermore an ndarray would not require and loops to construct as I believe an array of arrays of arrays (etc...) would (NB: dimensionality of space and the increments of spatial division are not constant as particles of different radii can be added, meaning a change in the size of the spatial division squares/cubes/hypercubes).

Speed is very important in this program and it would be a shame to see the algorithmically good optimisations I have found be ruined by bad implementation!

share|improve this question
    
You can make an ndarray of 'object' - and then you can populate it with, e.g. lists or any other object. You lose some of the speed advantages of using ndarray, but still have the multi-dimensional slicing and so forth. The .nonzero() method in particular could be useful. – greggo Jan 26 at 20:38
    
It sounds like you want something more along the lines of a K-d tree. Nearest neighbour searches on K-d trees are O(log(N)) – ali_m Jan 26 at 21:00
    
numpy arrays with dtype=object loose most of the speed advantages of conventional numeric numpy arrays. They are little better than glorified Python lists of lists. – hpaulj Jan 26 at 22:48

1 Answer 1

up vote 3 down vote accepted

Have you considered using a kd-tree instead? kd-trees support fast enumeration of the neighbours of a point by splitting the space (much like you suggested with the multidimensional arrays).

As a nice bonus, there's already a decent kd-tree implementation in SciPy, the companion project to NumPy: scipy.spatial.KDTree.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.