Fast Python spring network solver

Question

I wanted a very simple spring system written in Python. The system would be defined as a simple network of knots, linked by links using the following rules:

A knot is a massless connection between links. Each knot is only affected by the push/pull forces it receives from the links it is connected to (no gravity, viscosity etc.)... Its only attribute is to be anchored or not, where an anchored knot affects the system by its movement. An unanchored knot can also affect the system if being moved, but it will be pulled back by the resulting push/pull forces.
A link is a connection between 2 knots. It has no mass, and it applies force on the knots connected to each end derived from the difference between its current length and its initial length.
The system takes for input the initial position of each knot, each knot's anchored state, a list of links (presented as arrays of knot indices), and each link's initial lengths. The system then begins iterating over the network by adding up all the forces affecting each knot, adjusting the knots to a their new position (dampened for stability), and keep iterating until an iteration count limit is reached, or the highest force applied at any given iteration is below a given threshold. I don't care to solve over time, I don't need velocity, all I want is the final "relaxed" position of each knot.

Leveraging NumPy's vectorization, I came up with this code:

import numpy as np
from numpy.core.umath_tests import inner1d

def solver(kPos, kAnchor, link0, link1, w0, cycles=1000, precision=0.001, dampening=0.1, debug=False):
    """
    kPos       : vector array - knot position
    kAnchor    : float array  - knot's anchor state, 0 = moves freely, 1 = anchored (not moving)
    link0      : int array    - array of links connecting each knot. each index corresponds to a knot
    link1      : int array    - array of links connecting each knot. each index corresponds to a knot
    w0         : float array  - initial link length
    cycles     : int          - eval stops when n cycles reached
    precision  : float        - eval stops when highest applied force is below this value
    dampening  : float        - keeps system stable during each iteration
    """

    kPos        = np.asarray(kPos)
    pos         = np.array(kPos) # copy of kPos
    kAnchor     = 1-np.clip(np.asarray(kAnchor).astype(float),0,1)[:,None]
    link0       = np.asarray(link0).astype(int)
    link1       = np.asarray(link1).astype(int)
    w0          = np.asarray(w0).astype(float)

    F = np.zeros(pos.shape)
    i = 0
    knots, ndim = kPos.shape

    for i in xrange(cycles):

        # Init force applied per knot
        F = np.zeros(pos.shape)

        # Calculate forces
        AB = pos[link1] - pos[link0] # get link vectors between knots
        w1 = inner1d(AB,AB) ** 0.5 # get link current lengths
        AB/=w1[:,None] # normalize link vectors
        f = (w1 - w0) # calculate force
        f = f[:,None] * AB # calculate force vector

        # Apply force vectors on each knot
        #np.add.at(F, link0, f)      # F[link0] += f*AB
        #np.subtract.at(F, link1, f) # F[link1] -= f*AB
        #
        # Now using @Jaime's suggestion, this is a huge improvement!
        for dim in xrange(ndim):
            F[:,dim] = (np.bincount(link0, weights=f[:, dim], minlength=knots) - np.bincount(link1, weights=f[:, dim], minlength=knots))


        # Update point positions       
        pos += F * dampening * kAnchor

        # If the maximum force applied is below our precision criteria, exit
        if np.amax(F) < precision:
            break

    # Debug info
    if debug:
        print 'Iterations: %s'%i
        print 'Max Force:  %s'%np.amax(F)

    return pos

Here's some test data to show how it works. In this case I'm using a grid, but in reality the network can be of any shape or form:

import cProfile

# Create a 5x5 3D knot grid
z = np.linspace(-0.5, 0.5, 5)
x = np.linspace(-0.5, 0.5, 5)[::-1]
x,z = np.meshgrid(x,z)
kPos = np.array([np.array(thing) for thing in zip(x.flatten(), z.flatten())])
kPos = np.insert(kPos, 1, 0, axis=1)
'''
array([[-0.5 ,  0.  ,  0.5 ],
       [-0.25,  0.  ,  0.5 ],
       [ 0.  ,  0.  ,  0.5 ],
       ..., 
       [ 0.  ,  0.  , -0.5 ],
       [ 0.25,  0.  , -0.5 ],
       [ 0.5 ,  0.  , -0.5 ]])
'''


# Define the links connecting each knots
link0 = [0,1,2,3,5,6,7,8,10,11,12,13,15,16,17,18,20,21,22,23,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]
link1 = [1,2,3,4,6,7,8,9,11,12,13,14,16,17,18,19,21,22,23,24,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24]
AB    = kPos[link0]-kPos[link1]
w0    = np.sqrt(inner1d(AB,AB)) # this is a square grid, each link's initial length will be 0.25

# Set the anchor states
kAnchor = np.zeros(len(kPos)) # All knots will be free floating
kAnchor[12] = 1 # Middle knot will be anchored

This is what the grid looks like:

If we run my code using this data, nothing will happen since the links aren't pushing or pulling:

print np.allclose(kPos,solver(kPos, kAnchor, link0, link1, w0, debug=True))
# Returns True
# Iterations: 0
# Max Force:  0.0

Now let's move that middle anchored knot up a bit and solve the system:

# Move the center knot up a little
kPos[12] = np.array([0,0.3,0])

# eval the system
new = solver(kPos, kAnchor, link0, link1, w0, debug=True) # positions will have moved
#Iterations: 102
#Max Force:  0.000976603249133

# Rerun with cProfile to see how fast it runs
cProfile.run('solver(kPos, kAnchor, link0, link1, w0)')
# 520 function calls in 0.008 seconds

And here's what the grid looks like after being pulled by that single anchored knot:

Question

This grid example solves plenty fast, but my actual use cases are a little more complex than this example (~100-200 knots, ~300-500 links) and solve too slow. I'm looking for ways to make this faster. I did try to get rid of the square root calls at every iteration, which didn't make much of a difference.

I did try to cythonize my program (see below) in hopes to improve performance, but with my very limited C knowledge the final performance was far worse.

If anyone can show me ways to speed up its current pure Python implementation, or could show me ways to properly cythonize my function, I would be very grateful.

Here's my very naive Cython version:

cdef extern from "math.h":
    double sqrt(double m)


import numpy as np
cimport numpy as np
cimport cython

ctypedef np.float64_t DTYPE_f
ctypedef np.int64_t DTYPE_i

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)


def eval(np.ndarray[DTYPE_f, ndim=2] kPos,
                np.ndarray[DTYPE_f, ndim=1] kAnchor,
                np.ndarray[DTYPE_i, ndim=1] link0,
                np.ndarray[DTYPE_i, ndim=1] link1,
                np.ndarray[DTYPE_f, ndim=1] w0,
                int cycles=100,
                precision=0.001,
                dampening=0.1,
                debug=False):


    cdef Py_ssize_t i, j, k, nKnots, nLinks
    cdef double w1
    cdef double f
    cdef double maxF

    nKnots = len(kPos)
    nLinks = len(link0)
    cdef np.ndarray[DTYPE_f, ndim=2] pos = np.array(kPos)
    cdef np.ndarray[DTYPE_f, ndim=2] F = np.zeros((nKnots,3))
    cdef np.ndarray[DTYPE_f, ndim=1] AB

    for i in range(cycles):
        F = np.zeros((nKnots,3))


        # Calculate forces
        for j in range(nLinks):
            AB = pos[link1[j]] - pos[link0[j]]
            w1 = sqrt(AB[0]**2 + AB[1]**2 + AB[2]**2)

            if w1 > 0:
                AB/=w1
                f = w1 - w0[j]
                F[link0[j]] += f * AB
                F[link1[j]] -= f * AB

        # Update point positions
        for j in range(nKnots):
            if kAnchor[j] < 1:
                w1 = sqrt(F[j][0]**2 + F[j][1]**2 + F[j][2]**2)

                if maxF < w1:
                    maxF = w1

                pos[j] = pos[j] + F[j] * dampening * (1-kAnchor[j])


        if maxF < precision:
            break


    # Debug info
    if debug:
        print 'Iterations: %s'%i
        print 'Max Force:  %s'%maxF


    return pos

Something is wrong, possibly with the test. Since only one knot is anchored I expect the whole grid to move 0.3 up. Why does it deform at all? — vnp, Feb 26 at 5:21
Not really a speed up but you should probably use __debug__ instead of debug. As in there is a predefined constant __debug__ made specifically for the purpose that you use debug in... — Dair, Feb 26 at 5:51
@vnp if you walk through the algorithm it makes sense. At the very first iteration, only the links connected to the anchored knot have changed (they're longer than initial length), this causes a force pulling the connected knots, the anchored one doesn't change so it's immediate neighbors get pulled in. 2nd iteration the same knots get pulled again towards the anchor but this time they get resistance from the links to their neighbors who didn't move last cycle. Those neighbors also get a force pulling them inward... so on and so forth until one of the end conditions are met. — Fnord, Feb 26 at 7:52
Before we dive into an algorithm, let's check the physics. In the final configuration I see the force pulling a (0,1) knot up, and I don't see the force which would counteract it. — vnp, Feb 26 at 19:31
@vpn re: first comment: Imagine this grid example is square cloth. If you pinch the center with your fingers and move up, the whole cloth won't remain square. In the code example, the system's 1st iteration sees the links (11,12),(12,13),(7,12),(12,17) as being longer than initial, this will generate forces on knots (7,11,13,17) to move towards the center. 2nd iteration all links connected these knots will differ from their initial lengths, and will in turn generate forces on all connected knots. Each iteration produces forces until all links are (almost) back at their original lengths. — Fnord, Feb 27 at 18:23

Jaime · Accepted Answer · 2016-02-26 08:40:06Z

up vote 3 down vote accepted

You should profile your code, to figure out what exactly is it that is slowing your code down. It's hard to tell without some actual measurements, but my bet is on your calls to np.add.at and np.minus.at, as the .at() method is notoriously (very) slow. For some operations there is really no alternative, but for addition/subtraction you can use np.bincount. The transformed code would look something like:

knots, ndim = kPos.shape
...
for dim in range(ndim):
    # Apply force vectors on each dimension of each knot
    F[:, dim] = (np.bincount(link0, weights=f[:, dim], minlength=knots) -
                 np.bincount(link1, weights=f[:, dim], minlength=knots))

As I said, you'll need to test it, but I wouldn't be surprised if this little change makes your code 5-10x faster.

edited Feb 26 at 8:40

answered Feb 26 at 6:24

Jaime

4,517618

Wow, this is amazing! The speed difference is difficult to quantize but its definitely significant. To test i set the max iteration count to a large number, so the only end condition is that the highest applied force be below my threshold. On a small simple example (10x10 grid, 100 knots, 180 links) the difference is 0.17s vs 0.59 = ~2.88 times faster. But on a larger grid (50x50 grid, 2500 knots, 4900 links) it goes to 23.89s vs 3.29 = >7x faster. This is definitely a big boost! Do you have any recommendations for profiling (i'm using cProfile)... also fyi F[dim] should be F[:,dim]. – Fnord Feb 26 at 8:06

Oops! Fixed the assignment index mistake, thanks. And unfortunately no, I don't know of any good profiling tools, sorry. – Jaime Feb 26 at 8:43

add a comment |

oliverpool · Answer 2 · 2016-02-26 10:38:36Z

up vote 0 down vote

To compute the link current lengths, maybe numpy.linalg.norm can be useful.

w1 = np.sqrt(inner1d(AB,AB)) # get link current lengths

Becomes (no sure about the axis value)

w1 = np.linalg.norm(AB, axis=0)

answered Feb 26 at 10:38

oliverpool

1,042114

1

Thanks for the input. Unfortunately np.linalg.norm is a bit slower than the square root of inner1d. – Fnord Feb 26 at 17:41

add a comment |

asked	2 months ago
viewed	91 times
active	2 months ago

current community

your communities

more stack exchange communities

Fast Python spring network solver

Question

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged python numpy cython or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Fast Python spring network solver

Question

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python numpy cython or ask your own question.

Related

Hot Network Questions