Sign up ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I have a binary file with alternating uint8 and uint64 data stamps. I read those in by using the following line:

clicks = np.fromfile(filename, dtype=[('time','u8'),('channel','u2')])

This works well and fast enough. Now i want to go though the array and set the time values to the time difference with respect to the last 'click' seen on channel 7 (the so called gate clicks). The array is sorted by time. In C I would do this with a simple for loop going over the array (and this works extremly fast). When I implement this in python i get a data rate of only 2 mb/s. The best solution i came up with looks like this:

    ''' create an array with the indices of the channel-7 clicks '''
    gate_clicks = clicks['channel']==7
    gate_ind = np.array(range(len(gate_clicks)))
    gate_ind = gate_ind[gate_clicks]
    gate_ind_shift = np.delete(gate_ind,0,0)

    ''' slice out the clicks between to gate clicks and set the time stamps '''
    for start,end in zip(gate_ind,gate_ind_shift):
        start_time = data[start]['time']
        slice = data[start:end] 
        slice['time'] = slice['time']-start_time
        data[start:end] = slice

This gives a data rate of about 4.

share|improve this question
2  
Can you provide enough context so that we can run your code? That is, add import statements etc. so that the extract becomes a runnable problem, and provide us with example data to test it on. – Gareth Rees Mar 31 at 16:06

1 Answer 1

You could use numpy.digitize to group the data and vectorize the loop. Demo:

>>> clicks
array([(0L, 7), (1L, 0), (2L, 0), (3L, 0), (4L, 7), (5L, 0), (6L, 0),
       (7L, 0), (8L, 0), (9L, 7)],
      dtype=[('time', '<u8'), ('channel', '<u2')])
>>> bins = clicks['time'][clicks['channel']==7]
>>> bins
array([0, 4, 9], dtype=uint64)
>>> ind = np.digitize(clicks['time'], bins) - 1
>>> ind
array([0, 0, 0, 0, 1, 1, 1, 1, 1, 2])
>>> bins[ind]
array([0, 0, 0, 0, 4, 4, 4, 4, 4, 9], dtype=uint64)
>>> clicks['time'] - bins[ind]
array([0, 1, 2, 3, 0, 1, 2, 3, 4, 0], dtype=uint64)
share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.