Encapsulating checking the distance
You are checking whether dst < max
and whether dst < min
twice with the same consequences. By encapsulating this in a short function, you can avoid redundancies and make your code more clear. Assuming that min<max
, this function would could look like this:
def check_dst(i):
dst = d_f['Dst'][i]
if dst < min:
print 'out of range'
return dst < max
As the above function only takes an index as an argument, it would have to be defined inside of find_storms_dst
. This has more advantages in the following steps. Even if the following steps will reduce the need for using this function twice, it still makes your code more readable (and facilitates list filtering).
Moreover, it seems likely to me that you want to do something else if dst < min
, e.g., raising an error or at least a warning and check_dst
should not return True
.
Filtering first
Your code seems to do two things:
By unnecessarily intertwining those aspects, you are complicating your code. So, let’s first find the indices that comply with your criteria and call them protostorms
. With the above definition, this is pretty straightforward:
protostorms = filter(check_dst, range(len(d_f.index)))
Compiling the proto-storms
Now, all that remains to do is to compile the list of storms from the proto-storms. An straightforward way to do this would be popping items from your protostorms list successively like this:
storms = []
while protostorms:
i = protostorms.pop(0)
s = Storm(i, d_f.index[i], d_f['Dst'][i])
while protostorms and protostorms[0]==i+1:
i = protostorms.pop(0)
s.log(i, d_f.index[i], dst)
storms.append(s)
return storms
Result
Taking all of the above together, your new code would look like this:
def find_storms_dst(d_f, max, min):
def check_dst(i):
dst = d_f['Dst'][i]
if dst < min:
print 'out of range'
return dst < max
protostorms = filter(check_dst, range(len(d_f.index)))
storms = []
while protostorms:
i = protostorms.pop(0)
s = Storm(i, d_f.index[i], d_f['Dst'][i])
while protostorms and protostorms[0]==i+1:
i = protostorms.pop(0)
s.log(i, d_f.index[i], dst)
storms.append(s)
return storms
You could encapsulate the last loop, but I will leave it like this for now.
Restructuring d_f
and Storm
Another issue I noticed is that your d_f
object seems to be structured in a cumbersome manner. To get a distance, you first have to get an index from d_f.index
(by the way: Shouldn’t this be named d_f.indices
?), which you then have to feed into d_f['Dst']
. Also Storm
and log
take multiple properties from some element of d_f
, which could be streamlined.
If you can, I recommend changing d_f
such that it is an iterable of objects – let’s call them events –, which are accepted as an input by Storm
and log
and have an index and a distance as properties. Your code could then look like something along the line of the following:
def find_storms_dst(d_f, max, min):
def check_dst(event):
if event.dst < min:
print 'out of range'
return event.dst < max
def adjacent(storm, protostorm):
return storm.last_index() == protostorm.index
protostorms = filter(check_dst, d_f)
storms = []
while protostorms:
s = Storm(protostorms.pop(0))
while protostorms and adjacent(s, protostorms[0]):
s.log(protostorms.pop(0))
storms.append(s)
return storms