So, it was a part of my coding challenge and the last one, but I failed because it failed to yield a result in two seconds for some of sample inputs (out of six samples, three passed but three failed for not meeting the time constraint).
Basically, you are supposed to find the maximum deviation from given number of consecutive integers. There are two inputs; the first argument is an array of integers and the second argument is the number of consecutive integers. And you print the maximum deviation. For example, if you have [10, 1, 5, 2, 6, 3] and 3 as arguments, the output should be 9 since [10, 1, 5] would yield the maximum deviation of 9 (10-1).
My solution is the following. Can this be refactored to be faster?
def find_deviation(integers, range)
max = 0
(0..integers.size-range).each do |n|
array = integers.slice(n, range)
diff = array.max - array.min
max = diff if diff > max
end
puts max
end
EDIT
So, I did some research, and Ruby's sort method is actually fast (using quicksort and thus would be O(n log n)). For overall complexity using sort, it should be O(n/m * n log n). So I implemented a couple of different implementations using sort.
And it was still running > 2 seconds. So, I wanted to see which part is taking the longest. And it turns out it's each_cons
. For science, I also compared it with using slice
method as well, and each_cons
was a little faster.
require 'benchmark'
array1 = (1..100000).map { rand(2**23-1) + 1 }
def d_max(integers, range)
max = 0
integers.each_cons(range).each do |array|
sorted = array.sort
new_max = sorted[-1] - sorted[1]
max = new_max if new_max > max
end
end
def d_max2(integers, range)
max = 0
(0..integers.size-range).each do |n|
sorted = integers.slice(n, range).sort
diff = sorted[-1] - sorted[1]
max = diff if diff > max
end
end
def bm_each_cons(integers, range)
integers.each_cons(range).each do |array|
end
end
def bm_slice(integers, range)
(0..integers.size-range).each do |n|
integers.slice(n, range)
end
end
Benchmark.bmbm do |x|
[10, 100, 1000, 10000].each do |d|
x.report("d_max, d = #{d}") { d_max(array1, d) }
x.report("d_max2, d = #{d}") { d_max(array1, d) }
x.report("each_cons, d = #{d}") { bm_each_cons(array1, d) }
x.report("slice, d = #{d}") { bm_each_cons(array1, d) }
x.report("sort, d = #{d}") { array = (1..d).map { rand(2**23-1) + 1 }; array.sort }
end
end
And the results are (d is range),
user system total real
d_max, d = 10 0.130000 0.000000 0.130000 ( 0.124788)
d_max2, d = 10 0.120000 0.000000 0.120000 ( 0.123573)
each_cons, d = 10 0.040000 0.000000 0.040000 ( 0.034299)
slice, d = 10 0.030000 0.000000 0.030000 ( 0.033859)
sort, d = 10 0.000000 0.000000 0.000000 ( 0.000020)
d_max, d = 100 1.020000 0.020000 1.040000 ( 1.046397)
d_max2, d = 100 1.010000 0.020000 1.030000 ( 1.028441)
each_cons, d = 100 0.090000 0.000000 0.090000 ( 0.099764)
slice, d = 100 0.090000 0.010000 0.100000 ( 0.098405)
sort, d = 100 0.010000 0.000000 0.010000 ( 0.000073)
d_max, d = 1000 12.960000 0.010000 12.970000 ( 12.973879)
d_max2, d = 1000 12.880000 0.000000 12.880000 ( 12.892298)
each_cons, d = 1000 0.300000 0.000000 0.300000 ( 0.298419)
slice, d = 1000 0.310000 0.000000 0.310000 ( 0.304827)
sort, d = 1000 0.000000 0.000000 0.000000 ( 0.000573)
d_max, d = 10000 156.320000 0.080000 156.400000 (156.427606)
d_max2, d = 10000 155.530000 0.080000 155.610000 (155.614334)
each_cons, d = 10000 2.690000 0.000000 2.690000 ( 2.694524)
slice, d = 10000 2.890000 0.000000 2.890000 ( 2.889468)
sort, d = 10000 0.000000 0.000000 0.000000 ( 0.006876)
So, the real question is how to operate on subsets of array fast. Even in @tokland's suggestion, you would still need to operate on all subsets of the given array, and it would be the slowest....
v
necessarily sorted? If not, could you update your question with an unsorted example? – 200_success♦ Feb 16 at 8:35