Faster way to find maximum deviation?

Question

So, it was a part of my coding challenge and the last one, but I failed because it failed to yield a result in two seconds for some of sample inputs (out of six samples, three passed but three failed for not meeting the time constraint).

Basically, you are supposed to find the maximum deviation from given number of consecutive integers. There are two inputs; the first argument is an array of integers and the second argument is the number of consecutive integers. And you print the maximum deviation. For example, if you have [10, 1, 5, 2, 6, 3] and 3 as arguments, the output should be 9 since [10, 1, 5] would yield the maximum deviation of 9 (10-1).

My solution is the following. Can this be refactored to be faster?

def find_deviation(integers, range)
  max = 0
  (0..integers.size-range).each do |n|
    array = integers.slice(n, range)
    diff = array.max - array.min
    max = diff if diff > max
  end
  puts max
end

EDIT

So, I did some research, and Ruby's sort method is actually fast (using quicksort and thus would be O(n log n)). For overall complexity using sort, it should be O(n/m * n log n). So I implemented a couple of different implementations using sort.

And it was still running > 2 seconds. So, I wanted to see which part is taking the longest. And it turns out it's each_cons. For science, I also compared it with using slice method as well, and each_cons was a little faster.

require 'benchmark'

array1 = (1..100000).map { rand(2**23-1) + 1 }

def d_max(integers, range)
  max = 0
  integers.each_cons(range).each do |array|
    sorted = array.sort
    new_max = sorted[-1] - sorted[1]
    max = new_max if new_max > max
  end
end

def d_max2(integers, range)
  max = 0
  (0..integers.size-range).each do |n|
    sorted = integers.slice(n, range).sort
    diff = sorted[-1] - sorted[1]
    max = diff if diff > max
  end
end

def bm_each_cons(integers, range)
  integers.each_cons(range).each do |array|
  end
end

def bm_slice(integers, range)
  (0..integers.size-range).each do |n|
    integers.slice(n, range)
  end
end

Benchmark.bmbm do |x|
  [10, 100, 1000, 10000].each do |d|
    x.report("d_max, d = #{d}") { d_max(array1, d) }
    x.report("d_max2, d = #{d}") { d_max(array1, d) }
    x.report("each_cons, d = #{d}") { bm_each_cons(array1, d) }
    x.report("slice, d = #{d}") { bm_each_cons(array1, d) }
    x.report("sort, d = #{d}") { array = (1..d).map { rand(2**23-1) + 1 }; array.sort }
  end
end

And the results are (d is range),

                           user     system      total        real
d_max, d = 10          0.130000   0.000000   0.130000 (  0.124788)
d_max2, d = 10         0.120000   0.000000   0.120000 (  0.123573)
each_cons, d = 10      0.040000   0.000000   0.040000 (  0.034299)
slice, d = 10          0.030000   0.000000   0.030000 (  0.033859)
sort, d = 10           0.000000   0.000000   0.000000 (  0.000020)
d_max, d = 100         1.020000   0.020000   1.040000 (  1.046397)
d_max2, d = 100        1.010000   0.020000   1.030000 (  1.028441)
each_cons, d = 100     0.090000   0.000000   0.090000 (  0.099764)
slice, d = 100         0.090000   0.010000   0.100000 (  0.098405)
sort, d = 100          0.010000   0.000000   0.010000 (  0.000073)
d_max, d = 1000       12.960000   0.010000  12.970000 ( 12.973879)
d_max2, d = 1000      12.880000   0.000000  12.880000 ( 12.892298)
each_cons, d = 1000    0.300000   0.000000   0.300000 (  0.298419)
slice, d = 1000        0.310000   0.000000   0.310000 (  0.304827)
sort, d = 1000         0.000000   0.000000   0.000000 (  0.000573)
d_max, d = 10000     156.320000   0.080000 156.400000 (156.427606)
d_max2, d = 10000    155.530000   0.080000 155.610000 (155.614334)
each_cons, d = 10000   2.690000   0.000000   2.690000 (  2.694524)
slice, d = 10000       2.890000   0.000000   2.890000 (  2.889468)
sort, d = 10000        0.000000   0.000000   0.000000 (  0.006876)

So, the real question is how to operate on subsets of array fast. Even in @tokland's suggestion, you would still need to operate on all subsets of the given array, and it would be the slowest....

Is v necessarily sorted? If not, could you update your question with an unsorted example? — 200_success♦, Feb 16 at 8:35
@Yang, why haven't you answered the two questions above? They were asked more than 16 hours ago. Please do so with an edit, not a comment. — Cary Swoveland, Feb 17 at 3:35
Sorry! Somehow, I wasn't getting emails for questions even though I marked to track this question. @200_success, no, it's not sorted. I will update the example. — yangtheman, Feb 17 at 8:30
There's a discrepancy between your explanation and your code. You wrote that the maximum deviation for [10,1,5] is 5, because 10 - 5 = 5. However you compute the difference between the max and the min, which would be 10 - 1 = 9, and I think that's what you're really looking for. — user846250, Feb 19 at 9:44

14 revs · Answer 1 · 2014-02-23 23:11:14Z

up vote 10 down vote

Some notes:

find_deviation(v, d): Try to write more meaningful names for variables. Specially, I'd give a plurar name to v, since it's a collection.
max = 0, each, inline if: All of this denote you think in imperative terms. Some notes on functional programming with Ruby.

I'd write, with a functional approach:

def find_deviation(values, m)
  values.each_cons(m).map { |xs| xs.max - xs.min }.max
end

Now, this function has exactly the same time complexity than yours (though it may be faster or slower depending on how Ruby works). The complexity is: len(values) = n -> O(n*m). Note that you can use Enumerable#minmax to avoid one traversal, but it's still the same complexity.

To make it faster here's an idea, even though it's not trivial to implement: use a structure with O(log n) times for insertion/deletion/max/min (a binary search tree is ideal for this) to hold values of the sliding window, this way the overall complexity is O(n*log m). I guess some of the tests have big values of m so the trivial approach O(n*m) is too slow.

[edit] Just for fun, I wrote a O(n log m) implementation, it's pretty fast (sorry, in Haskell, I found no BST gem for Ruby):

deviation :: (Num a, Ord a) => [a] -> Int -> a 
deviation [] _ = 0
deviation xs m = maximum $ [Tree.maximum w - Tree.minimum w | w <- windows]
  where
    windows = scanl shift (Tree.fromList $ take m xs) (zip xs (drop m xs))
    shift window (old, new) = Tree.insert (Tree.delete window old) new

edited Feb 23 at 23:11

community wiki

14 revs
tokland

Yeah, I don't think you can get away from having to examine all elements in the array. I don't know if implementing my own way of finding max and min would be faster than using Ruby's min and max method. I guess the key is not creating subset of arrays, but to operate on the given array (thus using binary search on sliding window). Let me see if I can test this hypothesis. – yangtheman Feb 17 at 8:56

I did some research of my own and updated my question. What do you think? – yangtheman Mar 3 at 20:16

"you would still need to operate on all subsets of the given array, and it would be the slowest" -> no, in my second suggestion you'd use trees, not arrays, so sorting/inclusion/removal operations would be logarithmic (nearly linear for all practical purposes). – tokland Mar 3 at 21:30

I see. Would it work even if given argument is an array? I don't know Haskell, but if given argument is an array, there is no other way around having to work with subsets of the given array? – yangtheman Mar 3 at 22:06

No, arrays may have some operations O(1), but most definitely not all three you need here. Check for example for Python: wiki.python.org/moin/TimeComplexity#list – tokland Mar 3 at 22:23

add comment

asked	4 months ago
viewed	155 times
active	4 months ago

current community

your communities

more stack exchange communities

Faster way to find maximum deviation?

1 Answer

Your Answer

Not the answer you're looking for? Browse other questions tagged performance ruby array or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Faster way to find maximum deviation?

1 Answer

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged performance ruby array or ask your own question.

Related

Hot Network Questions