Computer Science Stack Exchange is a question and answer site for students, researchers and practitioners of computer science. It's 100% free, no registration required.

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

While doing the second code kata (which asks you to implement a binary search algorithm five times, each time with a different method), I've come up with a slightly different solution which works as follows:

If i have a sorted array of lenght 100 and I see its starting field contains the number 200 and its ending field contains the number 400, me, as a maths studying human, would be likely to start searching around field 35 if I was searching the number 270, and not the field 50 like in a normal binary search algorithm.

Then, if the number on field 35 of the array is 270, 35 is the index I was searching for.

If that isn't the case I can compare the number I got (say 280) and repeat the operation taking the lower part of the array (so I have 35 fields with the starting field containing 200 and the ending field containing 280) if the number I found is greater than what I'm searching for, or the upper part of the array (say I got 260: now I have 65 indexes, the first one containing 260 and the final one containing 400. Orientatively, I would head torward index 4 of this sub array, which is index 39 of the entire array) if the number I got is smaller than the number I'm searching for.

The question is: can this algorithm be considered a binary search algorithm? If not, has it got its own name?

share|cite|improve this question
1  
Whether it's binary search or not seems to be purely a matter of opinion. Essentially, the only answer you can give is either "Yeah, it's close enough to binary search to call it binary search" or "No, it's not." Argument ensues. – David Richerby yesterday
up vote 17 down vote accepted

I would not call this a binary search.

It is clearly similar to binary search and it's natural to see it as a refinement of binary search. However it has significantly different algorithm complexity characteristics, Interpolation Search has expected run time of O(log(log(n)) assuming the data is uniformly distributed, however it pays for this by having O(n) worst case run time.

I prefer to say "The worst case run time of binary search is O(log(n))" rather than "Depending on the choice of bounding elements the worst case run time of binary search is O(log(n))". This means that I can not classify Interpolation search as a binary search algorithm.

share|cite|improve this answer
    
Presumably if you break out of interpolation search when it's going badly, you can retain O(log n) worst case and O(log log n) on sufficiently linear data. My guess is that something like "if I haven't found the target after log n attempts then switch to binary search" will work, but I'm too lazy to prove it. There will of course be a class of killer inputs on which this takes basically twice as long as a binary search. – Steve Jessop 8 hours ago
    
That killer-input idea is interesting. What if instead of allowing killer inputs to negatively affect the search (i.e. by splitting near the end of an array) we limit/trim the "splittable range" to the 2nd third of the array or similar. That would have a worst case log3(n) but still enjoy a log(log) best case. – Andy 6 hours ago
    
@SteveJessop Remember that asymthotic complexity is not the complete picture. O(log n) is very fast. In addition the binary search does very little work in each loop. So already the problem for Interpolation search is that you need very long input to make up for the fact that you do more work on each loop. You suggestion adds more work to that. If I was unable to accept O(n) for data that was not uniform I suspect the best solution is to go for a pure Binary search, rather than some hybrid approach. – Taemyr 5 hours ago

Yes, this is known as Interpolation Search. With some caveats (depending on your computational model and the distribution of the data) its expected running time is $O(\log \log n)$, better than binary search.

share|cite|improve this answer
    
Cool. Now the question is if I can use it for the code kata, but it's my problem lol. I'm finding it more complicated than the binary search though so why not. – user6245072 yesterday
    
I discovered this one time when writing code to index a log file a few year's back. I also discovered that for my data alternating steps between interpolation and binary slice was better than either option on its own. I'm not sure if that has a name, or is a known effect. – Neil Slater 11 hours ago
    
@NeilSlater hedged interpolation search perhaps? – Steve Cox 11 hours ago
    
@SteveCox: I just searched that term and found nothing. Decided to ask that as a new question: cs.stackexchange.com/questions/59750/… – Neil Slater 11 hours ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.