Tell me more ×
Mathematics Stack Exchange is a question and answer site for people studying math at any level and professionals in related fields. It's 100% free, no registration required.

Given some data, and a fixed number of bins (k)-- How can I design a Dynamic Programming algorithm that minimizes the largest difference between bin sizes?

In other words, with a set number of bins (k), I am looking to adjust the width to make them all as equal as possible.

I have read this paper, however, it doesn't really handle what I am trying to accomplish. http://arxiv.org/pdf/1207.5578v3.pdf

Can I somehow modify their algorithm for my purposes or is there another approach I should look at?

share|improve this question

1 Answer

EDIT: It looks like I misunderstood the question, so after the discussion settles down I'll delete this answer.


Hint: The key to designing this algorithm is observing where the bin boundaries lie. Start with the smallest possible bins that will cover the data with $k$ bins (i.e., the width of the data divided by k).

  • By increasing the width, can you move a point from a bin to a lighter bin?
  • By shifting the first bin's starting point lower, can you do the same?

These are the recursion choices you can make, and then use the techniques for other dynamic programs to write down an algorithm.

share|improve this answer
I don't follow. I thought for dynamic programming solutions you build an optimal solution at n and use that for n+1? Also, what do you mean by:"shifting the first bin lower" how can I shift a bin lower that starts at my first element? – quannabe Mar 9 at 9:41
All k of your bins are the same width, correct? – John Moeller Mar 9 at 9:56
Fixed k. No, they do not need to be equal width – quannabe Mar 9 at 10:01
Ah, ok, I misunderstood. It sounds as though you want close to, if not the same, number of points in each bin? If that's the case, you'll want to ignore my answer and look up "quantiles." If k = 4 that sounds like quartiles. I don't know that you'd need dynamic programming for that though. – John Moeller Mar 9 at 10:05
In fact, I think you can do this in $O(n \log k)$ by recursively running $k$-select. – John Moeller Mar 9 at 10:12

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.