Unit testing to prove balanced tree

Question

I've just built a self-balancing tree (red-black) in Java (language should be irrelevant for this question though), and I'm trying to come up with a good means of testing that it's properly balanced.

I've tested all the basic tree operations, but I can't think of a way to test that it is indeed well and truly balanced. I've tried inserting a large dictionary of words, both pre-sorted and un-sorted. With a balanced tree, those should take roughly the same amount of time, but an unbalanced tree would take significantly longer on the already-sorted list. But I don't know how to go about testing for that in any reasonable, reproducible way. (I've tried doing millisecond tests on these, but there's no noticeable difference - probably because my source data is too small.)

Is there a better way to be sure that the tree is really balanced? Say, by looking at the tree after it's created and seeing how deep it goes? (That is, without modifying the tree itself by adding a depth field to each node, which is just wasteful if you don't need it for anything other than testing.)

Your source data is too small? Try inserting all 32-bit integers, sorted and unsorted. — U2EF1, Nov 11 at 2:26

Greg Hewgill · Accepted Answer · 2013-11-11 00:27:32Z

up vote 9 down vote accepted

One way to do this would be to create a method on your tree that measures the depth of the tree at a given node. You don't have to store the value, and if you use such a getDepth() method only for testing, then there's no extra overhead for normal tree operations. The getDepth() method would recursively traverse its child nodes and return the maximum depth found.

Once you have that, you can then check that your whole tree is balanced by recursing over each node of the tree, and verifying a condition something like:

Math.abs(getDepth(left) - getDepth(right)) <= 1

answered Nov 11 at 0:27

Greg Hewgill
5,51612330

1

As an addition to this answer: if you prevent direct access to the implementation (eg, by using a factory and an interface), you should feel free to add any useful method to the implementation class. Including methods designed solely for testing. – kdgregory Nov 11 at 1:36

Is the depth-difference of no greater than 1 at every node guaranteed for red-black? I know this would work for AVL trees, but I remember reading somewhere that other balanced trees don't necessarily have this property despite being relatively balanced. I'd think something like comparing the maximum depth to the log(n) would a more general solution, but I think there's some wiggle-room there as well... – Darrel Hoffman Nov 11 at 2:38

Well if you have a different criteria for what you consider "balanced", then use that. My answer is showing another way to test that your tree meets your requirements, other than measuring performance like you suggest (measuring performance only determines the "balanced-ness" indirectly and is not definitive). – Greg Hewgill Nov 11 at 2:39

Yeah, this probably is the way to go. I just wasn't sure if it would work for red-black. (I've done AVL trees a couple times in various languages, but this is my first time experimenting with other types of balanced trees.) – Darrel Hoffman Nov 11 at 2:45

1

The appropriate invariant of red-black tree is "Every path from a given node to any of its descendant leaves contains the same number of black nodes." And only that many red nodes, because red node may only have black children. – Jan Hudec Nov 11 at 8:36

show 1 more comment

Alexander Torstling · Answer 2 · 2013-12-10 17:31:33Z

up vote 0 down vote

If you want to measure time complexity for reads one idea is that you could have the tree issuing callbacks for traversals. This way you could measure the number of operations for a read, which would be a more reliable measurement of time complexity.

answered Dec 10 at 17:31

Alexander Torstling
70915

And also add complexity to every step of the operation. If it were just being used to test (and compare to something unbalanced, say), that might be okay, but you'd have to then disable that code before putting into production. – Darrel Hoffman Dec 10 at 21:58

Yes, it would be costly. A compromise is to use an internal counter, that should be cheaper. Direct statistics are as I said IMO the only really reliable way of measuring time complexity. But if you want to get fancy you can try ThreadMXBean.getThreadCPUTime() to measure it. – Alexander Torstling Dec 11 at 6:41

Well, as long as you can prove the tree is balanced at any given point, you know that any basic operations will operate in log time. So a solution which analyzes the current state of the tree and proves it's balanced should be good enough, and doesn't needlessly add time or space complexity when it's not being used for testing. (And is in fact what I've already done - this was a month ago.) – Darrel Hoffman Dec 11 at 16:10

Yes well I just wanted to add some pointers as to how you could go ahead with your original approach if you wanted to. That's why I started with "If you want to measure time complexity". For reference. – Alexander Torstling Dec 11 at 19:40

add comment

asked	1 month ago
viewed	273 times
active	7 days ago

Explore our sites

Unit testing to prove balanced tree

2 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged java unit-testing junit binary-tree or ask your own question.

Hot Network Questions

Explore our sites

Unit testing to prove balanced tree

2 Answers

Your Answer

Sign up or login

Post as a guest

Not the answer you're looking for? Browse other questions tagged java unit-testing junit binary-tree or ask your own question.

Related

Hot Network Questions