Stack Overflow Data Visualization Contest

We all know everyone loves pretty pictures, chock full of graph-y goodness.

You probably also know that about two months ago we started the Stack Overflow Machine Learning Contest, and that it’s now winding down. All models have been (or will shortly be) committed, and we’re starting to gather data for the final judgement.

What you may not have known about was the subsidiary Visualization Contest, which is looking to find an interesting and informative way of making sense of the mountains of interesting data in our data sets. You’re free to pull in any additional publicly available information from sources like the Data Explorer or API, but the data set put together for the machine learning contest is a good place to start.

Entries will be accepted through October 26th with voting ending November 1st. We’ll choose the most awesome of the top-voted entries based on how interesting and informative the visualization is, with bonus points for focusing on the subject of the machine learning contest.

So go out there, find a set of interesting statistics, gin up a cool picture and submit it to the…

Stack Overflow Visualization Contest

Filed under announcement, stackoverflow

13 Comments

F'x Oct 9 2012

Where are the rules? Also, what are you looking for? This entry is pretty thin…

Ben Swayne Oct 9 2012

@F’x: I think they want you to “thicken” out the concepts!

These contests are a subtle way of saying “give us hundreds of hours of free community R&D, and we’ll reward the best of you with a moment in the spotlight”.

If you like the spotlight and have some free time, that’s pretty cool.

If they knew exactly what they wanted, they’d be coding it already. ;-)

Tomasz N. Oct 9 2012

Don’t forget: http://hewgill.com/~greg/stackoverflow/stack_overflow/stats.html

Kevin Montrose author Oct 9 2012

@F’x

There’s very little guidance because there’s very little guiding to do. We’re looking for a neat visualization of our data, so the rules are basically:

1. It must be visual
2. It must use our data
3. There is no rule 3

Unlike the Machine Learning Contest itself, you’re not restricted to a certain subset of our data as inputs. You’re free to use all the data from the API, Data Explorer, data dumps, etc.

We’d *like* to see something related to question closure (visualizing common features of closed questions, closure rates, and so on), but we don’t require it. If you’ve got some awesome idea in a different area, we’re more than happy to look at it.

Aurelio De Rosa Oct 9 2012

@Ben Swayne the statement “give us hundreds of hours of free community R&D, and we’ll reward the best of you with a moment in the spotlight” is a little bit evil. Isn’t it?

Federico Oct 9 2012

This is so cool. We’ll definitely join the fun :)

Just learned about this today, 16days to go so gotta hurry to catch up.

MarkJ Oct 10 2012

@Ben Swayne @Aurelio There are cash prizes for these contests. Anyway, donating hundreds of hours of free R&D, with the reward being a moment in the spotlight, is the founding principle of Stack Overflow & Stack Exchange. It’s voluntary.

Pekka Oct 10 2012

So this is just about creating static visualizations? Or does there have to be a script/program that can create the visualization on demand based on new data?

Andrew Thompson Oct 10 2012

> Entries will be accepted through October 26th with voting ending November 1st.

The [kaggle](http://www.kaggle.com/c/predict-closed-questions-on-stack-overflow/prospector) page actually states..
> Submissions will be enabled through October 26th, and voting will remain open until **October 1st.**

I am guessing this blog has the correct date?

Creotiv Oct 10 2012

Few things i don’t understand.
1) Why not to use one of open libs for vizualization?
2) What need in this contest? It would be better to make open project to make developers working hand by hand.

Mifune Oct 10 2012

1) I didn’t see any rule against using something like Apache Flex or whatever “open” lib.
2) Would many bother?

ernie Oct 10 2012

The issue with visualizations generally isn’t plotting the data – the real challenge when you have lots of data is finding interesting things to plot (ideally in interesting/pretty ways).