123
votes
20answers
33k views

Python as a statistics workbench

Lots of people use a main tool like Excel or another spreadsheet, SPSS, STATA, or R for their statistics needs. They might turn to some specific package for very special needs, but a lot of things can ...
101
votes
49answers
28k views

What is your favorite “data analysis” cartoon?

This is one of my favorites: One entry per answer. This is in the vein of the Stack Overflow question What’s your favorite “programmer” cartoon?. P.S. Do not hotlink the cartoon without the site's ...
99
votes
128answers
20k views

Famous statistician quotes

What is your favorite statistician quote? This is community wiki, so please one quote per answer.
97
votes
12answers
12k views

The Two Cultures: statistics vs. machine learning?

Last year, I read a blog post from Bendan O'Connor entitled "Statistics vs. Machine Learning, fight!" that discussed some of the differences between the two fields. Andrew Gelman responded to ...
95
votes
18answers
31k views

Making sense of principal component analysis, eigenvectors & eigenvalues

In today's pattern recognition class my professor talked about PCA, eigenvectors & eigenvalues. I got the mathematics of it. If I'm asked to find eigenvalues etc. I'll do it correctly like a ...
83
votes
8answers
13k views

Detecting a given face in a database of facial images

I'm working on a little project involving the faces of twitter users via their profile pictures. A problem I've encountered is that after I filter out all but the images that are clear portrait ...
80
votes
41answers
5k views

What are common statistical sins?

I'm a grad student in psychology, and as I pursue more and more independent studies in statistics, I am increasingly amazed by the inadequacy of my formal training. Both personal and second hand ...
77
votes
16answers
19k views

Why square the difference instead of taking the absolute value in standard deviation?

In the definition of standard deviation, why do we have to square the difference from the mean to get the mean (E) and take the square root back at the end? Can't we just simply take the absolute ...
70
votes
13answers
7k views

Bayesian and frequentist reasoning in plain English

How would you describe in plain English the characteristics that distinguish Bayesian from Frequentist reasoning?
66
votes
10answers
13k views

Is there any reason to prefer the AIC or BIC over the other?

The AIC and BIC are both methods of assessing model fit penalized for the number of estimated parameters. As I understand it, BIC penalizes models more for free parameters than does AIC. Beyond a ...
61
votes
7answers
17k views

What is the difference between “likelihood” and “probability”?

The wikipedia page claims that likelihood and probability are distinct concepts. In non-technical parlance, "likelihood" is usually a synonym for "probability," but in statistical usage there is a ...
59
votes
16answers
4k views

How to annoy a statistical referee?

I recently asked a question regarding general principles around reviewing statistics in papers. What I would now like to ask, is what particularly irritates you when reviewing a paper, i.e. what's the ...
57
votes
6answers
4k views

Is $R^2$ useful or dangerous?

I was skimming through some lecture notes by Cosma Shalizi (in particular, section 2.1.1 of the second lecture), and was reminded that you can get very low $R^2$ even when you have a completely linear ...
56
votes
21answers
3k views

Locating freely available data samples

I've been working on a new method for analyzing and parsing datasets to identify and isolate subgroups of a population without foreknowledge of any subgroup's characteristics. While the method works ...
53
votes
6answers
2k views

Explaining to laypeople why bootstrapping works

I recently used bootstrapping to estimate confidence intervals for a project. Someone who doesn't know much about statistics recently asked me to explain why bootstrapping works, i.e., why is it that ...

15 30 50 per page
1 2 3 4 5 1369