Your function is too large to digest in one sitting, and without running on my machine. So I'll start with a few quick observations.
The idea of putting a plot function (axscatter
) in an inner most loop just feels wrong. It makes more sense to build arrays of values, and call the plot just once per subplot. But if the number of values that you plot per call is large, and the number of iterations is small, that isn't so big of a deal. It's hard to know just how many iterations this code performs.
The use of vectorize
on functions like lambda x: (x - mean) / std
is unnecessary. vectorize
does not speed up code; it just makes it easier to apply scalar operations to arrays. x-mean
is already a numpy array operation, as is ()/std
.
The use of list comprehension y = [normppf(i/length) for i in range(length)]
is, I suspect, unnecessary. But I don't know exactly what normppf
does. Does it accept an array, or just scalar values?
Organizing this as one big function makes it hard to test and optimize pieces. I'd put each level of iteration in a separate function. I should be able to test the data manipulation functions without plotting, and be able to plot without generating a new set of data.
I like to see, at a glance, where the code is massaging the data, and where it is plotting. And with plotting it is nice to separate out the actions that actualactually plot from those that tweak the appearance. I also like to clearly identify when it's calling imported code. You do that with np....
but not other functions.