Skip to main content

Your function is too large to digest in one sitting, and without running on my machine. So I'll start with a few quick observations.

The idea of putting a plot function (axscatter) in an inner most loop just feels wrong. It makes more sense to build arrays of values, and call the plot just once per subplot. But if the number of values that you plot per call is large, and the number of iterations is small, that isn't so big of a deal. It's hard to know just how many iterations this code performs.

The use of vectorize on functions like lambda x: (x - mean) / std is unnecessary. vectorize does not speed up code; it just makes it easier to apply scalar operations to arrays. x-mean is already a numpy array operation, as is ()/std.

The use of list comprehension y = [normppf(i/length) for i in range(length)] is, I suspect, unnecessary. But I don't know exactly what normppf does. Does it accept an array, or just scalar values?

Organizing this as one big function makes it hard to test and optimize pieces. I'd put each level of iteration in a separate function. I should be able to test the data manipulation functions without plotting, and be able to plot without generating a new set of data.

I like to see, at a glance, where the code is massaging the data, and where it is plotting. And with plotting it is nice to separate out the actions that actualactually plot from those that tweak the appearance. I also like to clearly identify when it's calling imported code. You do that with np.... but not other functions.

Your function is too large to digest in one sitting, and without running on my machine. So I'll start with a few quick observations.

The idea of putting a plot function (axscatter) in an inner most loop just feels wrong. It makes more sense to build arrays of values, and call the plot just once per subplot. But if the number of values that you plot per call is large, and the number of iterations is small, that isn't so big of a deal. It's hard to just how many iterations this code performs.

The use of vectorize on functions like lambda x: (x - mean) / std is unnecessary. vectorize does not speed up code; it just makes it easier to apply scalar operations to arrays. x-mean is already a numpy array operation, as is ()/std.

The use of list comprehension y = [normppf(i/length) for i in range(length)] is, I suspect, unnecessary. But I don't know exactly what normppf does. Does it accept an array, or just scalar values?

Organizing this as one big function makes it hard to test and optimize pieces. I'd put each level of iteration in a separate function. I should be able to test the data manipulation functions without plotting, and be able to plot without generating a new set of data.

I like to see, at a glance, where the code is massaging the data, and where it is plotting. And with plotting it is nice to separate out the actions that actual plot from those that tweak the appearance. I also like to clearly when it's calling imported code. You do that with np.... but not other functions.

Your function is too large to digest in one sitting, and without running on my machine. So I'll start with a few quick observations.

The idea of putting a plot function (axscatter) in an inner most loop just feels wrong. It makes more sense to build arrays of values, and call the plot just once per subplot. But if the number of values that you plot per call is large, and the number of iterations is small, that isn't so big of a deal. It's hard to know just how many iterations this code performs.

The use of vectorize on functions like lambda x: (x - mean) / std is unnecessary. vectorize does not speed up code; it just makes it easier to apply scalar operations to arrays. x-mean is already a numpy array operation, as is ()/std.

The use of list comprehension y = [normppf(i/length) for i in range(length)] is, I suspect, unnecessary. But I don't know exactly what normppf does. Does it accept an array, or just scalar values?

Organizing this as one big function makes it hard to test and optimize pieces. I'd put each level of iteration in a separate function. I should be able to test the data manipulation functions without plotting, and be able to plot without generating a new set of data.

I like to see, at a glance, where the code is massaging the data, and where it is plotting. And with plotting it is nice to separate out the actions that actually plot from those that tweak the appearance. I also like to clearly identify when it's calling imported code. You do that with np.... but not other functions.

Source Link
hpaulj
  • 1.6k
  • 1
  • 9
  • 16

Your function is too large to digest in one sitting, and without running on my machine. So I'll start with a few quick observations.

The idea of putting a plot function (axscatter) in an inner most loop just feels wrong. It makes more sense to build arrays of values, and call the plot just once per subplot. But if the number of values that you plot per call is large, and the number of iterations is small, that isn't so big of a deal. It's hard to just how many iterations this code performs.

The use of vectorize on functions like lambda x: (x - mean) / std is unnecessary. vectorize does not speed up code; it just makes it easier to apply scalar operations to arrays. x-mean is already a numpy array operation, as is ()/std.

The use of list comprehension y = [normppf(i/length) for i in range(length)] is, I suspect, unnecessary. But I don't know exactly what normppf does. Does it accept an array, or just scalar values?

Organizing this as one big function makes it hard to test and optimize pieces. I'd put each level of iteration in a separate function. I should be able to test the data manipulation functions without plotting, and be able to plot without generating a new set of data.

I like to see, at a glance, where the code is massaging the data, and where it is plotting. And with plotting it is nice to separate out the actions that actual plot from those that tweak the appearance. I also like to clearly when it's calling imported code. You do that with np.... but not other functions.