A way of re-expressing data to make their values lie between 0 and 1 (or 0% and 100%).

learn more… | top users | synonyms

18
votes
1answer
567 views

Random matrices with constraints on row and column length

I need to generate random non-square matrices with $R$ rows and $C$ columns, elements randomly distributed with mean = 0, and constrained such that the length (L2 norm) of each row is $1$ and the ...
15
votes
3answers
12k views

What's the difference between Normalization and Standardization?

At work we were discussing this as my boss has never heard of normalization. In Linear Algebra, Normalization seems to refer to the dividing of a vector by its length. And in statistics, ...
10
votes
2answers
2k views

“Normalizing” variables for SVD / PCA

Suppose we have $N$ measurable variables, $(a_1, a_2, \ldots, a_N)$, we do a number $M > N$ of measurements, and then wish to perform singular value decomposition on the results to find the axes of ...
8
votes
3answers
1k views

Reason to normalize in euclidean distance measures in hierarchical clustering

Apparently, in hierarchical clustering in which the distance measure is Euclidean distance, the data must be first normalized or standardized to prevent the covariate with the highest variance from ...
7
votes
7answers
3k views

How to represent an unbounded variable as number between 0 and 1

I want to represent a variable as a number between 0 and 1. The variable is a non-negative integer with no inherent bound. I map 0 to 0 but what can I map to 1 or numbers between 0 and 1? I could use ...
7
votes
3answers
170 views

Sparsity-inducing regularization for stochastic matrices

It is well-known (e.g. in the field of compressive sensing) that the $L_1$ norm is "sparsity-inducing," in the sense that if we minimize the functional (for fixed matrix $A$ and vector $\vec{b}$) ...
6
votes
2answers
3k views

Column-wise matrix normalization in R

I would like to perform column-wise normalization of a matrix in R. Given a matrix m, I want to normalize each column by dividing each element by the sum of the ...
6
votes
1answer
134 views

Why is my replication of Silver & Dunlap 1987 not working out?

I'm trying to replicate Silver & Dunlap (1987). I'm just comparing averaging correlations or averaging z transform correlations and back transforming. I seem to not be replicating the asymmetry ...
6
votes
3answers
384 views

How and why do normalization and feature scaling work?

I see that lots of machine learning algorithms work better with mean cancellation and covariance equalization. For example, Neural Networks tend to converge faster, and K-Means generally gives better ...
5
votes
4answers
2k views

What are the primary differences between z-scores and t-scores, and are they both considered standard scores?

We are currently converting student test scores in this manner : ( ScaledScore - ScaledScore Mean ) / StdDeviation ) * 15 + 100 I was referring to this ...
5
votes
2answers
387 views

Normalization vs. scaling

What is the difference between data 'Normalization' and data 'Scaling'? Till now I thought both terms refers to same process but now I realize there is something more that I don't know/understand. ...
5
votes
1answer
805 views

Normalizing or detrending groups of samples

How do I detrend or normalize multiple series of data so that I can inter-compare between the series? Specifics below may not be appropriate for this forum. Please let me know and I can remove or ...
5
votes
1answer
120 views

How can I devise a scoring system for a competition that is more fair than straight percentages?

I am trying to come up with a method for deciding the winner from among eight student groups competing for a prize. The raw data and corresponding percentages measure participation per group in a ...
5
votes
1answer
175 views

When doing quadrat counting, how do you construct the quadrats?

I want to perform quadrat count analysis on several point processes (or one marked point process), to then apply some dimensionality reduction techniques. The marks are not identically distributed, ...
4
votes
4answers
432 views

Ratio of Range to IQR vs. Coefficient of Variation — which is the more useful robust measure?

For a given set of data, spread is often calculated either as the standard deviation or as the IQR (inter-quartile range). Whereas a standard deviation is ...

1 2 3 4 5 7
15 30 50 per page