I have tried supervised classification in ArcGIS.
Firstly I would say that it is not the best software for classification.
As I did it, you can create training sites as points. Just create a shapefile (or geodatabase), add Integer field, click points over your image and assign classes as numbers. (I think you can also use polygon shapefile).

For signatures, go to ArcToolbox > Spatial Analyst Tools > Multivariate > Create Signatures. There just put your bands and training points.

ArcGIS doesn't show you the resulting signature file, however it is ASCII file and you can look inside, for example using Notepad++. For each band you can see something like this:
# Class ID Number of Cells Class Name
1 5
# Layers 1 2 3
# Means
74.20000 98.80000 69.60000
# Covariance
1 67.70000 95.30000 211.60000
2 95.30000 149.20000 354.40000
3 211.60000 354.40000 903.80000
# -------------------------------------------------------------------
If you wish, you can plot the Means manually to see if they are separated, for example in Excel.
You can do a dendrogram for your classes Spatial Analyst Tools > Multivariate > Dendrogram.
Distances between Pairs of Combined Classes
(in the sequence of merging)
Remaining Merged Between-Class
Class Class Distance
-----------------------------------------
1 3 1.882214
1 4 2.706293
1 2 12.502329
-----------------------------------------
Dendrogram of c:\2tmp\creates_shallow.gsg
C DISTANCE
L
A
S 0 1.4 2.8 4.2 5.6 6.9 8.3 9.7 11.1 12.5
S |-------|-------|-------|-------|-------|-------|-------|-------|-------|
3 ---------|
|----|
1 ---------| |--------------------------------------------------------|
| |
4 --------------| |-
|
2 -----------------------------------------------------------------------|
|-------|-------|-------|-------|-------|-------|-------|-------|-------|
0 1.4 2.8 4.2 5.6 6.9 8.3 9.7 11.1 12.5
Then you do actually classification Multivariate > Maximum Likelihood Classification. There you have only two tunable options: to say, how much uncertain pixels will remain unclassified (rejected), and probability weighting for classes; and request a confidence raster.

And the result looks like this:

For theorical information check classification help and signature help.
Actually, after several attempts I switched to Remote Sensing software.