The following code is to classify the test data into several classes:
function pdfmx = pdfparzen(train, test, winwidth)
% computes probability density for all classes
% using Parzen window approximation
% train - train set; the first column contains label
% used to compute mean and variation for all classes
% test - test set (without labels)
% winwidth - width of the Parzen window
% pdfmx - matrix of probability density for all classes
% class with label idx is stored in pdfmx(:,idx)
classnb = rows(unique(train(:,1)));
pdfmx = ones(rows(test), classnb);
for samp=1:rows(test)
for cl=1:classnb
clidx = train(:,1) == cl;
indiv = zeros(sum(train(:,1) == cl), columns(test));
for feat=1:columns(test)
indiv(:,feat) = normpdf(test(samp,feat),
train(clidx, feat + 1), winwidth);
end
pdfmx(samp,cl) = mean(prod(indiv,2));
end
end
Matlab command line usage:
ercf_parzen = bayescls(train, test, @pdfparzen, 0.25 * ones(1,4), 0.1);
Parzen classification error:
0.35581
How can I reduce the error coefficient of this code?
Driver Program
function errcf = bayescls(train, test, hpdf, apriori, winwidth)
% Bayes classifier
% train - training set; the first column contains label
% test - test set; the first column contains label
% hpdf - handle to function used to compute probability density
% apriori - row vector of a priori probabilities for all classes
% winwidth - window width (just for Parzen window hpdf function)
clpdf = hpdf(train, test(:,2:end), winwidth);
clpr = clpdf .* repmat(apriori, rows(test), 1);
[val lab] = max(clpr, [], 2);
errcf = mean(test(:,1) ~= lab);