Join the Stack Overflow Community
Stack Overflow is a community of 6.6 million programmers, just like you, helping each other.
Join them; it only takes a minute:
Sign up

I have a simple non-linear function y=x.^2, where x and y are n-dimensional vectors, and the square is a component-wise square. I want to approximate y with a low dimensional vector using an auto-encoder in Matlab. The problem is that I am getting distorted reconstructed y even if the low-dimensional space is set to n-1. My training data looks like this and here is a typical result reconstructed from the low dimensional space. My Matlab code is given below.

%% Training data
inputSize=100;
hiddenSize1 = 80;

epo=1000;
dataNum=6000;
rng(123);
y=rand(2,dataNum);
xTrain=zeros(inputSize,dataNum);
for i=1:dataNum
    xTrain(:,i)=linspace(y(1,i),y(2,i),inputSize).^2;
end

%scaling the data to [-1,1]
for i=1:inputSize
    meanX=0.5; %mean(xTrain(i,:));
    sd=max(xTrain(i,:))-min(xTrain(i,:));
    xTrain(i,:) = (xTrain(i,:)- meanX)./sd;
end

%% Training the first Autoencoder

% Create the network. 
autoenc1 = feedforwardnet(hiddenSize1);
autoenc1.trainFcn = 'trainscg';
autoenc1.trainParam.epochs = epo;

% Do not use process functions at the input or output
autoenc1.inputs{1}.processFcns = {};
autoenc1.outputs{2}.processFcns = {};

% Set the transfer function for both layers to the logistic sigmoid
autoenc1.layers{1}.transferFcn = 'tansig';
autoenc1.layers{2}.transferFcn = 'tansig';

% Use all of the data for training
autoenc1.divideFcn = 'dividetrain';
autoenc1.performFcn = 'mae';
%% Train the autoencoder
autoenc1 = train(autoenc1,xTrain,xTrain);
%%
% Create an empty network
autoEncoder = network;

% Set the number of inputs and layers
autoEncoder.numInputs = 1;
autoEncoder.numlayers = 1;

% Connect the 1st (and only) layer to the 1st input, and also connect the
% 1st layer to the output
autoEncoder.inputConnect(1,1) = 1;
autoEncoder.outputConnect = 1;

% Add a connection for a bias term to the first layer
autoEncoder.biasConnect = 1;

% Set the size of the input and the 1st layer
autoEncoder.inputs{1}.size = inputSize;
autoEncoder.layers{1}.size = hiddenSize1;

% Use the logistic sigmoid transfer function for the first layer
autoEncoder.layers{1}.transferFcn = 'tansig';

% Copy the weights and biases from the first layer of the trained
% autoencoder to this network
autoEncoder.IW{1,1} = autoenc1.IW{1,1};
autoEncoder.b{1,1} = autoenc1.b{1,1};


%%
% generate the features
feat1 = autoEncoder(xTrain);

%%
% Create an empty network
autoDecoder = network;

% Set the number of inputs and layers
autoDecoder.numInputs = 1;
autoDecoder.numlayers = 1;

% Connect the 1st (and only) layer to the 1st input, and also connect the
% 1st layer to the output
autoDecoder.inputConnect(1,1) = 1;
autoDecoder.outputConnect(1) = 1;

% Add a connection for a bias term to the first layer
autoDecoder.biasConnect(1) = 1;

% Set the size of the input and the 1st layer
autoDecoder.inputs{1}.size = hiddenSize1;
autoDecoder.layers{1}.size = inputSize;

% Use the logistic sigmoid transfer function for the first layer
autoDecoder.layers{1}.transferFcn = 'tansig';

% Copy the weights and biases from the first layer of the trained
% autoencoder to this network

autoDecoder.IW{1,1} = autoenc1.LW{2,1};
autoDecoder.b{1,1} = autoenc1.b{2,1};

%% Reconstruction
desired=xTrain(:,50);
input=feat1(:,50);
output = autoDecoder(input);

figure
plot(output)
hold on
plot(desired,'r')
share|improve this question

I'm not a Matlab user, but your code makes me think you have a standard shallow autoencoder. You can't really approximate a nonlinearity using a single autoencoder, because it won't be much more optimal than a purely linear PCA reconstruction (I can provide a more elaborate mathematical reasoning if you need it, though this is not math.stackexchange). You need to build a deep network to approximate your nonlinearity with several layers of linear transformations. Then, autoencoder is a bad model to choose (hardly anyone uses them in practice today), when you have denoising autoencoders, that tend to learn more important representations by trying to reconstruct a prior from its noisy version. Try building a deep denoising autoencoder. This video introduces the concept of denoising autoencoders. That course has a video about deep denoising autoencoders as well.

share|improve this answer
1  
Please, provide the elaborate math answer, because due to UAT single hidden layer with a reasonable activation (such as sigmoid) can approximate any continuous function on the hypercube with a given precision (although it might require lots of hidden units). Many layers are not about ability to capture nonlinearity, they are about problems with the volume of solution in the whole space of models – lejlot Jan 10 '16 at 12:46
    
@lejlot this will likely take some time, so I'll make the edit later. – Eli Korvigo Jan 10 '16 at 13:09
    
@EliKorvigo Making the network deep will be useful for reducing the dimensionality of the encoded features, and using denoising auto-encoders will result in better generalization but my problem is that I am not getting good results with high dimensional hidden layers on the training data. – Najeeb Jan 11 '16 at 4:08

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.