Methods and principles of building "computer systems that automatically improve with experience."
1
vote
1answer
17 views
Order of Support Vectors, and how to reduce them
I am working in an extremely memory constrained environment, and the number of support vectors my Matlab design is generating is just not something that scales. That led me to move to finding a way to ...
2
votes
1answer
41 views
What exactly is the equation for SVM classification for new example?
I understand that in the case of Logistic Regression, we simply multiply our weights with Input example for classification. But what exactly is the equation that we calculate in the case of SVM to ...
1
vote
0answers
22 views
Why is the logistic regression cost function scaled by the number of examples?
I sometimes see that the cost function, along with the regularizer is divided by 1/2m where m is the number of examples. When we are trying to find the minimum of the cost, why does scaling by this ...
0
votes
1answer
53 views
1
vote
0answers
17 views
Learning parameters of non-parametric Bayesian models
I have a sample of Chinese restaurant process which I want to model as Pitman–Yor process. How do I determine parameters of Pitman-Yor model from given sample?
For Dirichlet process I would just use ...
1
vote
1answer
43 views
Calculating whether a disease is probable using Bayes rule?
I want to compute whether it is more probable that a patient has a disease or the contrary. If I am given the following information:
P(disease)= 0.008
P(+|disease)= 0.98
P(-|¬disease)= 0.97
To ...
0
votes
0answers
28 views
Overfitting in K-NN and Decision Trees?
To avoid over fitting for K-NN could you increase the value of K to reduce anomalous results etc. However, if the value of K is very large with respect to a sample, would this also incur in over ...
2
votes
2answers
109 views
Why is svm not so good as decision tree on the same data?
I am new to machine learning and try to use scikit-learn(sklearn) to deal with a classification problem. Both DecisionTree and SVM can train a classifier for this problem.
I use ...
0
votes
0answers
14 views
How does R{MASS} lda function use MLEs to improve its result?
I am using the LDA function in the MASS package of R, which has the following specification:
...
1
vote
0answers
26 views
Activation value at output neuron equals 1, and the network doesn't learn anything
I'm implementing a typical neural network with 1 hidden layer. The network does well with the logic XOR and other simple problems, but fails miserably when encountering a (16-input, 20~30 hidden, 3 ...
1
vote
2answers
45 views
Highly unbalanced test data set and balanced training data in classification
I have a training set with about 3000 positive instances and 3000 negative instances. But my test data set is pretty much un-balanced. The positive set only has 50 instances and negative has 1500 ...
2
votes
0answers
24 views
What is Recurrent Reinforcement Learning
I recently came across the word of "Recurrent Reinforcement Learning". I understand what "Recurrent Neural Network" is and what "Reinforcement Learning" is, but couldn't find much information about ...
5
votes
3answers
126 views
What does “degree of freedom” mean in neural networks?
In Bishop's book "Pattern Classification and Machine Learning", it describes a technique for regularization in the context of neural networks. However, I don't understand a paragraph describing that ...
2
votes
2answers
44 views
Is it essential to do normalization for SVM and Random Forest?
My features' every dimension has different range of value. I want to know if it is essential to normalize this dataset. Thanks
2
votes
0answers
23 views
Integrating Prior estimates in Simrank Model
I am reading SimRank paper by Jeh and Widom which tries to find the similarity between objects based on the relationships between them. Effectively, SimRank is a measure that says "two objects are ...
0
votes
0answers
39 views
Energy estimation through machine learning
Greedings to everybody.
I have the dataset which you can find here, containing many different characteristics of different houses, including their types of heating, or the number of adults and ...
1
vote
2answers
36 views
Neural network with skip-layer connections
I am interested in regression with neural networks.
Neural networks with zero hidden nodes + skip-layer connections are linear models.
What about the same neural nets but with hidden nodes ?
I am ...
0
votes
0answers
19 views
Relationship between vector dimesion and number of training samples for binary classifer
I have some general questions about binary classifers.
Is there any relationship between sample vector dimesions and number of training samples for classifer?
Is it good or bad to provide samples ...
0
votes
1answer
38 views
Clustering a dataset to get the most abnormal data [duplicate]
I have several datasets in R+, each containing two training and test sets. For example the following dataset. I want to train a classifier by using training data such that by applying the test data, I ...
1
vote
0answers
42 views
k-fold cross validation vs k times hold-out validation
I am facing the evaluation of a genetic programming algorithm. I am using the Proben1 cancer1 dataset to evaluate the models created by this algorithm. This dataset contains 699 samples, which is ...
5
votes
2answers
92 views
Is there overfitting in this modellng approach
I recently was told that the process I followed (component of a MS Thesis) could be seen as over-fitting. I am looking to get a better understanding of this and see if others agree.
The objective of ...
0
votes
1answer
13 views
in nonlinear binary classification problems, which is the optimal dimension for make it lineary separable?
My question pertains to linear separability with hyperplanes in a support vector machine.
Is posible to determinate the optimal dimension in which i have to transform a training data set for make it ...
4
votes
1answer
82 views
Timeline of machine learning and data mining breakthroughs
Is there any timeline or historical overview of the most important breakthroughs in machine learning and data mining?
2
votes
0answers
25 views
Maximum number of classes for RandomForest multiclass estimation
I have researched the internet|literature a lot on multiclass prediction to find out what is a realistic limit for the number of classes that can successfully be used for estimation when using a ...
1
vote
2answers
43 views
How can I use Bayes rule for this question given additional data
I am required to use the Naive Bayes classifier to classify example 8, to see whether it is poisonous or not.
I gained the following results:
p(x|Poisonous=Y) = 0.0267857 and
p(x|Poisonous=N) = ...
0
votes
0answers
19 views
intersection kernel and distances between two histograms
intersection kernel can be given as $\sum_i min(x_i, y_i)$ . where x and y are histograms.
If two histograms are compeletely different the distance will be low.
If two histograms are similar what ...
0
votes
1answer
43 views
How do you Interpret RMSLE (Root Mean Squared Logarithmic Error)?
I've been doing a machine learning competition where they use RMSLE (Root Mean Squared Logarithmic Error) to evaluate the performance predicting the sale price of a category of equipment. The problem ...
-1
votes
0answers
19 views
kernels distances gram matrix classification
Could you please explain some thing about kernels? As I understand it is technique to map the feature space into a high dimensional feature space where we could separate two classes by a linear ...
3
votes
2answers
54 views
In general how do you set K in K-NN?
As the title suggests, how should you set K in K-Nearest Neighbours?
Is it just a case of lower values of K are more susceptible to over-fitting and larger values of K are likely to give a more ...
0
votes
0answers
23 views
Confusion related to L2 and L1 SVM
I have this confusion related to L1 and L2 svm. I was reading this paper
I am attaching the screenshot and the part I didn't understand
The part that I didn't understand how it was derived
I ...