Tagged Questions
In learning algorithms and statistical classification, a random forest is a classifier that consists in many decision trees. It outputs the class that is the mode of the classes output by individual trees, in other words, the class with the highest frequency.
4
votes
0answers
147 views
R: how to use long vectors with randomForest?
One of the new features of R 3.0.0 was the introduction of long vectors. However, .C() and .Fortran() do not accept long vector inputs. On R-bloggers I find:
This is a precaution as it is very ...
3
votes
0answers
74 views
What is the correct order of the prior vector in fitensemble?
When using matlabs fitensemble to learn a classifier I can specify the parameter prior as well as parameter classnames.
Has the order of the elements in both vectors be the same? And what is the ...
3
votes
0answers
1k views
How to weight classes in a RandomForest implementation
I am working on 3D point identification using the RandomForest method from scikit. One of the issues I keep running into is that certain classes are present more often then other classes. This means ...
3
votes
0answers
333 views
Issues when using randomForest in caret with ROC as optimization metric
I'm having an issue when constructing random forest models using caret. I have a dataset of about 46k rows and 10 columns (one of which is the optimization target). From this dataset, I'm trying to ...
3
votes
0answers
168 views
Random Forest: mismatch between %IncMSE and %NodePurity
I have performed a random forest analysis of 100,000 classification trees on a rather small dataset (i.e. 28 obs. of 11 variables).
I then made a plot of the variable importance
In the resulting ...
2
votes
0answers
44 views
r random forest error - type of predictors in new data do not match
I am trying to use quantile regression forest function in R (quantregForest) which is built on Random Forest package. I am getting a type mismatch error that I can't quite figure why.
I train the ...
2
votes
0answers
56 views
NAs in rasters and randomForest::predict()
New here, please let me know if you need more info.
My goal: I am using Rehfeldt climate data and eBird presence/absence data to produce niche models using Random Forest models.
My problem: I want ...
2
votes
0answers
61 views
R Temporary File being created
So here's what's happening. We currently run an R program in the cloud and import a package (randomForest) specifically. When we invoke the command (through php), I can see this temporary file ...
2
votes
0answers
637 views
randomforest.r predict() function flags up missing data
Hi I am encountering an error using the predict() function in the randomForest.r package.
Here is my error:
> m<-predict(mdl,QdataTestX)
Error in predict.randomForest(mdl, QdataTestX) :
...
1
vote
0answers
41 views
Highly imbalanced data on C5.0 tree model
I have a imbalanced dataset with only 87 target events "F" out of all 496,978 obs, since I would like to see a rule/tree, I chose to use the tree models, I have been following the codes in "Applied ...
1
vote
0answers
14 views
explicit and implicit model specification in randomForest leads to different results
I am using a simple data set extracted from Cars93 in the MASS package in R. I am running a randomForest on this simple data set predicting origin (usa or non-usa) from four other predictors. If i ...
1
vote
0answers
126 views
Using OpenCV Random Forest for Regression
I have previously used Random Forest for Classification task, setting the params using the example here as a guide. It works perfect. However now I want to solve a regression problem.
I kind of have ...
1
vote
0answers
52 views
Scikit learn + Random forest - features of single trees
I have a very specific question regarding random forests and its implementation in scikit.
I constructed a forest, and prediction works just fine so far. However, I need to know which particular ...
1
vote
0answers
70 views
Random forests: weighting individual observations when resampling
I'm currently using a random forest on a nationally representative dataset with probability weights incorporated for each observation, with the hope that I can use these weights in the bootstrapping ...
1
vote
0answers
107 views
Applying genetic algorithm in Random Forest Model in R
I am New in R and i want to apply genetic algorithm for random Forest model and i am using (GA) package in R.my Code is
library(GA)
library(randomForest)
library(Hmisc)
...
1
vote
0answers
95 views
Weight response with sampsize for unbalanced data in randomForest
I am new to machine learning and R.
I tried to fit some models including trees, boosted trees, random forests, ada boosting, svm, and logistic regression with R.
In my case, probability that the ...
1
vote
0answers
115 views
Shark Random Forest vs Weka - slow and low accuracy issue
I wanted to get a much faster random forest classifier than the one from Weka, so I just tried Shark (I can't use a commercial one like wiseRF). I know there is an alternative RF classifier on Weka ...
1
vote
0answers
703 views
Scikit Learn - ValueError: Array contains NaN or infinity
There are no NaNs in my dataset, I have checked thoroughly. Any reason why I keep getting this error when trying to fit my classifier? Some of the numbers in the data set are rather large and some ...
1
vote
0answers
400 views
Why does Weka RandomForest prediction differ from validation?
I just started to use Weka several weeks ago for a land cover classification of remotely sensed data. I'm no Data Mining expert, but until now, everything worked properly. By the way, I'm using Weka ...
1
vote
0answers
319 views
ROC curve using random forest data doesn't look right
I am trying to plot an ROC curve using random forest data with:
mdl <- randomForest(QdataTrainX, QdataTrainY)
m<-predict(mdl,QdataTestX)
OOB.x <- predict (mdl,QdataTrainX,type="prob");
...
1
vote
0answers
210 views
parallel randomForest with different results using doSNOW
I thought I found a way to make a reproducible foreach loop with doSNOW with the following code
library(foreach)
library(doSNOW)
library(parallel)
ncores <- 2
cl <- makeCluster(ncores)
...
1
vote
0answers
211 views
how to calculate the confidence level for random forest regression model in R
I'm using Random Forest (RF) package in R,for the purpose of predicting the distances between proteins (regression model of RF) "for a homology modeling purposes" and I obtained quite good results. ...
1
vote
0answers
124 views
A Neverending cforest
how can I decouple the time cforest/ctree takes to construct a tree from the number of columns in the data?
I thought the option mtry could be used to do just that, i.e. the help says
number of ...
1
vote
0answers
165 views
R's randomForest() function error - any way I can get more info?
I'm getting the error message that "Type of predictors in new data do not match that of the training data."
This confuses me, since I am able to get the same dat sets working under rpart and ctree. ...
1
vote
0answers
222 views
report random forest results
This is a question with respective to the output of Random Forest in R.
I understand what the gini, impurity, and mean accuracy plots represent. I have a large number of different response ...
1
vote
0answers
238 views
Generate data for clustering
I want to test my Random Forest clustering with some artificial data. I wanted to generate dataset with strong dependability and some noise.
I have 2 attributes, A1 and A2 (both binary). The class is ...
1
vote
0answers
389 views
Matlab TreeBagger Cost argument not working as it works with similar function fitensemble
The cost matrix of my TreeBagger class and fitensemble (Bag method) are both [0 8;1 0] for binary classification. The confusion matrix on fitensemble shows that the classfication tends to turn in the ...
1
vote
0answers
491 views
Gini Impurity, growing random trees in opencv
Goal: To add offset-impurity to the split decision of growing trees in openCV.
Currently in opencv random trees, the split is made as following:
if( !priors )
{
int L = 0, R = n1;
for( i = ...
0
votes
0answers
7 views
Visualizing OpenCV decision trees in C++?
I know this is possible in python with scikit-learn but am trying to figure out how to do this in C++ using OpenCV. I'm using random forests specifically.
0
votes
0answers
11 views
Understanding the Random Forest output of a mahout program
I constructed a Random Forest using the BuildForest utility in mahout. But I seem to be at loss to get stats on individual attributes (weightage | entropy | whatever). How do I make sense of the .seq ...
0
votes
0answers
29 views
trying to use package bootstrap to run a jackknife on my Random Forest model
I'm having trouble trying to figure out the following: I am running Random Forest for classification of habitat use and have GPS data from 17 animals. My data frame depicts different habitat ...
0
votes
0answers
25 views
“Non-conformable arguments” error with Random Forest in R
I am trying to make a simple estimate of the error of my Random Forest model in R (using package party). However, I get the error Error in w %*% response@predict_trafo : non-conformable arguments when ...
0
votes
0answers
36 views
sklearn random forest: oob score too low?
I was searching for applications for random forests, and I found the following knowledge competition on Kaggle:
https://www.kaggle.com/c/forest-cover-type-prediction.
Following the advice at
...
0
votes
0answers
16 views
RDotnet randomForest
I am trying to use randomFrost library with RDotNet and this code throws parsing error which works fine in RGui, any help ?
engine.Evaluate("library(randomForest");
engine.Evaluate("df1 <- ...
0
votes
0answers
20 views
How to use or translate a random forest model built using bigRF package in randomForest package?
I have a random forest model built using the bigrfc() function of the bigrf package in R. I would like to use that model with the prediction function of randomForest package (the ...
0
votes
0answers
17 views
Maximizing clusters for aggregated data with attributes
I have some measures and some attributes from a business database
I want to see if the data has some well defined clusters but the challenge is that the data is stored in an aggregated fashion in a ...
0
votes
0answers
10 views
Distribution used in gbm
My question is a little more generic and not specific to a technique per se.
First- What is the difference between GBM & Random forest and which 1 is better?
Second- When i try to run GBM using ...
0
votes
0answers
59 views
R Random Forest prediction not working
I'm new to Random Forests in R, and I'm trying to make a prediction. I have built a Random Forest model using the following code, which works fine
library(randomForest)
RF_model = ...
0
votes
0answers
45 views
Get variable importance in cforest (party package)
I am using the cforest from party package in order to get the variable importance plots, but to use the plots found here on pg 4.
I am coming across the error:
## Error in ...
0
votes
0answers
13 views
Python : Exporting a trained Random forests classifier (.pkl) to android device
I have a trained Random forest classifier in Python. I want to export it to an android device so that I can classify incoming data streams. I have saved it to a .pkl file but cannot find anything ...
0
votes
0answers
16 views
How to read a .seq file in ubuntu 12.04 through command line/program?
I need to read a .seq file from terminal in ubuntu and split it depending on its contents into multiple files. How can I do this?
Eg: A File abc.seq is to be read, then depending on its contents i ...
0
votes
0answers
57 views
Issue with randomForest & long vectors
I am running random forest on a data set with 8 numeric columns (the predictors), and 1 factor (the outcome). There are 1.2M rows in the dataset. When I do:
randomForest(outcome.f ~ a + b + c + d + ...
0
votes
0answers
32 views
Obtaining out-of-bag errors with scikit-learn's RandomForestClassifier
I'm trying to implement out-of-bag samples so that I won't have to partition my data into a training set and test set for random forest. Looking around, it seems that RandomForestClassifier takes in a ...
0
votes
0answers
49 views
Improving the speed of predicting new data using a Random Forest Model
I am generating species distribution models using Random Forest. These models attempt to predict the probability of occurrence by a species, conditioned on various environmental attributes. For most ...
0
votes
0answers
12 views
error.cv in rfcv function
In the help file, an example using the iris dataset is given.
Can anyone please explain what sapply function does in the error.cv step below?
`result <- replicate(5, rfcv(myiris, iris$Species), ...
0
votes
0answers
70 views
randomForest Error: NA not permitted in predictors (but no NAs in data)
So I am attempting to run the 'genie3' algorithm (ref: http://homepages.inf.ed.ac.uk/vhuynht/software.html) in R which uses the 'randomForest' method.
I am running into the following Error:
> ...
0
votes
0answers
130 views
Random Forest with party package cannot handle categorical predictors with more than 4 levels
I am trying to run a random forest model using the party package. My response variable (10 levels) is a classification value for different lake types (interested what factors influence clustering of ...
0
votes
0answers
29 views
Default predictions different than predictions using training data for randomForest
All,
Here's a simple example of a random forest grown in R:
Y <- iris[, 5]
X <- iris[, 1:4]
fit <- randomForest(X, Y)
pred0 <- predict(fit)
pred1 <- predict(fit, newdata = X)
...
0
votes
0answers
28 views
'getTree': binary expansion of factor levels
I am using the randomForest package and am trying to analyze the trees in the forest with getTree().
"Details" for help(getTree) reads:
For categorical predictors, the splitting point is ...
0
votes
0answers
55 views
Random forest in R - other error measures in OOB sample
I am preparing a predictive model using randomForest package in R. However I would like the function to report the other than accurace OOB error measure. In fact I want to use Gini coefficient (some ...