2
votes
1answer
64 views

Strategy for scraping web pages, maximizing information gathered

Here's the problem: Users register for a site and can pick one of 8 job categories, or choose to skip this step. I want to classify the users who've skipped that step into job categories, based on ...
1
vote
1answer
38 views

ML on house areas. two-component mixtures. SVM?

I trying to self-learn ML and came across this problem. Help from more experienced people in the field would be much appreciated! Suppose i have three vectors with areas for house compartments such ...
1
vote
0answers
34 views

Python NLTK: supervised learning for classifying unlabelled data, no labelled data available

I'm trying to extract time based information from text, for which as far as I know labelled data doesn't exist. The goal is to take sentences and extract information on when, for example, a task is ...
0
votes
1answer
32 views

(Python Scipy) How to flatten a csr_matrix and append it to another csr_matrix?

I am representing each XML document as a feature matrix in a csr_matrix format. Now that I have around 3000 XML documents, I got a list of csr_matrices. I want to flatten each of these matrices to ...
3
votes
3answers
82 views

Which algorithms/concepts should i dig for author prediction

I have been working on something that will try to figure out the author of a column by using my own data set. I'm planning to use mlpy python library. It has a good documentation, (about 100 pages ...
0
votes
1answer
35 views

How to use NLTK BigramAssocMeasures.ch_sq

I have list of words, I want to calculate the relatedness of two words by considering their co-occurrences. From a paper I found that it can be calculated using pearsson chi-square test. Also I found ...
1
vote
0answers
51 views

Orange information score maximum value, in the context of bayesian and tree classifiers

I am working with the Orange package and have written the following code based on the tutorials available: import orange,orngTest,orngStat,orngTree,Orange bayes = orange.BayesLearner() ...
0
votes
0answers
51 views

Only integer arrays with one element can be converted to an index error - using facerec in Python

I have get an error when using the facerec by bytefish Here is my code # -*- coding:Latin-1 -*- from facerec.feature import Fisherfaces from facerec.distance import EuclideanDistance from ...
0
votes
0answers
44 views

LIBLINEAR vs. LinearSVM from Orange

I used LinearSVM - which is a wrapper around LIBLINEAR - and noticed big differences between the results of the wrapper and the pure implementation? The difference is up to 10% higher for LinearSVM. ...
1
vote
3answers
119 views

Alternative to support vector machine classifier in python?

I have to make comparison between 155 image feature vectors. Every feature vector has got 5 features. My image are divided in 10 classes. Unfortunately i need at least 100 images for class for using ...
2
votes
2answers
111 views

How to predict a continuous value (time) from text documents? [closed]

I have about 3000 text documents which are related to a duration of time when the document was "interesting". So lets say document 1 has 300 lines of text with content, which led to a duration of ...
0
votes
0answers
57 views

Learning and using augmented Bayes classifiers in python

I'm trying to use a forest (or tree) augmented Bayes classifier in python, first learning it and then using it for classification. (I'd love to use incremental learning from incomplete data, but I ...
0
votes
1answer
91 views

sklearn logistic regression with unbalanced classes

I'm solving a classification problem with sklearn's logistic regression in python. My problem is a general/generic one. I have a dataset with two classes/result (positive/negative or 1/0), but the ...
2
votes
3answers
206 views

Machine Learning Email Prioritization - Python

I have been working on a Python coded priority email inbox, with the ultimate aim of using a machine learning algorithm to label (or classify) a selection of emails as either important or ...
1
vote
1answer
148 views

Implement K Neighbors Classifier in scikit-learn with 3 feature per object

I would like to implement a KNeighborsClassifier with scikit-learn module (http://scikit-learn.org/dev/modules/generated/sklearn.neighbors.KNeighborsClassifier.html) I retrieve from my image ...

1 2 3 4 5
15 30 50 per page