Newest 'data-mining' Questions

0

votes

1answer

64 views

Code for finding repeated entries with different data

project is the data frame. For the purpose of the code, HOUSE.NO is a column of the type character, and ...

beginner conditions r data-mining

asked Apr 14 at 8:13

Satwik Pasani

13317

3

votes

1answer

100 views

Mining association rules in Java

Let \$I = \{ i_1, i_2, \dots, i_d \}\$ be the set of all possible items, and \$T = \{ t_1, t_2, \dots, t_N \}\$ be the (multi)set of all given transactions, where \$t_i \subseteq I\$ for all \$i \in ...

java algorithm data-mining

asked Apr 11 at 11:42

coderodde

6,4262740

2

votes

1answer

48 views

Nested loops - Random Forest, multiple parameters

I'm writing a code which task is to grow Random Forest trees based on multiple parameters. In short: Firstly, I declare a data frame in which model parameters and some stats will be saved. Secondly, ...

performance loop r machine-learning data-mining

asked Mar 14 at 13:32

kaksat

111

5

votes

2answers

95 views

Document term matrix in Clojure

This is my very first foray into Clojure (I'm normally a Python-pushing data-type). I'm trying to create a simple term-document matrix as a vector of vectors, out of a vector of strings. For those ...

performance beginner functional-programming clojure data-mining

asked Mar 5 at 4:16

Paul Gowder

1485

4

votes

1answer

81 views

Data analytics on static file of 50,000+ tweets

I'm trying to optimize the main loop portion of this code, as well as learn any "best practices" insights I can for all of the code. This script currently reads in one large file full of tweets (50MB ...

python performance pandas data-visualization data-mining

asked Jan 27 at 17:43

Daniel Brown

1234

1

vote

1answer

4k views

Apriori algorithm for frequent itemset generation in Java

I have this algorithm for mining frequent itemsets from a database. In that problem, a person may acquire a list of products bought in a grocery store, and he/she wishes to find out which product ...

java algorithm data-mining

asked Sep 14 '15 at 13:39

coderodde

6,4262740

4

votes

3answers

111 views

Analyze very large sets of engineering data from Excel files

I am an electrical power engineer with some programming skills. My boss asked me to make a program which could analyze very large data, make some calculations and give the result. The task looks like ...

python python-2.7 file-system excel data-mining

asked Sep 1 '15 at 6:15

Irakli Darchiashvili

213

2

votes

1answer

139 views

File parser to extract data from text file

I am trying to extract the data from input file and store it for plotting. I have tested this code for a few files of same format. I am not sure if the code works correctly with the little change in ...

python parsing data-mining

asked Jul 22 '15 at 11:00

suhastheju

134

2

votes

0answers

141 views

C# port of data mining algorithm much slower than reference implementation

I was trying to implement the algorithm specified in this research paper (please ignore the math, since it's irrelevant to the question). This algorithm is very basic in formal concept analysis. The ...

c# performance matrix clustering data-mining

asked Jul 22 '15 at 4:59

sisck vabrigas

162

3

votes

0answers

88 views

Frequent subgraph mining program

I'm trying to make a programme that reads graphs from a .txt file, puts them in a vector, and finally puts the frequent closed graphs in another resulting file. ...

c++ performance graph boost data-mining

asked Jul 16 '15 at 22:39

Mohsenuss91

232

4

votes

1answer

1k views

Implementation of KNN in R

I have implemented the K-Nearest Neighbor algorithm with Euclidean distance in R. It works fine but takes tremendously huge time than the library function (get.knn). Please point out the possibility ...

performance matrix r clustering data-mining

asked Jun 23 '15 at 4:01

user3224114

234

2

votes

1answer

158 views

CSMR for large-scale text-prcessing

I'm working on a project for large-scale text-processing, which is a first implementation of the basic idea of CSMR. CSMR is an algorithm that measures the similarity between documents by calculating ...

java hadoop mapreduce data-mining

asked Oct 30 '14 at 21:38

IrishDog

162

5

votes

2answers

3k views

AutoComplete program using the n-gram model

For my Advanced Data Mining class (undergrad) we were to design a program that would predict the next word a user is likely to type via automatic text classification using the n-gram model. The ...

java optimization data-mining

asked Apr 18 '14 at 10:08

user40915

2814

3

votes

1answer

6k views

Apriori algorithm using Pandas

I want to optimize my Apriori algorithm for speed: ...

python algorithm numpy pandas data-mining

asked Dec 26 '13 at 7:02

user3084006

11614

5

votes

1answer

4k views

Alternative to Python's Naive Bayes Classifier for Twitter Sentiment Mining

I am doing sentiment analysis on tweets. I have code that I developed from following an online tutorial (found here) and adding in some parts myself, which looks like this: ...

python performance beginner machine-learning data-mining

asked Aug 9 '13 at 23:28

Andrew Martin

279415

current community

your communities

more stack exchange communities

Tagged Questions

Code for finding repeated entries with different data

Mining association rules in Java

Nested loops - Random Forest, multiple parameters

Document term matrix in Clojure

Data analytics on static file of 50,000+ tweets

Apriori algorithm for frequent itemset generation in Java

Analyze very large sets of engineering data from Excel files

File parser to extract data from text file

C# port of data mining algorithm much slower than reference implementation

Frequent subgraph mining program

Implementation of KNN in R

CSMR for large-scale text-prcessing

AutoComplete program using the n-gram model

Apriori algorithm using Pandas

Alternative to Python's Naive Bayes Classifier for Twitter Sentiment Mining

Hot Network Questions

your communities

Tagged Questions

Related Tags