Machine learning provides computer algorithms that automatically discover patterns in data and make intelligent decisions from them.
0
votes
1answer
26 views
csv loader and kNN algorithm in Java
I have applied the KNN algorithm for classifying handwritten digits. the digits are in vector format initially 8*8, and stretched to form a vector 1*64..
As it stands my code applies the kNN ...
1
vote
1answer
39 views
SKlearn automate data pre treatment
I want to make a simple wrapper for sklearn models. The idea is that the wrapper automatically takes care of factors (columns of type "object") replacing them with ...
1
vote
1answer
36 views
ImprSimple chat bot written in python
I'd like to know if I can improve the performance of my recent bot, and maybe some design patterns too.
So far, it's warning a user if it's using some bad words (which are parsed from ...
0
votes
0answers
31 views
Naive-Bayes classifier, to be packaged in a function
I need to create a Naive-Bayes classifier. I have eight labels (S) stored in tumors object, and 20531 attributes (A), I have stored the P(S,A) in objects of name ...
4
votes
2answers
131 views
Tic-Tac-Toe machine learning
I recently started getting into machine learning and I wanted to write a "beginner program" which would learn to play Tic Tac Toe. This code was inspired by a different program I saw, meaning some ...
1
vote
0answers
41 views
Increase performance of Spark-job Collaborative Recommendation.
This is my first Spark Application. I am using "ALS.train" for training the model - Model Factorization. The total time that the Application takes is approx 45 mins.
Note: I think takeOrdered is the ...
1
vote
0answers
53 views
Predicting a win/loss given prior game stats
The project: create a model that can (somewhat) accurately predict a win/loss given prior game stats. Wanted a review of code in general, in particular my use of the ...
4
votes
0answers
71 views
RandomForest multi-class classification
Below is the code I have for a RandomForest multiclass-classification model. I am reading from a CSV file and doing various transformations as seen in the code. I ...
3
votes
0answers
55 views
Logistic regression with eigen
I am a new to Eigen, and I implemented a logistic regression model with it. It works but I don't know whether it is implemented in an efficient way.
...
0
votes
0answers
38 views
Perceptron with 2 output neurons and binary input
To the best of my knowledge I've implemented a functional version of the perceptron algorithm, but as my knowledge is not so developed as of yet I'm wondering if I've done it correctly or not.
What ...
2
votes
1answer
112 views
Cross validation of gradient boosting machines
I am fairly new to Python. I implemented a short cross-validation tool for gradient boosting methods.
...
6
votes
1answer
150 views
Calculate conditional probabilities and perform naive Bayes classification on a given data set
I wrote a class that I'm using to calculate conditional probabilities of a given distribution as well as perform naive Bayes classification. I'd like to get a code review done to tell me if there is ...
3
votes
1answer
144 views
1
vote
1answer
62 views
Latent Dirichlet Allocation in Python
I've recently finished writing a "simple-as-possible" LDA code in Python.
The theory from which I've developed my code can be found in the book Computer Vision by Simon Prince, free (courtesy of ...
3
votes
0answers
1k views
ID3 Decision Tree in python
I've been working my way through Pedro Domingos' machine learning course videos (although the course is not currently active). His first homework assignment starts with coding up a decision tree ...
3
votes
2answers
39 views
Asynchronous model fitting that allows termination in Python
The problem
When you work with Python interactively (e.g. in an IPython shell or notebook) and run a computationally intensive operation like fitting a machine-learning model that is implemented in a ...
4
votes
1answer
57 views
Random Forest Code Optimization
I am new to Python. I have built a model with randomforest in python. But I think my code is not optimized. Please look into my code and suggest if I have deviated from best practices.
Overview about ...
2
votes
2answers
43 views
ML Retraining project
Tear me to shreds.
The class RandomForestRetrainer will be used to retrain a machine learning algorithm. It has functionality for taking in a directory containing malware or benignware files and ...
2
votes
1answer
47 views
Randomly learning a neuron to act as a signal counter
I have this small program for learning an artificial neuron to act as a simple signal counter: my cell has four input wires (also called dendrites) and a single output wire (also called axon). If at ...
3
votes
1answer
62 views
Batch Gradient Descent running too slowly
Following Data Science from Scratch by Joel Grus, I wrote a simple batch gradient descent solver in Python 2.7. I know this isn't the most efficient way to solve this problem, but this code should be ...
10
votes
3answers
1k views
Simple chat bot
I made a chat bot, that, as you talk to it, it learns to respond. But the way it speaks is strange, so if you have any ideas on how to make its response any more human, then please say so.
Anyway, ...
6
votes
2answers
265 views
Simple Java Neural Network
I've written a toy neural network in Java. I ran it several million times with the same outputs with only the randomized weights changing from run to run. The average of all of the outputs is not 0.5, ...
5
votes
1answer
68 views
Designing a circuit of gates in Clojure and doing forward and backpropagation
I am reading Hacker's guide to Neural Networks. Since I am also learning Clojure, I tried to implement them in Clojure.
I would like the feedback about what could be more idiomatic and better in the ...
2
votes
0answers
216 views
Discretization of continuous attributes for automatic classification [closed]
Background
In machine learning, it's common to encounter the problem of making a decision as to which discrete category an object belongs to based on a set of continuous attributes. For example, we ...
5
votes
0answers
347 views
Modified Taylor diagrams
There is a type of diagram summarizing how well predictions from numerical models fit expectations; one obvious use case is comparing machine-learning regression models. Modified Taylor diagrams are ...
3
votes
2answers
40 views
loopification of highly procedureal, though fully functional, multiclass perceptron
I've implemented the multiclass perceptron in the one vs. all style.
I just thought about it and tried to implement it in the most basic way. I think it's correct though my f_measure is a bit low. ...
3
votes
1answer
104 views
Perceptron algorithm
This is the Perceptron algorithm, I wrote this implementation with my friend. It gets the job done, but it's quite dirty, perhaps one of you stylish hackers might help me beautify this beast.
This ...
4
votes
1answer
66 views
Implementation of Logistic Regression
Is this kind of vectorized operations the most efficient way to do this in matlab? Any critics about my code? Am I doing something wrong (i tested several times, I think it works). Notice that I use J ...
5
votes
1answer
291 views
ANFIS network based on Sugeno model I
I've been learning Common Lisp lately and I've implemented ANFIS network based on Sugeno model I.
Network layout and details can be read in these slides by Adriano Oliveira Cruz.
I use sigmoid as the ...
4
votes
1answer
175 views
Generate and store hypernyms for all words in a hashmap
I have a system which reads in a clause in the form of a prolog "fact", i.e. 'is'('a sentence', 'this').. I want to generalize this up into higher-order classes and ...
4
votes
2answers
557 views
Inefficient hash map operation provokes OutOfMemory: Java heap space error
I know I can increase the size of the heap but that seems like a poor solution.
This program runs correctly on small files but when run on large data sets it crashes with the OutOfMemory: Java heap ...
0
votes
1answer
293 views
Simple k-means implemention using Python3 and Pandas
Is there anything I can improve? The distance function is Pearson correlation.
...
0
votes
1answer
106 views
Linear regression with visualization
I have created a small script that:
Creates a lot of random points.
Runs a small brute force search to find a rect that has a low error, that is a good fit for the data.
Runs a linear regression on ...
2
votes
3answers
201 views
Refactor jaccard similarity the “Scala way”
I'm trying to pick Scala up. This is a simple heuristic that checks a similarity value between two sets. I've done this a million times in Java or Python. The function works, but I'm certain I am not ...
6
votes
1answer
472 views
K-nearest neighbours in C# for large number of dimensions
I'm implementing the K-nearest neighbours classification algorithm in C# for a training and testing set of about 20,000 samples and 25 dimensions.
There are only two classes, represented by ...
4
votes
0answers
146 views
Implementation of a new algorithm for sklearn
In the Python library, sklearn is implemented the algorithm for SparsePCA.
I have written the code for a another version of this algorithm that is much faster in some situations. I have not enough ...
-1
votes
1answer
496 views
Stochastic gradient descent squared loss
I have implemented stochastic gradient descent in matlab and I would like to compare my results with another source but the error I am getting is higher (I am using squared error). I am worried I am ...
1
vote
1answer
83 views
Compute logistic regression on tweet objects
Is my approach good to naming variables and exception handling? I would like to make this code more robust and maintainable. I need advice on exception handling, var naming and comments.
...
5
votes
2answers
231 views
Defensive programming type-checking
I have issues with dynamically typed languages, and I tend to worry about type a lot.
Numpy has different behaviour depending on if something is a matrix or a plain ndarray, or a list. I didn't ...
11
votes
1answer
2k views
Clojure Neural Network
After reading this article about Neural Networks I was inspired to write my own implementation that allows for more than one hidden layer.
I am interested in how to make this code more idiomatic - ...
5
votes
1answer
1k views
Why does the LR on spark run so slowly?
Because the MLlib does not support the sparse input, I ran the following code, which supports the sparse input format, on spark clusters. The settings are:
5 nodes, each node with 8 cores (all the ...
7
votes
2answers
888 views
Python neural network: arbitrary number of hidden nodes
I'm trying to write a neural network that only requires the user to specify the dimensionality of the network. Concretely, the user might define a network like this:
...
4
votes
1answer
4k views
Alternative to Python's Naive Bayes Classifier for Twitter Sentiment Mining
I am doing sentiment analysis on tweets. I have code that I developed from following an online tutorial (found here) and adding in some parts myself, which looks like this:
...
5
votes
1answer
3k views
Simple Neural Network in Java
I had an assignment some weeks ago that consisted of making a simple McCulloch-Pitts neural network. I ended up coding it in a pretty OO style (or the OO style I've been taught), and I felt that my ...
3
votes
2answers
371 views
C++ and STL - Machine Learning Problem
I would like to get some general comments on style and use of STL in particular. This is some code I wrote to do machine learning classification (logistic regression). Any suggestions would be very ...
8
votes
2answers
959 views
Using Viterbi algorithm to analyze sentences
I've probably done some pretty horrendous things here, but I'm throwing it out for people to give me some feedback that I can start using to immediately improve my Clojure coding style.
Additional ...
5
votes
1answer
262 views
Performing machine learning
I've written the code below to do some work on machine learning in R. I'm not overly happy with some bits of it, and I suspect I could improve it quite a bit. Bits I'm specifically interested in ...