Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters).

learn more… | top users | synonyms

6
votes
1answer
31 views

Calculating Euclidean norm for each vector in a sparse matrix

Below is a naive algorithm to find nearest neighbours for a point in some n-dimensional space. ...
4
votes
1answer
77 views

K-nearest neighbours in MATLAB

I implemented K-Nearest Neighbours algorithm, but my experience using MATLAB is lacking. I need you to check the small portion of code and tell me what can be improved or modified. I hope it is a ...
4
votes
1answer
58 views

KMeans in the shortest and most readable format

I'm learning Python (coming from Java) so I decided to write KMeans as a practice for the language. However I want to see how could one improve the code and making it shorter and yet readable. I ...
4
votes
1answer
70 views

Nearest pair of points

Given a set of 2 dimensional points, it returns the two nearest points. If more pairs have the same min-distance between them, then an arbitrary choice is made. This program expects the points to be ...
2
votes
1answer
529 views

String Matching and Clustering

I have a pretty simple problem. A large set of unique strings that are to various degrees not clean, and in reality they in fact represent the same underlying reality. They are not person names, but ...
6
votes
3answers
267 views

How to speed up this k-nearest neighbors code?

I have implemented kNN (k-nearest neighbors) as following, but it is very slow. I want to get an exact k-nearest-neighbor, not the approximate ones, so I didn't use the FLANN or ANN libraries. Could ...
4
votes
2answers
893 views

Is this the fastest way to find the closest point to a list of points using numpy?

I'm trying to find the closest point (Euclidean distance) from a user inputed point to a list of 50,000 points that I have. Note that the list of points changes all the time. and the closest distance ...
4
votes
3answers
329 views

Nearest Neighbour classification algorithm

The following code is from a university assignment of mine to write a classification algorithm (using nearest neighbour) to classify whether or not a given feature set (each feature is the frequency ...
5
votes
4answers
2k views

Grouping consecutive numbers into ranges in Python 3.2

The following is a function that I wrote to display page numbers as they appear in books. If you enter the list [1,2,3,6,7,10], for example, it would return: ...
10
votes
1answer
161 views

Is this a BSP tree? Am I missing something?

I've been practicing using BSP trees as I hear they are a good start to create procedural spaces such as houses. For ease-of-deployment purposes, I've tested this code in Lua and it seems to work. ...
1
vote
1answer
164 views

A sophisticated algorithm for geometric computation (distance between points) [closed]

I am confused by an algorithm for counting the number of pairs of N random points which has a distance closer than d = sqrt((p1.x-p2.x)^2 + (p1.y-p2.y)^2) The ...