All Questions
Tagged with pandas clustering
7 questions
2
votes
0
answers
116
views
Regression on Pandas DataFrame
I am working on the following assignment and I am a bit lost:
Build a regression model that will predict the rating score of each
product based on attributes which correspond to some very common ...
3
votes
0
answers
578
views
Compute distance matrix using DTW acceptable for scipy.cluster.hierarchy
I am new to both data science and python. I have a dataset of the time-dependent samples, which I want to run agglomerative hierarchical clustering on them. I have found that Dynamic Time Warping (DTW)...
7
votes
1
answer
1k
views
PANDAS nearest site algorithm
I have got CSVs full of property transactions in the UK from 1995 to 2017, separated by year such as "RS2015.csv". I have a 2nd CSV with a list of wind turbines in the UK. Both have coordinates in WGS ...
6
votes
1
answer
604
views
Clustering points on a sphere
I have written a short Python program which does the following: loads a large data file (\$10^9+\$ rows) where each row is a point on a sphere. The code then loads a pre-determined triangular grid on ...
7
votes
1
answer
687
views
Similarity research : K-Nearest Neighbour(KNN) using a linear regression to determine the weights
I have a set of houses with categorical and numerical data. Later I will have a new house and my goal will be to find the 20 closest houses.
The code is working fine, and the result are not so bad but ...
5
votes
1
answer
2k
views
KNN pipeline w/ cross_validation_scores
Using the wine quality dataset, I'm attempting to perform a simple KNN classification (w/ a scaler, and the classifier in a pipeline). It works, but I've never used ...
5
votes
1
answer
2k
views
PANDAS spatial clustering
I'am writing on a spatial clustering algorithm using pandas and scipy's kdtree. I profiled the code and the .loc part takes most time for bigger datasets. I wonder ...