#
text-clustering
Here are 69 public repositories matching this topic...
中文文本分析工具包(包括- 文本分类 - 文本聚类 - 文本相似性 - 关键词抽取 - 关键短语抽取 - 情感分析 - 文本纠错 - 文本摘要 - 主题关键词-同义词、近义词-事件三元组抽取)
sentiment-analysis
text-classification
text-similarity
event-extraction
spell-corrector
text-clustering
text-ana
topic-keywords
key-words
text-summatizer
-
Updated
Mar 12, 2022 - Python
短文本聚类预处理模块 Short text cluster
-
Updated
Dec 28, 2019 - Python
-
Updated
Jan 4, 2018 - Jupyter Notebook
Library of state-of-the-art models (PyTorch) for NLP tasks
nlp
natural-language-processing
text-classification
machine-translation
pytorch
style-transfer
speech-recognition
text-summarization
nlp-library
text-clustering
punctuation-restoration
-
Updated
Oct 27, 2021 - Python
semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models (BERT).
text-similarity
simhash
transformer
locality-sensitive-hashing
fasttext
bert
text-search
word-vectors
text-clustering
-
Updated
Sep 19, 2020 - Python
Easy, fast clustering of texts
-
Updated
Apr 14, 2017 - R
Cross-lingual Language Model (XLM) pretraining and Model-Agnostic Meta-Learning (MAML) for fast adaptation of deep networks
machine-translation
languages
mlm
tlm
text-processing
pretrained-models
african-languages
bert
denoising-autoencoders
meta-model
clm
maml
text-clustering
xlm
back-translation
parallel-training
bpe-codes
bleu-scores
-
Updated
Mar 26, 2021 - Jupyter Notebook
Implementation of some algorithms for text clustering
-
Updated
Sep 5, 2018 - Python
Sentence Clustering and visualization. Created Date: 25 Apr 2018
-
Updated
Jan 15, 2020 - Python
This code belongs to ACL conference paper entitled as "An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering"
text-mining
data-stream
stochastic-process
non-parametric
dirichlet-process
dirichlet-process-mixtures
text-clustering
text-stream
data-stream-processing
data-stream-mining
-
Updated
Apr 22, 2021 - Python
Graph clustering and Node embeddings with word2vec
nlp
crawler
clustering
word2vec
word-embeddings
bachelor-thesis
random-walk
graph-clustering
text-clustering
graph-embedding
-
Updated
Mar 2, 2019 - Python
2020 Açık Seminer - Turkish NLP workshop
nlp
natural-language-processing
news
spacy
dataset
named-entity-recognition
ner
turkish-language
k-means-clustering
text-clustering
text-preprocessing
workshop-seminar
-
Updated
May 8, 2020 - Jupyter Notebook
Using word embeddings, TFIDF and text-hashing to cluster and visualise text documents
clustering
dimensionality-reduction
text-processing
d3js
document-clustering
umap
computational-social-science
text-clustering
text-features
-
Updated
Nov 7, 2019 - Python
Understanding hateful subreddits through text clustering
-
Updated
Nov 26, 2018 - Python
Domain Discovery Operations API formalizes the human domain discovery process by defining a set of operations that capture the essential tasks that lead to domain discovery on the Web as we have discovered in interacting with the Subject Matter Experts (SME)s.
information-retrieval
text-mining
text-classification
domain-discovery
topic-discovery
text-clustering
-
Updated
Aug 9, 2021 - Python
Chapter 3: Text and Speech Basics
-
Updated
Jul 23, 2019 - Jupyter Notebook
Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!
python
nlp
ocr
text-similarity
text-generation
pytorch
topic-modeling
summarization
research-tool
arxiv
research-data-management
scientific-publications
research-and-development
research-software-engineering
scientific-research
text-clustering
arxiv-api
pdf-document-processor
title-generation
-
Updated
Jul 22, 2021 - Python
heuristic matching of large databases by fuzzy criteria like addresses
-
Updated
Apr 7, 2022 - xBase
TFIDF being the most basic and simple topic in NLP, there's alot that can be done using TFIDF only! So, in this repo, I'll be adding the blog, TFIDF basics, wonders done using tfidf etc.
python
nlp
text-similarity
tfidf
text-clustering
textclassification
tfidf-vectorizer
tfidfvectorizer
-
Updated
Jun 15, 2020 - Jupyter Notebook
Python Program for Text Clustering using Bisecting k-means
-
Updated
Dec 12, 2017 - Jupyter Notebook
simple text clustering using kmeans algorithm
-
Updated
Oct 30, 2018 - Python
DBSCAN algorithm from scratch in Python -- to cluster text records.
-
Updated
May 18, 2018 - Python
This is an implementation of the TextClust algorithm in Python 3.
-
Updated
Dec 2, 2021 - Python
Clustering related books and research papers.
-
Updated
Nov 25, 2020
It is a very different task, as here I am going to cluster 200 different texts related to games and sports in 2 or more different clusters. we can also use zipf plot to determine how many useful clusters can be formed.
data-mining
data-visualization
data-analysis
elbow
pattern-recognition
kmeans
cluster-analysis
kmeans-clustering
zipf
text-clustering
-
Updated
Jun 8, 2019 - Jupyter Notebook
News Articles Text Classification and Clustering using Machine Learning in Python. Also, KNN implementation from scratch using max heap.
python
machine-learning
text-mining
text-classification
wordcloud
classification
tf-idf
vectorization
svd
knn
news-articles
ica
text-clustering
notebook-jupyter
roc-curves
-
Updated
Aug 27, 2020 - Jupyter Notebook
Topic Modeling and Text Cluster Analysis
-
Updated
Dec 11, 2019 - Jupyter Notebook
Information Retrieval project implementation
-
Updated
Oct 31, 2019 - Jupyter Notebook
Improve this page
Add a description, image, and links to the text-clustering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the text-clustering topic, visit your repo's landing page and select "manage topics."
It would be great to have more friendly and funny doctest text content (instead of "Aha", "Text", ...). It's also nicer for users if the docstring examples are all similar.
One idea, for instance, is to use famous sentences said by movie Superheroes. Here are a few examples: