Here are
62 public repositories
matching this topic...
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Updated
Oct 14, 2021
Python
Bitextor generates translation memories from multilingual websites
Updated
Oct 15, 2021
Python
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Updated
Jul 17, 2021
Python
An open source reimplementation of Benny Brodda's BETA in Python
Updated
Oct 28, 2019
Python
An advanced, extensible web front-end for the Manatee-open corpus search engine
Updated
Oct 14, 2021
TypeScript
Multi-Language Dataset Cleaner/Creator for Mozilla's DeepSpeech Framework
Updated
Jul 25, 2019
Python
Updated
Apr 12, 2021
HTML
A set of workflows for corpus building through OCR, post-correction and normalisation
Updated
Jan 12, 2021
Python
Utilities for Processing the Switchboard Dialogue Act Corpus
Updated
Jan 24, 2021
Python
Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
Reading the data from OPIEC - an Open Information Extraction corpus
Updated
Jun 12, 2019
Java
OpusFilter - Parallel corpus processing toolkit
Updated
Oct 12, 2021
Python
Python library for extracting quantitative, reproducible metrics of multi-level alignment between two speakers in naturalistic language corpora.
Updated
Sep 3, 2021
Python
Utilities for Processing the Meeting Recorder Dialogue Act Corpus
Updated
Jan 24, 2021
Python
Praaline is an open-source system to manage, annotate, visualise and analyse spoken language corpora
A library of functions enabling complex corpus search in context (KWIC), search aggregation, bag-of-words building & keyphrase extraction.
Software for multi-level annotation of linguistic corpora
Updated
Jan 15, 2020
Java
Collector and speech cutter for librivox audiobooks
An unofficial Python API that allows users to create a corpus of lyrical text from their favorite artists and billboard charts
Updated
Jul 2, 2018
Python
Updated
Oct 13, 2020
Java
An Interactive Tool for Annotating Discourse Structure and Text Improvement
Updated
Sep 15, 2021
JavaScript
Korpuslinguistik war noch nie so einfach...
Script that sets up and configures an entire CQPweb server installation
Updated
Dec 1, 2019
Shell
Searching in-memory corpus with Corpus Query Language (CQL)
Updated
Oct 11, 2021
Python
Library for Python to use Korp API
Updated
Sep 11, 2020
Python
Yet another search platform for linguistic corpora.
Updated
Oct 6, 2021
Python
Updated
Sep 26, 2021
Python
Scripts for data conversion
Updated
Oct 4, 2021
Python
Improve this page
Add a description, image, and links to the
corpus-tools
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
corpus-tools
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
The urls used in the code and documentation should be checked for availability (As done for the readme in #87)
Places to check: