Skip to content
#

linguistics

Here are 560 public repositories matching this topic...

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

  • Updated Mar 13, 2019
  • Python
DhanshreeA
DhanshreeA commented Dec 24, 2019

Documentation for case 3 and case 1 in definition extraction methods is the same (ref: https://github.com/LexPredict/lexpredict-lexnlp/blob/f3920be16dac588b2f38e17811ea5482b417954d/lexnlp/extract/en/definition_parsing_methods.py#L134) however, case #3 seems to only work for Title case or upper case words followed by something from the strong trigger list, and case #1 must necessarily have the word

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

  • Updated May 6, 2020
  • C++
ErkanBasar
ErkanBasar commented Oct 22, 2019

Summary

Currently, the FLAT can be configured to limit the number of sentences/paragraphs loaded at once. If a document is longer than the limit, it is sliced into multiple pages which can be navigated via the page dropdown on the top-left of the editor view.

Problem

The page is too small and the total number of pages are not displayed anywhere. So, after the first page is annota

bambooforest
bambooforest commented Jul 9, 2017

When merging all the new inventory resources, the process introduced duplicate bibtex keys (some from the original data providers themselves).

TODO:

  • remove duplicates
  • double-check that all inventory ID's have a bibtex (currently at least two from ER are without citations; others are URLs to online materials which need a reference added)
  • make all entries valid bibtex format (when using
shuttle1987
shuttle1987 commented Jan 6, 2020

This is something that will greatly help your code quality, especially if you have to target multiple platforms. You also then get substantial type checking benefits from mypy and other tools since there's no type ambiguity between strings that are representing strings and strings that are representing file paths.

xrotwang
xrotwang commented Aug 7, 2018

We should have our policy on Glottocode assignment and maintenance documented somewhere (also in the web app).

  1. Language level Glottocodes are always valid - i.e. the way to mark obsolence for these is moving the languoids to the Bookkeeping pseudo family.
  2. Sub-group level Glottocodes may be become obsolete and will then be removed from the current version. The codes will never be recycl

Improve this page

Add a description, image, and links to the linguistics topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the linguistics topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.