Skip to content
#

tei

Here are 200 public repositories matching this topic...

adbar
adbar commented Jan 9, 2020

I have mostly tested trafilatura on a set of English, German and French web pages I had run into by surfing or during web crawls. There are definitely further web pages and cases in other languages for which the extraction doesn't work so far.

Corresponding bug reports can either be filed as a list in an issue like this one or in the code as XPath expressions in [xpaths.py](https://github.com

good first issue up for grabs

Improve this page

Add a description, image, and links to the tei topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tei topic, visit your repo's landing page and select "manage topics."

Learn more