Skip to content
Avatar

Organizations

@deutschestextarchiv @zentrum-lexikographie
Block or Report

Block or report adbar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Add an optional note:
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
adbar/README.md

Hi there! 👋

Links

  Web   |     Blog   |   🐦  Twitter   |   🎞  Youtube   |     Coffee

Activity

🔭  Currently working on gathering texts on the Web and detecting word trends

Programming experience

🖩  First programs written on a TI-83 Plus in TI-BASIC

Top Langs


Most popular blog posts

Pinned

  1. trafilatura Public

    Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

    Python 618 81

  2. htmldate Public

    Fast and robust date extraction from web pages, with Python or on the command-line

    Python 58 15

  3. simplemma Public

    Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

    Python 64 3

  4. py3langid Public

    Forked from saffsd/langid.py

    Faster, modernized fork of the language identification tool langid.py

    Python 9 3

  5. courlan Public

    Clean, filter and sample URLs to optimize data collection – includes spam, content type and language filters

    Python 21 1

  6. German-NLP Public

    Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German

    253 43

654 contributions in the last year

Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Mon Wed Fri

Contribution activity

October 2022

Created 5 commits in 2 repositories
Opened 1 issue in 1 repository
adbar/htmldate 1 closed

Seeing something unexpected? Take a look at the GitHub profile guide.