Skip to content
#

metadata-extraction

Here are 129 public repositories matching this topic...

OpenGraph
MirianBebee
MirianBebee commented May 14, 2021

Good Morning,

I'm writing you because I have had a bug when obtaining the openGraph of a url, it does not get the image because it has escaped characters and it is not validated.

When using openGraph there are certain urls from which the image is not obtained correctly, and it is because a decode of the image should be added in the verify_image_url function.

Page example:
https://www

adbar
adbar commented Jan 3, 2020

I have mostly tested htmldate on a set of English, German and French web pages I had run into by surfing or during web crawls. There are definitely further web pages and cases in other languages for which the extraction of a date doesn't work so far.

Please install the dateparser library beforehand as it significantly extends linguistic coverage: pipor pip3 install -U dateparser or `pi

Improve this page

Add a description, image, and links to the metadata-extraction topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the metadata-extraction topic, visit your repo's landing page and select "manage topics."

Learn more