Here are
35 public repositories
matching this topic...
An Awesome List for getting started with web archiving
Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
-
Updated
May 19, 2020
-
JavaScript
A list of things related to software, literature, and other content for 🕣 Memento
Parse And Create Web ARChive (WARC) files with node.js
-
Updated
Dec 11, 2020
-
JavaScript
A dockerized, queued high fidelity web archiver based on Squidwarc
-
Updated
Jul 19, 2020
-
Python
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Decentralized web archiving
-
Updated
Aug 7, 2018
-
Python
Seeder - Czech webarchive curating tool and public site
-
Updated
Dec 21, 2020
-
Python
A social media open post web archiving tool
-
Updated
Sep 15, 2020
-
JavaScript
Digital Preservation of HTTP in documentary heritage.
pywb recorder over tor, anonymously records the web. (docker image)
record current active tab on webrecorder.io
-
Updated
May 9, 2017
-
JavaScript
Tika based link extractor for httpreserve
-
Updated
Jan 25, 2020
-
HTML
An archival thumbnail visualization server
-
Updated
Sep 2, 2020
-
JavaScript
A helper package to tokenize textual content and retrieve hyperlinks
Client app for httpreserve pkg that generates CSV, JSON, HTTP, and BoltDB
-
Updated
Mar 23, 2019
-
JavaScript
Given four bytes, download a random file from web archives implementing the UKWA Shine interface
metawarc: a command-line tool for metadata extraction from files from WARC (Web ARChive)
-
Updated
Dec 4, 2020
-
Python
Class page for ODU CS 791 / 891 Web Archiving Seminar
A wrapper for phantom.js commands for headless screenshots.
From WARC records to MongoDB documents
A archiving utility that is compatible with web servers.
-
Updated
Dec 29, 2020
-
Python
-
Updated
Sep 20, 2017
-
JavaScript
Link crawler for a phpBB forum
-
Updated
Jul 17, 2017
-
Java
A restrictied API in Golang for the (semi)-exposed functions of the internet archive.
HTTPreserve Analysis of Million Dollar Web Page
An Awesome List for getting started with web archiving
-
Updated
Jul 20, 2017
-
JavaScript
Extracts links from DSpace repositories
-
Updated
Oct 13, 2020
-
Java
Nástroj pro archivaci webových stránek na Wayback Machine
-
Updated
Dec 30, 2018
-
Kotlin
Improve this page
Add a description, image, and links to the
webarchiving
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
webarchiving
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.