🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
-
Updated
Nov 9, 2023 - Python
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
Collect and revisit web pages.
Core Python Web Archiving Toolkit for replay and recording of web archives
A High-Fidelity Web Archiving Extension for Chrome and Chromium based browsers!
InterPlanetary Wayback: A distributed and persistent archive replay system using IPFS
Serverless Web Archive Replay directly in the browser
Run a high-fidelity browser-based crawler in a single Docker container
Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)
Archiveror will help you preserve the webpages you love. 💾
A Tool To Push Web Resources Into Web Archives
Wayback Machine API interface & a command-line tool
🐋 Web Archiving Integration Layer: One-Click User Instigated Preservation
Streaming WARC/ARC library for fast web archive IO
Chrome extension to "Create WARC files from any webpage"
Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)
Social Feed Manager user interface application.
A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine
An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Add a description, image, and links to the web-archiving topic page so that developers can more easily learn about it.
To associate your repository with the web-archiving topic, visit your repo's landing page and select "manage topics."