#

web-archiving

Here are 99 public repositories matching this topic...

ArchiveBox

ArchiveBox / ArchiveBox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

Updated Nov 9, 2023
Python

conifer

Rhizome-Conifer / conifer

Collect and revisit web pages.

python docker archives warc web-archiving wayback webrecorder pywb

Updated Nov 8, 2023
Python

webrecorder / pywb

Core Python Web Archiving Toolkit for replay and recording of web archives

python web-archiving wayback web-archives pywb

Updated Nov 7, 2023
JavaScript

webrecorder / archiveweb.page

A High-Fidelity Web Archiving Extension for Chrome and Chromium based browsers!

extension archiving chromium web-archiving webrecorder wacz

Updated Oct 7, 2023
JavaScript

ipwb

oduwsdl / ipwb

InterPlanetary Wayback: A distributed and persistent archive replay system using IPFS

python docker service-worker ipfs memento warc web-archiving wayback memento-rfc

Updated Nov 9, 2023
Python

webrecorder / replayweb.page

Serverless Web Archive Replay directly in the browser

service-worker warc web-archiving wayback-machine web-archive replay-web-page web-replay wacz

Updated Nov 10, 2023
JavaScript

webrecorder / browsertrix-crawler

Run a high-fidelity browser-based crawler in a single Docker container

crawler web-crawler crawling warc web-archiving webrecorder wacz

Updated Nov 11, 2023
JavaScript

webrecorder / webrecorder-player

Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)

electron warc web-archiving webrecorder pywb

Updated Sep 17, 2020
JavaScript

Florents-Tselai / WarcDB

WarcDB: Web crawl data as SQLite databases.

cli database sqlite crawling warc web-archiving web-data

Updated Nov 8, 2023
Python

harvard-lil / perma

Indelible links

libraries web-archiving

Updated Nov 8, 2023
JavaScript

rahiel / archiveror

Archiveror will help you preserve the webpages you love. 💾

javascript chrome-extension bookmark archiving webextension firefox-extension browser-extension mhtml linkrot web-archiving

Updated Oct 18, 2019
JavaScript

oduwsdl / archivenow

A Tool To Push Web Resources Into Web Archives

internet-archive web-archiving

Updated Feb 14, 2021
Python

waybackpy

akamhy / waybackpy

Wayback Machine API interface & a command-line tool

osint internet-archive web-archiving wayback-machine webarchiving cdx-api internet-archiving savepagenow archive-webpage archive-webpages wayback-machine-api wayback-machine-python

Updated Nov 17, 2022
Python

wail

machawk1 / wail

🐋 Web Archiving Integration Layer: One-Click User Instigated Preservation

python gui warc web-archiving pyinstaller wayback heritrix openwayback

Updated Nov 10, 2023
Roff

webrecorder / warcio

Streaming WARC/ARC library for fast web archive IO

python warc web-archiving web-archives pywb

Updated May 10, 2023
Python

warcreate

machawk1 / warcreate

Chrome extension to "Create WARC files from any webpage"

chrome-extension warc web-archiving

Updated May 18, 2023
JavaScript

ArchiveBox / electron-archivebox

Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)

electron windows macos linux docker gui desktop web-archiving digipres internet-archiving archivebox desktop-electron

Updated Feb 28, 2023
JavaScript

gwu-libraries / sfm-ui

Social Feed Manager user interface application.

social-media web-archiving code4lib social-feed-manager

Updated Aug 10, 2023
Python

cocrawler / cdx_toolkit

A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine

python warc web-archiving cdx web-archives commoncrawl cdx-api

Updated Sep 21, 2023
Python

helgeho / ArchiveSpark

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

spark internet-archive warc web-archiving webarchive archivespark spark-framework

Updated Oct 8, 2021
Scala

Improve this page

Add a description, image, and links to the web-archiving topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the web-archiving topic, visit your repo's landing page and select "manage topics."