Grow your team on GitHub
GitHub is home to over 50 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
Sign up
Pinned repositories
Repositories
-
ArchiveAdministrator
Archive administration system. Handles archive creation and user authentication.
-
LookingGlass
Intuitive and configurable search interface for document archives.
-
OCRServer
OCR server for hosted archiving service
-
DocManager
Universal backend for indexing, storing, and querying documents.
-
DocUpload
Upload application for documents in archiving service.
-
tt-ansible
Ansible roles for deployment. In development, expect problems.
-
IndexServer
Receives, decrypts, and verifies data, then indexes with DocManager
-
Catalyst
Text mining framework.
-
Test-Data
Test data for Transparency Toolkit development
-
UtilityScripts
Scripts for managing scrapers
-
catalyst_test_scripts
Test scripts for all Catalyst methods.
-
DocIntegrityCheck
Methods for encrypting and verifying documents. Utility gem for document processing pipeline.
-
NSA-Data
NSA documents in machine readable form
-
Harvester
Web crawling and document processing through a usable interface.
-
DocSuggestions
Backend for processing document suggestions from LookingGlass
-
Surveillance-Research-Data
Raw data and scripts for Surveillance Research Archive
-
ParseFile
OCRs document and extracts metadata
-
DirCrawl
Runs block of code on every file in directory
-
TransparencyToolkit
Main repository for Transparency Toolkit
-
CrawlerManager
API for calling crawlers
-
TSJobCrawler
Collects listings for jobs that require security clearance.
-
HarvesterReporter
Incremental crawler result reporting for Transparency Toolkit
-
dataspec-TsjobCrawl
Dataspec for cleared job listings
-
TwitterCrawler
A crawler for Twitter
-
DesignAssets
A collection of branding, interfaces, and other visual resources!
-
dataspec-TwitterCrawl
LookingGlass dataspec for tweets
-
LinkedinCrawler
Crawls public LinkedIn profiles
-
generalscraper
Scrapes all pages on any site you specify for keywords.
-
ICWATCH-Data
Resume data and scripts for managing it