Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format
-
Updated
Aug 15, 2022 - C++
Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format
Simple app for visual editing of Page XML files
Some bits of javascript to transcribe scanned pages using PageXML
Library in C++ and a python wrapper for dealing with Page XML files
LECTAUREP Pipeline demonstration to TEI Publisher
This module provides access to Transkribus PageXML files via Xquery functions. It is designed to be used in context of a Basex xml database, but should work with other xml databases as well.
Add a description, image, and links to the pagexml topic page so that developers can more easily learn about it.
To associate your repository with the pagexml topic, visit your repo's landing page and select "manage topics."