Here are
11 public repositories
matching this topic...
A Python Library for Document Layout Understanding
Updated
Apr 13, 2021
Python
Document Layout Analysis resources repos for development with PdfPig.
A Large Dataset of Historical Japanese Documents with Complex Layouts
Updated
Jun 8, 2020
Jupyter Notebook
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.
Updated
Sep 11, 2020
Python
A Delphi implementation of the maximal empty rectangle problem.
Updated
Mar 15, 2019
Pascal
A more complete example of programming with PDFMiner, which continues where the default documentation stops
Updated
Jul 24, 2019
Python
JS Practices and exercising skills in layouting such as Flexbox and Grid
Updated
Jul 13, 2017
Python
A natural language processing package in Javascript
Updated
Mar 30, 2021
JavaScript
Improve this page
Add a description, image, and links to the
layout-analysis
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
layout-analysis
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
Hello,
Is there a way to get the measurement properties, as described in chapter 8.8 in the PDF Reference document? Thanks.