repaper

Transfomers LayoutLMv3 Huggingface Pytorch EasyOCR PDF Google Forms

An open source python package to create an editable PDF form from a handwritten form. I used LayoutLMv3 model trained on a Question-Answer dataset to identify key-value pairs from the paper and easy-ocr to extract the bounding boxes and text information.

Installation

Stable release

To install repaper, run this command in your terminal:

$ pip install repaper

This is the preferred method to install repaper, as it will always install the most recent stable release.

Usage

To use repaper in a project:

from repaper import repaper

To generate a google form from the a form image:

from repaper import repaper
 
re_paper = repaper('../samples/test.jpg')
 
form_id = re_paper.make_google_from('../secrets/credentials.json')
 
print(f'''Form created with form id: {form_id["formId"]} and is accessible at:
https://docs.google.com/forms/d/{form_id['formId']}/viewform
 
edit and publish the form to make it accessible to others''')

Command line usage to generate google form from image:

repaper google-form --img_path ./samples/test.jpg --oauth_json ./secrets/credentials.json