Data & Document Completeness Checker using Python + Streamlit

Data and Document Completeness Checker

Project Overview

Easily validate and verify your data and documents in just a few clicks. The Data & Document Completeness Checker is a powerful yet simple Python + Streamlit tool that lets you upload a CSV file, rules DOCX file, and PDF documents to automatically check missing data fields and detect missing documents. Ideal for compliance, finance, insurance, and data teams.

Key Features

Automatic CSV Validation: Instantly detect missing or incomplete fields in your dataset.
PDF Document Matching: Scan multiple PDFs and verify if required documents are present.
Rule-Based Checking: Use a DOCX file to define required fields and documents per report type.
Per-Client Report View: Expand each client’s record to view missing fields and unmatched documents.
Summary Dashboard: Get a complete overview of total cases, completed vs. incomplete, and most missing fields.
Easy-to-Use Interface: Built on Streamlit for a clean and fast experience.

How It Works

Upload a CSV file containing your client or record data.
Upload a Rules DOCX file that lists required fields and documents for each type of report.
Upload one or more PDFs with the supporting documents.
The system scans and compares the data with the rules.
Instantly see which clients have missing fields or missing documents.
View an easy-to-read summary and download reports if needed.