Suppose we have bulk of files and we want to identity which file contains information that we are looking for.So, our system makes it easy for users. The project allows to create database and upload all files into it in easy way. Also, there is a search panel where user can find data, from the databases, whatever they want. Must install mongodb database in the system. Must install Tesseract OCR in the system and put exe file into folder. User can select Keywork search if they want to search data which contains exact word with insensitive manner.User can select Phrase search if they want to search data which contains sentence or group of words that occur insequence.

L2 Modules

 Master

Where user can create databases and list all databases that was created by them. User can delete database also.We use modules to break down large programs into small manageable and organized files. Furthermore, modules provide reusability of code. We can define our most used functions in a module and import it, instead of copying their definitions into different programs.

 Upload

where user can select database and has access to select folder from their PC or laptop. Once they select folder our system identifies all files in it and display all files with their path, file size, file extension and status. When user click Go button the system start upload all file into database as fast as possible.The Data Upload module helps researches to upload their data to the data repository. The data should be uploaded in the form of a submission package (XML) that has a unique identifier – a submission ticket (XML). If you don’t have a ubmission package ready, use the Data Validation module to create it

 Search

Where user can search data from all files that they uploaded to the database.Search modules provide functionality related to indexing and searching content on your site. First, user can select database in which they want to find the data. Second, we provide three types of queries for user convenience:

   Keyword search

   Phrase search

   Wild search

 Index

Where user can select database on which they want to perform indexing, after that user can only see list of collection with pending indexing. Once they click index button indexing will start on listed collections.Written in conjunction with Post Build Index Writer Module, the IndexReader will translate the fields in the index and re-insert the information into a BuildResultSummary object, which has a specially designated customBuildData map for this purpose.

L2 Libraries

    OS

    Time

    Datetime

   re(regex)

   Chardet

   Pymongo

   File Opertaion

   Multiprocessing

   Threading

   Pandas

   NumPy

   Docx

   Docx2txt

   Shutil

   Pymupdf

   Tkinter

   Pytesseract

   Belfrywidgets