Data Parsing Application
The application parses large amounts of non-standard information (articles) and structures the data following a given model.
It extracts relevant information, such as: Author, Contact information, Article details etc.
APSo corrects user input information: removes duplicate data, adds regional details according to static information etc.
#linux
Technologies
Perl
Posix
Highlights
- Web scraping functionality: removes HTML formatting and structures data after a given pattern
- Gathers / generates article statistics: activity results and detailed inconsistency information
- Supports appropriate information extraction based on specific regional / country details
- Auto-completes missing details based on existing information and a provided knowledge base
- Intuitive and user friendly UI