D4.8 Documentation of Information Mining and Information Quality components


Greg Holdridge, Matthias Moi, Marc-André Kaufhold


This document describes the implementation of the Processing and Analysis Subsystem (PAS) of EmerGent, which contains the data enrichment, information mining (IM) and information quality (IQ) components. This document expands on D4.6 [Mark15] by describing how the designs therein have been incorporated into a live system. It describes a processing pipeline based on association of components by a simple standard RESTful interface as well as a controller that coordinates the operation of the pipeline. The input of the system is a batch of messages collected from various social media channels. The output will have been filtered to improve relevance and augmented with useful information on the quality of information supplied. The system is intended to be used alongside a user interface component that presents the results of processing and allows the parameters of the processing pipeline to be altered.

Purpose of the Document

This document is intended to record the operation of, and the design choices within, the Processing and Analysis Subsystem. It functions as an update to the specifications and design descriptions from D4.6.