Oxford Computer Consultants Ltd (OCC) is a UK SME, based in Oxford. OCC was founded in 1989 and has specialised in providing bespoke software development services to a wide variety of clients, focusing on the desktop and web platforms. OCC has significant experience in working on EU projects, having initially been founded as a consequence of work on the TOPMUSS project. Since then, OCC has worked on a number of projects, including most recently TERREGOV, REACT, and currently CuPID, where staff of OCC manages projects or project parts.
OCC is an ISO 27001 accredited software house handling data on social care and finances for over 70 UK Local governments. ISO 27001 concerns the management of information security including Security policies, governance of information security, the management of information assets, the acquisition, development and maintenance of secure application and conformance with information security policies, standards, laws and regulations.
Oxford Computer Consultants contributes with knowledge based on the following specific research and work priorities:
- Information Gathering and Mining
- Mobile and web application development
- Implementation of complex algorithms and models in a commercial context
Role in the project
In EmerGent, OCC enhances data by using information mining techniques, continuously monitoring the activities by the use the Ethics Advisory Committee, which is responsible for guaranteeing all ethical issues. OCC takes the lead in obtaining data from the social media sources, applying processing algorithms to it, and integrating the work done by the partners into a functioning and robust system. OCC applies the expertise and software developed during the REACT project, in which a semantic textual analysis techniques and incident description methods have been developed. OCC is leader on WP5 (Information Collection and Presentation), and is leader on the following tasks:
T4.2 (Development of Information mining methods): OCC has built server applications to process data spidered from YouTube and social web sites for research purposes. In REACT we mined the data for temporal relations, co-referencing (multiple content referring to the same subject), linked content (separate clusters referencing different but related subjects). EmerGent will need to heuristically optimise analysis to the content, solving a minimisation problem by tuning the parameters by which data is graded. Based on REACT these could be Subject, Media, Geography and Time as a preliminary list but more criteria can be added through text based analysis of the data. In the previous models, the problem with the text based analysis is that no single word makes up enough significance to partition the data. There are often a number of words to describe a single entity (e.g. Male, Man), there are also a great number of very commonly made spelling mistakes and variations on abbreviations. The process carried out in REACT was to cluster on numerics, locations and names, perform spell checking, group word stems with common variations and build a custom dictionary. We were then able to cluster semantically using resources such as WordNet.
T4.4 (Design of IM and IQ component) and T4.6 (Implementation and verification of IM and IQ components): This task building directly on real world data and learning is used to refine and adapt the IM and IQ techniques. Generally within text analysis handling abbreviations and misspellings are very difficult to handle. Social media lends itself to shorthand and typing errors are common, f.i. OCC has analysed incident logs and found 25 different spellings of the word ‘yesterday’. The use of custom word sets allowed most of these common errors to be included and handled correctly but these word sets need to be built empirically from the gathered content.
T5.4 (Information gathering from public and access-granted streams): This critical task for Emergent requires concentrated effort from OCC and IES. OCC builds and supports server applications which will run continuously downloading and processing data from the web. OCC has experience in this and in implementing the relevant design patterns including failover and recovery methods, bandwidth throttling, randomising access times to collect data generated by diverse sources. The collected data is then made available to the EmerGent IM and IQ components. This is a resource-intensive process which involves the creation, management and maintenance of big data sets, securing them and controlling access.
T6.3 (Integration of components into the system): In this task, based on D6.1, integration is considered an on-going activity following an Agile approach to software development. Web based services that are developed online and continuously integrated and tested by the Project team. Regular reports describe the progress at the end of each iterative release.
T7.4 (Exploitation): The project results in a set of tools validated in field trials and a set of guidelines for the future adoption of social media in emergency procedures. Both results are part of the project exploitation plan that aims to identify possible commercial paths for the tools and the best strategies for improving emergency management systems and crisis management procedures. OCC has found different emergency organisations to be poorly integrated and making little use of social media which represents an opportunity for the project. Given the varying nature and size of emergency organisation and structures OCC considers a variety of commercial channels including licensing the tools, selling a managed service and consultancy.
T7.5 (Data Protection, Privacy and Ethics): Through this task EmerGent monitors the evolution of European law on Data Protection and Privacy and the Article 29 Working Party, reporting regularly through deliverable D7.5.2. OCC uses and adapts the procedures, tools and experience from its own work with privately identifiable data to guide this task.
Reynold Greenlaw, Reynold@oxfordcc.co.uk