Search from over 60,000 research works

Advanced Search

A text mining framework for Big Data

[thumbnail of paper.pdf]
Preview
paper.pdf - Published Version (194kB) | Preview
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Pavlopoulou, N., Abushwashi, A., Stahl, F. orcid id iconORCID: https://orcid.org/0000-0002-4860-0203 and Scibetta, V. (2017) A text mining framework for Big Data. Expert Update, 17 (1). ISSN 1465-4091 (Special Issue on the 1st BCS SGAI Workshop on Data Stream Mining Techniques and Applications)

Abstract/Summary

Text Mining is the ability to generate knowledge (insight) from text. This is a challenging task, especially when the target text databases are very large. Big Data has attracted much attention lately, both from academia and industry. A number of distributed databases, search engines and frameworks have been developed to handle the memory and time constraints, which are required to process a large amount of data. However, there is no open-source end-to-end framework that can combinenearreal-timeandbatchprocessingofingestedbigtextualdataalongwith user-defined options and provision of specific, reliable insight from the data. This is important as this way new unstructured information is made accessible in near real-time, more personalised customer products can be created and novel unusual patterns can be found and actioned on quickly. This work focuses on a proprietary complete near real-time automated classification framework for unstructured data with the use of Natural Language Processing and Machine Learning algorithms on Apache Spark. The evaluation of our framework shows that it achieves a comparable accuracy with respect to some of the best approaches presented in the literature.

Item Type Article
URI https://reading-clone.eprints-hosting.org/id/eprint/70108
Item Type Article
Refereed Yes
Divisions Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Publisher BCS Specialist Group on Artifical Intelligence
Download/View statistics View download statistics for this item

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar