Search from over 60,000 research works

Advanced Search

J-measure based hybrid pruning for complexity reduction in classification rules

[thumbnail of WSEAS_PAPER.pdf]
Preview
WSEAS_PAPER.pdf - Published Version (764kB) | Preview
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Liu, H., Gegov, A. and Stahl, F. orcid id iconORCID: https://orcid.org/0000-0002-4860-0203 (2013) J-measure based hybrid pruning for complexity reduction in classification rules. WSEAS Transactions on Systems, 12 (9). pp. 433-446. ISSN 2224-2678

Abstract/Summary

Prism is a modular classification rule generation method based on the ‘separate and conquer’ approach that is alternative to the rule induction approach using decision trees also known as ‘divide and conquer’. Prism often achieves a similar level of classification accuracy compared with decision trees, but tends to produce a more compact noise tolerant set of classification rules. As with other classification rule generation methods, a principle problem arising with Prism is that of overfitting due to over-specialised rules. In addition, over-specialised rules increase the associated computational complexity. These problems can be solved by pruning methods. For the Prism method, two pruning algorithms have been introduced recently for reducing overfitting of classification rules - J-pruning and Jmax-pruning. Both algorithms are based on the J-measure, an information theoretic means for quantifying the theoretical information content of a rule. Jmax-pruning attempts to exploit the J-measure to its full potential because J-pruning does not actually achieve this and may even lead to underfitting. A series of experiments have proved that Jmax-pruning may outperform J-pruning in reducing overfitting. However, Jmax-pruning is computationally relatively expensive and may also lead to underfitting. This paper reviews the Prism method and the two existing pruning algorithms above. It also proposes a novel pruning algorithm called Jmid-pruning. The latter is based on the J-measure and it reduces overfitting to a similar level as the other two algorithms but is better in avoiding underfitting and unnecessary computational effort. The authors conduct an experimental study on the performance of the Jmid-pruning algorithm in terms of classification accuracy and computational efficiency. The algorithm is also evaluated comparatively with the J-pruning and Jmax-pruning algorithms.

Item Type Article
URI https://reading-clone.eprints-hosting.org/id/eprint/34529
Item Type Article
Refereed Yes
Divisions Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Uncontrolled Keywords Data Mining, Machine Learning, Classification Rules, J-pruning, Jmax-pruning, Jmid-pruning, if-then rules, overfitting, J-measure
Publisher WESAS
Download/View statistics View download statistics for this item

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar