Analysis of outlier detection rules based on the ASHRAE global thermal comfort database

[thumbnail of 4 outlier paper - submission - R1 without highlight.pdf]
Preview
Text - Accepted Version
· Available under License Creative Commons Attribution Non-commercial No Derivatives.
· Please see our End User Agreement before downloading.
| Preview

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Zhang, S., Yao, R. orcid id iconORCID: https://orcid.org/0000-0003-4269-7224, Du, C., Essah, E. orcid id iconORCID: https://orcid.org/0000-0002-1349-5167 and Li, B. (2023) Analysis of outlier detection rules based on the ASHRAE global thermal comfort database. Building and Environment, 234. 110155. ISSN 1873-684X doi: 10.1016/j.buildenv.2023.110155

Abstract/Summary

ASHRAE Global Thermal Comfort Database has been extensively used for analyzing specific thermal comfort parameters or models, evaluating subjective metrics, and integrating with machine learning algorithms. Outlier detection is regarded as an essential step in data preprocessing, but current publications related to this database paid less attention to the influence of outliers in raw datasets. This study aims to investigate the filter performance of different outlier detection methods. Three stochastic-based approaches have been performed and analyzed based on the example of predicting thermal preference using the Support Vector Machine (SVM) algorithm as a case study to compare the predictions before and after outlier removal. Results show that all three rules can filter some obvious outliers, and the Boxplot rule produces the most moderate filer results, whereas the 3-Sigma rule sometimes fails to detect outliers and the Hampel rule may provide an aggressive solution that causes a false alarm. It has also been discovered that a small reduction in establishing machine learning models can result in less complicated and smoother decision boundaries, which has the potential to provide more energy-efficient and conflict-free solutions.

Altmetric Badge

Item Type Article
URI https://reading-clone.eprints-hosting.org/id/eprint/111009
Identification Number/DOI 10.1016/j.buildenv.2023.110155
Refereed Yes
Divisions Science > School of the Built Environment > Construction Management and Engineering
Science > School of the Built Environment > Energy and Environmental Engineering group
Publisher Elsevier
Download/View statistics View download statistics for this item

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar