Copula-based synthetic data augmentation for machine-learning emulators

[thumbnail of Open access]
Preview
Text (Open access) - Published Version
· Available under License Creative Commons Attribution.
· Please see our End User Agreement before downloading.
| Preview
Available under license: Creative Commons Attribution

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Meyer, D. orcid id iconORCID: https://orcid.org/0000-0002-7071-7547, Nagler, T. and Hogan, R. J. orcid id iconORCID: https://orcid.org/0000-0002-3180-5157 (2021) Copula-based synthetic data augmentation for machine-learning emulators. Geoscientific Model Development, 14 (8). pp. 5205-5215. ISSN 1991-9603 doi: 10.5194/gmd-14-5205-2021

Abstract/Summary

Can we improve machine-learning (ML) emulators with synthetic data? If data are scarce or expensive to source and a physical model is available, statistically generated data may be useful for augmenting training sets cheaply. Here we explore the use of copula-based models for generating synthetically augmented datasets in weather and climate by testing the method on a toy physical model of downwelling longwave radiation and corresponding neural network emulator. Results show that for copula-augmented datasets, predictions are improved by up to 62 % for the mean absolute error (from 1.17 to 0.44 W m−2).

Altmetric Badge

Item Type Article
URI https://reading-clone.eprints-hosting.org/id/eprint/101309
Identification Number/DOI 10.5194/gmd-14-5205-2021
Refereed Yes
Divisions Science > School of Mathematical, Physical and Computational Sciences > Department of Meteorology
Publisher European Geosciences Union
Download/View statistics View download statistics for this item

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar