Towards decoupling the selection of compression algorithms from quality constraints – an investigation of lossy compression efficiency

Download

Preview

Text (Open access)
- Published Version
· Available under License Creative Commons Attribution Non-commercial.

Advice

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Tools

Lists

Kunkel, J. M., Novikova, A. and Betke, E. (2017) Towards decoupling the selection of compression algorithms from quality constraints – an investigation of lossy compression efficiency. Supercomputing Frontiers and Innovations, 4 (4). pp. 17-33. ISSN 2313-8734 doi: 10.14529/jsfi170402

Abstract/Summary

Data intense scientific domains use data compression to reduce the storage space needed. Lossless data compression preserves information accurately but lossy data compression can achieve much higher compression rates depending on the tolerable error margins. There are many ways of defining precision and to exploit this knowledge, therefore, the field of lossy compression is subject to active research. From the perspective of a scientist, the qualitative definition about the implied loss of data precision should only matter. With the Scientific Compression Library (SCIL), we are developing a meta-compressor that allows users to define various quantities for acceptable error and expected performance behavior. The library then picks a suitable chain of algorithms yielding the user’s requirements, the ongoing work is a preliminary stage for the design of an adaptive selector. This approach is a crucial step towards a scientifically safe use of much-needed lossy data compression, because it disentangles the tasks of determining scientific characteristics of tolerable noise, from the task of determining an optimal compression strategy. Future algorithms can be used without changing application code. In this paper, we evaluate various lossy compression algorithms for compressing different scientific datasets (Isabel, ECHAM6), and focus on the analysis of synthetically created data that serves as blueprint for many observed datasets. We also briefly describe the available quantitiesof SCIL to define data precision and introduce two efficient compression algorithms for individualdata points. This shows that the best algorithm depends on user settings and data properties.

Altmetric Badge

Item Type	Article
URI	https://reading-clone.eprints-hosting.org/id/eprint/77683
Identification Number/DOI	10.14529/jsfi170402
Refereed	Yes
Divisions	Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Publisher	South Urals University
Download/View statistics	View download statistics for this item

Download Statistics

Downloads

Downloads per month over past year

Deposit Details

Date Deposited:	27 Jun 2018 10:22	Date item deposited into CentAUR
Last Modified:	09 Jun 2024 05:37	Date item last modified

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar