zstandard compression in GeoTIFF

I ran GDAL 2.4’s gdal_translate (GDAL 2.4.0dev-333b907 or GDAL 2.4.0dev-b19fd35e6f-dirty, I am not sure) on some GeoTIFFs to compare the new ZSTD compression support to DEFLATE in file sizes and time taken.

Hardware was a mostly idle Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz with fairly old ST33000650NS (Seagate Constellation) harddisks and lots of RAM.

A small input file was DGM1_2x2KM_XYZ_HH_2016-01-04.tif with about 40,000 x 40,000 pixels at around 700 Megabytes.
A big input file was srtmgl1.003.tif with about 1,3000,000 x 400,000 pixels at 87 Gigabytes.
Both input files had been DEFLATE compressed at the default level 6 without using a predictor (that’s what the default DEFLATE level will make them smaller here).

gdal_translate -co NUM_THREADS=ALL_CPUS -co PREDICTOR=2 -co TILED=YES -co BIGTIFF=YES --config GDAL_CACHEMAX 6144 was used all the time.
For DEFLATE -co COMPRESS=DEFLATE -co ZLEVEL=${level} was used, for ZSTD -co COMPRESS=ZSTD -co ZSTD_LEVEL=${level}

Mind the axes, sometimes I used a logarithmic scale!

Small file

DEFLATE

ZSTD

Big file

DEFLATE

ZSTD

Findings

It has been some weeks since I really looked at the numbers, so I am making the following up spontaneously. Please correct me!
Those numbers in the findings below should be percentages (between the algorithms, to their default values, etc), but my curiosity was satisfied. At the bottom is the data, maybe you can take it to present a nicer evaluation? ;)

ZSTD is powerful and weird. Sometimes subsequent levels might lead to the same result, sometimes a higher level will be fast or bigger. On low levels it is just as fast as DEFLATE or faster with similar or smaller sizes.

A <700 Megabyte version of the small file was accomplished within a minute with DEFLATE (6) or half a minute with ZSTD (5). With ZSTD (17) it got down to <600 Megabyte in ~5 Minutes, while DEFLATE never got anywhere near that.
Similarly for the big file, ZSTD (17) takes it down to 60 Gigabytes but it took almost 14 hours. DEFLATE capped at 65 Gigabytes. The sweet spot for ZSTD was at 10 with 4 hours for 65 Gigabytes (DEFLATE took 11 hours for that).

In the end, it is hard to say what default level ZSTD should take. For the small file level 5 was amazing, being even smaller than and almost twice as fast as the default (9). But for the big file the gains are much more gradual, here level 3 or level 10 stand out. I/O might be to blame?

Yes, the machine was not stressed and I did reproduce those weird ones.

Raw numbers

Small file

Algorithm Level Time [s] Size [Bytes] Size [MB] Comment
ZSTD 1 18 825420196 787
ZSTD 2 19 783437560 747
ZSTD 3 21 769517199 734
ZSTD 4 25 768127094 733
ZSTD 5 31 714610868 682
ZSTD 6 34 720153450 687
ZSTD 7 40 729787784 696
ZSTD 8 42 729787784 696
ZSTD 9 51 719396825 686 default
ZSTD 10 63 719394955 686
ZSTD 11 80 719383624 686
ZSTD 12 84 712429763 679
ZSTD 13 133 708790567 676
ZSTD 14 158 707088444 674
ZSTD 15 265 706788234 674
ZSTD 16 199 632481860 603
ZSTD 17 287 621778612 593
ZSTD 18 362 614424373 586
ZSTD 19 549 617071281 588
ZSTD 20 834 617071281 588
ZSTD 21 1422 616979884 588
DEFLATE 1 25 852656871 813
DEFLATE 2 26 829210959 791
DEFLATE 3 32 784069125 748
DEFLATE 4 31 758474345 723
DEFLATE 5 39 752578464 718
DEFLATE 6 62 719159371 686 default
DEFLATE 7 87 710755144 678
DEFLATE 8 200 705440096 673
DEFLATE 9 262 703038321 670

Big file

Algorithm Level Time [m] Size [Bytes] Size [MB] Comment
ZSTD 1 70 76132312441 72605
ZSTD 2 58 75351154492 71860
ZSTD 3 63 73369706659 69971
ZSTD 4 75 73343346296 69946
ZSTD 5 73 72032185603 68695
ZSTD 6 91 72564406429 69203
ZSTD 7 100 71138034760 67843
ZSTD 8 142 71175109524 67878
ZSTD 9 175 71175109524 67878 default
ZSTD 10 235 69999288435 66757
ZSTD 11 406 69999282203 66757
ZSTD 12 410 69123601926 65921
ZSTD 13 484 69123601926 65921
ZSTD 14 502 68477183815 65305
ZSTD 15 557 67494752082 64368
ZSTD 16 700 67494752082 64368
ZSTD 17 820 64255634015 61279
ZSTD 18 869 63595433364 60649
ZSTD 19 1224 63210562485 60282
ZSTD 20 2996 63140602703 60216
ZSTD 21 lolno
DEFLATE 1 73 87035905568 83004
DEFLATE 2 76 85131650648 81188
DEFLATE 3 73 79499430225 75817
DEFLATE 4 77 75413492394 71920
DEFLATE 5 92 76248511117 72716
DEFLATE 6 129 73901542836 70478 default
DEFLATE 7 166 73120114047 69733
DEFLATE 8 407 70446588490 67183
DEFLATE 9 643 70012677124 66769

2 thoughts on “zstandard compression in GeoTIFF

  1. Pingback: Johannes Kröger to GeoHipster: “$existing_free_software can do that already.” | GeoHipster

  2. Pingback: Johannes Kröger to GeoHipster: “$existing_free_software can do that already.” – GeoHipster

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.