Von Wegen Lisbeth – B*tch als Slitscan

Ausschnitt eines resultierenden Bildes

Das Lied Von Wegen Lisbeth – Bitch hat ein verrücktes Musikvideo, wo sich die Kamera in einer chaotischen Szene scheinbar um sich selbst dreht, während im Hintergrund jemand von links nach rechts läuft.

Ich hab da mal was beklopptes mit gemacht…: Eine Transformation des Videos im Slitscan-Stil. Angefangen 2019, und jetzt endlich mal richtig gerendert.

Kochrezept

  • Extrahiere die Videoframes: 5621 Frames mit einer Auflösung von 1920×1080 Pixeln
  • Schneide jeden der Frames in einzelne 1×1080 große Streifen: 5621x der linkste Streifen, 5621x der zweitlinkste Streifen und so weiter.
  • Setze je ein Bild je Streifen zusammen, als erstes ganz links den linksten Streifen des ersten Frames, direkt daneben den linksten Streifen vom zweiten Frame, dann den vom dritten, etc.: 1920 Bilder mit einer Auflösung von 5621×1080 Pixeln
Hier sieht man, wie sich eins der Bilder aus den einzelnen Pixelstreifen der Frames zusammensetzt
  • Setze die resultierenden Bilder als Video zusammen: Video in 5621×1080 Auflösung und 1920 Frames

Ich hab’s mit ffmpeg, graphicsmagick und einem Bash-Skript gemacht. Super ineffizient, aber sehr effektiv ;)

Resultat

Hier in klein, der Link unten führt zur “Vollversion”. Es wabbelt alles bisschen rum, weil die Kamera nicht in gleichbleibender Geschwindigkeit bewegt wurde, oder so ähnlich.

Ein so großes Video kann man nirgendwo einfach hosten oder vernünftig ansehen (oder?), praktischerweise gibt es das Gigapan Time Machine-Projekt, was auch Google für ihre Timelapse-Funktion nutzt. Damit kann man die Bilder nehmen und wie eine animierte Webkarte ansehbar machen, mit Zoom und Pan und einigermaßen effizientem Datenzugriff. Leider ist das Tool veraltet, nicht mehr in Entwicklung und macht es praktisch unmöglich die Videoparameter selbst zu tweaken.

Volle Auflösung: https://hannes.enjoys.it/stuff/lisbeth-slitscan/view.html

Achtung, ggf. hoher Datenverbrauch

PS: (Ich hab das Resultat vertikal gestaucht, damit die Proportionen besser passen. Und es läuft rückwärts, weil damit die Zeit der meisten Bewegungen passt. Ziemlicher Brainfuck.)

Hä?!

Joa, schwer zu verstehen, was hier passiert. Ein paar schicke Diagramme würden helfen (welcher Pixelstreifen geht wohin und was wurde wie zusammengesetzt), aber das ist mir zu zeitaufwendig. Daher nur ein paar Denkanstöße, vielleicht helfen die:

Das originale Video hat 5621 Frames in 1920×1080.
Dieses Video hat 1920 Frames in 5621×1080.

Die Dame mit dem Tuch am Anfang ist etwa bis Frame 750 im originalen Video sichtbar.
Im transformierten Video ist sie nie weiter als etwa 750 Pixel vom linken Bildrand entfernt.

Disclaimer

Dies ist ein Kunstprojekt und ich hoffe damit es ist von etwas wie der Kunstfreiheit gedeckt. Falls nicht: Eine kurze Mail (hannes ät enjoys dotter it) und ich lösche es von der Seite.

❤️ Von Wegen Lisbeth!

The effect of lossy LERC compression, visually “explained”

LERC is a kick-ass approach to 2D raster data compression, supported in GDAL since version 3.3. You can use it for lossless compression but it is also able to throw away some bits of information for smaller data sizes. You tell it which level of Z error is acceptable for your values and it will use that freedom to change the values of neighboring cells to do its magic. Z means the “data” axis here, of a single-band in a 2D raster, X and Y are the coordinates, or rather the locations of the data values in the raster, which are obviously not changed.

I thought it would be nice to show what it actually results in, you can read up on the details elsewhere if you want. Look at the images in full resolution please.

I used a global SRTM DEM with a Z value in full meters (no floating point values but integers) and applied LERC on it in three ways: Lossless, with a maximum Z error of 1 meter and a maximum Z error of 10 meters. Zstandard compression was always used.

The original GeoTIFF file was already very well compressed with Zstandard level 15 and a horizontal predictor; at ~1296000 x ~417600 pixels it has a size of 86 gigabytes including overviews.

  • Original (ZSTD level 15): 86 GB
  • LERC_ZSTD (lossless): 105G
  • LERC_ZSTD (maximum Z error of 1): 81G
  • LERC_ZSTD (maximum Z error of 10): 21G

Cool, so if we don’t care about an error of 10 meters, we can have a global DEM (well, as global as SRTM is with its 60° cut-off) at ~30 meters pixel resolution in 21 gigabytes. But what does that actually look like then and how will this error appear? Well, check it out:

Here are some samples visualised with a greyscale color ramp (locally adjusted, so the lowest value in the image is black, the highest value is white). They are shown at a 1:1 resolution, one pixel in the image (if you look at it at 100%) is one cell of the DEM data. The left image is lossless, the middle one was allowed an Z error of 1 meter, the right one 10 meters.

Mountainous, here the values range from 0 meters to about 2000 meters:

You can hardly see a difference, at least visually.

“Mediumish”, values between ~100 and ~500 meters:

At the 10 meter error level you can see a significant terracing effect.

Plains, values all around 100 meters:

You can see some structures collapsing into flat areas in the 1 meter version and oh wow that 10 meters version looks like upscaled pixels.

Time to zoom in! I picked a less flat area again because it makes it easier to understand. Here the values are between ~100 and ~300 meters:

So what do we see here? Neighboring cells with the same values compress better so LERC is shifting the values around (within the allowed error), creating terraces of same-valued cells. If you look closely you can see that there is also a visible pattern of squarish structures. Those are the blocks or windows in which LERC looks at the data and does its adjustments, in this case they were 8×8 pixels wide. Note: What LERC does exactly is a bit more complex than “try to make neighboring values the same”, it actually looks at the bits required to store the values within a block and optimized that within the error tolerance.

And now you know what LERC can do, if you give it a error level to play with.

For reference, here is that same-ish area with the error tolerance at 1 meter:

You have to zoom in quite a bit more to be able to see the effects here due to the nature of the data in this extent in combination with the particular error tolerance:

The larger the zonal differences of your Z values are to each other and in relation to the error tolerance, the less distinguished will this effect be. If there are steps of 100 meters between neighboring pixels, an extra error of 10 meters won’t do much of a difference. But in more flat areas it will have significant “terracing” effects as you could see above. This is similar to “banding” effects in images where there is little variation in color, e. g. a blue sky or an artificial color gradient, and you look at it in a setup that has a color bit depth on a resolution your human eyes can distinguish.

So if you want to use LERC with a lossy approach, think hard about what is going to happen with your data later. What kind of analysis will be performed, how will it be “looked” at, what will be calculated. Do it smart and you can have a predictable/controllable lossy compression with seriously small file sizes, do it without thinking and your data will lead to misinterpretation and apocalypse.

How to build your own aerial image Twitter bot

About a year ago I built a small Twitter bot that posts an aerial image of Hamburg, Germany every day.

As I will shut it down (no reason, just decluttering) here’s how it works so you can run your own:

  1. Have aerial images with a permissive license
  2. Register a Twitter account and set it up for posting via API access
  3. Write code to pick an image and post it

Done!

Okok, I am kidding.

I used https://suche.transparenz.hamburg.de/dataset/digitale-orthophotos-belaubt-hamburg3 as data source because they have a CC-BY like license, allowing such a project without any legal complications. As a side benefit these images are already provided in tiles: There is not one big image of the whole region but 1km² image tiles. Perfect for random selection, quick resizing and posting.

Registering a Twitter account and setting it up is a privacy nightmare and, being a good human being as you are, you ought hate any part of it. I followed the Twython tutorial for OAuth1 in a live Python interpreter session which was fairly painless. If you do not want to link a phone number to your Twitter account (see above for how you should feel about that), this approach worked for me in the past but I bet they use regional profiling or worse so your luck might differ.

Next you need some code to do the work for you automatically. Here is what I wrote with PIL for the image processing and Twython for posting:

import glob
import random
from io import BytesIO

from PIL import Image
from twython import Twython

## Initialise Twitter
# use a Python interpreter to follow the stops on 
# https://twython.readthedocs.io/en/latest/usage/starting_out.html#oauth-1-user-authentication 
# don't rush, there are some intermediate keys iirc...
APP_KEY = '1234567890ABCDEFGHIJKLMNO'
APP_SECRET = '12345678901234567890123456F8U0CAKATAWAIATATAEARAAA'
OAUTH_TOKEN = '1234567890123456789123456F8U0CAKATAWAIATATAEARAAAA'
OAUTH_TOKEN_SECRET = '123456789012345678901234567890FFSFFSFFSFFSFFS'
twitter = Twython(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)

## Prepare the image
# via https://twython.readthedocs.io/en/latest/usage/advanced_usage.html#posting-a-status-with-an-editing-image

# Pick the image
jpegs = glob.glob("DOP20_HH_sommerbefliegung_2019.zip/*.jpg")
todays_image = random.choice(jpegs)
print(f"Today we will post {todays_image}")
photo = Image.open(todays_image)

# Resize the image
basewidth = 1000
wpercent = (basewidth / float(photo.size[0]))
height = int((float(photo.size[1]) * float(wpercent)))
resized_photo = photo.resize((basewidth, height), Image.ANTIALIAS)

# "Save" the resulting image in temporary memory
stream = BytesIO()
resized_photo.save(stream, format='JPEG')
stream.seek(0)

## We've got what we need, let's tweet
license = "dl-de/by-2-0 (Freie und Hansestadt Hamburg, Landesbetrieb Geoinformation und Vermessung)"
tweet = f"Das ist #IrgendwoInHH, aber wo denn nur?\n\n#codeforhamburg\n\nBild: {license}"

response = twitter.upload_media(media=stream)
twitter.update_status(status=tweet, media_ids=[response['media_id']])

First we initialize that Twython thingie, then we pick a random image from a directory of .jpg files, then we resize it to a maximum of 1000 pixels (you can simplify that if your images are square…) and finally we post it to Twitter.

Set up a cronjob or systemd timer or alarm clock to run the script as often as you like and that’s it.

Digitales Oberflächenmodell von Hamburg

Ach nee, der LGV hat ein bildbasiertes DOM von Hamburg im Transparenzportal veröffentlicht… Und sogar gleich zwei, eins von 2018, eins von 2020. Die Rasterweite ist 1 Meter, ob das wohl so vorlag oder für die Veröffentlichung gefiltert wurde?

Egal. Das ist ja großartig! Da werden eine Menge von Anwendungen ermöglicht (Sichtachsen! Verschattungen! Vermaschung! VR! AR!) und verschiedenste Akteure werden die Daten absolut feiern. Auch wenn es mit 1 Meter Auflösung wirklich mies grob ist, auf ein 1 Meter Gitter gerastert ist (nicht ausgedünnt, d. h. es ist teilweise stärker verfälscht und “daneben”) und “nur” bildbasiert (nicht gescannt) ist, geht da schon einiges mit.

Ausprobieren! Im Browser!

Achtung, frickelige Bedienung! Am besten den WASD-Möwen-Modus nutzen, mit Speed 1000. Oder mit einem Doppelklick irgendwo hinzoomen.

https://hamburg.datenatlas.de/DOM1_XYZ_HH_2018_04_30-colored.html
https://hamburg.datenatlas.de/DOM1_XYZ_HH_2020_04_30-colored.html

Datenaufbereitung als LAZ

Für 2018 liegen die Daten als 12768 einzelne XYZ-Kacheln vor, also als super ineffiziente Textdateien. Insgesamt sind es rund 22 Gigabyte. Für 2020 sind es stattdessen 827 größere Kacheln, aber ebenfalls in XYZ mit einem ähnlichem Platzbedarf.

Schönerweise gibt es freie Tools wie txt2las, was sie schnell und einfach ins super effiziente LAZ-Format umwandeln kann:

txt2las -i DOM1_XYZ_HH_2018_04_30/*.xyz -epsg 25832 -odir /tmp/laz/ -olaz

Hat bei mir ungefähr 5 Minuten gebraucht und da waren es nur noch 700 Megabyte. Das ZIP war übrigens mehr als 3 Gigabyte groß.

Zusammengefasst werden können die einzelnen Dateien mit lasmerge:

lasmerge -i /tmp/laz/*.laz -olaz -o DOM1_XYZ_HH_2018_04_30.laz

Interaktive 3D-Webanwendung

Dann noch schnell in den großartigen PotreeConverter von Markus Schütz geschmissen mit

PotreeConverter DOM1_XYZ_HH_2018_04_30.laz -o web/ --encoding BROTLI \
  --generate-page DOM1_XYZ_HH_2018_04_30 --title DOM1_XYZ_HH_2018_04_30

und 4 Minuten später ist die interaktive 3D-Webanwendung fertig (siehe unten), wegen der zusätzlichen Octree-Struktur jetzt bei rund 3 Gigabyte.

Punktwolke mit Farben aus Orthophoto einfärben

Die bereitgestellten Oberflächenmodelle sind so schlicht wie es nur geht, es sind reine XYZ-Daten ohne weitere Dimensionen wie Farbe o. ä.

Glücklicherweise gibt es ja auch die Orthophotos, eventuell wurde sogar dasselbe Bildmaterial genutzt? Da müsste mal jemand durch den Datenwust wühlen, die bei den DOPs werden die relevanten Metadaten nicht mitgeliefert…

Theoretisch könnte man sie also einfärben. Leider ist lascolor proprietär und kommt mit gruseligen, bösartigen Optionen, wenn man es wagt es “unlizenziert” zu nutzen (“Please note that the unlicensed version will (…) slightly change the LAS point order, and randomly add a tiny bit of white noise to the points coordinates once you exceed a certain number of points in the input file.”) und kann JPEG in GeoTIFF nicht lesen (so hab ich mir die DOPs aufbereitet). Eine Alternative ist das geniale PDAL. Mit einer Pipeline wie

{
    "pipeline": [
        "DOM1_XYZ_HH_2020_04_30.laz",
        {
            "type": "filters.colorization",
            "raster": "DOP20_HH_fruehjahrsbefliegung_2020.tif"
        },
        {
            "type": "writers.las",
            "compression": "true",
            "minor_version": "2",
            "dataformat_id": "3",
            "filename":"DOM1_XYZ_HH_2020_04_30-colored.laz"
        }
    ]
}

und

pdal pipeline DOM1_XYZ_HH_2020_04_30.laz+DOP20_HH_fruehjahrsbefliegung_2020_90.cog.tif.json

ist die Punktwolke innerhalb von Minuten coloriert und kann dann wie gehabt mit PotreeConverter in einen interaktiven 3D-Viewer gesteckt werden.

Das Ergebnis ist besser als erwartet, da es scheinbar tatsächlich die selben Bilddaten sind (für beide Jahre). Andererseits ist es auch nicht wirklich schick, da die DOPs nicht als True Orthophoto vorliegen und damit höhere Gebäude gekippt in den Bilder abgebildet sind. Sieht man hier schön am Planetarium.

DOM als GeoTIFF

Wer es lieber als GeoTIFF haben möchte, hat es etwas schwerer, denn GDAL kommt mit dieser Art von Kacheln (mit Lücken und in der bereitgestellten Sortierung) nicht gut klar. Mein Goto-Tool dafür ist GMT.

gmt xyz2grd $(gmt gmtinfo -I- *.xyz) -Vl -I1 -G/tmp/gmt.tif=gd:GTiff *.xyz

Das so erstellte GeoTIFF kann anschließend mit GDAL optimiert werden (ab GDAL 3.2.3/3.3):

gdal_translate -of COG -co COMPRESS=DEFLATE -co PREDICTOR=2 \
  --config GDAL_NUM_THREADS ALL_CPUS --config GDAL_CACHEMAX 50%  \
  -a_srs EPSG:25832 /tmp/gmt.tif DOM1_XYZ_HH_2020_04_30.tif

Hier mal im Vergleich mit dem DGM1 als Schummerungen:

Vermaschung als 3D-Modell

Leider habe ich keine gute Lösung für die 3D-Vermaschung gefunden. tin-terrain und dem2mesh kommen nicht mit so großen Datenmengen auf einmal klar und weiter hab ich nicht geschaut. Wer da was gutes weiß kann sich bei mir bei nächster Gelegenheit Kekse oder Bier abholen. ;)

Daten hinter den Bildern und den Viewern
Datenlizenz Deutschland Namensnennung 2.0, Freie und Hansestadt Hamburg, Landesbetrieb Geoinformation und Vermessung (LGV)

Tablets supported by Lineage OS in 2021

Looking for a 10 inch tablet with proper, official Lineage OS support I ended up on the device database. Unfortunately they do not have a simple list of device types. So I wrote some code to extract the tablets with support of at least version 17. The list is short…:

7 inch tablets supported by Lineage OS in 2021

Google Nexus 7 2013 (Wi-Fi, Repartitioned) [flox]

10 inch tablets supported by Lineage OS in 2021

Samsung Galaxy Tab S5e (LTE) [gts4lv]
Samsung Galaxy Tab S5e (Wi-Fi) [gts4lvwifi]
Samsung Galaxy Tab S6 Lite (Wi-Fi) [gta4xlwifi]

Comment if you need an update in the future.

Von wo wurde der Hamburger Fernsehturm fotografiert

Vom großartigen @Fernsehturm_HH-Account inspiriert hab ich mal ein 4 Jahre altes Projektchen aufgewärmt: Verortete Fotos auf Flickr, die mit “Hamburg” und “Fernsehturm” getaggt sind. Alternativ als Linien vom (angeblichen) Aufnahmeort zum Turm.

Image

Ungefiltert und voller Mist, aber man kann z. B. toll sehen, wie einladend die Brückem vorm Kuhmühlenteich für Fotos sind.

Kostenlose Internetpunkteabsahnidee: Dasselbe mit dem Eiffelturm, dem Londoner Dildo u. ä. machen, Konkave Hülle außen drum, schick aufbereiten, als “Nur wo man $Wahrzeichen sieht, ist $Stadt” vermarkten, €€€€.

Satellite composite of Earth 2020

A follow-up to Average Earth from Space 2018 with a how-to. For each day of 2020 I took one global true color image of the whole planet and merged them together by using the most typical color per pixel. You can see cloud patterns in astonishing detail, global wind, permafrost (careful, white can be ice and/or clouds here) and more. Scroll to the bottom for interactive full resolution viewers.

Basically we will want to overlay one satellite image per day into one image for the whole year. You need two things: The images and the GDAL suite of geospatial processing tools.

Imagery

You can get a daily satellite composite of (almost) the whole earth from NASA. For example of the Soumi NPP / VIIRS instrument. Check it out at WorldView.

You can download those images via Global Imagery Browse Services (GIBS).

As the API I used two years ago is gone, Joshua Stevens was so nice to share code he used previously. It was easy to adapt:

set -e
set -u

# run like: $ bash gibs_viirs.sh 2020-10-05
# you get: VIIRS_SNPP_CorrectedReflectance_TrueColor-2020-10-05.tif
# in ~15 minutes and at ~600 megabytes for 32768x16384 pixels

# based on https://gist.github.com/jscarto/6c0413f4820ed5141744e96e19f31205

# https://gibs.earthdata.nasa.gov/wmts/epsg4326/best/1.0.0/WMTSCapabilities.xml
# VIIRS_SNPP_CorrectedReflectance_TrueColor is not served as PNG by GIBS
# so this is using JPEG tiles as source

# -outsize 65536 32768 took ~50 minutes
# -outsize 32768 16384 took ~15 minutes
# you can run multiple instances at once without issues to reduce total time

# TODO probably should be using a less detailed tileset than 250m to put
# less stress on the server...!

layer=VIIRS_SNPP_CorrectedReflectance_TrueColor
caldate=$1  # 2020-09-09
tilelevel=8  # 8 is the highest for 250m, see Capa -> 163840 81920 would be the full outsize

# ready? let's go!
gdal_translate \
-outsize 32768 16384 \
-projwin -180 90 180 -90 \
-of GTIFF \
-co TILED=YES \
-co COMPRESS=DEFLATE \
-co PREDICTOR=2 \
-co NUM_THREADS=ALL_CPUS \
"<GDAL_WMS>
 <Service name=\"TMS\">
 <ServerUrl>https://gibs.earthdata.nasa.gov/wmts/epsg4326/best/"${layer}"/default/"${caldate}"/250m/\${z}/\${y}/\${x}.jpg</ServerUrl>
</Service>
<DataWindow>
 <UpperLeftX>-180.0</UpperLeftX><!-- makes sense -->
 <UpperLeftY>90</UpperLeftY><!-- makes sense -->
 <LowerRightX>396.0</LowerRightX><!-- wtf -->
 <LowerRightY>-198</LowerRightY><!-- wtf -->
 <TileLevel>"${tilelevel}"</TileLevel>
 <TileCountX>2</TileCountX>
 <TileCountY>1</TileCountY>
 <YOrigin>top</YOrigin>
</DataWindow>
<Projection>EPSG:4326</Projection>
<BlockSizeX>512</BlockSizeX><!-- correct for VIIRS_SNPP_CorrectedReflectance_TrueColor-->
<BlockSizeY>512</BlockSizeY><!-- correct for VIIRS_SNPP_CorrectedReflectance_TrueColor -->
<BandsCount>3</BandsCount>
</GDAL_WMS>" \
${layer}-${caldate}.tif

As this was no scientific project, please note that I have spent no time checking e. g.:

  • If one could reduce the (significantly) compression artifacts of imagery received through this (the imagery is only provided as JPEG using this particular API),
  • if the temporal queries are actually getting the correct dates,
  • if there might be more imagery available
  • or even if the geographic referencing is correct.

As it takes a long time to fetch an image this way, I decided to go for a resolution of 32768×16384 pixels instead of 65536×32768 because the latter took about 50 minutes per image. A day of 32768×16384 pixels took me about 15 minutes to download.

Overlaying the images

There are lots of options to overlay images. imagemagick/graphicsmagick might be the obvious choice but they are unfit for imagery of these dimensions (exhausting RAM). VIPS/nips2 is awesome but might require some getting used to and/or manual processing. GDAL is the hot shit and very RAM friendly if you are careful. So I used GDAL for this.

Make sure to store the images somewhere sensible for lots of I/O.

Got all the images you want to process? Cool, build a VRT for them:

gdalbuildvrt \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020.vrt \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020-*.tif

This takes some seconds.

We want to overlay the images with some fancy, highly complex mathematical formula (or not ;) ) and since GDAL’s VRT driver supports custom Python functions to manipulate pixel values, we can use numpy for that. Put this in a file called functions.py and remember the path to that file:

import numpy as np

def median(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize,
      raster_ysize, buf_radius, gt,  **kwargs):
    out_ar[:] = np.median(in_ar, axis = 0)

def mean(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize,
      raster_ysize, buf_radius, gt,  **kwargs):
    out_ar[:] = np.mean(in_ar, axis = 0)

def max(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize,
      raster_ysize, buf_radius, gt,  **kwargs):
    out_ar[:] = np.amax(in_ar, axis = 0)

def min(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize,
      raster_ysize, buf_radius, gt,  **kwargs):
    out_ar[:] = np.amin(in_ar, axis = 0)

You can now use a median, mean, min or max function for aggregating the images per pixel. For that you have to modify the VRT to include the function you want it to use. I used sed for that:

sed -e 's#<VRTRasterBand#<VRTRasterBand subClass="VRTDerivedRasterBand"#' \
-e 's#</ColorInterp>#</ColorInterp>\n<PixelFunctionLanguage>Python</PixelFunctionLanguage>\n<PixelFunctionType>functions.median</PixelFunctionType>#' \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020.vrt \
> VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt

That’s it, we are ready to use GDAL to build an image that combines all the daily images into one median image. For this to work you have to set the PYTHONPATH environment variable to include the directory of the functions.py file. If it is in the same directory where you launch gdal, you can use $PWD, otherwise enter the full path to the directory. Adjust the rest of the options as you like, e. g. to choose a different output format. If you use COG, enabling ALL_CPUS is highly recommended or building overviews will take forever.

PYTHONPATH=$PWD gdal_translate \
--config CPL_DEBUG VRT --config GDAL_CACHEMAX 25% \
--config GDAL_VRT_ENABLE_PYTHON YES \
-of COG -co NUM_THREADS=ALL_CPUS \
-co COMPRESS=DEFLATE -co PREDICTOR=2 \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt.tif

This will take many hours. 35 hours for me on a Ryzen 3600 with lots of RAM and the images on a cheap SSD. The resulting file is about the same size as the single images (makes sense, doesn’t it) at ~800 megabytes.

Alternative, faster approach

A small note while we are at it: GDAL calculates overviews from the source data. And since we are using a custom VRT function here, on a lot of raster images, that takes a long time. To save a lot of that time, you can build the file without overviews first, then calculate them in a second step. With this approach they will be calculated from the final raster instead of the initial input which, when ever there is non-trivial processing involved, is way quicker:

PYTHONPATH=$PWD gdal_translate \
--config CPL_DEBUG VRT --config GDAL_CACHEMAX 25% \
--config GDAL_VRT_ENABLE_PYTHON YES \
-of GTiff -co NUM_THREADS=ALL_CPUS \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt.noovr.tif

Followed by the conversion to a COG (which automatically will build the overviews as mandatory for that awesome format):

gdal_translate \
--config GDAL_CACHEMAX 25% \
-of COG -co NUM_THREADS=ALL_CPUS \
-co COMPRESS=DEFLATE -co PREDICTOR=2 \
/tmp/VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt.noovr.tif \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt.cog.tif

This took “just” 10 hours for the initial raster and then an additional 11 seconds for the conversion COG and the building of overviews. And the resulting file is bit-by-bit identical to the one from the direct-to-COG approach. So one third of processing time for the same result. Nice!

Result

Check it out in full, zoomable resolution:

Closing remarks

Please do not consider a true representation of the typical weather or cloud cover throughout the year. The satellite takes the day imagery at local noon if I recall correctly so the rest of the day is not part of this “analysis”. I did zero plausibility or consistency checks. The data was probably reprojected multiple times through out the full (sensor->composite) pipeline. The composite is based on color alone, anything bright will lead to a white-ish color, be it snow, ice, clouds, algae, sand, …

It’s just some neat imagery to love our planet.

Update 2020-01-05

Added compression to GeoTIFF creation where useful, not sure how I missed that here. Reduces filesizes to 1/2 or 1/3 even.

Your own little internet speed monitor

I wanted to monitor my ISP’s service over time and could not find any available simple tool for that. The usual system monitoring tools are usually displaying averages, not min/max values. So I used WD40 (speedtest-cli) and duct tape (cron) to make my own.

You need to have a cron daemon set up and speedtest-cli installed.

Then prepare an empty csv file with a header like this (don’t forget a trailing newline!) and store it in a path of your choice:

Server ID,Sponsor,Server Name,Timestamp,Distance,Ping,Download,Upload,Share,IP Address

Set up a cronjob at an interval of your choice (don’t be a dick) to run a speed test and log the results to the csv file:

@hourly speedtest --csv >> /home/user/path/to/speedtest.csv

If you have a fast connection you might spot slow test servers that would badly bias your results, so exclude them using the --exclude option if necessary.

That’s all, you get a nice log of internet ping, upload and download speeds, ready to be visualized in your software of choice (like the best spreadsheet software in existence). I will have to complain to my ISP for that drop since mid December for sure:

And now that I have written this, I realise that for plotting I could also just use a min/max function for a moving time window in Grafana I guess? The speedtests would still be triggered and provide nice bursts of usage. Anyone got pointers on how to do that?

Finding the most popular reaction in Slack

This can be run against a Slack export. It will count the reactions used and display them in an ordered list. Written for readability not speed or efficiency. No guarantees that this isn’t terribly broken. Enjoy and use responsibly!

import json
import glob
import collections

# collect messages
messages = []
for filename in glob.glob('*/*.json'):
    with open(filename) as f:
        messages += json.load(f)

# extract reactions
reactions = []
for message in messages:
    if "reactions" in message:
        reactions += message["reactions"]

# count reactions
reaction_counter = collections.Counter()
for reaction in reactions:
    reaction_counter.update({reaction["name"]: reaction["count"]})

# done, print them
print(reaction_counter.most_common())