Category Archives: guide

Satellite composite of Earth 2020

A follow-up to Average Earth from Space 2018 with a how-to. For each day of 2020 I took one global true color image of the whole planet and merged them together by using the most typical color per pixel. You can see cloud patterns in astonishing detail, global wind, permafrost (careful, white can be ice and/or clouds here) and more. Scroll to the bottom for interactive full resolution viewers.

Basically we will want to overlay one satellite image per day into one image for the whole year. You need two things: The images and the GDAL suite of geospatial processing tools.

Imagery

You can get a daily satellite composite of (almost) the whole earth from NASA. For example of the Soumi NPP / VIIRS instrument. Check it out at WorldView.

You can download those images via Global Imagery Browse Services (GIBS).

As the API I used two years ago is gone, Joshua Stevens was so nice to share code he used previously. It was easy to adapt:

set -e
set -u

# run like: $ bash gibs_viirs.sh 2020-10-05
# you get: VIIRS_SNPP_CorrectedReflectance_TrueColor-2020-10-05.tif
# in ~15 minutes and at ~600 megabytes for 32768x16384 pixels

# based on https://gist.github.com/jscarto/6c0413f4820ed5141744e96e19f31205

# https://gibs.earthdata.nasa.gov/wmts/epsg4326/best/1.0.0/WMTSCapabilities.xml
# VIIRS_SNPP_CorrectedReflectance_TrueColor is not served as PNG by GIBS
# so this is using JPEG tiles as source

# -outsize 65536 32768 took ~50 minutes
# -outsize 32768 16384 took ~15 minutes
# you can run multiple instances at once without issues to reduce total time

# TODO probably should be using a less detailed tileset than 250m to put
# less stress on the server...!

layer=VIIRS_SNPP_CorrectedReflectance_TrueColor
caldate=$1  # 2020-09-09
tilelevel=8  # 8 is the highest for 250m, see Capa -> 163840 81920 would be the full outsize

# ready? let's go!
gdal_translate \
-outsize 32768 16384 \
-projwin -180 90 180 -90 \
-of GTIFF \
-co TILED=YES \
-co COMPRESS=DEFLATE \
-co PREDICTOR=2 \
-co NUM_THREADS=ALL_CPUS \
"<GDAL_WMS>
 <Service name=\"TMS\">
 <ServerUrl>https://gibs.earthdata.nasa.gov/wmts/epsg4326/best/"${layer}"/default/"${caldate}"/250m/\${z}/\${y}/\${x}.jpg</ServerUrl>
</Service>
<DataWindow>
 <UpperLeftX>-180.0</UpperLeftX><!-- makes sense -->
 <UpperLeftY>90</UpperLeftY><!-- makes sense -->
 <LowerRightX>396.0</LowerRightX><!-- wtf -->
 <LowerRightY>-198</LowerRightY><!-- wtf -->
 <TileLevel>"${tilelevel}"</TileLevel>
 <TileCountX>2</TileCountX>
 <TileCountY>1</TileCountY>
 <YOrigin>top</YOrigin>
</DataWindow>
<Projection>EPSG:4326</Projection>
<BlockSizeX>512</BlockSizeX><!-- correct for VIIRS_SNPP_CorrectedReflectance_TrueColor-->
<BlockSizeY>512</BlockSizeY><!-- correct for VIIRS_SNPP_CorrectedReflectance_TrueColor -->
<BandsCount>3</BandsCount>
</GDAL_WMS>" \
${layer}-${caldate}.tif

As this was no scientific project, please note that I have spent no time checking e. g.:

  • If one could reduce the (significantly) compression artifacts of imagery received through this (the imagery is only provided as JPEG using this particular API),
  • if the temporal queries are actually getting the correct dates,
  • if there might be more imagery available
  • or even if the geographic referencing is correct.

As it takes a long time to fetch an image this way, I decided to go for a resolution of 32768×16384 pixels instead of 65536×32768 because the latter took about 50 minutes per image. A day of 32768×16384 pixels took me about 15 minutes to download.

Overlaying the images

There are lots of options to overlay images. imagemagick/graphicsmagick might be the obvious choice but they are unfit for imagery of these dimensions (exhausting RAM). VIPS/nips2 is awesome but might require some getting used to and/or manual processing. GDAL is the hot shit and very RAM friendly if you are careful. So I used GDAL for this.

Make sure to store the images somewhere sensible for lots of I/O.

Got all the images you want to process? Cool, build a VRT for them:

gdalbuildvrt \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020.vrt \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020-*.tif

This takes some seconds.

We want to overlay the images with some fancy, highly complex mathematical formula (or not ;) ) and since GDAL’s VRT driver supports custom Python functions to manipulate pixel values, we can use numpy for that. Put this in a file called functions.py and remember the path to that file:

import numpy as np

def median(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize,
      raster_ysize, buf_radius, gt,  **kwargs):
    out_ar[:] = np.median(in_ar, axis = 0)

def mean(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize,
      raster_ysize, buf_radius, gt,  **kwargs):
    out_ar[:] = np.mean(in_ar, axis = 0)

def max(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize,
      raster_ysize, buf_radius, gt,  **kwargs):
    out_ar[:] = np.amax(in_ar, axis = 0)

def min(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize,
      raster_ysize, buf_radius, gt,  **kwargs):
    out_ar[:] = np.amin(in_ar, axis = 0)

You can now use a median, mean, min or max function for aggregating the images per pixel. For that you have to modify the VRT to include the function you want it to use. I used sed for that:

sed -e 's#<VRTRasterBand#<VRTRasterBand subClass="VRTDerivedRasterBand"#' \
-e 's#</ColorInterp>#</ColorInterp>\n<PixelFunctionLanguage>Python</PixelFunctionLanguage>\n<PixelFunctionType>functions.median</PixelFunctionType>#' \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020.vrt \
> VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt

That’s it, we are ready to use GDAL to build an image that combines all the daily images into one median image. For this to work you have to set the PYTHONPATH environment variable to include the directory of the functions.py file. If it is in the same directory where you launch gdal, you can use $PWD, otherwise enter the full path to the directory. Adjust the rest of the options as you like, e. g. to choose a different output format. If you use COG, enabling ALL_CPUS is highly recommended or building overviews will take forever.

PYTHONPATH=$PWD gdal_translate \
--config CPL_DEBUG VRT --config GDAL_CACHEMAX 25% \
--config GDAL_VRT_ENABLE_PYTHON YES \
-of COG -co NUM_THREADS=ALL_CPUS \
-co COMPRESS=DEFLATE -co PREDICTOR=2 \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt.tif

This will take many hours. 35 hours for me on a Ryzen 3600 with lots of RAM and the images on a cheap SSD. The resulting file is about the same size as the single images (makes sense, doesn’t it) at ~800 megabytes.

Alternative, faster approach

A small note while we are at it: GDAL calculates overviews from the source data. And since we are using a custom VRT function here, on a lot of raster images, that takes a long time. To save a lot of that time, you can build the file without overviews first, then calculate them in a second step. With this approach they will be calculated from the final raster instead of the initial input which, when ever there is non-trivial processing involved, is way quicker:

PYTHONPATH=$PWD gdal_translate \
--config CPL_DEBUG VRT --config GDAL_CACHEMAX 25% \
--config GDAL_VRT_ENABLE_PYTHON YES \
-of GTiff -co NUM_THREADS=ALL_CPUS \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt.noovr.tif

Followed by the conversion to a COG (which automatically will build the overviews as mandatory for that awesome format):

gdal_translate \
--config GDAL_CACHEMAX 25% \
-of COG -co NUM_THREADS=ALL_CPUS \
-co COMPRESS=DEFLATE -co PREDICTOR=2 \
/tmp/VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt.noovr.tif \
VIIRS_SNPP_CorrectedReflectance_TrueColor-2020_median.vrt.cog.tif

This took “just” 10 hours for the initial raster and then an additional 11 seconds for the conversion COG and the building of overviews. And the resulting file is bit-by-bit identical to the one from the direct-to-COG approach. So one third of processing time for the same result. Nice!

Result

Check it out in full, zoomable resolution:

Closing remarks

Please do not consider a true representation of the typical weather or cloud cover throughout the year. The satellite takes the day imagery at local noon if I recall correctly so the rest of the day is not part of this “analysis”. I did zero plausibility or consistency checks. The data was probably reprojected multiple times through out the full (sensor->composite) pipeline. The composite is based on color alone, anything bright will lead to a white-ish color, be it snow, ice, clouds, algae, sand, …

It’s just some neat imagery to love our planet.

Update 2020-01-05

Added compression to GeoTIFF creation where useful, not sure how I missed that here. Reduces filesizes to 1/2 or 1/3 even.

Your own little internet speed monitor

I wanted to monitor my ISP’s service over time and could not find any available simple tool for that. The usual system monitoring tools are usually displaying averages, not min/max values. So I used WD40 (speedtest-cli) and duct tape (cron) to make my own.

You need to have a cron daemon set up and speedtest-cli installed.

Then prepare an empty csv file with a header like this (don’t forget a trailing newline!) and store it in a path of your choice:

Server ID,Sponsor,Server Name,Timestamp,Distance,Ping,Download,Upload,Share,IP Address

Set up a cronjob at an interval of your choice (don’t be a dick) to run a speed test and log the results to the csv file:

@hourly speedtest --csv >> /home/user/path/to/speedtest.csv

If you have a fast connection you might spot slow test servers that would badly bias your results, so exclude them using the --exclude option if necessary.

That’s all, you get a nice log of internet ping, upload and download speeds, ready to be visualized in your software of choice (like the best spreadsheet software in existence). I will have to complain to my ISP for that drop since mid December for sure:

And now that I have written this, I realise that for plotting I could also just use a min/max function for a moving time window in Grafana I guess? The speedtests would still be triggered and provide nice bursts of usage. Anyone got pointers on how to do that?

Das eigene kleine Deutschlandradio Archiv

Mediatheken des Öffentlich-rechtlichen Rundfunks müssen wegen asozialen Arschlöchern ihre Inhalte depublizieren. Wegen anderer Arschlöcher sind die Inhalte nicht konsequent unter freien Lizenzen, aber das ist ein anderes Thema.

Ich hatte mir irgendwann mal angesehen, was es eigentlich für ein Aufwand wäre, die Inhalte verschiedener Mediatheken in ein privates Archiv zu spiegeln. Mit dem Deutschlandradio hatte ich angefangen und mit den üblichen Tools täglich die neuen Audiobeiträge in ein Google Drive geschoben. Dieses Setup läuft jetzt seit mehr als 2 Jahren ohne Probleme und vielleicht hat ja auch wer anders Spaß dran:

Also:

  • rclone einrichten oder mit eigener Infrastruktur arbeiten (dann die rclone-Zeile mit z.B. rsync ersetzen)
  • <20 GB Platz haben
  • Untenstehendes Skript als täglichen Cronjob einrichten (und sich den Output zu mailen lassen)
#!/bin/bash

# exit if anything fails
# not a good idea as downloads might 404 :D
set -e

cd /home/dradio/deutschlandradio

# get all available files
wget -nv -nc -x "http://srv.deutschlandradio.de/aodlistaudio.1706.de.rpc?drau:page="{0..100}"&drau:limit=1000"
grep -hEo 'http.*mp3' srv.deutschlandradio.de/* | sort | uniq > urls

# check which ones are new according to the list of done files
comm -13 urls_done urls > todo

numberofnewfiles=$(wc -l todo | awk '{print $1}')
echo "${numberofnewfiles} new files"

if (( numberofnewfiles < 1 )); then
        echo "exiting"
        exit
fi

# get the new ones
echo "getting new ones"
wget -i todo -nv -x -nc || echo "true so that set -e does not exit here :)"
echo "new ones downloaded"

# copy them to remote storage
rclone copy /home/dradio_scraper/deutschlandradio remote:deutschlandradio && echo "rclone done"

## clean up
# remove files
echo "cleaning up"
rm -r srv.deutschlandradio.de/
rm -rv ondemand-mp3.dradio.de/
rm urls

# update list of done files
cat urls_done todo | sort | uniq > /tmp/urls_done
mv /tmp/urls_done urls_done

# save todo of today
mv todo urls_$(date +%Y%m%d)

echo "done"

Pro Tag sind es so 2-3 Gigabyte neuer Beiträge.

In zwei Jahren sind rund 2,5 Terabyte zusammengekommen und ~300.000 Dateien, aber da sind eventuell auch die Seiten des Feeds mitgezählt worden und Beiträge, die schon älter waren.

Wer mehr will nimmt am besten direkt die Mediathekview-Datenbank als Grundlage.

Nächster Schritt wäre das eigentlich auch täglich nach archive.org zu schieben.

Highlight current timeslice in a QGIS Atlas layout

Did this for an ex-colleague some months ago and forgot to share the how-to publically. We needed a visual representation of the current time in a layout that showed both a raster map (different layer per timeslice) and a timeseries plot of an aspect of the data (this was created outside QGIS).

Have lots of raster layers you want to iterate through. I have:

./ECMWF_ERA_40_subset/2019-01-01.tif
./ECMWF_ERA_40_subset/2019-01-02.tif
./ECMWF_ERA_40_subset/2019-01-03.tif
./ECMWF_ERA_40_subset/2019-01-04.tif
./ECMWF_ERA_40_subset/2019-01-05.tif
./ECMWF_ERA_40_subset/2019-01-06.tif
...

Create a new layer for your map extent. Draw your extent as geometry. Duplicate that geometry as many times as you have days. Alternatively you could of course have different geometries per day. Whatever you do, you need a layer with one feature per timeslice for the Atlas to iterate though. I have 30 days to visualise so I duplicated my extent 30 times.

Open the Field Calculator. Add a new field called date as string type (not as date type until some bug is fixed (sorry, did not make a note here, maybe sorting is/was broken?)) with an expression that represents time and orders chronologically if sorted by QGIS. For example: '2019-01-' || lpad(@row_number,2,0) (assuming your records are in the correct order if you have different geometries…)

Have your raster layers named the same way as the date attribute values.

Make a new layout.

For your Layout map check “Lock layers” and use date as expression for the “Lock layers” override. This will now select the appropriate raster layer, based on the attribute value <-> layer name, to display for each Atlas page.

Cool, if you preview the Atlas now you got a nice animation through your raster layers. Let’s do part 2:

In your layout add your timeseries graph. Give it a unique ID, e. g. “plot box”. Set its width and height via new variables (until you can get those via an expression this is needed for calculations below).

Create a box to visualise the timeslice. Set its width to map_get(item_variables('plot box'), 'plot_width') / @atlas_totalfeatures. For the height and y use/adjust this expression: map_get(item_variables('plot box'), 'plot_height'). For x comes the magic:

with_variable(
'days_total',
day(to_date(maximum("date"))-to_date(minimum("date")))+1,
-- number of days in timespan
-- +1 because we need the number of days in total
-- not the inbetween, day() to just get the number of days
with_variable(
'mm_per_day',
map_get(item_variables('plot box'), 'plot_width') / @days_total,
with_variable(
'days',
day(to_date(attribute(@atlas_feature, 'date'))-to_date(minimum("date"))),
-- number of days the current feature is from the first day
-- to_date because BUG attribute() returns datetime for date field
@mm_per_day * @days + map_get(item_variables('plot box'), 'plot_x')
)
)
)

This will move the box along the x axis accordingly.

Have fun!

Fake chromatic aberration and dynamic label shadows in QGIS

Welcome to Part 43 of “Fun with Weird Hacks in QGIS”!

Fake Chromatic Aberration

Get some lines or polygons! I used German administrative polygons and a “Outline: Simple line” style.

Add a “Geometry Generator” to the same layer and set it to “LineString / MultiLineString”. If you used polygons, you will need to use “boundary($geometry)” to get the border lines of your polygons.

Translate (shift) the geometries radially away from the map center, with an amount of translation based on their distance to the map center. This will introduce ugly artifacts cool glitches for big geometries. Note the magic constant of 150 I divided the distances by, I cannot be arsed to turn this into percentages so you will have to figure out what works for you. Also, if your map is in a different CRS than the layer you do this with, you will need to transform the coordinates (I do that for the labels below).

 translate(
   boundary($geometry),
   (
    x(centroid($geometry))-x(@map_extent_center)
   )/150
   ,
   (
    y(centroid($geometry))-y(@map_extent_center)
   )/150
)

Color those lines in some fancy 80s color like #00FFFF.

We only want the color to appear in the edges of the map, so set the “Feature Blending” mode of the layer to “Lighten”. This will make sure the white lines do not get darker/colored.

I forgot to take an image for that and then it was lunch time. :x

Now do the same but for distances the other way around and color that in something else (like foofy #FF00FF).

 translate(
   boundary($geometry),
   (
    x(@map_extent_center)-x(centroid($geometry))
   )/150
   ,
   (
    y(@map_extent_center)-y(centroid($geometry))
   )/150
)

Oh, can you see it already? Move the map around! Ohhh!

For the final touch, use the layer’s “Draw Effects” to replace the “Source” with a “Blur”. Be aware that the “Blur type” can quite strongly influence the look and find a setting for the “Blur strength” that works for you. I used “Guassian blur (quality)” with a strength of 2.

Dynamic Label Shadows

Get some data to label! I used ne_10m_populated_places_simple. Label it with labels placed “Offset from Point” without any actual offset. This is just to make sure calculations on the geometry’s location make sense to affect the labeling later.

Pick a wicked cool font like Lazer 84!

Add a “Buffer” to the labels and pick an appropriate color (BIG BOLD #FF00FF works well again).

Time for magic! Add a “Shadow” to the labels and use an appropriate color (I used the other color from earlier again, #00FFFF).

We want to make the label shadows be further away from the label if the feature is further away from the map center. So OVERRIDE the offset with code that does exactly that. Note another magic constant (relating to meters in EPSG:25832) and that I needed to transform coordinates here (my map is in EPSG:25832 while this layer is EPSG:4326).

distance(  
	transform(
		$geometry, 'EPSG:4326', 'EPSG:25832'
	),
	@map_extent_center
)/10000

Cool. But that just looks weird. Time for another magical ingredient while OVERRIDING the angle at which the label is placed. You guessed it, radially away from the map center!

degrees(
	azimuth(  
		transform(
			$geometry, 'EPSG:4326', 'EPSG:25832'
		),  
		@map_extent_center)
	)
-180

By the way, it does look best if your text encoding is introducing bad character marks and missing umlauts!

That’s it, move the map around and be WOWed by the sweet effects and the time it takes to render :o)

And now it is your turn to apply this to some MURICA geodata and rake in meaningless internet points that make you feel good! Sad but true ;)

And also TODO: Make the constants percentages instead, that should make sure it works on any projection and any scale.

Making a Star Wars hologram in QGIS

I like doing silly things in QGIS.

So I wanted to make a Star Wars hologram (you know, that “You’re my only hope” Leia one) showing real geodata. What better excuse for abusing QGIS’ Inverted Polygons and Raster Fills. So, here is what I did:

  1. Find some star wars hologram leia image.
  2. Crudly remove the princess (GIMP’s Clone and Healing tools work nicely for this).
  3. In QGIS create an empty Polygon-geometry Scratchpad layer and set the renderer to Inverted Polygons to fill the whole canvas.
  4. Set the Fill to a Raster image Fill and load your image.
  5. Load some geodata, style it accordingly and rejoice.
    1. Get some geodata. I used Natural Earth’s countries, populated places and tiny countries (to have some stuff in the oceans), all in 110m.
    2. Select a nice projection, I used “+proj=ortho +lat_0=39 +lon_0=139 +x_0=0 +y_0=0 +a=6371000 +b=6371000 +units=m +no_defs”.
    3. I used a three layer style for the countries:
      • A Simple Line outline with color #4490f3, stroke width 0.3mm and the Dash Dot Line stroke style.
      • A Line Pattern Fill with a spacing of 0.8mm, color #46a8f3 and a stroke width of 0.4mm.
      • And on top of those, for some noisiness a black Line Pattern Fill rotated 45° with a spacing of 1mm and stroke width 0.1mm.
      • Then the Feature Blending Mode Dodge to the Layer Rendering and aha!
      • More special effects come from Draw Effects, I disabled Source and instead used a Blur (Gaussian, strength 2) to lose the crispness and also an Outer Glow (color #5da6ff, spread 3mm, blur radius 3) to, well, make it glow.





    4. I used a three layer style for the populated places:
      • I used a Simple Marker using the “cross” symbol and a size of 1.8mm. The Dash Line stroke style gives a nice depth effect when in Draw Effects the Source is replaced with a Drop Shadow (1 Pixel, Blur radius 2, color #d6daff).
      • A Blending Mode of Addition for the layer itself makes it blend nicely into the globe.
    5. I used a three layer style for the tiny countries:
      • I used a white Simple Marker with a size of 1.2mm and a stroke color #79c7ff, stroke width 0.4mm.
      • Feature Blending Mode Lighten makes sure that touching symbols blob nicely into each other.




  6. You can now export the image at your screen resolution (I guess) using Project -> Import/Export or just make a screenshot.
  7. Or add some more magic with random offsets and stroke widths in combination with refreshing the layers automatically at different intervals:

A bad map about gender differences in literacy

Wrote this in 2014, not sure why I did not publish it. It was a response to this bad map.

world bank 2011 female vs male literacy

No need for expensive software, you can use the free and open-source QGIS for this: https://qgis.org/

1. Install QGIS

2. Download and unzip the data http://databank.worldbank.org/data/download/WDI_excel.zip (not sure what license, they want attribution “World Development Indicators, The World Bank”)

3. Download and unzip country geometries http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/cultural/ne_50m_admin_0_countries.zip (public domain but be nice and add attribution “Geometries from Natural Earth”)

4. Open QGIS, Layer -> Add Vector Layer -> choose ne_50m_admin_0_countries.shp

5. Unfortunately the csv is not simple, it has more than one row per country as it includes time series. And it does not have the value we want to map precalculated.

Afghanistan,AFG,"Literacy rate, adult female (% of females ages 15 and above)",SE.ADT.LITR.FE.ZS,,,,,,,,,,,,,,,,,,,,4.98746100000000E+00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Afghanistan,AFG,"Literacy rate, adult male (% of males ages 15 and above)",SE.ADT.LITR.MA.ZS,,,,,,,,,,,,,,,,,,,,3.03077500000000E+01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Afghanistan,AFG,"Literacy rate, adult total (% of people ages 15 and above)",SE.ADT.LITR.ZS,,,,,,,,,,,,,,,,,,,,1.81576800000000E+01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Afghanistan,AFG,"Literacy rate, youth female (% of females ages 15-24)",SE.ADT.1524.LT.FE.ZS,,,,,,,,,,,,,,,,,,,,1.11428000000000E+01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Afghanistan,AFG,"Literacy rate, youth male (% of males ages 15-24)",SE.ADT.1524.LT.MA.ZS,,,,,,,,,,,,,,,,,,,,4.57960200000000E+01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Afghanistan,AFG,"Literacy rate, youth total (% of people ages 15-24)",SE.ADT.1524.LT.ZS,,,,,,,,,,,,,,,,,,,,3.00663500000000E+01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

This step is probably the hardest. I will use some Unix tools as I am used to them and they work well. Sorry! You can probably do this with a good texteditor or spreadsheet application as well.

We have “csvcut -c 2 WDI_Data.csv | uniq | wc -l” -> 253 -> 252 country codes (without the header). We have 6 lines per country for the literacy data. We should have at max 252 unique values per field then. TheZeitgeist only used data from 2011.

To make it more convenient to work, I first split off the Literacy data into a new file with

head -n 1 WDI_Data.csv > Literacy.csv; grep Literacy WDI_Data.csv >> Literacy.csv

No idea if TheZeitgeist mixed Adult and Youth, let’s just use the Adult data for now.

head -n 1 WDI_Data.csv > Literacy_adult.csv; grep -E "Literacy rate, adult (fe)*male" Literacy.csv >> Literacy_adult.csv

Next let’s isolate the data for 2011. csvcut seems to have a bug with numerically named columns so we have to use the field’s index (56) instead of its name “2011”.

csvcut -c 2,3,56 Literacy_adult.csv > Literacy_adult_2011.csv

We need to get the data into one line per country, I am lazy so:

grep "Literacy rate, adult female" Literacy_adult_2011.csv > Literacy_adult_2011_female.csv
grep "Literacy rate, adult male" Literacy_adult_2011.csv | sed 's/.*15 and above)",/,/' > Literacy_adult_2011_male.csv
echo "Country Name,Country Code, Literacy Female, Literacy Male" > Literacy_adult_2011_oneline.csv; paste -d "" Literacy_adult_2011_female.csv Literacy_adult_2011_male.csv >> Literacy_adult_2011_oneline.csv

Enough of that commandline mumbojumbo! QGIS time!

Natural Earth has a column named “wb_a3” which is the WB 3 letter country codes, yay!

toreal("Literacy_adult_2011_oneline_Literacy Female") - toreal("Literacy_adult_2011_oneline_Literacy Male")

Figure out the rest yourself. This is where I apparently lost interest in writing back then. ;)
—–

Now make the map better by choosing a projection that does not make Greenland as big as Africa. Also, I would try adding another “attribute” to the display, change the alpha value depending on the absolute literacy.

And finally realise that a map is not a good visualisation because you cannot see the values of tiny countries. Make a bar chart instead. ;)

world bank 2011 female vs male literacy plus bar

ffmpeg on raspbian / Raspberry Pi

Since http://www.jeffreythompson.org/blog/2014/11/13/installing-ffmpeg-for-raspberry-pi/ is a bit messy, here is how you can compile ffmpeg with x264 on raspbian. Changes are building in your home directory, getting just a shallow git clone and building with all CPU cores. Also no unnecessary sudo…

Read the comments below!

# In a directory of your choosing (I used ~/ffmpeg):

# build and install x264
git clone --depth 1 git://git.videolan.org/x264
cd x264
./configure --host=arm-unknown-linux-gnueabi --enable-static --disable-opencl
make -j 4
sudo make install
 
# build and make ffmpeg
git clone --depth=1 git://source.ffmpeg.org/ffmpeg.git
cd ffmpeg
./configure --arch=armel --target-os=linux --enable-gpl --enable-libx264 --enable-nonfree
make -j4
sudo make install

Read the comments below!

Hopefully someone, somewhere will provide a repository for this kind of stuff some day.

It takes just 25 minutes on a Raspberry Pi 3. Not hours or days like some old internet sources on old Raspis say.

In case you are wondering v4l2 should work with this.

Properly splitting a file at specific intervals with ffmpeg

Ever wanted to split a media file (video, audio, both) into segments of 10 minutes or something like that? The internet is full of terrible hacks and shitty Stack Overflow answers for this. So here is how you easily, properly split a file into same-length segments with ffmpeg.

-f segment -segment_time SECONDS fileprefix%04d.ext

Done. segment_time takes seconds as argument. For example:

ffmpeg -i recording.opus -c:a libvorbis -f segment -segment_time 3600 recording_%04d.ogg

or

ffmpeg -i huge.wav -c:a copy -f segment -segment_time 600 huge-%04d.flac

Not so hard, is it?

WFS zu Shapefiles

Hier mal als Beispiel mit wget, grep, sed und ogr2ogr (von GDAL/OGR) in der Bash. Diese Anleitung geht davon aus, dass die Titel der Layer keine Sonderzeichen enthalten, also einfach in der Shell und als Dateinamen verwendet werden können. Generell nicht elegant, aber funktioniert meistens.

WFS finden, zum Beispiel über http://hmdk.de/freitextsuche?action=doSearch&q=wfs. Hier nehme ich einfach mal http://gateway.hamburg.de/OGCFassade/DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.aspx?REQUEST=GetCapabilities&SERVICE=WFS&VERSION=1.1.0 als Beispiel.

Capabilities abfragen:

wget -O DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.GetCapabilities.xml http://gateway.hamburg.de/OGCFassade/DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.aspx?REQUEST=GetCapabilities&SERVICE=WFS&VERSION=1.1.0

Verfügbare Layertitel extrahieren:

grep “<wfs:Title>” DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.GetCapabilities.xml | grep -Eo ‘>.*<‘ | sed ‘s/[<>]//g’ > DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.wfsTitle

Das sind dann zum Beispiel:

Hafengebietsgrenzen
Stadtteile
Bezirk
Ortsteile

Layer runterladen:

while read title; do wget -O DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten-${title}.gml "http://gateway.hamburg.de/OGCFassade/DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.aspx?REQUEST=GetFeature&SERVICE=WFS&VERSION=1.1.0&typeName="${title}; done < DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.wfsTitle

GML in Shapefile umwandeln:

while read title; do ogr2ogr -a_srs "EPSG:25832" -f "ESRI Shapefile" -fieldTypeToString IntegerList,StringList DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten-${title}.shp DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten-${title}.gml; done < DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.wfsTitle