Category Archives: map

Replicating a media-hyped color by numbers Etsy map in 10 minutes

https://old.reddit.com/r/MapPorn/comments/ajaz42/the_arteries_of_the_world_map_shows_every_river/eeu3uj9/

Thats beautiful… how long it took?

Well, that looks like QGIS’ random colors applied to http://hydrosheds.org/

So I fired up QGIS, extracted the region from eu_riv_15s.zip, realised those rivers came without a corresponding basin, extracted the region from eu_bas_15s_beta.zip, set the map background to black, set the rivers to render in white, set the rivers’ line width to correspond to their UP_CELLS attribute, put the basins on top, colored them randomly by BASIN_ID, set the layer rendering mode to Darken and that was it.

I should open an Etsy store.

Yes, I realise that replicating things is easier than creating them. But seriously, this is just a map of features colored by category and all the credit should go to http://hydrosheds.org/

Update

But Hannes, that original has some gradients!

Ok, then set the rivers not to white but a grey and the basin layer rendering mode to Overlay instead of Darken.

This product incorporates data from the HydroSHEDS database which is © World Wildlife Fund, Inc. (2006-2013) and has been used herein under license. WWF has not evaluated the data as altered and incorporated within, and therefore gives no warranty regarding its accuracy, completeness, currency or suitability for any particular purpose. Portions of the HydroSHEDS database incorporate data which are the intellectual property rights of © USGS (2006-2008), NASA (2000-2005), ESRI (1992-1998), CIAT (2004-2006), UNEP-WCMC (1993), WWF (2004), Commonwealth of Australia (2007), and Her Royal Majesty and the British Crown and are used under license. The HydroSHEDS database and more information are available at http://www.hydrosheds.org.

Update: Someone asked me for more details so I made a video. Because I did not filter the data to a smaller region I did not use a categorical style in this example (300,000 categories QGIS no likey) but simply a random assignment.

Fake chromatic aberration and dynamic label shadows in QGIS

Welcome to Part 43 of “Fun with Weird Hacks in QGIS”!

Fake Chromatic Aberration

Get some lines or polygons! I used German administrative polygons and a “Outline: Simple line” style.

Add a “Geometry Generator” to the same layer and set it to “LineString / MultiLineString”. If you used polygons, you will need to use “boundary($geometry)” to get the border lines of your polygons.

Translate (shift) the geometries radially away from the map center, with an amount of translation based on their distance to the map center. This will introduce ugly artifacts cool glitches for big geometries. Note the magic constant of 150 I divided the distances by, I cannot be arsed to turn this into percentages so you will have to figure out what works for you. Also, if your map is in a different CRS than the layer you do this with, you will need to transform the coordinates (I do that for the labels below).

 translate(
   boundary($geometry),
   (
    x(centroid($geometry))-x(@map_extent_center)
   )/150
   ,
   (
    y(centroid($geometry))-y(@map_extent_center)
   )/150
)

Color those lines in some fancy 80s color like #00FFFF.

We only want the color to appear in the edges of the map, so set the “Feature Blending” mode of the layer to “Lighten”. This will make sure the white lines do not get darker/colored.

I forgot to take an image for that and then it was lunch time. :x

Now do the same but for distances the other way around and color that in something else (like foofy #FF00FF).

 translate(
   boundary($geometry),
   (
    x(@map_extent_center)-x(centroid($geometry))
   )/150
   ,
   (
    y(@map_extent_center)-y(centroid($geometry))
   )/150
)

Oh, can you see it already? Move the map around! Ohhh!

For the final touch, use the layer’s “Draw Effects” to replace the “Source” with a “Blur”. Be aware that the “Blur type” can quite strongly influence the look and find a setting for the “Blur strength” that works for you. I used “Guassian blur (quality)” with a strength of 2.

Dynamic Label Shadows

Get some data to label! I used ne_10m_populated_places_simple. Label it with labels placed “Offset from Point” without any actual offset. This is just to make sure calculations on the geometry’s location make sense to affect the labeling later.

Pick a wicked cool font like Lazer 84!

Add a “Buffer” to the labels and pick an appropriate color (BIG BOLD #FF00FF works well again).

Time for magic! Add a “Shadow” to the labels and use an appropriate color (I used the other color from earlier again, #00FFFF).

We want to make the label shadows be further away from the label if the feature is further away from the map center. So OVERRIDE the offset with code that does exactly that. Note another magic constant (relating to meters in EPSG:25832) and that I needed to transform coordinates here (my map is in EPSG:25832 while this layer is EPSG:4326).

distance(  
	transform(
		$geometry, 'EPSG:4326', 'EPSG:25832'
	),
	@map_extent_center
)/10000

Cool. But that just looks weird. Time for another magical ingredient while OVERRIDING the angle at which the label is placed. You guessed it, radially away from the map center!

degrees(
	azimuth(  
		transform(
			$geometry, 'EPSG:4326', 'EPSG:25832'
		),  
		@map_extent_center)
	)
-180

By the way, it does look best if your text encoding is introducing bad character marks and missing umlauts!

That’s it, move the map around and be WOWed by the sweet effects and the time it takes to render :o)

And now it is your turn to apply this to some MURICA geodata and rake in meaningless internet points that make you feel good! Sad but true ;)

And also TODO: Make the constants percentages instead, that should make sure it works on any projection and any scale.

A bad map about gender differences in literacy

Wrote this in 2014, not sure why I did not publish it. It was a response to this bad map.

world bank 2011 female vs male literacy

No need for expensive software, you can use the free and open-source QGIS for this: https://qgis.org/

1. Install QGIS

2. Download and unzip the data http://databank.worldbank.org/data/download/WDI_excel.zip (not sure what license, they want attribution “World Development Indicators, The World Bank”)

3. Download and unzip country geometries http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/cultural/ne_50m_admin_0_countries.zip (public domain but be nice and add attribution “Geometries from Natural Earth”)

4. Open QGIS, Layer -> Add Vector Layer -> choose ne_50m_admin_0_countries.shp

5. Unfortunately the csv is not simple, it has more than one row per country as it includes time series. And it does not have the value we want to map precalculated.

Afghanistan,AFG,"Literacy rate, adult female (% of females ages 15 and above)",SE.ADT.LITR.FE.ZS,,,,,,,,,,,,,,,,,,,,4.98746100000000E+00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Afghanistan,AFG,"Literacy rate, adult male (% of males ages 15 and above)",SE.ADT.LITR.MA.ZS,,,,,,,,,,,,,,,,,,,,3.03077500000000E+01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Afghanistan,AFG,"Literacy rate, adult total (% of people ages 15 and above)",SE.ADT.LITR.ZS,,,,,,,,,,,,,,,,,,,,1.81576800000000E+01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Afghanistan,AFG,"Literacy rate, youth female (% of females ages 15-24)",SE.ADT.1524.LT.FE.ZS,,,,,,,,,,,,,,,,,,,,1.11428000000000E+01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Afghanistan,AFG,"Literacy rate, youth male (% of males ages 15-24)",SE.ADT.1524.LT.MA.ZS,,,,,,,,,,,,,,,,,,,,4.57960200000000E+01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Afghanistan,AFG,"Literacy rate, youth total (% of people ages 15-24)",SE.ADT.1524.LT.ZS,,,,,,,,,,,,,,,,,,,,3.00663500000000E+01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

This step is probably the hardest. I will use some Unix tools as I am used to them and they work well. Sorry! You can probably do this with a good texteditor or spreadsheet application as well.

We have “csvcut -c 2 WDI_Data.csv | uniq | wc -l” -> 253 -> 252 country codes (without the header). We have 6 lines per country for the literacy data. We should have at max 252 unique values per field then. TheZeitgeist only used data from 2011.

To make it more convenient to work, I first split off the Literacy data into a new file with

head -n 1 WDI_Data.csv > Literacy.csv; grep Literacy WDI_Data.csv >> Literacy.csv

No idea if TheZeitgeist mixed Adult and Youth, let’s just use the Adult data for now.

head -n 1 WDI_Data.csv > Literacy_adult.csv; grep -E "Literacy rate, adult (fe)*male" Literacy.csv >> Literacy_adult.csv

Next let’s isolate the data for 2011. csvcut seems to have a bug with numerically named columns so we have to use the field’s index (56) instead of its name “2011”.

csvcut -c 2,3,56 Literacy_adult.csv > Literacy_adult_2011.csv

We need to get the data into one line per country, I am lazy so:

grep "Literacy rate, adult female" Literacy_adult_2011.csv > Literacy_adult_2011_female.csv
grep "Literacy rate, adult male" Literacy_adult_2011.csv | sed 's/.*15 and above)",/,/' > Literacy_adult_2011_male.csv
echo "Country Name,Country Code, Literacy Female, Literacy Male" > Literacy_adult_2011_oneline.csv; paste -d "" Literacy_adult_2011_female.csv Literacy_adult_2011_male.csv >> Literacy_adult_2011_oneline.csv

Enough of that commandline mumbojumbo! QGIS time!

Natural Earth has a column named “wb_a3” which is the WB 3 letter country codes, yay!

toreal("Literacy_adult_2011_oneline_Literacy Female") - toreal("Literacy_adult_2011_oneline_Literacy Male")

Figure out the rest yourself. This is where I apparently lost interest in writing back then. ;)
—–

Now make the map better by choosing a projection that does not make Greenland as big as Africa. Also, I would try adding another “attribute” to the display, change the alpha value depending on the absolute literacy.

And finally realise that a map is not a good visualisation because you cannot see the values of tiny countries. Make a bar chart instead. ;)

world bank 2011 female vs male literacy plus bar

WFS zu Shapefiles

Hier mal als Beispiel mit wget, grep, sed und ogr2ogr (von GDAL/OGR) in der Bash. Diese Anleitung geht davon aus, dass die Titel der Layer keine Sonderzeichen enthalten, also einfach in der Shell und als Dateinamen verwendet werden können. Generell nicht elegant, aber funktioniert meistens.

WFS finden, zum Beispiel über http://hmdk.de/freitextsuche?action=doSearch&q=wfs. Hier nehme ich einfach mal http://gateway.hamburg.de/OGCFassade/DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.aspx?REQUEST=GetCapabilities&SERVICE=WFS&VERSION=1.1.0 als Beispiel.

Capabilities abfragen:

wget -O DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.GetCapabilities.xml http://gateway.hamburg.de/OGCFassade/DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.aspx?REQUEST=GetCapabilities&SERVICE=WFS&VERSION=1.1.0

Verfügbare Layertitel extrahieren:

grep “<wfs:Title>” DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.GetCapabilities.xml | grep -Eo ‘>.*<‘ | sed ‘s/[<>]//g’ > DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.wfsTitle

Das sind dann zum Beispiel:

Hafengebietsgrenzen
Stadtteile
Bezirk
Ortsteile

Layer runterladen:

while read title; do wget -O DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten-${title}.gml "http://gateway.hamburg.de/OGCFassade/DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.aspx?REQUEST=GetFeature&SERVICE=WFS&VERSION=1.1.0&typeName="${title}; done < DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.wfsTitle

GML in Shapefile umwandeln:

while read title; do ogr2ogr -a_srs "EPSG:25832" -f "ESRI Shapefile" -fieldTypeToString IntegerList,StringList DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten-${title}.shp DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten-${title}.gml; done < DE_HH_WFS_INSPIRE_A1_4_Verwaltungseinheiten.wfsTitle

Open Data Hackathon 2014

Am Samstag war ich auf dem Open Data Hackathon in Hamburg mit dabei. Es war klasse! Hier ein kleiner Bericht, was ich gemacht habe.

Im Etherpad zur Veranstaltung hatte ich im Vorfeld einige Ideen gepostet und dabei auch auf meine (meiner Meinung nach nur halbfertige) Hamburg Gebäudealter Karte verlinkt, welche auf einem freien Gebäude-Datensatz des Hamburger LGV basiert. Die Resonanz darauf war sehr gut und so war klar, dass wir mit den Daten noch etwas machen würden.

Von Achim Tack kam die Idee die Untergeschosse, also die “Hamburger Unterwelt” zu mappen. Leider haben nur rund 10% der Gebäude im Datensatz entsprechende Attribute. Stattdessen wurden es dann die Obergeschosse. Gemeinsam mit Christina Elmer, Patrick Stotz und Serge (vergesse ich nicht jemanden?) entschieden wir über die Klasseneinteilung und die Farbgebung. Der Rest war mit Tilemill ein Kinderspiel. Hier ist unsere Karte der Obergeschosse in Hamburg.

geschosse_hh_k
geschosse_detail

Nach ihrem Baujahr (bzw der “ersten baulichen Veränderung des Gebäudes”) hatte ich die Gebäude schon im letzten Herbst eingefärbt, dann aber die letzten 5% nie fertiggestellt. Auf dem ODH habe ich dann einmal eine möglicherweise bessere Klassen-Einteilung ausprobiert (nach Gebäudealter gegoogelt und die Klassen aus irgendeiner Bewertungsrichtlinie übernommen). Heute gefallen mir meine alten Klassen doch besser, da sie die älteren Gebäude hervorheben. Also habe ich die Karte eben nochmal mit dem alten Stil neu gerendert, diesmal ohne die höchste Zoomstufe: Karte des Gebäudealters in Hamburg. Leider sind die Daten sehr unvollständig, teilweise fehlerhaft und darüber hinaus auch noch irreführend (Baujahr ist wohl mit der ältesten dokumentierten Einmessung einer baulichen Veränderung gleichgesetzt). Wer ist beim LGV eigentlich auf die Idee gekommen die Zahl 1500 als “ja keine Ahnung von wann das Gebäude ist” zu missbrauchen…? Diese Karte ist also wirklich nur als Kunst zu gebrauchen, nicht für ernstgemeinte Analysen. Ich habe es leider nicht geschafft die Attribute beim Mouseover zu zeigen.

baujahr_hh_k
baujahr_detail

Beide Karten haben natürlich das Problem, dass kleine Gebäude schlecht sichtbar sind und der Gesamteindruck den Betrachter dadurch teilweise in die Irre führt. Klassen lassen sich auch nur bedingt identifizieren. Egal, schön ist schön!

Zwischendurch überlegten wir, was wir mit den Gebäude-Daten noch anstellen könnten. Ich wurde mit den tollen Daten des Urban Atlas bekannt gemacht und Christina Elmer hatte eine wunderbare Idee, welche sofort das kreative Kopfkino angeschmissen hat. Darüber sage ich jetzt erstmal nicht mehr, da wir nebenbei daran basteln. :)

Fazit: Ein toller, inspirierender und produktiver Tag mit netten Leuten, interessanten Projekten und leckerem Essen (danke Marco!). Ich freue mich auf’s Wiedersehen und Weiterhacken. Wer hat Lust die übrigen Attribute der Gebäude zu visualisieren? Und warum nicht bestimmte Jahre (Nachkriegszeit) oder typische Gebäudetypen einzeln hervorheben?

Kuckt euch auch die andere Projekte an, im Hackdash zur Veranstaltung sind sie aufgelistet.

PS: Die Gebäudedaten kommen in einem GML-Format. Man kann sie mit QGIS öffnen, wenn man GDAL installiert hat. Mit GDAL kann man sie auch einfach in Shapefiles konvertieren, wobei dann natürlich die Attributsnamen gekürzt werden und doppelte Attribute zusammengefasst werden müssen:

ogr2ogr -a_srs "EPSG:25832" -f "ESRI Shapefile" \
-fieldTypeToString IntegerList,StringList \
"$(basename ${file} .xml)".shp "${file}" AX_Gebaeude

Ich habe sie als Shapefile hochgeladen: NAS_gebaeude.7z (keine Garantie usw.)

Fuel prices in Germany: Collecting the data

Discontinued because I will be working on this data for my master thesis. I’ll try to post updates on that though.

Recently the German government required all most fuel stations to report their prices to a central agency to control price fixing and enable the public to inform themselves. Right now the project is in a test phase. Unfortunately the data is not publicly shared with anyone but a selected few companies. The Telekom Tankstellen website is one resulting frontend to the data. But since the display options and graphical representation did not satisfy me, I just had to make my own map(s). Most importantly I wanted to see the variety of prices in the whole country at once.

In a short series of posts (maybe just two or three) I will show you how to acquire the data from the website, how to get it into a GIS usable format, how to make a map with it and maybe how to automate all of it with the goal of turning it into an animation. Let’s start with the data acquisition.

Scraping data

Prerequisites

  • A web browser that let’s you inspect network traffic. I use Opera 12, it has very nice developer tools. You could also use ngrep, tcpdump, wireshark or something similar but I find a web browser most convenient.
  • curl or wget
  • Bash or some other way of automating curl/wget
  • Some basic understanding of HTTP

Inspecting the website

tanken.telekom.map

The Telekom Tankstellen website offers a search function where you can choose a type of fuel and a radius of up to 20 kilometers around a location in which you want it to return all fuel stations. There is an embedded map which shows the results with big brand labels while highlighting the nearest as well as the cheapest station in the area. When hovering the mouse pointer on the labels, the price is shown in a pop-up. Below the map is a list of all the affected stations, their brand, address and price.

Since the map is just an embedded dynamic Google map with custom markers you might already suspect that the raw coordinates of the stations are available to your web browser. Also note how the page does not reload if you make a new search, a great indicator for Ajax.

Enable your browser’s network monitor and reload the website. Now look through the list of requests. (In Opera you can easily filter by resource type, in this case you could filter for XHR.) You should see a file called search.xml. Inspect the details of that request.

opera network request

As you can see in the image above, the request is a HTTP POST to /api/v1/search.xml with the parameters lat, lng, radius and fuels.

The server’s response is an XML object showing you a multitude of information about the fuel stations:

<search>
  <gasStations type="Array">
    <gasStation
    ident="519b492fbed7139bf0d432c0"
    brand="HEM"
    name="Hem Berlin Holzmarktstraße"
    street="Holzmarktstraße 4"
    zip="10179"
    city="Berlin"
    lat="52.5147271"
    lng="13.4195317"
    highway="false"
    services="convenience_store,grocery_store"
    distance="0.431"
    is_open="true"
    >
  <openingTimes type="Array">
    (...)
  </openingTimes>
  <fuels type="Array">
    <fuel
    kind="Super"
    price="1.539"
    timestamp="1381593720">
    </fuel>
  </fuels>
</gasStation>
(...)

Excellent, we found the data source. If you were only interested in those selected few stations, you could just copy’n’paste the XML response but since our goal is to make a map of the whole country we need to be able to automate this data acquisition step for arbitrary coordinates.

With curl you can perform the request like this:

curl --data 'lat=52.5125&lng=13.4143&radius=5&fuels=super' http://tanken.t-online.de/api/v1/search.xml

By default curl will print the response to the terminal but you can tell it to save it to a file or alternatively just use wget instead.

I first tried if you could use a bigger radius than the 20 kilometers available on the website and yes, the maximum seems to be 50 kilometers. I also tried to find a way to make it return prices for all the fuel types but did not succeed. I could imagine that it would work if you pass them in a properly formatted array but poking around like that takes it too far in my opinion.

Performing a mass-scrape

Mass-scraping can be taxing to a service and is ethically questionable, so I will not share a snippet to copy and paste. If you decide to do it yourself, you will have to implement it yourself.

We can now download fuel stations within a radius of 50 kilometers around an arbitrary coordinate. So of course we can do that for many coordinates until we have complete coverage of Germany.

Roughly rounding the numbers the geographic extends of Germany are

     55.5 N
5.5 E    15.5 E
     47 N

tankstellen-circles

To fully cover an area with fixed size circles you can just put rows of circles into it, shifting each row by the radius of the circles. This will lead to quite some overlap but definitely cover all of it.

The problem here is that we specify coordinates in WGS84 (a geographic coordinate system) while the radius is specified in meters. The ratio between degrees and meters depends on the latitude: In the southern-most corner of Germany (~47° latitude) longitudes are 76 kilometers apart while in the north (~56° latitude) it is just 64 kilometers (Source). The distance between latitudes is always roughly 111 kilometers (Source).

tankstellen-circles-germany

The maximum radius we can use is 50 kilometers. At 47° latitude this is 50/((2*pi*6371*cos(47*pi/180))/360) = 0.65°. So we can step by 0.65° in the west/east direction and shift 0.65°/2 = 0.32° sideways on subsequent (north/south) lines which themselves are 0.22° (50/112/2) apart. You can see the resulting increase of overlap towards the north in the image on the right.

You could load the resulting list of coordinates and remove the unnecessary ones with QGIS. Or alternatively download them all once, see which ones returned zero data and remove those coordinates. Either way, you can significantly reduce the number of requests to the website.

I used a bash script to perform all the resulting requests. If you do it yourself please consider adding a pause between requests and using a custom user-agent with your contact details in case the service administrators want to contact you. If you do this nicely and handle the resulting dataset responsibly, you should not have anything to worry about. But always keep in mind that you are accessing an API that was not likely published to be used like that.

Next: Turning the XML into a data format that you can easily load into QGIS.