TanDEM-X promo images in their full resolution at the bottom, SRTMGL1 at maximum resolution at the top.
If only TanDEM-X data was freely available instead of incredibly expensive.
TanDEM-X promo images in their full resolution at the bottom, SRTMGL1 at maximum resolution at the top.
If only TanDEM-X data was freely available instead of incredibly expensive.
Let’s take it to the next level: Wir wollen alle auf daten-hamburg.de gehosteten Datensätze, weil da die ganzen schicken Geodaten sind. Wir müssen also einen Query bauen, der uns alle Datensätze gibt, die “^http://daten-hamburg.de/” in der resources.url haben.
Mit http://wiki.apache.org/solr/CommonQueryParameters kann man komplexe Queries schreiben, sagt http://docs.ckan.org/en/latest/api/index.html#ckan.logic.action.get.package_search . Mit ein bisschen Scrollen stößt man gegebenenfalls auf resource_search und über Google nach “ckan resource_search” auf https://github.com/ckan/ckan/issues/1494, dessen Query man dann nimmt und sich damit nach http://suche.transparenz.hamburg.de/api/action/resource_search?query=url:http://daten-hamburg.de/ durchhangelt. Voll einfach! … Der Query dauert mehrere Sekunden und scheint ALLE Hits zurückzugeben, super!
Download läuft, so langsam wie daten-hamburg.de eben leider ist: https://www.datenatlas.de/geodata/public/sources/
Insgesamt sind es rund 104 Gigabyte, allerdings inklusive einiger Duplikate. Übrigens stecken auch SHA256-Hashes in den Daten, praktisch zum Überprüfen der Downloads.
Ugly-but-does-the-job URLs rausziehen:
$ cat suche.transparenz.hamburg.de/api/action/resource_search@query\=url%3Ahttp%3A%2F%2Fdaten-hamburg.de%2F | json_pp | grep '"url"' | grep -Eo 'http.*"' | sed 's#"$##' > urls
Grundlegende Links:
http://transparenz.hamburg.de/hinweise-zur-api/
http://transparenz.hamburg.de/contentblob/4354384/f19d09732a6ea80ae9808de157b5ba4c/data/mdm-schema1-6.pdf
http://docs.ckan.org/en/latest/api/index.html
Die Links zu den Daten stecken in den Resources der Packages, z.B.:
import urllib.request import json url = "http://suche.transparenz.hamburg.de/api/3/action/package_show?id=larmminderungsplanung-fluglarm-hamburg2" with urllib.request.urlopen(url) as req: response = req.read() response_dict = json.loads(response.decode('utf-8')) assert response_dict['success'] result = response_dict['result'] resources = result['resources'] for resource in resources: print(resource['url']) |
gibt uns
http://geodienste.hamburg.de/HH_WMS_Fluglaermschutzzonen?REQUEST=GetCapabilities&SERVICE=WMS
http://geodienste.hamburg.de/HH_WFS_Fluglaermschutzzonen?REQUEST=GetCapabilities&SERVICE=WFS
http://daten-hamburg.de/umwelt_klima/laermminderungsplanung_fluglaerm/Kurvenpunkte_Laermschutzbereich_Flughafen_Hamburg_EDDH.xlsx
http://suche.transparenz.hamburg.de/localresources/HMDK/335B680C-CA3E-4FE9-BC05-641BA565E366/Kurvenpunkte_Laermschutzbereich_Flughafen_Hamburg_EDDH.zip
http://daten-hamburg.de/umwelt_klima/laermminderungsplanung_fluglaerm/Laermminderungsplanung_Fluglaerm_HH_2015-07-27.zip
http://metaver.de/trefferanzeige?docuuid=335B680C-CA3E-4FE9-BC05-641BA565E366
Das ist doch schon mal was.
While this is about data of the city of Hamburg, Germany, I decided to post in English as GeoPackage Propaganda should be accessible. ;)
I am casually working on converting open geodata released via the Transparenzportal Hamburg to more usable GeoPackages, GeoTIFFs and similar formats with free and open-source tools like GDAL and GMT where possible. This includes things like orthophotos, ALKIS, addresses, DEM, districts etc. You can get a list of most available source data here but there are some datasets “hidden” in other categories as well. The data is usually released in GML or as gridded files (e.g. JPEG or XYZ tiles/files). While this is pretty much perfect as source formats, working with them is cumbersome. My goal is to make this data more accessible for anyone in tools like QGIS.
For now you can find the 20cm orthophotos for 2013-2015 and the 1m DEM in https://www.datenatlas.de/geodata/public/hamburg/. Mind the licenses, see the readme file for a bit of info. More to come, I want to plan the pipeline a bit better first though. There should be a full script & log from source to GeoPackage for each file.
I will also provide mirrors of the source files. If you want to collaborate, please contact me. Apart from Hamburg I will also add free/open datasets for the whole of Germany, things related to (nearby) bathymetry and some global ones. If you want Shapefiles, ECW, MrSID or similar, you can pay me for converting.
WTF!
https://data.oaklandnet.com/browse?q=alpr via https://news.ycombinator.com/item?id=10642139
817159 timestamped car plate locations.
Pages:
https://data.oaklandnet.com/Public-Safety/All-License-Plate-Reader-Data-ALPR-September-23-20/6dab-n9nd
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-04-01-2014-thru/k76g-27ne
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-3-1-2013-4-1-20/m64r-jeei
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-3-19-2014-3-31-/wy2w-ue82
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-4-1-14-thru-5-3/7axi-hi5i
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-5-13-2012-7-3-2/f28j-9q95
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-7-6-2011-12-19-/7xz6-yzxz
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-8-6-2012-8-12-2/gyu7-qpwz
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-8-26-2012-9-19-/bd9f-4pn8
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-9-15-2012/vwcs-329i” clas
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-10-20-2012-11-2/cyhz-jk8v
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-11-20-2012-12-3/jxx3-67d2
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-12-19-2013-1-31/h4qp-hsyy
https://data.oaklandnet.com/Public-Safety/All-license-plate-reader-data-ALPR-12-23-2010-thru/t7dd-x7dh
grab.sh:
ids=”6dab-n9nd
7axi-hi5i
7xz6-yzxz
bd9f-4pn8
cyhz-jk8v
f28j-9q95
gyu7-qpwz
h4qp-hsyy
jxx3-67d2
k76g-27ne
m64r-jeei
t7dd-x7dh
vwcs-329i
wy2w-ue82″for id in ${ids}
do
echo “Grabbing ${id}”
wget -a wget.log -x –content-disposition “https://data.oaklandnet.com/api/views/${id}/rows.csv?accessType=DOWNLOAD”
# screw excel wget -a wget.log -x –content-disposition “https://data.oaklandnet.com/api/views/${id}/rows.csv?accessType=DOWNLOAD&bom=true”
wget -a wget.log -x –content-disposition “https://data.oaklandnet.com/api/views/${id}/rows.json?accessType=DOWNLOAD”
wget -a wget.log -x –content-disposition “https://data.oaklandnet.com/api/views/${id}/rows.pdf?accessType=DOWNLOAD”
wget -a wget.log -x –content-disposition “https://data.oaklandnet.com/api/views/${id}/rows.rdf?accessType=DOWNLOAD”
wget -a wget.log -x –content-disposition “https://data.oaklandnet.com/api/views/${id}/rows.rss?accessType=DOWNLOAD”
wget -a wget.log -x –content-disposition “https://data.oaklandnet.com/api/views/${id}/rows.xls?accessType=DOWNLOAD”
wget -a wget.log -x –content-disposition “https://data.oaklandnet.com/api/views/${id}/rows.xlsx?accessType=DOWNLOAD”
wget -a wget.log -x –content-disposition “https://data.oaklandnet.com/api/views/${id}/rows.xml?accessType=DOWNLOAD”
wget -a wget.log -x –content-disposition “https://data.oaklandnet.com/api/views/${id}/rows.xml?accessType=DOWNLOAD”
doneexit
Be aware that only the CSV format is guaranteed to have all records. At least that’s what some files say.
$ sed ‘s#,.*##g’ *.csv | sort | uniq | wc -l
817159
sed ‘s#,.*##g’ *.csv | sort | uniq -c | sort -h | tail
grep -h ‘^PLATENUMBER,’ *.csv | sed ‘s#,”(#,#g’ | sed ‘s#)”##’
grep -hEo “\(37.*\)\”” *.csv | sed ‘s#[()” ]*##g’
Happy stalking :(
The coordinates are wrong, I should have be xmin and ymin to match the official grid. Will update the PDFs soonish, sorry!
The LGV offers their official UTM grid for Hamburg in the Transparenzportal. Since many datasets are indexed by those grid tiles, it can be handy to have a quick references. Queue QGIS!
Load the layers “utm_raster1km any” and “utm_raster2km any” of the GML file. The CRS is EPSG:25832. Set their styles to have no fill.
Label with substr(x_min($geometry),0,4) || '\n' || substr(y_min($geometry),0,5)
to truncate the coordinate display to just the interesting bits, the three leading numbers of X and the four leading numbers of Y.
Print them. You now have nice maps of the grid that you can use as reference when browsing through files with names like dgm1_32552_5936_2_fhh.xyz, LoD1_571_5939_1_HH.xml or dop20c_32576_5953.jpg (ignore the leading 32…).
I added the Stadtteile as background (and Wished QGIS could style by the 4 color theorem).
Click to download PDFs (they should be DIN A4, ask the composer why they are not):
For example the buildings in Hamburg, Germany:
layer=$1
for file in *.xml
do
if [ -f "${layer}".spatialite ]
then
ogr2ogr -f "SQLite" -update -append "${layer}".spatialite "${file}" "${layer}" -dsco SPATIALITE=YES
else
ogr2ogr -a_srs EPSG:25832 -f "SQLite" "${layer}".spatialite "${file}" "${layer}" -dsco SPATIALITE=YES
fi
done
Remove the -dsco SPATIALITE=YES
and change the output filename for SQLite. QGIS can work with both.
$ bash mergexmltospatialite.sh AX_Gebaeude
Be aware that Spatialite is much more sensitive to geometric problems. You might get things like
ERROR 1: sqlite3_step() failed:
ax_gebaeude.GEOMETRY violates Geometry constraint [geom-type or SRID not allowed] (19)
ERROR 1: ROLLBACK transaction failed: cannot rollback - no transaction is active
ERROR 1: Unable to write feature 1712 from layer AX_Gebaeude.ERROR 1: Terminating translation prematurely after failed translation of layer AX_Gebaeude (use -skipfailures to skip errors)
but on the other hand, you get spatial indexing which makes queries or high zoom interaction much quicker.
Be aware that if you try to merge files into a Shapefile and fields are getting truncated, those fields will only be filled with data for the first file you merge. On the later files OGR will try to match the input field names to the merged file’s fieldnames, notice the difference and discard them. If you still want to convert to Shapefiles, check out the -fieldTypeToString IntegerList,StringList
options.
tl;dr: GMT is documented for people who use it since the 80s.
update 2: You can use gmtinfo to calculate the extents, using -I- will make it output the -R parameter! xyz2grd $(gmtinfo -I- *.xyz) ...
update: xyz2grd supports GeoTIFF via its GDAL driver (now?)! For example -Gfile.tif=gd:GTiff
. See eg http://gmt.soest.hawaii.edu/doc/5.2.1/grdconvert.html. So the way below is overly complicated. I do not know how to apply the advanced settings of GDAL though like compression and prediction etc.
The Statistische Ämter des Bundes und der Länder offer a 100 meter grid of Germany’s population density: csv_Bevoelkerung_100m_Gitter.zip (110MB). Datensatzbeschreibung_Bevoelkerung_100m_Gitter.xlsx provides additional information.
Let’s turn that dataset into a GeoTIFF so we can use it in our GIS. We will use free and open-source tools from GMT and GDAL. GDAL loves to interpolate values but our data is discrete/regular. We do not want any kind of interpolation. So xyz2grd from GMT is the best choice for turning the xyz data into a “continuous” GIS format (tell me if not).
Inside the zip is a 1.3GB file Zensus_Bevoelkerung_100m-Gitter.csv with about 36 million lines.
Gitter_ID_100m;x_mp_100m;y_mp_100m;Einwohner
100mN26840E43341;4334150;2684050;-1
100mN26840E43342;4334250;2684050;-1
…
100mN27407E44044;4404450;2740750;3
100mN27407E44045;4404550;2740750;31
100mN27407E44046;4404650;2740750;13
100mN27407E44047;4404750;2740750;14
100mN27407E44048;4404850;2740750;10
Datensatzbeschreibung_Bevoelkerung_100m_Gitter.xlsx says the coordinates are in ETRS89-LAEA Europe – EPSG:3035.
First we need to find out the geographic extends of the data, you could use your favourite cli tools for that, I wrote a quick .vrt file and used ogrinfo on that:
$ cat Zensus_Bevoelkerung_100m-Gitter.csv.vrt
<OGRVRTDataSource> <OGRVRTLayer name="Zensus_Bevoelkerung_100m-Gitter"> <LayerSRS>EPSG:3035</LayerSRS> <SrcDataSource>Zensus_Bevoelkerung_100m-Gitter.csv</SrcDataSource> <GeometryType>wkbPoint</GeometryType> <GeometryField encoding="PointFromColumns" x="x_mp_100m" y="y_mp_100m" /> </OGRVRTLayer> </OGRVRTDataSource>
$ ogrinfo -al Zensus_Bevoelkerung_100m-Gitter.csv.vrt
INFO: Open of `Zensus_Bevoelkerung_100m-Gitter.csv.vrt’ using driver `VRT’ successful.
Layer name: Zensus_Bevoelkerung_100m-Gitter
Geometry: Point
Feature Count: 35785840
Extent: (4031350.000000, 2684050.000000) – (4672550.000000, 3551450.000000)
…
xyz2grd wants xyz, nothing else. The Gitter_ID_100m column is redundant in any case, you can calculate it yourself from the x and y fields if needed. So first let’s convert it to a “x y z” format with awk. The separator is ;
awk 'FS=";" {print $2" "$3" "$4}' Zensus_Bevoelkerung_100m-Gitter.csv > Zensus_Bevoelkerung_100m-Gitter.xyz
Now we can write our xyz2grd commandline.
We have our extends:
-R4031350/4672550/2684050/3551450
We know the spacing is 100 units (meters):
-I100
There is one header line:
-h1
And of course we know that we want a “classic” netcdf4 chunk size, whatever that means. We knew that right away, not after googling helplessly for an hour and eventually finding the hint on some mailing list. Not knowing this might have lead to QGIS only seeing NaN values for z, R’s ncdf/raster saying “Error in substr(w, 1, 3) : invalid multibyte string at ‘<89>HDF” and GDAL “0ERROR 1: nBlockYSize = 130, only 1 supported when reading bottom-up dataset”.
--IO_NC4_CHUNK_SIZE=c
The resulting commandline:
xyz2grd -Vl -R4031350/4672550/2684050/3551450 -I100 -h1 --IO_NC4_CHUNK_SIZE=c -GZensus_Bevoelkerung_100m-Gitter.cdf Zensus_Bevoelkerung_100m-Gitter.xyz
You can inspect the file with grdinfo and gdalinfo now if you want.
Let’s turn it into a GeoTIFF with gdal_translate. We will need a bunch of commandline parameters.
The spatial reference system is EPSG:3035, so:
-a_srs EPSG:3035
Values of -1 mean “no data”:
-a_nodata -1
TIFF is uncompressed by default, we want good lossless compression:
-co COMPRESS=DEFLATE
The resulting commandline:
gdal_translate -co COMPRESS=DEFLATE -a_srs EPSG:3035 -a_nodata -1 Zensus_Bevoelkerung_100m-Gitter.cdf Zensus_Bevoelkerung_100m-Gitter.tif
The resulting file is about 8 Megabytes and should work in any reasonable GIS. Have fun!
http://hannes.enjoys.it/opendata/Zensus_Bevoelkerung_100m-Gitter.tif
TODO: What license is this now?
Auf den großartigen Seiten des Hamburger Luftmessnetz kann man schöne Diagramme der verschiedenen Messwerte sehen, zum Beispiel von Feinstaub: PM10 oder PM2,5. Leider kann man nicht auf Diagramme sämtlicher Stationen verlinken, stattdessen muss der Benutzer sie per Hand zusammenstellen. Weil ich die Kurven gerne auf /r/dataisbeautiful verlinken wollte und die neuen Regeln dort etwas eigen sind, habe ich die Diagramme einfach mal mit Gnumeric nachgebaut.
Erstmal die Teilchen mit einem aerodynamischen Durchmesser von weniger als 10 Mikrometer (10 µm), PM10. Hiervon darf 35 mal Im Jahr ein Tagesmittelwert von 50µg/m³ überschritten werden.
Und dann noch die fieseren PM2,5 (kleiner 2,5µm), diese werden nur an drei Stationen gemessen:
In Berlin gibt es diese Werte leider nur täglich, aber der Feinstaub-Monitor der Berliner Morgenpost ist einen Blick wert (auch wenn er leider 2014 zu Ende gegangen ist?).
English version: This image shows the amount of particulate matter with a diameter of 10 micrometres or less in the days before and during new year 2014/2015 in Hamburg, Germany. This image shows the amount of particulate matter with a diameter of 2.5 micrometres or less.
Auch bundesweit gibt es schöne Werte, leider nur täglich. Das Umweltbundesamt stellt interpolierte Karten bereit, ich hab die letzten vier Tage mal zusammengefasst:
Schade, dass die Farbskala ein fixes Maximum hat, die Werte lagen ja wohl eher über 50µg/m³.
Wie auch immer, frohes neues!
Lieber Transparenzmensch, ich wünsche mir für das Hamburger Transparenzportal:
Wie auch immer, ich danke dir für die vielen Stunden, die ich mit deinen Datengaben dieses Jahr schon spielen konnte. Schenkst du uns nächstes Jahr die Verpflichtung der Anstalten öffentlichen Rechts zur Veröffentlichung? Die drücken sich im Moment noch. Und sag bitte dem Bundestransparenzmenschen, dass die MTS-K-Daten doch nun wirklich einfach freigegeben werden sollten. Das wäre auch für die dortigen Mitarbeiter eine enorme Arbeitsentlastung.