Category Archives: commandline

Converting coordinates with cs2cs

In the spirit of Blog Little Things and after reading WEBKID’s You Might Not Need QGIS earlier, here you go:

So you have thousands or millions of coordinates in “the wrong” or “that stupid” coordinate reference system and want to convert them to “the one you need”? Easily done with proj‘s cs2cs tool. You feed it a list of coordinates and tell it how you want them transformed and put out.

Let’s say you have these crazy accurate super coordinates in EPSG:4326 (WGS84) in a file called MyStupidCoordinates.txt (for example from copying columns out of Excel).


9.92510775592794303 53.63833616755876932
9.89715616602172332 53.61639299685715798
9.91541274879985579 53.63589289755918799
9.91922922611725966 53.63415202989637010
9.92072211107566382 53.63179675033637750
9.89998015840083490 53.62284810375095390
9.90427723676020832 53.60740624866674153
9.95012889485460583 53.64563499608360075
9.89590860878436196 53.62979225409544171
9.92944379841924452 53.60061248161031955

and you want them in the much superior and wonderfully metric EPSG:25832 (ETRS89 / UTM zone 32N). You simply use +init=sourceCRS +to +init=targetCRS. if you have no idea what CRS your coordinates are in, enjoy 5 minutes with http://projfinder.com/ and you will either know or you might want to take a walk (don’t forget to try them flipped though).

cs2cs +init=epsg:4326 +to +init=epsg:25832 MyStupidCoordinates.txt

and it will print


561164.00 5943681.64 0.00
559346.79 5941216.81 0.00
560526.52 5943401.54 0.00
560781.36 5943211.12 0.00
560883.46 5942950.37 0.00
559524.51 5941937.29 0.00
559830.54 5940223.00 0.00
562807.40 5944515.43 0.00
559245.50 5942706.43 0.00
561505.49 5939488.65 0.00

You probably want more decimal places though, since your input coordinates were accurate down to the attodegree. For this, you can specify the output format with -f

cs2cs +init=epsg:4326 +to +init=epsg:25832 -f "%.17f" MyStupidCoordinates.txt


561164.00100000295788050 5943681.64279999211430550 0.00000000000000000
559346.79200000269338489 5941216.80899999570101500 0.00000000000000000
560526.52400000276975334 5943401.53899999428540468 0.00000000000000000
560781.36300000245682895 5943211.12199999392032623 0.00000000000000000
560883.46400000224821270 5942950.37499999348074198 0.00000000000000000
559524.51200000231619924 5941937.29399999510496855 0.00000000000000000
559830.54200000269338489 5940223.00299999490380287 0.00000000000000000
562807.39800000295508653 5944515.43399999290704727 0.00000000000000000
559245.50000000221189111 5942706.42899999395012856 0.00000000000000000
561505.48800000257324427 5939488.65499999187886715 0.00000000000000000

Perfect!

You could direct these transformed coordinates into a new file called MyCoolCoordinates.txt by adding a redirection of its output:

cs2cs +init=epsg:4326 +to +init=epsg:25832 MyStupidCoordinates.txt > MyCoolCoordinates.txt

You can find out more about cs2cs by reading its manpage:

man cs2cs

PS: You can handle that Z coordinate, right?

Merge GML files into one SQLite or Spatialite file

For example the buildings in Hamburg, Germany:

layer=$1
for file in *.xml
do
 if [ -f "${layer}".spatialite ]
 then
  ogr2ogr -f "SQLite" -update -append "${layer}".spatialite "${file}" "${layer}" -dsco SPATIALITE=YES
 else
  ogr2ogr -a_srs EPSG:25832 -f "SQLite" "${layer}".spatialite "${file}" "${layer}" -dsco SPATIALITE=YES
 fi
done

Remove the -dsco SPATIALITE=YES and change the output filename for SQLite. QGIS can work with both.

$ bash mergexmltospatialite.sh AX_Gebaeude

Be aware that Spatialite is much more sensitive to geometric problems. You might get things like

ERROR 1: sqlite3_step() failed:
ax_gebaeude.GEOMETRY violates Geometry constraint [geom-type or SRID not allowed] (19)
ERROR 1: ROLLBACK transaction failed: cannot rollback - no transaction is active
ERROR 1: Unable to write feature 1712 from layer AX_Gebaeude.

ERROR 1: Terminating translation prematurely after failed translation of layer AX_Gebaeude (use -skipfailures to skip errors)

but on the other hand, you get spatial indexing which makes queries or high zoom interaction much quicker.

Be aware that if you try to merge files into a Shapefile and fields are getting truncated, those fields will only be filled with data for the first file you merge. On the later files OGR will try to match the input field names to the merged file’s fieldnames, notice the difference and discard them. If you still want to convert to Shapefiles, check out the -fieldTypeToString IntegerList,StringList options.

A GeoTIFF of the German 2011 census population density

tl;dr: GMT is documented for people who use it since the 80s.

update 2: You can use gmtinfo to calculate the extents, using -I- will make it output the -R parameter! xyz2grd $(gmtinfo -I- *.xyz) ...

update: xyz2grd supports GeoTIFF via its GDAL driver (now?)! For example -Gfile.tif=gd:GTiff. See eg http://gmt.soest.hawaii.edu/doc/5.2.1/grdconvert.html. So the way below is overly complicated. I do not know how to apply the advanced settings of GDAL though like compression and prediction etc.

zensus2011_berlinzensus2011_irgendwozensus2011_bremen

The Statistische Ämter des Bundes und der Länder offer a 100 meter grid of Germany’s population density: csv_Bevoelkerung_100m_Gitter.zip (110MB). Datensatzbeschreibung_Bevoelkerung_100m_Gitter.xlsx provides additional information.

Let’s turn that dataset into a GeoTIFF so we can use it in our GIS. We will use free and open-source tools from GMT and GDAL. GDAL loves to interpolate values but our data is discrete/regular. We do not want any kind of interpolation. So xyz2grd from GMT is the best choice for turning the xyz data into a “continuous” GIS format (tell me if not).

Getting to know the data

Inside the zip is a 1.3GB file Zensus_Bevoelkerung_100m-Gitter.csv with about 36 million lines.

Gitter_ID_100m;x_mp_100m;y_mp_100m;Einwohner
100mN26840E43341;4334150;2684050;-1
100mN26840E43342;4334250;2684050;-1

100mN27407E44044;4404450;2740750;3
100mN27407E44045;4404550;2740750;31
100mN27407E44046;4404650;2740750;13
100mN27407E44047;4404750;2740750;14
100mN27407E44048;4404850;2740750;10

Datensatzbeschreibung_Bevoelkerung_100m_Gitter.xlsx says the coordinates are in ETRS89-LAEA Europe – EPSG:3035.

First we need to find out the geographic extends of the data, you could use your favourite cli tools for that, I wrote a quick .vrt file and used ogrinfo on that:

$ cat Zensus_Bevoelkerung_100m-Gitter.csv.vrt

<OGRVRTDataSource>
 <OGRVRTLayer name="Zensus_Bevoelkerung_100m-Gitter">
  <LayerSRS>EPSG:3035</LayerSRS>
  <SrcDataSource>Zensus_Bevoelkerung_100m-Gitter.csv</SrcDataSource>
  <GeometryType>wkbPoint</GeometryType>
  <GeometryField encoding="PointFromColumns" x="x_mp_100m" y="y_mp_100m" />
 </OGRVRTLayer>
</OGRVRTDataSource>

$ ogrinfo -al Zensus_Bevoelkerung_100m-Gitter.csv.vrt

INFO: Open of `Zensus_Bevoelkerung_100m-Gitter.csv.vrt’ using driver `VRT’ successful.

Layer name: Zensus_Bevoelkerung_100m-Gitter
Geometry: Point
Feature Count: 35785840
Extent: (4031350.000000, 2684050.000000) – (4672550.000000, 3551450.000000)

Creating a netcdf/grd file from the xyz data with xyz2grd

xyz2grd wants xyz, nothing else. The Gitter_ID_100m column is redundant in any case, you can calculate it yourself from the x and y fields if needed. So first let’s convert it to a “x y z” format with awk. The separator is ;

awk 'FS=";" {print $2" "$3" "$4}' Zensus_Bevoelkerung_100m-Gitter.csv > Zensus_Bevoelkerung_100m-Gitter.xyz

Now we can write our xyz2grd commandline.

We have our extends:
-R4031350/4672550/2684050/3551450

We know the spacing is 100 units (meters):
-I100

There is one header line:
-h1

And of course we know that we want a “classic” netcdf4 chunk size, whatever that means. We knew that right away, not after googling helplessly for an hour and eventually finding the hint on some mailing list. Not knowing this might have lead to QGIS only seeing NaN values for z, R’s ncdf/raster saying “Error in substr(w, 1, 3) : invalid multibyte string at ‘<89>HDF” and GDAL “0ERROR 1: nBlockYSize = 130, only 1 supported when reading bottom-up dataset”.
--IO_NC4_CHUNK_SIZE=c

The resulting commandline:
xyz2grd -Vl -R4031350/4672550/2684050/3551450 -I100 -h1 --IO_NC4_CHUNK_SIZE=c -GZensus_Bevoelkerung_100m-Gitter.cdf Zensus_Bevoelkerung_100m-Gitter.xyz

You can inspect the file with grdinfo and gdalinfo now if you want.

Creating a GeoTIFF from the netcdf/grd file

Let’s turn it into a GeoTIFF with gdal_translate. We will need a bunch of commandline parameters.

The spatial reference system is EPSG:3035, so:
-a_srs EPSG:3035

Values of -1 mean “no data”:
-a_nodata -1

TIFF is uncompressed by default, we want good lossless compression:
-co COMPRESS=DEFLATE

The resulting commandline:
gdal_translate -co COMPRESS=DEFLATE -a_srs EPSG:3035 -a_nodata -1 Zensus_Bevoelkerung_100m-Gitter.cdf Zensus_Bevoelkerung_100m-Gitter.tif

The resulting file is about 8 Megabytes and should work in any reasonable GIS. Have fun!

Downloads

http://hannes.enjoys.it/opendata/Zensus_Bevoelkerung_100m-Gitter.tif
TODO: What license is this now?

zensus2011_hamburg
zensus2011_altersheimzensus2011_schrebergärten

“Commandline File Hosts”

These are the “terminal to URL” file hosts I know. Please tell me about others. The speed tests are not scientific at all, my servers might be uncapable of better speeds because of saturation or bad routing.

https://transfer.sh/

curl --upload-file "01_Name_Game_(Intro).mp3" https://transfer.sh/

Maximum Filesize: 5 Gigabytes
Expiration after: 2 weeks
Uses HTTPS!
Serves error 500 for non-existing files. Does not serve direct downloads anymore. Bleh.
Download URLs are modeled after your original filename (“special” characters are replaced): https://transfer.sh/SGNl/01-name-game-intro.mp3
1GB.zip upload: ~30-110 Megabit/s
1GB.zip download: ~240 Megabit/s

http://curl.io/

curl -F "file=@01_Name_Game_(Intro).mp3" http://curl.io/send/abcd1234

Maximum Filesize: 20 Gigabytes
Expiration after: 4 hours
Requires you to visit their website to get a key (URL) assigned.
Download URLs are random and but it serves original filename with content-disposition: http://curl.io/get/kegcxfza/85866a9bcfaf6f8eccab136238c07f659d580b25
1GB.zip upload: ~60-75 Megabit/s
1GB.zip download: 100 Megabit/s

http://chunk.io

curl --upload-file 01_Name_Game_\(Intro\).mp3 chunk.io

Maximum Filesize: 50 Megabytes
Expiration after: 6 months
They might increase filesize and expiration if there is enough interest, as of now you can not “sign-up” for those.
Download URLs are unreadable, filenames are lost: http://chunk.io/f/afb01d527ceb4d03b9bf9e62bb32263e
1GB.zip upload: ~8 Megabit/s (until it errored :) )

http://purrrl.link/
Requires sign-up.