There was always the issue which bugged me: Why do I have to go through an intermediate format on disk when I want to import a GRASS raster layer into R? At the moment, when I use
readRAST6(), the raster layer is exported from GRASS into an intermediate formate (I don't recall which format it is) on the HDD, then this format is imported into R, and the intermediate layer is deleted. Now - this is working reliably and reasonably fast, but somehow I don't like this intermediat file. So my idea is: why not use Rcpp to access the functions in GRASS to read the raster collumn-wise and write a function in R which allows to
- read the whole raster from the GRASS raster
- read single columns or column ranges from the GRASS raster
- read single cells from the GRASS raster
- read user specified blocks from the GRASS raster
Vice versa, there is a C function in GRASS which writes columns to a raster - so it would be possible to
- write a whole R raster to GRASS raster
- write single columns or column ranges to a GRASS raster
- write single cells to a GRASS raster
- write user specified blocks to the GRASS raster
An example module for grass to read a raster and write it into a new raster is at http://svn.osgeo.org/grass/grass/trunk/doc/raster/r.example/main.c.
And now comes the intriguing part: ther is the raster package, which uses a similar machanism to avoid having to load a whole raster into R memory. If raster is linked to GRASS by using these functions, there would be a brilliant backend for working with rasters in R.
Now these are ideas, but I am planning on following them up. Some things which need to be considered and thought trough:
- To compile the modules for GRASS, it might be the easiest to write the C code in GRASS so that it get's compiled with GRASS, possibly even becoming a part of the binary distribution of GRASS. In this way, one would simply have to load the library in GRASS and call the function to read the raster, and it would make it possible to be used by other programs as well. (In my view, GRASS is missing a simple API for these kind of things, but this is a different story).
- One could put the C code into an R package and compile it from there, but this might be calling for trouble, as it would be very much linked to GRASS and dependant on internal changes. So the option of writing a C library as part of GRASS which provides functions to read and write blocks of and whole rasters might be the better solution.
- The wrapper around the C library would be relatively straightforward using Rcpp.
- The R part should be GRASS version agnostic, i.e. the same code independant of the GRASS version. By specifying the path to the GRASS installation, a specific library would be loaded and used.ossible to even switch between different GRASS version.
- It might make sense to split this into two packages: one frontend which defines the functions to be used by the R user, and a backend which supplies the functionality to ink these functions to the GRASS backend. So it would be similar to the dbi package which defines the database access functions, and on the other hand the backends which link these to different databases. This would enable a common interface to access spatial data in a GRASS database, Postgresql database, spatialite database, directory containing the raster layers in a specific format, …
OK - so what are the next steps:
- Setting up a github repo where interested parties can contribute and comment: https://github.com/rkrug/grassRLink
- Getting input from the GRASS community and what they think about this
- Getting a structure of the package(s) setup, so that a framework is available in which one can do the coding to satisfy the requirements
I don't think this is something which can (and should!) be done in a rush, as this framework could possibly form a crucial backbone for spatial processing.
And: if this is there, one can do the same for vectors, spatio-temporal data, …
My feeling is that the time is ripe to give R an interface to the spatial GRASS database which can easily be extended to other spatial storage systems, in the same way that dbi is doing this for databases.
So: please give feedback, let me know what you think, if you have suggestions, tell me if this is not going to work (if you think so).
Cheers and enjoy life.