Pronto Raster: A C++ library for Map Algebra Alex Hagen-Zanker [email protected] Map algebra Local operations 3 Focal operations Zonal operations 5 1 3 A A 2 + B

A 2 = 13 A 4 A 4 B 1 2 7 https://github.com/ahhz/raster 2 4

3 2 = = B B 2 2 3 B A 1 4 B

Zone A 17 B 13 That is basic! Dont we have tools? Geodata management: GDAL, Proj, GeoTIFF, etc. Image processing: OpenCV, GIL, etc. Linear algebra: Eigen, Blitz, UBLAS, etc. Scripting languages: ArcGIS Python, PC Raster, R - raster, Geotrellis GIS software: Raster Calculator and specific tools in QGIS, SAGA ArcGIS All of these facilitate Map Algebra operations, but none meets all (my) requirements https://github.com/ahhz/raster Expected user and

requirements Researcher / consultant working in geocomputational domain who does not like being constrained by built-in methods Requirement Customizable Key challenge Be able to apply custom local, focal and zonal functions on any number of layers Scalable Work with large raster data sets that exceed available memory Efficient Minimal creation of temporary datasets for intermediary results Flexible Work with wide range of data types with no need for data preprocessing Usable Compatible

Idiomatic interface, minimal use of boiler plate code Be able to use in conjunction with other libraries (e.g. for optimization, simulation control, etc.) https://github.com/ahhz/raster Design: Raster and Raster View concepts Range* Gives access to a sequence of values through iterator access. Unlike a Container does not have to store its own values Range View A Range that can be copied O(1) and is meant to be passed to functions by value Raster A Range of which we know the number of rows and columns. And, must also give access to sub-raster . Raster View A Raster that is also a View. *Official proposal to C++ standard, see also https://ericniebler.github.io/range-v3/ https://github.com/ahhz/raster Design: Raster and Raster

View concepts Concept Range Main requirements Sized size_type size(); View copy-constructor O(1) Raster sub_type sub_raster(start_row, start_col, rows, cols); size_type rows(); size_type cols(); iterator_type begin(); iterator_type end(); https://github.com/ahhz/raster Using the Raster concept (1)

#include namespace pr = pronto::raster; Use GDAL to open a raster dataset as a model of the Raster View concept int main() { auto ras = pr::open("my_file.tif") int sum = 0; for(auto&& v : ras) Use range-based for-loop to iterate over all values in the raster { } sum += v; } https://github.com/ahhz/raster Using the Raster concept (2) #include namespace pr = pronto::raster;

int main() { auto ras = pr::open("my_file.tif") int sum = 0; for(auto&& v : ras.sub_raster(2,3,10,20)) { } } Iterate over subset of the raster. First cell at (2,3) then 10 rows and 20 columns sum += v; https://github.com/ahhz/raster Design: Expression Templates (ET) for local operations Local operations take Raster Views as input and produce an ET as output that also is a Raster View. Cell values of the ET are only calculated once they are iterated over No need to allocate memory

Input and output conform to the same concept: highly composable A generic function, transform, is implemented, other local functions can be implemented in terms of transform https://github.com/ahhz/raster Using local operations(1) #include #include #include namespace pr = pronto::raster; Open two datasets int main() { auto a = pr::open("a.tif"); auto b = pr::open("b.tif"); Create a Raster View that applies std::plus to each cell of both input datasets auto sum = pr::transform(std::plus, a, b); for(auto&& v : sum.sub_raster(2,3,10,20)) { // do something Iterate over a sub-raster of the }

Raster View and calculate the } https://github.com/ahhz/raster required values only Using local operations(2) auto a = pr::open("a.tif"); auto b = pr::open("b.tif"); Create a RasterView of square of differences in two steps auto diff = pr::transform(std::minus, a, b); auto sqr_lambda = [](int x){return x*x;}; auto sqr_diff = pr::transform(sqr_lambda, diff); int sum_of_squared_diff = 0; for(auto&& v : square_diff) { } sum_of_squared_diff += v; https://github.com/ahhz/raster Iterate over values, without allocating extra memory, or creating temporary dataset.

Using local operations(3) #include #include #include namespace pr = pronto::raster; Wrap a and b in raster_algebra_wrapper int main() to allow use of overloaded operators { auto a = pr::raster_algebra_wrap(br::open("a.tif")); auto b = pr::raster_algebra_wrap(br::open("b.tif")); auto square_diff = (a b) * (a b); for(auto&& v : square_diff) { } } // do something Iterate over values. https://github.com/ahhz/raster Design: use optional for flagging and processing nodata values #include

#include #include namespace pr = pronto::raster; int main() { auto in = pr::open("demo.tif"); auto nodata = pr::nodata_to_optional(in, 6); auto un_nodata = pr::optional_to_nodata(nodata, -99); } Providing an idiomatic way for functions to skip over missing values or to leave cells unspecified Value type: int 0 3 6 1 4 0 2 5 1 3

6 2 2 3 4 5 5 6 0 1 Value type: std::optional 0 3 2 5 1 4 0 3 2 5 1 4 0 3

2 5 1 Value type: int 0 3 -99 1 4 0 2 5 1 3 -99 2 https://github.com/ahhz/raster 2 3 4 5 5 -99 0 1

Design: fundamental spatial operators avoid copying data Function pad Effect sub_raster offset as required by the Raster concept h_edge / v_edge iterate over all pairs of horizontally / vertically adjacent cell pairs add const-valued leading and trailing rows and columns with a unmutable value returns value found at a fixed spatial offset in the input RasterView https://github.com/ahhz/raster Design: moving windows as Expression Templates Moving window analysis

is a common type of Focal Operation Each cell obtains the value of a summary statistics calculated for cells in a surrounding window Efficient implementations exploit overlap in windows of adjacent cells (especially for large windows) https://github.com/ahhz/raster Moving window schemes Circular window of edges Circular window of cells O(r x n) r = radius n = raster size Square window of cells Square window of edges O(n)

n = raster size Hagen-Zanker, AH (2016) A computational framework for generalized moving windows and its application to landscape https://github.com/ahhz/raster pattern analysis International Journal of Applied Earth Observation and Geoinformation. Design: Indicator concept Implement statistics in terms of a few functions add(element, weight) subtract(element, weight) add(subtotal, weight) subtract(subtotal, weight) extract() Can be used with moving window methods Can be used for zonal statistics https://github.com/ahhz/raster Using moving window

indicators #include #include Specify circular window with radius of 2 cells. Specify to use the mean statistic. namespace pr = pronto::raster; int main() { auto in = pr::open("my_file.tif"); auto window = pr::circle(2); auto indicator = pr::mean_generator{}; auto out = pr::moving_window_indicator(in, window, indicator); for(auto&& v : out) { // do something } "out" is a Raster View, satisfying all the same requirements as "in". } https://github.com/ahhz/raster The assign function forces

evaluation of lazy Expression Templates #include #include #include namespace pr = pronto::raster; Cheap: Create ET for the sqrt of values in my_file.tif int main() { auto in = pr::open("my_file.tif"); auto sqrt = pr::transform([](int x){return std::sqrt(x);}, in); auto out = pr::create_from_model("sqrt.tif", in); pr::assign(b, out); } Expensive: Create sqrt.tif and write values https://github.com/ahhz/raster Design: Runtime polymorphism through type erasure Runtime polymorphism is sometimes required

operations specified by user (e.g. Raster Calculator) value type of raster dataset unknown at Type erased class Holds Is a Raster View? compile time any_raster any_blind_raster Raster View object with value type T any_raster, where T is a supported type Yes Has: rows(), cols(), sub_raster() Lacks: begin(), end() https://github.com/ahhz/raster Using any_raster pr::any_raster plus_or_minus(bool do_plus) { auto a = pr::open("a.tif");

Runtime decision auto b = pr::open("b.tif"); if(do_plus) { auto x = pr::transform(std::plus, a, b)); return pr::make_any_raster(x); } else { auto y = pr::transform(std::minus, a, b)); return pr::make_any_raster(y); } } x and y are different types; but both are Raster Views with int as value type https://github.com/ahhz/raster Using any_blind_raster pr::any_blind_raster a_in= pr::open_any("a.tif"); pr::any_blind_raster b_in= pr::open_any("b.tif"); auto a = pr::raster_algebra_wrap(a_in);Opened as the appropriate value type auto b = pr::raster_algebra_wrap(b_in); auto c = (b-a) * (b-a);

Raster algebra made to work pr::export_any("out.tif", c.unwrap()); with any_blind_raster Exported to the appropriate value type Unwrap from raster_algebra_wrapper https://github.com/ahhz/raster Future music: Parallel processing The sub_raster() requirement of Raster Views combined with the Expression Template design allows doing calculation for only a part of the raster For local operations and moving window indicators only the assign function needs to be parallelized Split the rasters into sub-raster blocks Assign all blocks asynchronously And rest of library needs to be thread safe https://github.com/ahhz/raster Applications

Scale dependent landscape metrics Cellular Automata land use model Fuzzy set map comparison https://github.com/ahhz/raster Conclusions Requirement Customizable Scalable Efficient Flexible Usable Compatible Key challenge Be able to apply custom local, focal and zonal functions on any number of layers Work with large raster data sets that exceed available memory Minimal creation of temporary datasets for intermediary results

Work with wide range of data types with no need for data preprocessing Idiomatic interface, minimal use of boiler plate code Be able to use in conjunction with other libraries (e.g. for optimization, simulation control, etc.) Solution Transform function Indicator concept Generic moving windows Uses GDAL buffering and LRU Using expression template Using non-copying spatial operations Potential for parallelization Uses GDAL for data access Range based for-loop std::optional C++ Simple concepts

Thank you! github.com/ahhz/raster [email protected] https://github.com/ahhz/raster