1 Pragmatic Functional and Parallel Programming in Dyalog APL Morten Kromberg, CXO Dyalog Ltd Chuo University, Tokyo July 6th, 2016 Functional Parallel Programming in Dyalog APL 2 Agenda 50-Year-Old Constructs for Parallel Programing: a Brief Introduction to APL Some recent additions of parallel and asynchronous extensions to the Dyalog APL dialect of APL A very short demo - if I don't talk too much Functional Parallel Programming in Dyalog APL 3

A Programming Language (for Math) ab Problems: Mat1 +. Mat2 *x f g x - Wide variety of syntactical forms - Strange and inconsistent precedence rules xy when you deal with matrices - Things get worse See http://www.jsoftware.com/papers/EvalOrder.htm ba (3x)*2 +/46 a*n /46

Functional Parallel Programming in Dyalog APL 4 Comments The lamp symbol () indicates the beginning of a comment: 23 two times three 6 Functional Parallel Programming in Dyalog APL 5 Scalar Functions 1 2 3 + 4 5 6 addition 10 20 30 > 25 greater than 1 2 3 3 2 1

maximum 5 7 9 0 0 1 3 2 3 ? 6 6 6 6 4 3 5 4 "roll" Below: 10 times a 2x3 matrix containing the integers from 1 to 6 10 2 3 6 10 20 30 40 50 60 Functional Parallel Programming in Dyalog APL 6 Order of Execution No "precedence", only one rule: as f g x in mathematics f g x f(g(x)) Each function takes as its right argument the result of the entire

expression to its right. Good APL can be read from left to right by an experienced programmer, but as a beginner you may need to break it down. 10 2 3 6 Let's experiment at https://tryapl.org: https://tryapl.org/?a=10%20%D7%202%203%20%u2374%20%u2373%206&run Functional Parallel Programming in Dyalog APL 7 Functional Parallel Programming in Dyalog APL 8 Reduction (f/ or f or f/[n]) mat 3 4 12 /mat mat 1 2 3

4 24 5 6 7 8 1680 9 10 11 12 11880

45 120 231 384 Functional Parallel Programming in Dyalog APL 9 Scan (f\ or f or f\[n]) mat 3 4 12 +\mat +mat 1 2 3 4

1 3 6 10 5 6 7 8 5 11 18 26

9 10 11 12 9 19 30 42 1 2 3 4 6

8 10 12 15 18 21 12 Functional Parallel Programming in Dyalog APL 10 Outer Product (.f) f for all combinations of items from left & right 1 2

10 vec1 .x vec2 Functional Parallel Programming in Dyalog APL 100 1000 11 Inner Product (f.g) f/ row g col for all combinations of rows left, cols right. m1 m2 1 2 0 1

1 1 0 3 0 2 1 0 m1 +. m2 1 5 2 1

+/ 1 2 0 1 2 0 (+. is the regular dot product) Functional Parallel Programming in Dyalog APL 12 Inner Product (f.g) m1 m2 1 2 0 1 1 1 0

3 0 2 1 0 m1 +.= m2 +/ 1 2 0 = 1 2 0 0 3 2 0 (+.= is count of matches) Functional Parallel Programming in Dyalog APL 13

Selected "Mixed" Functions 1/4 Rotate: and [n] mat 3 4 12 0 1 0 1mat 2 0 1mat 1 2 3 4 3 4 1 2 5

6 7 8 5 6 7 8 9 10 11 12 12

9 10 11 1 6 3 12 5 10 7 4 9 2

11 8 Functional Parallel Programming in Dyalog APL 14 Selected Mixed Functions 2/4 Take and Drop 2 2mat mat 3 4 12 1 2 3 4 5 6

7 8 5 6 9 10 11 12 9 10 5 6

7 8 9 10 11 12 1mat Functional Parallel Programming in Dyalog APL 15 Selected Mixed Functions 3/4 Transpose mat 3 4 12 1 1mat mat

1 2 3 4 1 5 9 5 6 7 8 2

6 10 9 10 11 12 3 7 11 4 8 12 1

6 11 Functional Parallel Programming in Dyalog APL 16 Selected Mixed Functions 4/4 IndexOf 3 1 4 1 5 9 1 2 3 2 7 1 Grade (Up) 3 1 4 1 5 9 2 4 1 3 5 6 {[]}3 1 4 1 5 9 index by grade 1 1 3 4 5 9 Membership 'HELLO WORLD''AEIOU' 0 1 0 0 1 0 0 1 0 0 0 Functional Parallel Programming in Dyalog APL

17 Look Ma, No Loops! The fact that map is implicit, and indexing can be done using arrays, encourages switch free logic. Your data structure acts as a control structure: Example data2 7 15 60 data 5 5 7 15 60 data + data 3 7 15 2 8 16 60 (x mask) + y ~mask ages'child' 'young' '20s' 'old' ages[14 data10] child child young old Comments if data[i]>5 then data[i] else 5 Increment where data[i] is in [3,7,15]. If mask[i] then x else y bucketing

NB: This stuff is *really* easy for a compiler to parallelise Functional Parallel Programming in Dyalog APL 18 Summary: Basic [Parallel] Forms The following forms have existed since the beginning: Scalar functions: +-| * <=> ~ ! ?(roll) Reduction: / and Scan: \ Outer Product: .f Inner Product (rows-left vs cols-right): f.g and Mixed: , ?(deal)

Functional Parallel Programming in Dyalog APL 19 User-defined Functions Lambdas, aka dfns provide a lexically scoped, functional form of functions in APL. Alternative to procedural style of coding from 1966 APL. plusdouble{+2} left arg + two times right fibonacci{ Tail-recursive Fibonacci. 0 1 Default left argument =0: If =0, return 1st item of (1,+/) -1 Tail recursion } Functional Parallel Programming in Dyalog APL 20 Summary: Syntax of APL Syntactical Form

Example array 1 3.1415 1.2E18 function argument 6 1 2 3 4 5 6 left-arg function right-arg 1 2 3 1 10 100 1 plusdouble 2 3 1 20 300 5 7 operand

operator / 1 2 3 4 5 6 2 +/ 1 2 3 4 5 6 plusdouble/1 2 3 720 3 5 7 9 11 17 left-opnd operator right-opnd 1 0 2 +. 1 2 3 7 array[indices] 'ABCDEF'[2 5 5 6] BEEF Functional Parallel Programming in Dyalog APL Result

21 1982: Nested Arrays APL2: Any item of an array can be another array Scalar functions pervade: (1 2 3) 10 (4 5 6) (7 (8 9)) 4 10 18 7080 90 Functional Parallel Programming in Dyalog APL 22 Each Operator () The Each operator maps non-scalar functions. For example, +/ reduces vectors so:

+/ (4 5 6) (7 8 9) (4 5 6)+(7 8 9) 11 13 15 +/ (plus reduce each) will sum each top-level item: +/ (4 5 6) (7 8 9) (4+5+6) (7+8+9) 15 24 Functional Parallel Programming in Dyalog APL 23 Recent Extensions Modern APL has introduced additional high-order functions and combinators: Function trains (f + g) Power () applies a function repeatedly Rank () breaks arguments into sub-arrays of specified ranks Key () uses values found in one argument to partition another Stencil () applies a function to neighborhoods of each item of an array Function application using such operators can be highly optimised and parallelised by compilers (AND interpreters).

Functional Parallel Programming in Dyalog APL 24 Function Trains Atop: (f g) (f g) f ( g ) 10000 (? ) 6 Roll 10000 dice w/out 10000 element temp array Fork: (f g h) (f g h) (f ) g (h ) (f + h) (f ) + (h ) (+ ) (+) ( ) mean is sum count (f g h) (f) g (h) 1 (+ , -) 0.1 (1+0.1),(1-0.1) 1.1 0.9 or 1 0.1 Tines of a fork can be computed in parallel, and Trains allow creation of more efficiently parallelisable units Functional Parallel Programming in Dyalog APL

25 Power Operator Apply a function a fixed number of times, or until the right operand function tells you to stop. 2 (+ 3) 3 Add 2 three times 9 twice2 2 +twice 3 Bind one operand; twice is a monadic opr 7 ({1+})1 until fn fn-1 1.618033989 Functional Parallel Programming in Dyalog APL 26 Rank Operator (1 of 2) mat (x1 0) vec Multiplication with left rank 1 (vectors), right rank 0 (scalars)

1 2 3 4 5 6 10 10 20 30 400 500

600 Functional Parallel Programming in Dyalog APL 100 27 Rank Operator (2 of 2) mat 2 2 4 16 1 2 3 4 5 6 7 8

9 10 11 12 13 14 15 16 ((+ )2) mat avg cols 3 11 4 5 6 12 13

14 (+ ) mat average along 1 st dimension 5 9 6 7 8 10 11 12 ((+ )1) mat avg rows 2.5 6.5 10.5 14.5 Functional Parallel Programming in Dyalog APL

28 Key () Apply operand function to items corresponding to unique keys. keys 'red' 'blue' 'red' 'red' 'blue' values 10 20 30 40 50 keys { } values Group by key red 10 30 40 blue20 50 keys {,+/} values Sum items by key red 80 blue70 Functional Parallel Programming in Dyalog APL

29 Stencil () Stencil applies function operand to each item of an array and selected neighbours: 1 2 0 1 1 2 2 3 1 2 3 is same: identity function 3 ( 3) 1 2 3 Window Size = 3 2 3 0 Neighbors Items A classical blur stencil can be expressed as: ({ .25 .5 .25 +. } 3) 0 10 0 0 10 0 2.5 5 2.5 2.5 5 2.5

Functional Parallel Programming in Dyalog APL 30 John Conways Game of Life Computing the next generation: Any live cell with fewer than two live neighbours dies, as if caused by underpopulation. Any live cell with two or three live neighbours lives on to the next generation. Any live cell with more than three live neighbours dies, as if by over-population. Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction. life{1 .3 4=+/,1 0 1.1 0 1.} life{1 .3 4={+/,} 3 3} Functional Parallel Programming in Dyalog APL 31 Life is a Convolution life{1 .3 4={+/,} 3 3} Stencil Life edges.2 1 2 Neighbors weighted 2, centre = 1 2 2 2 2 1 2

2 2 2 If there are two or three neighbours for a live cell, the edge-weighted sum of the 3x3 tile around the cell will be (1+2+2) or (1+2+2+2). If there are three neighbours for an empty cell (birth), the sum will be (2+2+2). So: good5 6 718 Index Origin Zero 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 And now life is a super-optimisable parallel stencil calculation: life{good[{+/,edges} 3 3]} Functional Parallel Programming in Dyalog APL 32 Neural Networks APL is a direct notation for expressing the math at the core of artificial neural networks. For example, the following function applies a sigmoid function to the output of a collection of sigmoid neurons: sn{1+*-+.} contains neuron weights (rows are neurons, one col per input) is an array of input stimuli (row are inputs, one col per case). This applies the computation as a stencil of size 7 7, to an input matrix:

img{1+*-weights+.,} 7 7} Functional Parallel Programming in Dyalog APL 33 Parallel Functional Forms in APL Parallel Forms Sometimes Parallelizable ... f (map often implicit) f (each: explicit map) f.g (inner product) fn (rank: map sub-arrays) f (key: group by) fsz (stencil: map to neighborhoods) f/ (reduction) f\ (scan) fn (apply n times or until fixpoint)

Functional Parallel Programming in Dyalog APL 34 Objects in APL: Dot is Parallel APL Namespaces are dynamic objects: lifeNS '' Create an empty namespace life.meaning42 Create variable within it (altlifeNS '').meaning321 Another one universeslife,altlife Array of namespaces universes How many items? 2 universes.meaning Parallel universes 42 321 Functional Parallel Programming in Dyalog APL 35 Class-based Objects FNEW'Form' (('Caption' 'Hello World') ('Coord' 'Pixel')('Size'(100 400)))

F.(B1 B2 B3)F.NEW{'Button' (('Caption'('Button ',))('Posn'(5+500 )))}1 2 3 F.(B1 B2 B3).(Caption Posn) Button 15 55Button 25 105Button 35 155 F.(B1 B2 B3).(Posn[1]50) Functional Parallel Programming in Dyalog APL 36 More Arrays of Objects XLNEW 'OleClient' ('ClassName' 'Excel.Application') citiesXL.Workbooks.Open 'c:\...\cities.xls' sheetscities.Sheets Sheets collection as an array sheets.Name DDKUK sheets.UsedRange.Value2

Germany 82.4 Denmark 5.4 United Kingdom60.2 Berlin 3.4 Copenhagen2 London 7 Mnchen 1.2 Helsingr 0.03 Birmingham 1 Stuttgart0.5 Basingstoke 0.1

Functional Parallel Programming in Dyalog APL 37 Parallel Execution of APL Use vector instructions (SSE etc) Multithread primitives Done some scalar dyadic functions Need to look at operators and trains Compile to Parallel Hardware Aaron Hsu at Indiana University: co-dfns compiler HyperFit / University of Copenhagen: APL to Futhark Bernecky / Scholz @ Herriot Watt: Single Assignment C Introduce Asynchronous Features into Language Futures and Isolates Functional Parallel Programming in Dyalog APL 38 Compilers Compiler projects are very exciting and will make APL competitive in new areas

Image manipulation Fluid dynamics-style applications But not all applications are data parallel And compilers sometimes put unpleasant constraints on the users Functional Parallel Programming in Dyalog APL 39 Interpreter Needs Help... APL is SIMD at the core But parallelising sequences of SIMD instructions is hard We have parallel cores, but they dont have parallel access to memory The user has the knowledge which is hard to deduce throught static analysis of APL: Array size vs number of parallel threads Side effects (or lack thereof) (S)he can help make decisions on when to fork and collect Conclusion: We need new language features to allow the user to express optionally asynchronous parts of algorithms Functional Parallel Programming in Dyalog APL

40 Futures and Isolates Goal: Allow the APL user to explicitly express parallelism / asynchronicty in a natural way In the interpreter, futures and isolates enable coarsegrained task parallelism Tasks with a duration of at least 100ms In a compiler, futures can be used to express finegrained data parallelism Functional Parallel Programming in Dyalog APL 41 Isolates An Isolate tastes, smells, looks like a Dyalog namespace, except that... Expressions executed in the isolate run in a separate process from the main interpreter thread (in parallel) Functional Parallel Programming in Dyalog APL 42

Isolate Example I3isolate.New3'' I3.X(1 2 3)(4 5)6 I3.({(+)}X) 2 4.5 6 X6 X1 2 3 X4 5 Functional Parallel Programming in Dyalog APL 43 Futures The result of an expression executed in an Isolate is a Future Futures can be passed as arguments to functions without blocking Structural functions can work on arrays containing futures without blocking Primitives which need to reference the value will block Functional Parallel Programming in Dyalog APL

44 The Parallel Operator Monadic operator parallel () derives a function which: - creates an empty isolate - executes the operand inside the isolate - returns a future (and discards the isolate when no longer needed) sums{+/}100 returns 100 futures - IMMEDIATELY sums structural functions do not realize futures 100 partitions(100251)sums Partitioned Enclose 4 partitions 4 groups, each containing 25 futures 25 25 25 25 +/ +/partitions 4 parallel +/es 171700 (We used 1+4+100 parallel threads to compute the end result) Functional Parallel Programming in Dyalog APL 45 Deterministic Parallelism Inserting or removing Parallel operators does not change the meaning of the code. Thus,

parallelism does not interfere with the notation. sums{+/}100 sums{+/} 100 partitions(100251)sums +/+/partitions +/+/ partitions 171700 (as long as your functions have no side effects) ( and there are no errors) Functional Parallel Programming in Dyalog APL 46 The Model Implementation Futures are fully implemented in the Dyalog interpreters from v14.0 (2014) onwards The creation and management of isolates is still modelled using APL code, most importantly: Proposed Name in Primitive Construct Current Model

Comment New Isolate II Parallel I Parallel Each Functional Parallel Programming in Dyalog APL 47 Demo Functional Parallel Programming in Dyalog APL

48 Fun with Isolates ! Example: Start an isolate server on each of two Raspberry Picontrolled robots, then under Windows/Linux/Mac: {isolate.AddServer '192.168.0.',}100 101 bots bot bot clone bot driver API 500 bots.Drive (45 0)(0 45) This means: call the Drive function on each bot in parallel, - With a left argument of 500ms - With right arg of (45 0) for first and (0 45) for 2nd bot (power settings for right and left wheels) Functional Parallel Programming in Dyalog APL 49 Dancing Robots Functional Parallel Programming in Dyalog APL 50 Selected Customers

KCI Corp (US) Budgeting and Planning Carlisle Group (US) Collateral and Securitization for Global Capital Markets CompuGroupMedical (Sweden) Worlds Largest Patient Record system: 40,000 users and 2.5 million patients records at largest hospital in Scandinavia ExxonMobil (US) Optimizes the Cracking of Petroluem Products using APL SimCorp (DK), APL Italiana (I), Fiserv Investment Services (US), Infostroy Ltd (Russia) Leaders in various markets for Asset Management Systems A Finnish game company Functional Parallel Programming in Dyalog APL 51

Infostroy Functional Parallel Programming in Dyalog APL 52 SimCorp Functional Parallel Programming in Dyalog APL 53 APLIT Functional Parallel Programming in Dyalog APL 54 CGM TakeCare Patient Journal System Functional Parallel Programming in Dyalog APL 55

Any Questions Prefix: Roll (scalar) - Integer in range 1 to : ? 6 6 6 6 4 3 4 2 ? Infix: Deal deal items from range 1 to : 5 ? 6 2 5 1 4 6 Selfie: Permutation: ? 6 3 1 4 2 6 5 Functional Parallel Programming in Dyalog APL 56 Resources Supporting documentation and materials online: Interactive APL Session on line: http://tryapl.org (see the "resources" tab) Online Help and Manuals

http://help.dyalog.com http://docs.dyalog.com Introduction to Dyalog APL by Bernard Legrand http://http://www.dyalog.com/mastering-dyalog-apl.htm Google: Try for example https://www.google.com/?q=dyalog+apl+power+operator Functional Parallel Programming in Dyalog APL