Concurrency Analysis for Parallel Programs with Textually Aligned

Concurrency Analysis for Parallel Programs with Textually Aligned

Concurrency Analysis for Parallel Programs with Textually Aligned Barriers Amir Kamil and Katherine Yelick Titanium Group http://titanium.cs.berkeley.edu U.C. Berkeley October 20, 2005 LCPC 05: Concurrency Analysis 1 Amir Kamil Motivation/Goals Many program analyses and optimizations benefit from knowing which expressions can run concurrently Develop basic concurrency analysis for Titanium programs Refine analysis to ignore infeasible paths Evaluate analysis using two applications Race detection Memory model enforcement

LCPC 05: Concurrency Analysis 2 Amir Kamil Barrier Alignment Many parallel languages make no attempt to ensure that barriers line up Example code that is legal but will deadlock: if (Ti.thisProc() % 2 == 0) Ti.barrier(); // even ID threads else ; // odd ID threads LCPC 05: Concurrency Analysis 3 Amir Kamil Structural Correctness Aiken and Gay introduced structural correctness (POPL98) Ensures that every thread executes the same number of barriers

Example of structurally correct code: if (Ti.thisProc() % 2 == 0) Ti.barrier(); // even ID threads else Ti.barrier(); // odd ID threads LCPC 05: Concurrency Analysis 4 Amir Kamil Textual Barrier Alignment Titanium has textual barriers: all threads must execute the same textual sequence of barriers Stronger guarantee than structural correctness this example is illegal: if (Ti.thisProc() % 2 == 0) Ti.barrier(); // even ID threads else Ti.barrier(); // odd ID threads Single-valued expressions used to enforce textual barriers LCPC 05: Concurrency Analysis

5 Amir Kamil Single-Valued Expressions A single-valued expression has the same value on all threads when evaluated Example: Ti.numProcs() > 1 All threads guaranteed to take the same branch of a conditional guarded by a single-valued expression Only single-valued conditionals may have barriers Example of legal barrier use: if (Ti.numProcs() > 1) Ti.barrier(); // multiple threads else ; // only one thread LCPC 05: Concurrency Analysis 6 Amir Kamil Concurrency Graph

Represents concurrency as a graph Nodes are program expressions If a path exists between two nodes, they can run concurrently Generated from control flow graph x++; x++ b call foo Ti.barrier(); z-- if (b) z--; z++ method foo else z++; c

if (c [single]) w--; else y++; w-- y++ ret foo foo(); LCPC 05: Concurrency Analysis 7 Amir Kamil Barriers Barriers prevent code before and after from running concurrently Nodes for barrier expressions removed from concurrency graph x++ x++; barrier

Ti.barrier(); ... ... LCPC 05: Concurrency Analysis 8 Amir Kamil Non-Single Conditionals Branches of a non-single conditional can run concurrently Different threads can take different branches Edge added in the concurrency graph between branches b if (b) z--; z-- else

z++ z++; ... LCPC 05: Concurrency Analysis ... 9 Amir Kamil Single Conditionals Branches of a single conditional cannot run concurrently All threads take the same branch No edge added in the concurrency graph between branches c if (c [single]) w--; w--

else y++ y++; ... LCPC 05: Concurrency Analysis ... 10 Amir Kamil Method Calls Method call nodes split into a call and return node Edges added from call node to target methods subgraph, and from target method to return node ... call foo ... foo();

method foo ret foo LCPC 05: Concurrency Analysis 11 Amir Kamil Concurrency Algorithm Two accesses can run concurrently if at least one is reachable from the other Concurrent accesses computed by doing N depth first searches x++; x++ b call foo Ti.barrier(); z-- if (b) z--;

z++ method foo else z++; c if (c [single]) w--; else y++; w-- y++ ret foo foo(); LCPC 05: Concurrency Analysis 12 Amir Kamil Infeasible Paths (I) Handling of method calls allows infeasible

control flow paths Path exists into one call site and out of another call foo method bar call foo method foo ret foo LCPC 05: Concurrency Analysis method baz ret foo 13 Amir Kamil Infeasible Paths (II) Solution label call and return edges with matching parentheses Only follow paths that correspond to balanced parentheses

call foo method bar call foo method foo ret foo LCPC 05: Concurrency Analysis method baz ret foo 14 Amir Kamil Bypass Edges (I) Reachability now depends on context Inefficient to revisit method in every context Solution add edges to bypass method calls

call foo call foo ret foo ret foo method foo call foo call foo ret foo ret foo LCPC 05: Concurrency Analysis 15 method foo Amir Kamil

Bypass Edges (II) Can only bypass method calls that can actually complete (without executing a barrier) Iteratively compute set of methods that can complete CanComplete Do (until a fixed point is reached): CanComplete CanComplete all methods that can complete by only calling methods in CanComplete LCPC 05: Concurrency Analysis 16 Amir Kamil Static Race Detection Two heap accesses compose a data race if they can concurrently access the same location, and at least one is a write Initially, x = 0 T1 set x = x + 1 T2

set x = x - 1 Possible final values of x: -1, 0, 1 Alias and concurrency analysis used to statically compute set of possible data races Analyses are sound, so all real races are detected Goal is to minimize number of false races detected LCPC 05: Concurrency Analysis 17 Amir Kamil Sequential Consistency Definition: A parallel execution must behave as if it were an interleaving of the serial executions by individual threads, with each individual execution sequence preserving the program order [Lamport79]. Initially, flag = data = 0 T1

T2 a [set data = 1] y [read flag] x [set flag = 1] b [read data] Legal execution: a x y b Illegal execution: x y b a Critical cycle Titanium and most other languages do not provide sequential consistency due to the (perceived) cost of enforcing it. LCPC 05: Concurrency Analysis 18 Amir Kamil Enforcing Sequential Consistency Compiler and architecture must not reorder memory accesses that are part of a critical cycle

Fences inserted into program to enforce order Potentially costly can prevent optimizations such as code motion and communication aggregation At runtime, can cost an RTT on a distributed machine Goal is to minimize number of inserted fences LCPC 05: Concurrency Analysis 19 Amir Kamil Benchmarks Benchmark lu-fact gsrb Lines1 Description 420 Dense linear algebra 1090 Computational fluid dynamics

kernel spmv pps gas 1493 3673 8841 Sparse matrix-vector multiply Parallel Poisson equation solver Hyperbolic solver for gas dynamics Line counts do not include the reachable portion of the 1 37,000 line Titanium/Java 1.0 libraries 1 LCPC 05: Concurrency Analysis 20 Amir Kamil

Analysis Levels We tested analyses of varying levels of precision Analysis Description base All expressions assumed concurrent concur Basic concurrency analysis feasible Feasible paths concurrency analysis All levels use alias analysis to distinguish memory locations LCPC 05: Concurrency Analysis 21

Amir Kamil Race Detection Results Number of Data Races Detected Fraction Compared tobase 1.2 1 0.8 0.6 0.4 0.2 0 gas gsrb base LCPC 05: Concurrency Analysis lu-fact Benchmark concur 22

pps spmv feasible Amir Kamil Sequential Consistency Results Number of Static Memory Barriers Fraction Compared tobase 1.2 1 0.8 0.6 0.4 0.2 0 gas gsrb base LCPC 05: Concurrency Analysis

lu-fact Benchmark concur 23 pps spmv feasible Amir Kamil Conclusion Textual barriers and single-valued expressions allow for simple but precise concurrency analysis Concurrency analysis is useful both for detecting races and for enforcing sequential consistency Not sufficient for race detection too many false positives Good enough for sequential consistency to be provided at low cost (SC|05) Ignoring infeasible paths significantly improves analysis results

LCPC 05: Concurrency Analysis 24 Amir Kamil

Recently Viewed Presentations

  • Climate Change - Union College

    Climate Change - Union College

    Global change and inequality. Prof. Jeff Corbin. Dept of Biology. Union College. Global change. Humans have altered many aspects of Earth's biology, geology, chemistry on a global scale. Climate change. Land use change. Species extinctions.
  • First Things First - West Virginia Department of Education

    First Things First - West Virginia Department of Education

    CHANGE Change is: A PROCESS, not an event Made by INDIVIDUALS first, then by the organization A highly PERSONAL experience DEVELOPMENTAL growth in both feelings and skills Hord & Loucks Taking Charge of Change Nurtured Optimism Every day, we have...
  • mod_perl における C10K Problem - Perl Mongers

    mod_perl における C10K Problem - Perl Mongers

    → これらの問題を一挙に解決してくれるのが Esehttpd と Pound です エンジニアは、時として相反する2つの命題を 同時に解決しなければならない事がある mod_perl アプリケーション users File Server RDBMS AP server 1 The Internet or intranet 従来の構成では AP Server 1上で、mod ...
  • Languages and Finite Automata - University Portal

    Languages and Finite Automata - University Portal

    * The + Operation : the set of all possible strings from alphabet except , also known as Kleene plus operator. Note : are infinite Courtesy Costas Busch - RPI * Languages A language is a set of strings OR...
  • 2017 Legal & Risk Management Briefing for Church

    2017 Legal & Risk Management Briefing for Church

    2 other University officials sentenced to prison terms; one told the court during sentencing: "It sickens me to think I might have played a part in children being hurt. I am sorry that I did not do more, and I...
  • Ångest "det är trångt" - samordning.org

    Ångest "det är trångt" - samordning.org

    ACT: Ett radikalt synsätt? Det mesta av hur vi mår är NORMALT. Problemet är att vi inte tycker om att må dåligt. Allt mående som aktiverar sympaticusreaktionen ger impuls till flykt/undvikande eller kamp - man vill komma bort ifrån obehaget.
  • Course Selection for 2011-12 - Peel District School Board

    Course Selection for 2011-12 - Peel District School Board

    If the staff have changed the student's email at the session, they login to myblueprint/peel with student#@pdsb.org ([email protected]) If the kids are complete new users they find their school in the NEW USER drop down menu. The kids who have...
  • Figure 4. (a) Distribution of unsigned errors (degrees

    Figure 4. (a) Distribution of unsigned errors (degrees

    Figure 4. (a) Distribution of unsigned errors (degrees from gravitational vertical) recorded during computerized rod and frame (CRAF) test for subjective visual vertical for patients with Parkinson disease (PD) and controls. Errors in circled area represent individuals aligning rod with...