Parallel Programming in C with the Message Passing Interface

Parallel Programming in C with the Message Passing Interface

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 15 The Fast Fourier Transform Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Outline Fourier analysis Discrete Fourier transform

Fast Fourier transform Parallel implementation Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Discrete Fourier Transform Many applications in science, engineering Examples Voice recognition Image processing Straightforward implementation: (n2) Fast Fourier transform: (n log n) Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Fourier Analysis Fourier analysis: Represent continuous functions by potentially infinite series of sine and cosine functions Discrete Fourier transform: Map a sequence over time to another sequence over frequency Signal strength as a function of time Fourier coefficients as a function of frequency Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

DFT Example (1/4) 16 data points representing signal strength over time Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. DFT Example (2/4) DFT yields amplitudes and frequencies of sine/cosine functions Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. DFT Example (3/4) Plot of four constituent sine/cosine functions and their sum Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. DFT Example (4/4)

Continuous function and original 16 samples. Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. DFT of Speech Sample An gorra cats are furrier... Signal

Frequency and amplitude Figure courtesy Ron Cole and Yeshwant Muthusamy of the Oregon Graduate Institute Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Computing DFT Matrix-vector product Fn x x is input vector (signal samples) f = ij for 0 i, j < n and is i,j n

n primitive nth root of unity Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Example 1 Compute DFT of vector (2, 3) 2, the primitive square root of unity, is -1 200 201 x0 1 1 2 5 10

11 x 1 1 3 1 2 1 2 Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Example 2

Compute DFT of vector (1, 2, 4, 3) The primitive 4th root of unity is i 40 40 0 4 41 0 2 4 4 0 3 4 4 40 42

44 46 40 x0 1 1 1 1 1 10 3 4 x1 1 i 1 i 2 3 i 6

0 4 x2 1 1 1 1 4 9 4 x3 1 i 1 i 3 3 i Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Fast Fourier Transform An (n log n) algorithm to perform DFT Based on divide-and-conquer strategy Suppose we want to compute f(x) f ( x)a0 a1 x a2 x 2 ... an 1 x n 1 We define two new functions, f[0] and f[1] f [ 0 ] a0 a2 x a4 x 2 ... an 2 x n / 2 1 f [1] a1 a3 x a5 x 2 ... an 1 x n / 2 1 Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. FFT (continued)

Note: f(x) = f [0](x2) + x f [1](x2) Problem of evaluating f (x) at n values of reduces to Evaluating f [0](x) and f [1](x) at n/2 values of Performing f [0](x2) + x f [1](x2) Leads to recursive algorithm with time complexity (n log n) Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Iterative Implementation Preferable Well-written iterative version performs fewer index computations than recursive

version Iterative version evaluates key common sub-expression only once Easier to derive parallel FFT algorithm when sequential algorithm in iterative form Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Recursive Iterative (1/3) Recursive implementation of FFT Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Recursive Iterative (2/3) Determining which computations are performed

for each function invocation Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Recursive Iterative (3/3) Tracking the flow of data values (input vector at bottom) Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Program Design Domain decomposition Associate primitive task with each element of input vector a and corresponding element of output vector y Add channels to handle communications between tasks

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. FFT Task/Channel Graph a[0 ] y [0 ] a[1 ] y [1 ] a[2 ] y [2 ]

a[3 ] y [3 ] a[4 ] y [4 ] a[5 ] y [5 ] a[6 ] y [6 ]

a[7 ] y [7 ] Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration and Mapping Agglomerate primitive tasks associated with contiguous elements of vector Map one agglomerated task to each process Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. After Agglomeration, Mapping

a [0 ] a[1 ] a[2 ] a[3 ] a[4 ] a[5 ] a[6 ] a[7 ]

a[8 ] a[9 ] a[1 0 ] a[1 1 ] a[1 2 ] a[1 3 ] a[1 4 ] a[1 5 ] y [0 ] y [1 ] y [2 ] y [3 ]

y [4 ] y [5 ] y [6 ] y [7 ] y [8 ] y [9 ] y [1 0 ] y [1 1 ] y [1 2 ] y [1 3 ] y [1 4 ] y [1 5 ] Input Output

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Phases of Parallel FFT Algorithm Phase 1: Processes permute as (all-to-all communication) Phase 2: First log n log p iterations of FFT No message passing is required Phase 3: Final log p iterations Processes organized as logical hypercube

In each iteration every process swaps values with partner across a hypercube dimension Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Complexity Analysis Each process performs equal share of computation: (n log n / p) All-to-all communication: (n log p / p) Sub-vector swaps during last log p iterations: (n log p / p) Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Isoefficiency Analysis

Sequential time complexity: (n log n) Parallel overhead: (n log p) Isoefficiency relation: n log n C n log p log n C log p n pC C C M ( p )/ p p / p p C 1

Scalability depends C, a function of the ratio between computing speed and communication speed. Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary Discrete Fourier transform used in many scientific and engineering applications Fast Fourier transform important because it

implements DFT in time (n log n) Developed parallel implementation of FFT Why isnt scalability better? (n log n) sequential algorithm Parallel version requires all-to-all data exchange

Recently Viewed Presentations

  • Human-Computer Interaction - SFU.ca

    Human-Computer Interaction - SFU.ca

    Goals of HCI (Shneiderman & Plaisant Chap1) Allow users to carry out tasks. Safely. Effectively. Efficiently. Enjoyably. Jan 6, 2011. IAT 334. Goals of System Engineering. Functionality. Tasks and sub-tasks to be carried out. ... Human-Computer Interaction Last modified by:
  • Literary Terms - Abbotsford Christian School

    Literary Terms - Abbotsford Christian School

    Example: the sun is a lemon drop in the sky. When using a . metaphor, you can also replace one object for another. E. xamples: the lemon drop in the sky burned my skin. the booming thunder followed up the...
  • SEC331 -  Windows Rights Management Services   RMS  RMS

    SEC331 - Windows Rights Management Services RMS RMS

    This is a schematic of how Office 2003 packages rights-protected information, policy, keys and licensing information into one protected package. Office 2003 rights-protected files that enable HTML clients to view will also have an HTML (.rmh) copy of the file...
  • Entrepreneurship Development for the Creative Industries

    Entrepreneurship Development for the Creative Industries

    Gaynor Richards Department for Culture, Media and Sport Objectives of the session Discuss entrepreneurship development for the creative industries Address creative enterprise as an emerging area of scholarship Reflect upon current situation Set up an action plan (have a strategy...
  • Atmospheric constituents forecast and analysis using an ...

    Atmospheric constituents forecast and analysis using an ...

    In this case study, considering dust direct effect alone may not be enough to represent SAL thermal property, and its subsequence impact on hurricane development. Future Works (1) Considering dust indirect effect by employing different chemistry schemes such as MOSAIC,...
  • VANCOUVER STYLE - unito.it

    VANCOUVER STYLE - unito.it

    La citazione bibliografica. La citazione bibliografica serve ad indicare i documenti a cui si fa riferimento in un altro documento, come fonti di informazioni o come termini di confronto.
  • Highway Safety Team Staff Meeting SMART Portal HSIP

    Highway Safety Team Staff Meeting SMART Portal HSIP

    CMF's Clearinghouse. Star Quality Rating based on a scale 1 to 5, where 5 indicates the best rating. We recommend highest star rating and use engineering judgement to select the suitable CMF value. We are working on developing VA specific...
  • Kematian Pada Tumor Sinus Endodermal Pasca Operasi Debulking

    Kematian Pada Tumor Sinus Endodermal Pasca Operasi Debulking

    Upaya untuk mengidentifikasi masalah kegawatan akut utama yang dijumpai pada ginekologi : * Kegawatan Ginekologi Syok Abortus Mola Kista terpuntir / pecah * Syok Berkurangnya aliran darah dalam " sirkulasi mikro " MENGANCAM JIWA MEMERLUKAN PENGOBATAN YANG SEGERA DAN INTENSIF...