EEL4930/5934 Reconfigurable Computing

EEL4930/5934 Reconfigurable Computing

Reminder Lab 0 Xilinx ISE tutorial Research Send me an email if interested Looking for those interested in RC with skills in compilers/languages/synthesis, networking, and/or memory structures Undergraduates also encouraged to participate What is Reconfigurable Computing? Reconfigurable computing (RC) is the study of architectures that can adapt

(after fabrication) to a specific application or application domain Involves architecture, design strategies, tool flows, CAD, languages, algorithms What is Reconfigurable Computing? Alternatively, RC is a way of implementing circuits without fabricating a device Essentially allows circuits to be implemented as software circuits are no longer the same thing as hardware RC devices are programmable by downloading bits - just like software a b Microprocess or Binaries 001010010

FPGA Binaries (Bitfile) 001010010 Bits loaded into progra m memory 0010 Processor Processor Bits loaded into CLBs, SMs, etc. 0010 FPGA Processor

x c y Why is RC important? Tremendous performance advantages Implements applications as custom circuit In some cases, > 100x faster than microprocessor Alternatively, similar performances as large cluster But much smaller Example: Software executes sequentially

RC executes all multiplications in parallel for (i=0; i < 16; i++) y += c[i] * x[i] Additions become tree of adders Even with slower clock, RC is much faster Performance difference even greater for larger input sizes SW time increases linearly RC time is basically O(log2(n)) - If enough area is available Implementation Possibilities Microprocessor RC (FPGA,CPLD, etc.) ASIC Performance

Why not use an ASIC for everything? Moores Law Moore's Law is the empirical observation made in 1965 that the number of transistors on an integrated circuit doubles every 24 months [Wikipedia] Some sources say 18 months 1993: 1 Million transistors 2007: >1 BILLION transistors!!!! Becoming extremely difficult to design this ASICs are expensive! Moores Law Solution: Make billions of transistors into a reconfigurable fabric - fabricate 1 big chip and use it for many things

Area overhead: circuit in FPGA can require 20x more transistors But, thats still equivalent to a > 50 million transistor ASIC Pentium IV ~ 42 million transistors Modern FPGAs reportedly support millions of logic gates! 2007: >1 BILLION transistors!!!! Solution: Make this reconfigurable When should RC be used? When it provides the cheapest solution Generally, depends on volume of devices

RC is typically more cost effective for low volume devices RC: low NRE, high unit cost ASIC: very high NRE, low unit cost When should RC be used? When circuit may have to be modified Cant change ASIC - hardware Can change circuit implemented in FPGA Uses When standards change

Codec changes after devices fabricated Allows addition of new features to existing devices Partial reconfiguration allows virtual fabric size analogous to virtual memory Without RC Anything that may have to be reconfigured is implemented in software Performance loss What about microprocessors? Similar cost issues uPs low NRE cost (coding is cheap)

Unit cost varies from several dollars to several thousand Wouldnt cheapest microprocessor always be the cheapest solution? Yes, but What about microprocessors? Often, microprocessors cannot meet performance constraints e.g. video decoder must achieve minimum frame rate Common reason for using custom circuit implementation Design Space Exploration Determine architectures that meet performance requirements 1.

Not trivial, requires performance analysis/estimation important problem 2. 3. Will study later in semester And, other constraints - power, size, etc. Estimate volume of device Determine cheapest solution The best architecture for an application is typically the cheapest one that meets all design constraints. RC Markets Embedded Systems RC achieves performance close to ASIC,

sometimes at much lower cost Many embedded systems still use ASIC due to high volume Reconfigurablilty! If standards changes, architecture is not fixed Can add new features after production RC Markets High-performance computing - HPC Cray XD-1 SGI Altix

64 Itaniums, FPGAs IBM Chameleon 12 AMD Opterons, FPGAs Cell processor, FPGAs Low volume, ASIC rarely feasible RC Markets General-purpose computing??? Ideal situation: desktop machine/OS uses RC to speedup up all applications Problems RC can be very fast, but not for all applications

Generally requires parallel algorithms Coding constructs used in many applications not appropriate for hardware Subject of tremendous amount of past and likely future research How to use extra transistors? More cache More microprocessors FPGA Something else? Limitations of RC

Not all applications can be improved Embedded Applications Large Speedups 15 14 13 12 11 10 9 8 7 Speedup 6 5 4 3 2 1 0 Desktop Applications No Speedup 15 14 13 12

11 10 9 8 7 6 Speedup 5 4 3 2 1 0 Tools need serious improvement! Design strategies are often ad-hoc Floating point?

Recently Viewed Presentations

  • Building a Hypertextual Digital Library in the Humanities: A ...

    Building a Hypertextual Digital Library in the Humanities: A ...

    Learning From Visualizations: Principles from Learning Science David N. Rapp University of Minnesota What is a visualization? Novel presentation of data Can detail dynamic, salient relationships Can provide experience with the unobservable Teach, organize, simulate How can visualizations influence learning...
  • AP speedup Association Rule Learning

    AP speedup Association Rule Learning

    - FP-growth is another popular FIM algorithm introduced by Han et al. [3]. By using a data structure called Frequent-Pattern tree (FP-tree) that contains all the information from the input database to avoid multi-pass database scanning, FP-growth requires only two...
  • IPC Training - Texas Instruments

    IPC Training - Texas Instruments

    IPC APIs are same across devices (e.g. GateMP implemented with hardware spinlock or software Peterson algorithm as needed) OS-agnostic. Same APIs on all operating systems. IPC 3.30. Overview - Architecture. IPC 3.30. Hardware. IPC. SYS/BIOS. Application. Overview.
  • Diodes Part II - ecse.rpi.edu

    Diodes Part II - ecse.rpi.edu

    Diodes Part II. K. A. Connor. Mobile Studio Project. Center for Mobile Hands-On STEM. SMART LIGHTING Engineering Research Center. ECSE Department. Rensselaer Polytechnic Institute
  • Ruby - Atlantic Credit Unions

    Ruby - Atlantic Credit Unions

    Atlantic Credit Unions Calendar. Ruby PyeEagle River Credit Union. One of Winter Wonderland's Creatures, Cape St. Charles, Labrador ... NS. Elaine Roy-JonesNBTA Credit Union. A Boy and His Pet, Fredericton, NB. Wanda DoironBergengren Credit Union. Ya Think Farm, Antigonish, NS....
  • X- and Y- Intercepts - Los Alamitos Unified School District

    X- and Y- Intercepts - Los Alamitos Unified School District

    X- and Y- Intercepts Objective: To identify the x- and y- intercepts. Standard 6.0 Return Quizzes Any questions? 13 total points 2 extra credit points for last question Intercepts X-Intercept: where the line crosses the x-axis ordered pair: (x,0) Y-Intercept:...
  • Programming and Problem Solving with C++, 2/e

    Programming and Problem Solving with C++, 2/e

    output stream that goes to the standard output device ... Your college is hosting a chess tournament, and the people running the tournament want to record the final positions of the pieces in each game on a sheet of paper...
  • Community Bankers Workshop Consumer Compliance Supervisory Scope of

    Community Bankers Workshop Consumer Compliance Supervisory Scope of

    *FDIC is still evaluating how to implement this provision of the final rule. MAY (but not required to) accept plans provided by mutual aid societies meeting certain requirements. Primary Federal Regulator must determine that plans qualify as flood insurance*