Data Intensive Clouds Tools and Applications May 2,

Data Intensive Clouds Tools and Applications May 2,

Data Intensive Clouds Tools and Applications May 2, 2013 Judy Qiu [email protected] http://SALSAhpc.indiana.edu School of Informatics and Computing Indiana University SALSA Important Trends In all fields of science and throughout life (e.g. web!) Impacts preservation, access/ use, programming model Implies parallel computing important again Performance from extra cores not extra clock speed

new commercially supported data center model building on compute grids Data Deluge Cloud Technologies Multicore/ Parallel Computing eScience A spectrum of eScience or eResearch applications (biology chemistry, physics social science and humanities ) Data Analysis Machine learning

SALSA Challenges for CS Research Science faces a data deluge. How to manage and analyze information? Recommend CSTB foster tools for data capture, data curation, data analysis Jim Grays Talk to Computer Science and Telecommunication Board (CSTB), Jan 11, 2007 Therere several challenges to realizing the vision on data intensive systems and building generic tools (Workflow, Databases, Algorithms, Visualization ). Cluster-management software Distributed-execution engine Language constructs Parallel compilers Program Development tools ... SALSA Data Explosion and Challenges Data Deluge

Multicore/ Parallel Computing Cloud Technologies eScience SALSA Data Were Looking at Biology DNA sequence alignments (Medical School & CGB) (several million Sequences / at least 300 to 400 base pair each) Particle physics LHC (Caltech) (1 Terabyte data placed in IU Data Capacitor) Pagerank (ClueWeb09 data from CMU) (1 billion urls / 1TB of data) Image Clustering (David Crandall) (7 million data points with dimensions in range of 512 ~ 2048, 1 million clusters; 20 TB intermediate data in shuffling) Search of Twitter tweets (Filippo Menczer) (1 Terabyte data / at 40 million tweets a day of tweets / 40 TB decompressed data)

High volume and high dimension require new efficient computing approaches! SALSA Data Explosion and Challenges SALSA Cloud Services and MapReduce Data Deluge Multicore/ Parallel Computing Cloud Technologies eScience SALSA

Clouds as Cost Effective Data Centers Builds giant data centers with 100,000s of computers; ~ 200-1000 to a shipping container with Internet access Microsoft will cram between 150 and 220 shipping containers filled with data center gear into a new 500,000 square foot Chicago facility. This move marks the most significant, public use of the shipping container systems popularized by the likes of Sun Microsystems and Rackable Systems to date. News Release from Web 8 SALSA Clouds hide Complexity Cyberinfrastructure Is Research as a Service SaaS: Software as a Service (e.g. Clustering is a service) PaaS: Platform as a Service IaaS plus core software capabilities on which you build SaaS (e.g. Azure is a PaaS; MapReduce is a Platform) IaaS (HaaS): Infrasturcture as a Service

(get computer time with a credit card and with a Web interface like EC2) 9 SALSA What is Cloud Computing? 1. 2. 3. 4. Historical roots in todays web-scale problems Case Large data centers Study 1 Different models of computing Case Study 2 Highly-interactive Web applications A model of computation and data storage based on pay as you go access to unlimited remote data center capabilities YouTube; CERN

SALSA Parallel Computing and Software Data Deluge Cloud Technologies Parallel Computing eScience SALSA MapReduce Programming Model & Architecture Google, Apache Hadoop, Dryad/DryadLINQ (DAG based and now not available) Master Node Worker Nodes Data Partitions

Record readers Read records from data partitions Distributed File System map(Key , Value) Intermediate space partitioned using a key partition function Sort input pairs to groups Inform Master Sort Local disks

Download data reduce(Key , List) Schedule Reducers Output Distributed File System Map(), Reduce(), and the intermediate key partitioning strategy determine the algorithm Input and Output => Distributed file system Intermediate data => Disk -> Network -> Disk Scheduling =>Dynamic Fault tolerance (Assumption: Master failures are rare) SALSA Twister (MapReduce++) Pub/Sub Broker Network Worker Nodes D D

R M M M M R R R R Data Split MR Driver

Streaming based communication Intermediate results are directly transferred Reduce Worker from the map tasks to the reduce tasks eliminates local files MRDeamon Cacheable map/reduce tasks Static data remains in memory Data Read/Write Combine phase to combine reductions User Program is the composer of MapReduce Communication computations Extends the MapReduce model to iterative computations M Map Worker User Program D File System

Static data Iterate Configure() User Program Map(Key, Value) flow Reduce (Key, List) Combine (Key, List) Different synchronization and intercommunication mechanisms used by the parallel runtimes Close() SALSA Twister New Release SALSA

Iterative Computations K-means Performance of K-Means Matrix Multiplication Parallel Overhead Matrix Multiplication SALSA Data Intensive Applications Data Deluge Cloud Technologies Multicore eScience

SALSA Applications & Different Interconnection Patterns Map Only (Embarrassingly Parallel) Input map Classic MapReduce Iterative Reductions Loosely Synchronous iterations Input map Input map

Pij Output CAP3 Gene Analysis Document conversion (PDF -> HTML) Brute force searches in cryptography Parametric sweeps PolarGrid Matlab data analysis reduce High Energy Physics (HEP) Histograms Distributed search Distributed sorting Information retrieval Calculation of Pairwise Distances for genes reduce Expectation maximization algorithms Clustering

- K-means - Deterministic Annealing Clustering - Multidimensional Scaling MDS Linear Algebra Domain of MapReduce and Iterative Extensions Many MPI scientific applications utilizing wide variety of communication constructs including local interactions - Solving Differential Equations and - particle dynamics with short range forces MPI SALSA Bioinformatics Pipeline

Gene Sequences (N = 1 Million) Select Referenc e N-M Sequence Set (900K) Pairwise Alignment & Distance Calculation Reference Sequence Set (M = 100K) Reference Coordinates Interpolative MDS

with Pairwise Distance Calculation x, y, z O(N2) N-M x, y, z Coordinates Visualization Distance Matrix MultiDimensional Scaling (MDS) 3D Plot SALSA Pairwise Sequence Comparison Using 744 CPU cores in Cluster-I

Compares a collection of sequences with each other using Smith Waterman Gotoh Any pair wise computation can be implemented using the same approach All-Pairs by Christopher Moretti et al. DryadLINQs lower efficiency is due to a scheduling error in the first release (now fixed) Twister performs the best SALSA High Energy Physics Data Analysis HEP data (binary) map map ROOT[1] interpreted function 256 CPU cores of Cluster-III (Hadoop and Twister) and

Cluster-IV (DryadLINQ). Histograms (binary) reduce combine ROOT interpreted Function merge histograms Final merge operation Histogramming of events from large HEP data sets as in Discovery of Higgs boson Data analysis requires ROOT framework (ROOT Interpreted Scripts) Performance mainly depends on the IO bandwidth Hadoop implementation uses a shared parallel file system (Lustre) ROOT scripts cannot access data from HDFS (block based file system) On demand data movement has significant overhead DryadLINQ and Twister access data from local disks Better performance [1] ROOT Analysis Framework, http://root.cern.ch/drupal/

SALSA Pagerank Partial Adjacency Matrix Current Page ranks (Compressed) Iterations C M Partial Updates R Partially merged Updates

Well-known pagerank algorithm [1] Used ClueWeb09 [2] (1TB in size) from CMU Hadoop loads the web graph in every iteration Twister keeps the graph in memory Pregel approach seems more natural to graph based problems [1] Pagerank Algorithm, http://en.wikipedia.org/wiki/PageRank [2] ClueWeb09 Data Set, http://boston.lti.cs.cmu.edu/Data/clueweb09/ SALSA Iterative MapReduce Frameworks Twister[1] Map->Reduce->Combine->Broadcast Long running map tasks (data in memory) Centralized driver based, statically scheduled. Daytona[3] Iterative MapReduce on Azure using cloud services Architecture similar to Twister Haloop[4] On disk caching, Map/reduce input caching, reduce output caching Spark[5]

Iterative Mapreduce Using Resilient Distributed Dataset to ensure the fault tolerance Mahout[6] Apache open source data mining iterative Mapreduce based on Hadoop DistBelief[7] Apache open source data mining iterative Mapreduce based on Hadoop SALSA Parallel Computing and Algorithms Data Deluge Cloud Technologies Parallel Computing eScience SALSA

Parallel Data Analysis Algorithms on Multicore Developing a suite of parallel data-analysis capabilities Clustering using image data Parallel Inverted Indexing using for HBase Matrix algebra as needed Matrix Multiplication Equation Solving Eigenvector/value Calculation SALSA Intels Application Stack SALSA NIPS 2012: Neural Information Processing Systems, December, 2012. Jeffrey Dean Andrew Ng What are the Challenges to Big Data Problem? Traditional MapReduce and classical parallel runtimes cannot solve iterative algorithms efficiently

Hadoop: Repeated data access to HDFS, no optimization to data caching and data transfers MPI: no natural support of fault tolerance and programming interface is complicated We identify collective communication is missing in current MapReduce frameworks and is essential in many iterative computations. We explore operations such as broadcasting and shuffling and add them to Twister iterative MapReduce framework. We generalize the MapReduce concept to Map Collective noting that large collectives are a distinguishing feature of data intensive and data mining applications. SALSA Case Study 1 Data Intensive Kmeans Clustering mage Classification: 7 million images; 512 features per image; 1 million cluste K Map tasks; 64G broadcasting data (1GB data transfer per Map task node); TB intermediate data in shuffling. SALSA

Workflow of Image Clustering Application SALSA High Dimensional Image Data K-means Clustering algorithm is used to cluster the images with similar features. In image clustering application, each image is characterized as a data point (vector) with dimension in range 512 ~ 2048. Each value (feature) ranges from 0 to 255. Around 180 million vectors in full problem Currently, we are able to run K-means Clustering up to 1 million clusters and 7 million data points on 125 computer nodes. 10K Map tasks; 64G broadcast data (1GB data transfer per Map task node); 20 TB intermediate data in shuffling. SALSA Twister Collective Communications Broadcast Broadcasting Data could be large

Chain & MST Map Tasks Map Tasks Map Tasks Map Collective Map Collective Map Collective Reduce Tasks Reduce Tasks Reduce Tasks Reduce Collective Reduce Collective

Reduce Collective Map Collectives Local merge Reduce Collectives Collect but no merge Combine Direct download or Gather Gather SALSA Time (Unit: Seconds) Twister Broadcast Comparison (Sequential vs. Parallel implementations) 450 400 350

300 250 200 150 100 50 0 Per Iteration Cost (Before) Per Iteration Cost (After) Combine Shuffle & Reduce Map Broadcast SALSA Twister Broadcast Comparison (Ethernet vs. InfiniBand) 1GB bcast data on 16 nodes cluster at ORNL 25 Seconds

20 15 10 5 0 Ethernet InfiniBand SALSA Serialization, Broadcasting and De-serialization SALSA Topology-aware Broadcasting Chain Core Switch 10 Gbps Connection Rack Switch Rack Switch Compute Node Compute Node

Compute Node Compute Node Compute Node Compute Node Compute Node Compute Node Compute Node pg1-pg42 pg43-pg84 pg295pg312 Rack Switch 1 Gbps Connection

SALSA Bca st Tim e (Seco n d s) Bcast Byte Array on PolarGrid with 1Gbps Ethernet 25 Twister Bcast 500MB MPI Bcast 500MB Twister Bcast 1GB MPI Bcast 1GB 20 Twister Bcast 2GB MPI Bcast 2GB 15 10 5 0

1 25 50 75 100 125 150 Number of Nodes SALSA Triangle Inequality and Kmeans

Dominant part of Kmeans algorithm is finding nearest center to each point O(#Points * #Clusters * Vector Dimension) Simple algorithms finds min over centers c: d(x, c) = distance(point x, center c) But most of d(x, c) calculations are wasted as much larger than minimum value Elkan (2003) showed how to use triangle inequality to speed up using relations like d(x, c) >= d(x,c-last) d(c, c-last) c-last position of center at last iteration So compare d(x,c-last) d(c, c-last) with d(x, c-best) where c-best is nearest cluster at last iteration Complexity reduced by a factor = Vector Dimension and so this important in clustering high dimension spaces such as social imagery with 512 or more features per image SALSA Fast Kmeans Algorithm Graph shows fraction of distances d(x, c) calculated each iteration for a test data set

200K points, 124 centers, Vector Dimension 74 Results on Fast Kmeans Algorithm Fraction of Point-Center Distances Case Study 1 HBase Architecture Tables split into regions and served by region servers Reliable data storage and efficient access to TBs or PBs of data, successful application in Facebook and Twitter Good for real-time data operations and batch analysis using Hadoop MapReduce Problem: no inherent mechanism for field value searching, especially for fulltext values SALSA IndexedHBase System Design Dynamic HBase

deployment Data Loading (MapReduce) CW09DataTable Index Building (MapReduce) PageRankTable Web Search Interface CW09FreqTable CW09PosVecTable Performance Evaluation (MapReduce) Term-pair Frequency Counting (MapReduce) CW09PairFreqTable

LC-IR Synonym Mining Analysis (MapReduce) SALSA Parallel Index Build Time using MapReduce We have tested system on ClueWeb09 data set Data size: ~50 million web pages, 232 GB compressed, 1.5 TB after decompression Explored different search strategies SALSA Architecture for Search Engine Data Layer Apache Lucene Inverted Indexing System Business Logic Layer mapreduce

PHP script Web UI HBase Tables 1. inverted index table 2. page rank table Hive/Pig script Apache Server on Salsa Portal Presentation Layer crawler ClueWeb09 Data HBase Thrift client SESSS YouTube Demo Thrift

Server Hadoop Cluster on FutureGrid Ranking System Pig script SALSA Applications of Indexed HBase Combine scalable NoSQL data system with fast inverted index look up Best of SQL and NoSQL Text analysis: Search Engine Truthyo Project: visualize About 40Analyze million and

tweets a day the diffusion of information on Twitter o o The new dailyand dataemerging size was bursts ~13 GBofcompressed (~80memes GB Identify activity around (Internet decompressed) year ago (May 2012), and 30 GB compressed concepts) of various aflavors now (April 2013). model of memes on social network o Investigate competition o

The total compressed size is about 6-7 TB, and around 40 TB after o Detect political smears, astroturfing, misinformation, and other social decompressed. pollution Medical Records: Identify patients of interest (from indexed Electronic Health Record EHR entries) o Perform sophisticated Hbase search on data sample identified SALSA Traditional way of query evaluation Meme index #usa: 1234 2346 (tweet id) #love: 9987 4432 (tweet id) get_tweets_with_meme([memes], time_window)

Meme index Time index IDs of tweets containing [memes] IDs of tweets within time window Time index 2012-05-10: 7890 3345 (tweet id) 2012-05-11: 9987 1077 (tweet id) results

Challenges: 10s of millions of tweets per day, and time window is normally in months large index data size and low query evaluation performance SALSA Customizable index structures stored in HBase tables Text Index Table tweets Beautiful 13496 12393 2011-04-05 2011-05-05 (tweet ids) Meme Index Table tweets #Euro2012

13496 12393 2011-04-05 2011-05-05 (tweet ids) Embed tweets creation time in indices Queries like get_tweets_with_meme([memes], time_window) can be evaluated by visiting only one index. For queries like user_post_count([memes], time_window), embed more information such as tweets user IDs for efficient evaluation . SALSA Distributed Range Query get_retweet_edges([memes], time_window) Customized meme index Subset of tweet IDs Subset of tweet

IDs Subset of tweet IDs MapReduce for counting retweet edges (i.e., user ID -> retweeted user ID) results For queries like get_retweet_edges([memes], time_window), using MapReduce to access the meme index table, instead of the raw data table SALSA Convergence is Happening Data intensive application with basic activities: capture, curation, preservation, and analysis (visualization) Data Intensive Paradigms

Cloud infrastructure and runtime Parallel threading and processes SALSA Dynamic Virtual Clusters Dynamic Cluster Architecture Monitoring Infrastructure SW-G Using Hadoop SW-G Using Hadoop SW-G Using DryadLINQ Linux Baresystem Linux on Xen

Windows Server 2008 Bare-system XCAT Infrastructure iDataplex Bare-metal Nodes (32 nodes) Monitoring & Control Infrastructure Monitoring Interface Pub/Sub Broker Network Virtual/Physical Clusters XCAT Infrastructure Summarizer Switcher

iDataplex Baremetal Nodes Switchable clusters on the same hardware (~5 minutes between different OS such as Linux+Xen to Windows+HPCS) Support for virtual clusters SW-G : Smith Waterman Gotoh Dissimilarity Computation as an pleasingly parallel problem suitable for MapReduce style applications SALSA SALSA HPC Dynamic Virtual Clusters Demo At top, these 3 clusters are switching applications on fixed environment. Takes ~30 Seconds. At bottom, this cluster is switching between Environments Linux; Linux +Xen; Windows + HPCS. Takes about ~7 minutes. It demonstrates the concept of Science on Clouds using a FutureGrid cluster. SALSA Summary of Plans Support Scientific Simulations (Data Mining and Data Analysis) Applications Kernels, Genomics, Proteomics, Information Retrieval, Polar Science, Scientific Simulation Data Analysis and Management, Dissimilarity Computation,

Clustering, Multidimensional Scaling, Generative Topological Mapping Security, Provenance, Portal Services and Workflow Programming Model Runtime Storage Infrastructure Hardware High Level Language Cross Platform Iterative MapReduce (Collectives, Fault Tolerance, Scheduling) Distributed File Systems Object Store Windows Server Linux HPC Amazon Cloud HPC Bare-system Bare-system Virtualization

CPU Nodes Data Parallel File System Azure Cloud Virtualization Grid Appliance GPU Nodes SALSA Big Data Challenge Peta 10^15 Tera 10^12 Pig Latin Giga 10^9 Mega 10^6 SALSA Acknowledgement

SALSA HPC Group http://salsahpc.indiana.edu School of Informatics and Computing Indiana University SALSA References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

12. 13. 14. 15. M. Isard, M. Budiu, Y. Yu, A. Birrell, D. Fetterly, Dryad: Distributed data-parallel programs from sequential building blocks, in: ACM SIGOPS Operating Systems Review, ACM Press, 2007, pp. 59-72 J.Ekanayake, H.Li, B.Zhang, T.Gunarathne, S.Bae, J.Qiu, G.Fox, Twister: A Runtime for iterative MapReduce, in: Proceedings of the First International Workshop on MapReduce and its Applications of ACM HPDC 2010 conference June 20-25, 2010, ACM, Chicago, Illinois, 2010. Daytona iterative map-reduce framework. http://research.microsoft.com/en-us/projects/daytona/. Y. Bu, B. Howe, M. Balazinska, M.D. Ernst, HaLoop: Efficient Iterative Data Processing on Large Clusters, in: The 36th International Conference on Very Large Data Bases, VLDB Endowment, Singapore, 2010. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, Ion Stoica, University of Berkeley. Spark: Cluster Computing with Working Sets. HotCloud10 Proceedings of the 2 nd USENIX conference on Hot topics in cloud computing. USENIX Association Berkeley, CA. 2010. Yanfeng Zhang , Qinxin Gao , Lixin Gao , Cuirong Wang, iMapReduce: A Distributed Computing Framework for Iterative Computation, Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, p.11121121, May 16-20, 2011 Tekin Bicer, David Chiu, and Gagan Agrawal. 2011. MATE-EC2: a middleware for processing data with AWS. In Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers (MTAGS '11). ACM, New York, NY, USA, 5968. Yandong Wang, Xinyu Que, Weikuan Yu, Dror Goldenberg, and Dhiraj Sehgal. 2011. Hadoop acceleration through network levitated merge. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC '11). ACM, New York, NY, USA, , Article 57 , 10 pages. Karthik Kambatla, Naresh Rapolu, Suresh Jagannathan, and Ananth Grama. Asynchronous Algorithms in MapReduce. In IEEE

International Conference on Cluster Computing (CLUSTER), 2010. T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmleegy, and R. Sears. Mapreduce online. In NSDI, 2010. M. Chowdhury, M. Zaharia, J. Ma, M.I. Jordan and I. Stoica, Managing Data Transfers in Computer Clusters with Orchestra SIGCOMM 2011, August 2011 M. Zaharia, M. Chowdhury, M.J. Franklin, S. Shenker and I. Stoica. Spark: Cluster Computing with Working Sets, HotCloud 2010, June 2010. Huan Liu and Dan Orban. Cloud MapReduce: a MapReduce Implementation on top of a Cloud Operating System. In 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pages 464474, 2011 AppEngine MapReduce, July 25th 2011; http://code.google.com/p/appengine-mapreduce. J. Dean, S. Ghemawat, MapReduce: simplified data processing on large clusters, Commun. ACM, 51 (2008) 107-113. SALSA Comparison of Runtime Models Language Environment Job Control Fault Tolerance Communication Protocol Work Unit Scheduling Twister

Java Hadoop Java clusters, HPC, cloud clusters, cloud Iterative MapReduce MapReduce iteration level task level broker, TCP RPC, TCP thread

process dynamic, speculative static MPI C HPC, super computers parallel processes added fault tolerance TCP, shared memory, Infiniband process static SALSA Comparison of Data Models Application Data Category Data Source

Twister scientific data (vectors, matrices) local disk, DFS Hadoop records, logs Data Format text/binary text/binary Data Loading partition based Data Caching Data Processing Unit Data Collective Communication

in memory Key-Value objects broadcasting, shuffling local disk, HDFS InputSplit, InputFormat local files Key-Value objects broadcasting, shuffling MPI scientific data (vectors, matrices) DFS text/binary/ HDF5 /NetCDF customized in memory basic types, vectors multiple kinds

SALSA Problem Analysis Entities and Relationships in Truthy data set User User User Follow Mention Tweet User memes Retweet SALSA Problem Analysis Example piece of Truthy data set SALSA

Problem Analysis Examples of time-related queries and measurements: - get_tweets_with_meme([memes], time_window) - get_tweets_with_text(keyword, time_window) - timestamp_count([memes], time_window) {2010-09-31: 30, 2010-10-01: 50, 2010-10-02: 150, ...} - user_post_count([memes], time_window) {"MittRomney": 23,000, "RonPaul": 54,000 ... } - get_retweet_edges([memes], time_window) - measure meme life time (time between first tweet and last tweet about a meme) distribution SALSA Chef Study What is SalsaDPI? (Cont.) SalsaDPI Provide configurable (API later) interface Automate Hadoop/Twister/other binary execution *Chef Official website: http://www.opscode.com/chef/ Motivation

Background knowledge Environment setting Different cloud infrastructure tools Software dependencies Long learning path Automatic these complicated steps? Solution: Salsa Dynamic Provisioning Interface (SalsaDPI). One-click deploy Chef open source system traditional client-server software Provisioning, configuration management and System integration contributor programming interface

Graph source: http://wiki.opscode.com/display/chef/Home Chef Study 1. Fog Cloud API (Start VMs) 2. Knife Bootstrap installation 3. Compute nodes registration Chef Client (Knife-Euca) Chef Chef Server Server Bootstrap Bootstrap templates templates FOG NET::SSH

3 2 1 Compute Compute Node Node Compute Compute Node Node Compute Compute Node Node SalsaDPI SalsaDPI configs configs

JobInfo Hadoop Hadoop DPIConf DPIConf Twister Twister Other Other System System Call Call module module SSH SSH module module

Software Software Recipes Recipes SalsaDPI Driver Chef Chef /Knife /Knife Client Client Compute Compute Node Node Compute Compute Node Node Chef Chef Server Server

Compute Compute Node Node Summary of Plans Intend to implement range of biology applications with Dryad/Hadoop/Twister FutureGrid allows easy Windows v Linux with and without VM comparison Initially we will make key capabilities available as services that we eventually implement on virtual clusters (clouds) to address very large problems Basic Pairwise dissimilarity calculations Capabilities already in R (done already by us and others) MDS in various forms GTM Generative Topographic Mapping Vector and Pairwise Deterministic annealing clustering Point viewer (Plotviz) either as download (to Windows!) or as a Web service gives Browsing Should enable much larger problems than existing systems Will look at Twister as a universal solution SALSA

Building Virtual Clusters Towards Reproducible eScience in the Cloud Separation of concerns between two layers Infrastructure Layer interactions with the Cloud API Software Layer interactions with the running VM 69 SALSA Separation Leads to Reuse Infrastructure Layer = (*) Software Layer = (#) By separating layers, one can reuse software layer artifacts in separate clouds 70 SALSA Design and Implementation Equivalent machine images (MI) built in separate clouds Common underpinning in separate clouds for software

installations and configurations Extend to Azure Configuration management used for software automation 71 SALSA Cloud Image Proliferation FG Eucalyptus Images per Bucket (N = 120) 14 12 10 8 6 4 2 0 j t

t t t t t t t t t r z s r s y e ll e e e 3 6 4 1

d n n 4 n 1 n an dbo ucke ztro os5 os5 mag clov 198 ucke ucke ucke ianc test iste eral jdia ingi ucke mag ucke age isio ucke mag st-b ucke -xe ucke ucke u90 che ges s a l y a t t l i v I z t w f n

g m t w -s a a b ti jk yb my e-b s-i ro a-b GX -b tb tr e-b e-b un rm -b -b -b pp ce -t te- bu cen cen tes r ah p uc m g ag ub g su S di2 n- ora age age d-a lian nce cke d ag a a ve i

a i a e d im im r i r a bu c p m m m i d g i- m b e i i e l i r

a g -ap p e-s 4- pe u- Xm de im-f ora- exp ey sm .0 ant ME en id ida ag hl k ed a-m 9 g r u h i s . g gr a d f

im u sa r ub ntu wc nt do u u e f ub ub 43 p4 72 e yy SALSA

Changes of Hadoop Versions SALSA Implementation - Hadoop Cluster Hadoop cluster commands knife hadoop launch {name} {slave count} knife hadoop terminate {name} 74 SALSA Running CloudBurst on Hadoop Running CloudBurst on a 10 node Hadoop Cluster knife hadoop launch cloudburst 9 echo {"run list": "recipe[cloudburst]"}' > cloudburst.json chef-client -j cloudburst.json

CloudBurst on a 10, 20, and 50 node Hadoop Cluster Run Time (seconds) 400 CloudBurst Sample Data Run-Time Results FilterAlignments CloudBurst 350 300 250 200 150 100 50 0 10 75 20 Cluster Size (node count)

50 SALSA Implementation - Condor Pool Condor Pool commands knife cluster launch {name} {exec. host count} knife cluster terminate {name} knife cluster node add {name} {node count} 76 SALSA Implementation - Condor Pool Ganglia screen shot of a Condor pool in Amazon EC2 80 node (320 core) at this point in time 77 SALSA

Big Data Challenge Peta 10^15 Tera 10^12 Pig Latin Giga 10^9 Mega 10^6 SALSA Collective Communication Primitives for Iterative MapReduce Generalize MapReduce to MapCollective implemented optimally on each CPU-Network configuration nth Iteration Initial System or Final User Routing Routing Collectives

(n+1)th Iteration Map1 Map1 Map2 Map2 MapN MapN Iterate SALSA Fraction of Point-Center Distances Fraction of Point-Center Distances calculated for 3 versions of the algorithm for 76800 points and 3200 centers in a 2048 dimensional for three choices of lower bounds LB kept per point

SALSA One-click Deployment on Clouds What is SalsaDPI? (High-Level) SalsaDPI Jar Chef Client Chef Server OS User Conf. 2. Retrieve conf. Info. and request Authentication and Authorization 3. Authenticated and Authorized to execute software run-list

5. Submit application commands 1. Bootstrap VMs with a conf. file 6. Obtain Result 4. VM(s) Information * Chef architecture http://wiki.opscode.com/display/chef/Architecture+Introduction Apps Apps Apps S/W S/W S/W Chef Chef

Chef OS OS OS VM VM VM Web Interface http://salsahpc.indiana.edu/salsaDPI/ One-Click solution Futures Extend to OpenStack and commercial clouds Support storage such as Walrus (Eucalyptus) , Swift (OpenStack)

Test scalability Compare Engage (Germany), Cloud-init (Ubuntu), Phantom (Nimbus), Horizon (OpenStack) Acknowledgement Bingjing Zhang Thilina Gunarathne Prof. David Crandall Computer Vision Fei Teng Xiaoming Gao Prof. Filippo Menczer Complex Networks and Systems Stephen Wu Prof. Geoffrey Fox Parallel and Distributed

Computing SALSA Others Mate-EC2[8] Local reduction object Network Levitated Merge[9] RDMA/infiniband based shuffle & merge Asynchronous Algorithms in MapReduce[10] Local & global reduce MapReduce online[11] online aggregation, and continuous queries Push data from Map to Reduce Orchestra[12] Data transfer improvements for MR iMapReduce[13] Async iterations, One to one map & reduce mapping, automatically joins loop-variant and invariant data

CloudMapReduce[14] & Google AppEngine MapReduce[15] MapReduce frameworks utilizing cloud infrastructure services SALSA Summary of Initial Results Cloud technologies (Dryad/Hadoop/Azure/EC2) promising for Biology computations Dynamic Virtual Clusters allow one to switch between different modes Overhead of VMs on Hadoop (15%) acceptable Inhomogeneous problems currently favors Hadoop over Dryad Twister allows iterative problems (classic linear algebra/datamining) to use MapReduce model efficiently Prototype Twister released SALSA Future Work The support for handling large data sets, the concept of moving computation to data, and the better quality of services provided by cloud technologies, make data analysis feasible on an unprecedented scale for assisting new scientific discovery. Combine "computational thinking with the fourth paradigm (Jim Gray on data intensive computing)

Research from advance in Computer Science and Applications (scientific discovery) SALSA

Recently Viewed Presentations

  • Art History Time line - Dobele

    Art History Time line - Dobele

    * Location and media * Prehistoric Art (c.30,000-2000 BC) Cave painting, Hall of Bulls (Lascaux, France c15,000 BC) Ancient Egyptian Art (c.3000-30BC) Egyptian Tomb paintings.
  • EnVision UVS Monitor

    EnVision UVS Monitor

    Cell Density (Model CDP) Ultra-violet Absorption (Model UVS) Turbidity Foundation Fieldbus/Bluetooth Interfaces Advanced Optics Technologies Biotech/Pharmaceutical/Food & Beverage Industries User Interface Criteria Bluetooth Wireless Interface Low-power 2.4 - 2.5 GHz wireless communication (10 ...
  • Object Oriented Analysis and Design Sequence Diagrams 1

    Object Oriented Analysis and Design Sequence Diagrams 1

    Object Oriented Analysis and Design. ... Message (method call) indicated by horizontal arrow. Message include message description which is known as the message signature. ... Activation bar or focus of control is the box placed on the lifeline. It is...
  • Chapter 32-34: Animal Diversity Objectives 1. 2. 3.

    Chapter 32-34: Animal Diversity Objectives 1. 2. 3.

    Define common characteristics amongst all animals Animals can be characterized by "body plans" ... (Deuterostome development and includes chordates and vertebrates) Sponges p. 670 -Thought to have been plants -Suspension Feeders -Water is drawn through pores and a current provided...
  • Introduction - Houston Community College

    Introduction - Houston Community College

    Introduction. Courts serve as an agency of social control. What is fair? What is just? Who should be punished, for how long, for what? Courts and Their Importance. Courts in American Government. ... Corpus . delicti, "the body of crime"...
  • Hardness and Approximation Results for Black Hole Search in ...

    Hardness and Approximation Results for Black Hole Search in ...

    Algorithm. 1, Agents from the same node are in one team; if when an agent wakes up, its homebase already visited by someone else, then the agent just join the team.
  • WG-Report-to-the-802-Plenary-Mar05

    WG-Report-to-the-802-Plenary-Mar05

    Times New Roman MS Pゴシック Arial Arial Rounded MT Bold IEEE-802_15 Microsoft Word Document Wireless Chairs MCO Meeting Report ...
  • Derisking Offshore Oil Exploration with Lipidomics G. Todd

    Derisking Offshore Oil Exploration with Lipidomics G. Todd

    Funding Sources: - Canadian Research Chair - Canadian Foundation for Innovation - JELF - Offshore Exploration Research Association (Halifax, Nova Scotia, Canada) - Department of Energy Nova Scotia - Mitacs - SMU FGSR Thank you How it works - Lipidomics...