Arijit Khan - Nanyang Technological University

Arijit Khan - Nanyang Technological University

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage Arijit Khan Gustavo Segovia Donald Kossmann Nanyang Technological University (NTU), Singapore ETH Zurich, Switzerland Microsoft Research, Redmond, USA Big Graphs Facebook: > 800 million active

users Google: > 1 trillion indexed pages Web Graph 31 billion RDF 31 billion triples in RDF 2011 triples in 2011 Information Network Social Network De Bruijn: 4k nodes (k = 20, , 40) Biological Network

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 100M Ratings, 480K Users, 17K Movies Graphs in Machine Learning 1/ 32 Background: Distributed Graph Querying Systems First, partition the graph, and then place each partition on a separate server, where query answering over that partition takes place. State-of-the-art distributed graph querying systems (e.g., SEDGE [SIGMOD12], Trinity [SIGMOD13], Horton [PVLDB13]) On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 2/ 32

Background: Distributed Graph Querying Systems Disadvantages Fixed Routing (less flexible) Balanced Graph Partitioning and Re-Partitioning State-of-the-art distributed graph querying systems On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 3/ 32 Background: Distributed Graph Querying Systems Disadvantages Fixed Routing (less flexible) The server which contains the query node can only handle that request the router maintains a fixed routing table (or, a fixed routing strategy, e.g., modulo hashing). Less flexible with respect to query

routing and fault tolerance, e.g., adding more machines will require updating the data partition and/or the routing table. Balanced Graph Partitioning and Re-Partitioning On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann State-of-the-art distributed graph querying systems 4/ 32 Background: Distributed Graph Querying Systems Disadvantages Fixed Routing (less flexible) Balanced Graph Partitioning Partitioning and

Re- (1) workload balancing to maximize parallelism, (2) locality of data access to minimize network communication NPhard, difficult in power-law graphs. later updates to graph structure or variations in query workloads graph re-partitioning/ replication online monitoring of workload changes, repartitioning of the graph topology, and migration of graph data across servers are expensive. On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann State-of-the-art distributed graph querying systems 5/ 32 Roadmap Distributed graph querying and graph partitioning Decoupled graph querying system Related work Smart graph query routing Experimental results

Conclusions On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 6/ 32 Decoupled Graph Querying System we decouple query processing and graph storage into two separate tiers. This decoupling happens at a logical level. Decoupled architecture for graph querying On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 7/ 32 Decoupled Graph Querying System Benefits Flexible routing

Less reliant on good partitioning across storage servers [Due to our smart query routing strategy will be discussed soon!] Decoupled architecture for graph querying On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 8/ 32 Decoupled Graph Querying System Benefits Flexible routing A query processor no longer assigned a fixed part of the graph equally capable of handling any request facilitating load balancing and fault tolerance. The query router can send a request to any of the query processors more flexible query routing, e.g., more query processors can be added (or, a query processor that is

down can be replaced) without affecting the routing strategy. Decoupled architecture for graph querying On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 9/ 32 Decoupled Graph Querying System Benefits Flexible routing Each tier can be scaled-up independently. A certain workload is processing intensive allocate more servers to the processing tier. Graph size increases over time add more servers in the storage tier. Decoupled architecture, being generic, can be employed in many existing graph querying systems.

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann Decoupled architecture for graph querying 10/ 32 Roadmap Distributed graph querying and graph partitioning Decoupled graph querying system Related work Smart graph query routing Experimental results Conclusions On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 11/ 32 Related Work: Decoupling Storage and Query Processors Facebooks Memcached [NSDI13] Googles F1 [PVLDB13] ScaleDB [http://scaledb.com/pdfs/TechnicalOverview.pdf]

Loesing et. al. (On the Design and Scalability of Distributed SharedData Databases) [SIGMOD15] Binnig et. al. (The End of Slow Networks: Its Time for a Redesign) [PVLDB16] Shalita et. al. (Social Hash: An Assignment Framework for Optimizing Distributed Systems Operations on Social Networks) [NSDI16] On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 12/ 32 Decoupled Graph Querying System Disadvantages Query processors may need to communicate with the storage tier via the network additional penalty to the response time for answering a query. May cause high contention rates on either the network, storage tier, or both. Decoupled architecture for

graph querying On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 13/ 32 Our Contribution: Smart Query Routing We We design design aa smart smart query query routing routing logic logic to to utilize utilize the the cache cache of of query query processors processors over over such

such decoupled decoupled architecture. architecture. More More cache cache hits hits reduce reduce communication communication among among query query processors processors and and storage storage servers. servers. More More cache cache hits hits less less reliant

reliant on on good good partitioning partitioning across across storage storage servers. servers. Decoupled architecture for graph querying On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 14/ 32 Roadmap Distributed graph querying and graph partitioning Decoupled graph querying system Related work Smart graph query routing Experimental results Conclusions

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 15/ 32 h-Hop Traversal Queries Online, h-hop queries: explore a small region of the entire graph, and require fast response time. Start with a query node, and traverse its neighboring nodes up to a certain number of hops (i.e., h = 2, 3). Example Example ss h-hop neighbor aggregation h-step random walk with restart h-hop reachability More complex queries, e.g., node labeling and classification, expert finding, ranking, discovering functional modules, complexes, and pathways On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 16/ 32

Objectives for Smart Query Routing Leverage each cached data processors Balance workload even if skewed or contains hotspot Make fast routing decisions [a small constant time, or << O(n) ] Have low storage overhead in the router [a small fraction of the input graph size] Decoupled architecture for graph querying On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 17/ 32 Challenges in Smart Query

Routing Objectives are conflicting For maximum cache locality, router can send all queries to the same processor (assuming no cache eviction) imbalanced workload in processors lower throughput. router could inspect the cache of each processor before making a good routing decision network delay. Hence, router must infer what is likely to be in each processors cache. On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann Smart Smart Routing Routing Objectives Objectives Leverage each processors cached data Balance workload even if skewed or contains hotspot

Make fast routing decisions Have low storage overhead in the router 18/ 32 Challenges in Smart Query Routing Smart Smart Routing Routing Objectives Objectives are are conflicting! conflicting! Topology-Aware Locality successive queries on nearby nodes must be sent to the same processor. It is likely that h-hop neighborhoods of these nodes significantly overlap. How the router knows about nearby nodes without storing the entire graph topology?

2-hop neighborhoods of u and v overlap significantly - use landmark, graph embedding On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 19/ 32 Challenges in Smart Query Routing Smart Smart Routing Routing Objectives Objectives are are conflicting! conflicting! Query Stealing Always Routing queries to processors that have the most useful cache data workload imbalance if skew/ query hotspot lower throughput. We perform query stealing at router Whenever a processor is idle and is

ready to handle a new query, if it does not have any other requests assigned to it, the router may steal a request and send to it which was intended for another processor. Query stilling by maintaining topology-aware locality (as much as possible). On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 20/ 32 Smart Routing-1: Landmark If two nodes are close to a given landmark, they are likely to be close themselves. On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 21/ 32 Smart Routing-1: Landmark PrePreprocessing processing

Select a small set of L nodes as landmarks. Compute distance landmarks. of every node to Assign landmarks to query processors: Every processor is assigned a pivot landmark with the intent that pivot landmarks are as far from each other as possible. Each remaining landmark is assigned to the processor which contains its closest pivot landmark. If two nodes are close to a given landmark, they are likely to be close themselves. The distance of a node u to a processor p is defined as the minimum distance of u to any landmark that is assigned to processor p.

This distance information is stored in the router, which requires O(nP) space and O(nL) time to compute. On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 22/ 32 Smart Routing-1: Landmark Online Online Routing Routing a query on node u the router verifies the pre-computed distance d(u, p) for every processor p selects the one with the smallest d(u, p) value. Routing decision time: O(p) Load-balancing Load-balancing via via QueryQuerystealing stealing If two nodes are close to a given landmark, they are

likely to be close themselves. Route to smallest load-balanced distance. Nearby nodes are routed in similar way, maintaining topology-aware locality. On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 23/ 32 Smart Routing-2: Embed PrePreprocessing processing Embed a graph in a lower D-dimensional Euclidean plane. The hop-count distance between graph nodes are approximately preserved via their Euclidean distance. Storage = O(nD), time = O(|L|2D + n|L|D) Graph embedding in 2D Euclidean plane A A benefit

benefit of of embed embed routing routing is is that that the the pre-processing pre-processing is is independent independent of of the the system system topology, topology, allowing allowing more more processors processors to to be be easily easily added added at at aa later later time. time.

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 24/ 32 Smart Routing-2: Embed Online Online Routing Routing Exponential moving average to compute the mean of the processors cache contents. Router finds the distance between a query node u and a processor p, denoted as d(u, p), and defined as the distance of the query nodes co-ordinates to the historical mean of the processors cache contents. Graph embedding in 2D Euclidean plane Route query on u to processor p with

minimum d(u, p). Routing decision time: O(PD) On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 25/ 32 Smart Routing-2: Embed Online Online Routing Routing Exponential moving average to compute the mean of the processors cache contents. Router finds the distance between a query node u and a processor p, denoted as d(u, p), and defined as the distance of the query nodes co-ordinates to the historical mean of the processors cache contents. Route query on u to processor p with minimum d(u, p). Graph embedding in 2D

Euclidean plane Load-balancing Load-balancing via via Query-stealing Query-stealing Routing decision time: O(PD) On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 25/ 32 Roadmap Distributed graph querying and graph partitioning Decoupled graph querying system Related work Smart graph query routing Experimental results Conclusions On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 26/ 32

Experimental Setup Graph Graph Datasets Datasets Cluster Cluster Configuration Configuration 12 servers each having 2.4 GHz Intel Xeon processors, 0 4GB cache. interconnected by 40 Gbps Infiniband, and also by 10 Gbps Ethernet. Use a single core of each server with the following configuration: 1 server as router, 7 servers in the processing tier, 4 servers in the storage tier; and communication over Infiniband with remote direct memory access (RDMA). RAMCloud as storage tier. Graph is stored as adjacency list every node-id is key, and the corresponding value is an array of its 1-hop neighbors. The graph is partitioned across storage servers via RAMClouds default and inexpensive hash partitioning scheme, MurmurHash3 over graph nodes. On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 27/ 32

List of Experiments Comparison with distributed graph systems (SEDGE [SIGMOD12] with Giraph [SIGMOD10], GraphLab [VLDB12]) that use smart graph partitioning and reportioning - Our method achieves up to an order of magnitude higher throughput even with inexpensive hash partitioning of the graph! Scalability with number of processors and storage servers Impact of cache size Impact of graph updates Sensitivity w.r.t. different parameters: query locality and hotspot, h-hop queries, load factor, smoothing parameter, embedding dimensionality, landmark numbers, minimum distance between a pair of landmarks Performance Performance Metrics Metrics Query efficiency, Query throughput, Cache hit rates Baseline Baseline Routing Routing Methods Methods Next ready, No cache, Modular hash with query stealing

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 28/ 32 Performance with Varying Number of Query Processors Embed Embed routing routing is is able able to to sustain sustain almost almost same same cache cache hit hit rate rate with with many many query query processors. processors. Hence,

Hence, its its throughput throughput scales scales linearly linearly with with query query processors. processors. On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 29/ 32 Impact of Cache Sizes Both Both smart smart routings routings Embed Embed and and Landmark Landmark utilize utilize the

the cache cache well; well; and and for for the the same same amount amount of of cache, cache, they they achieve achieve lower lower response response time time compared compared to to baseline baseline routings. routings. On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann

30/ 32 Roadmap Distributed graph querying and graph partitioning Decoupled graph querying system Related work Smart graph query routing Experimental results Conclusions On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann 31/ 32 Conclusions Decoupled graph querying system Smart query routing to achieve higher cache hits for h-hop traversal queries emphasize less on expensive graph partitioning and re-partitioning across storage tiers provide linear scalability in throughput

with more number of query processors works well in the presence of query hotspots adaptive to workload changes and graph updates. On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage: Arijit Khan (NTU Singapore), G. Segovia, and D. Kossmann Decoupled architecture for graph querying 32/ 32

Recently Viewed Presentations

  • Zmysły chemiczne

    Zmysły chemiczne

    Kodowanie tekstury w receptorach i korze czuciowej Wzorzec impulsów w mechanoreceptorach koduje teksturę obiektów będących w kontakcie ze skórą. Kodowanie tekstury obiektów w receptorach dotyku i obszarach kory czuciowej Spatial event plot. Każda kropka odpowiada potencjałowi ...
  • Goddard Sailing Association - NASA

    Goddard Sailing Association - NASA

    Goddard Sailing Association Laguna 26 ... electrical fire, less effective for Class 'A' - general combustibles, and 'B' - oil / gasoline Check satisfactory charge before sailing Drill Call FIRE! FIRE! FIRE! Stop the engine, if running, and select OFF...
  • HOW TO GENERATE YOUR ACADEMIC LETTER OF ACCOMMODATION

    HOW TO GENERATE YOUR ACADEMIC LETTER OF ACCOMMODATION

    HOW TO GENERATE YOUR ACADEMIC LETTER OF ACCOMMODATION UBC CENTRE FOR ACCESSIBILITY INFORMATION FOR STUDENTS How to generate the Academic Accommodation Letter You need ...
  • Bangladesh Climate-Resilient Ecosystem Curriculum (BACUM) Module 2: REDD+

    Bangladesh Climate-Resilient Ecosystem Curriculum (BACUM) Module 2: REDD+

    Prof. (Dr.) Md. Danesh Miah. REDD+, Forest Carbon. ... (referred to as "power with" or exchange power) What is Power? KEY MESSAGE: Power is simply the ability to get what one wants. Think of your husband or wife…. Sources of...
  • The New Frontier and The Great Society

    The New Frontier and The Great Society

    The Act prohibited discrimination based on race, color, religion or national origin, and granted the federal government new powers to enforce the law by creating agencies such as the E.O.E.C. [Equal Opportunity Employment Commission] that would monitor hiring practices of...
  • Unit 1 Introduction to Chemistry - FREE Chemistry Materials ...

    Unit 1 Introduction to Chemistry - FREE Chemistry Materials ...

    For example, acetone has the following LD 50. s:ORL-RAT LD. 50 ... Chemical B: LD. 50 = 48 mg/kg. Science. The Functions of Science. pure science applied science. the search for . knowledge; facts. using knowledge. in a practical way...
  • Title

    Title

    Anomalous Monism:The Way Out? [Stanford Encyclopedia] "Anomalous Monism is a theory about the scientific status of psychology, the physical status of mental events, and the relation between these issues developed by Donald Davidson. It claims that psychology cannot be a...
  • IEEE 802.11 based WLANs

    IEEE 802.11 based WLANs

    The last link with the users is wireless, to give a network connection to all users in a building or campus. The backbone network usually uses cables Common Topologies The wireless LAN connects to a wired LAN There is a...