Optimizing Mixing in Pervasive Networks: A Graph-Theoretic Perspective Murtuza Jadliwala, Igor Bilogrevic and Jean-Pierre Hubaux ESORICS, 2011 Wireless Trends Smart Phones Vehicles Always on Background apps Watches Cameras

Passports 2 Peer-to-Peer Wireless Networks 1 Identifier 2 Message 3 Examples VANETs

Social networks Nokia Instant Community Urban Sensing networks Delay tolerant networks Peer-to-peer file exchange 4 Location Privacy Problem Monitor identifiers used in peer-to-peer communications a b c

5 Location Privacy Attacks Identifier Pseudonym Message Pseudonymous location traces Home/work location pairs are unique [1] Re-identification of traces through data analysis [2,4,3,5] Attack: Spatio-Temporal correlation of traces [1] P. Golle and K. Partridge. On the Anonymity of Home/Work Location Pairs. Pervasive Computing, 2009 [2] A. Beresford and F. Stajano. Location Privacy in Pervasive Computing. IEEE Pervasive Computing, 2003 [3] B. Hoh et al. Enhancing Security & Privacy in Traffic Monitoring Systems. Pervasive Computing, 2006

[4] B. Hoh and M. Gruteser. Protecting location privacy through path confusion. SECURECOMM, 2005 [5] J. Krumm. Inference Attacks on Location Tracks. Pervasive Computing, 2007 6 Location Privacy with Mix Zones Prevent long term tracking 1 2 b ?a a Mix zone

Change identifier in mix zones [6,7] Key used to sign messages is changed MAC address is changed [6] A. Beresford and F. Stajano. Mix Zones: User Privacy in Location-aware Services. Pervasive Computing and Communications Workshop, 2004 [7] M. Gruteser and D. Grunwald. Enhancing location privacy in wireless LAN through disposable interface identifiers: a quantitative analysis. 7 Mobile Networks and Applications, 2005 Mix-zone Placement in Road Networks Mix zone placement most effective at intersections [8] Enables mixing (covers) at roads leading in and out of the intersection Mix-zones incur cost Communication loss Routing delays Cost vary from intersection to intersection

How to place mix-zones? All roads are covered Overall cost is minimized Mix Cover problem [8] L. Buttyan, T. Holczer, and I. Vajda. On the effectiveness of changing pseudonyms to provide location privacy in 8 VANETs. ESAS 2007 Previous Work on Mix zone Placement Optimization Approach [9] Mixing effectiveness using a flow-based metric Given upper bound on mix zones, max. distance between them and cost, where to place mix zones that maximizes mixing effectiveness Do not address the coverage problem Game-theoretic Approach [10,11] Game-theoretic model of optimal attack and defense strategies

Only consider local, and not network-wide, intersection characteristics [9] J. Freudiger, R. Shokri, and J-P. Hubaux. On the optimal placement of mix zones. PETS 2009 [10] M. Humbert, M. H. Manshaei, J. Freudiger, and J-P. Hubaux. Tracking games in mobile networks. GameSec 2010 [11] T. Alpcan and S. Buchegger. Security games for vehicular networks. IEEE Transactions on Mobile Computing, 2011 9 Outline 1. Mix Cover (MC) Problem 2. Algorithms 3. Evaluation and Results 10 Graph-Theoretic Model ( , , , )

Intersections Vertices (V) Roads Edges (E) Mixing cost at intersection Vertex weight (w) Node intensity on road or demand Edge weight (d) One for each direction, for 3 7 2 6 2 6

9 4 3 2 4 8 8 2 7 2 2 2 2 2

2 5 8 10 7 6 6 3 9 12

2 5 8 9 1 2 4 4 1 1 3

9 4 1 11 Mix Cover (MC) Problem ( , , , ) Determine a subset and a capacity s.t. at least one of or , for all covered by 6 6

2 6 4 3 2 4 8 2 2 7 2 2

2 10 9 8 (capacity indicates the largest demand the intersection can handle) Total weighted cost is minimized 3 7 2

2 2 5 2 7 8 10 7 6 6 3

9 12 2 5 8 9 1 2 4 4

4 1 1 9 3 9 4 1 6x6 + 2x5 + 7x12+ 10x8 + 4x1 + 9x9 = 295 12

Why Mix Cover? Mix zone deployment that provides two guarantees: 1. Privacy guarantee All roads are covered at least at one end Nodes go without mixing over at most one intersection 2. Cost guarantee Minimum network-wide mixing cost A mix cover provides both these! 13 Combinatorial Properties Generalization of Weighted Vertex Cover (WVC) problem Different from the Facility Terminal Cover (FTC) [13] generalization of WVC In FTC, each edge has only a single demand

Result 1: Mix Cover problem is NP-hard No efficient algorithm for finding optimal solution, even finding a good approximation seems hard Proof by polynomial-time reduction from WVC [13] G. Xu, Y. Yang, and J. Xu. Linear Time Algorithms for Approximating the Facility Terminal Cover Problem. Networks 2007 14 Outline 1. Mix Cover (MC) Problem 2. Algorithms 3. Evaluation and Results 15 Three Algorithms

Optimization using Linear Programming 1 Divide and Conquer approach Largest Demand First Smallest Demand First 2 3 16 Integer Program Formulation Cost guarantee Privacy guarantee Capacity requirement

where decision variable for vertex covering edge mixing cost at vertex decision variable indicating selected capacity of vertex Result 2: LP relaxation of the above IP can guarantee a polynomial-time 2approximation for the Mix Cover problem 17 Largest Demand First (LDF) 1. For each edge, replace smaller demand with larger demand (

(,, . . . ) ) 2. Round off the demands to the closest power of 2 3. Divide into subgraphs based on the rounded edge demands 4. Obtain for each 5. For all , , where 6. Output 18 LDF Combinatorial Results A solution to MC problem on is also a solution for Result 3: , where is the optimal solution and

Result 4: LDF is a linear time -approximation algorithm for mix cover where is approximation ratio of Proofs in the paper! 19 Smallest Demand First (SDF) LDF highly sub-optimal chosen capacity depends on larger edge demand value SDF similar to LDF, except In step 1, replace larger edge demand value by smaller value Additional step: For each vertex, remember the largest edge demand incident on it In , choose capacity Result 5: SDF is a time -approximation algorithm for mix cover where is approximation ratio of 20

Outline 1. Mix Cover (MC) Problem 2. Algorithms 3. Evaluation and Results 21 Experimental Setup Input graph constructed using real vehicular traffic data 2 US states, Florida and Virginia 3 sizes of road network, 25%, 65% and 100% of total state municipalities 3 different distributions of vertex weight, constant (1), uniform (between 1 and 100) and Gaussian (mean=50, sd=10) Edge demands chosen from real traffic intensities Algorithms implemented in MATLAB, executed on multi-core computer Results average over 100 runs

22 Solution Quality Ratio of LDF/SDF solution cost to nave strategy cost Nave solution: Select all vertices in final solution v/e v/e LDF SDF Florida LDF SDF Virginia SDF outperforms LDF in both cases for all graph sizes SDF achieves as low as 34% of the cost of the nave solution Performance best for uniform vertex weight distribution and worst for constant distribution 23

Execution Efficiency Duration (in seconds) of algorithm execution LDF SDF Florida LDF Virginia SDF SDF runs slower compared to LDF in both cases for all graph sizes Algorithms fastest when vertex weight constant and worst when selected from a Gaussian distribution 24 Results for LP-based Algorithm Too slow for large graphs

Executed on reduced Florida graph of 515 and 1024 vertices For 515 vertices, ratio of solution cost compared to nave strategy improves to 0.24 (better than LDF and SDF) Execution time is twice compared to LDF and four times that of SDF For 1024 vertices, execution time increased by a factor of 20 25 Conclusion Mix Cover: cost-efficient mix zone placement that guarantees mixing coverage Modeled as a generalization of weighted vertex cover problem Never been studied Model general enough and applicable to other scenarios Approximation algorithms using Linear programming LDF and SDF based on Divide and Conquer approach

Results Proposed algorithms provide solution quality and execution time guarantees Experimentation using real data and standard computation resources show feasibility [email protected] 26 BACKUP SLIDES 27 How to obtain mix zones? Silent mix zones Turn off transceiver Passive mix zones Where adversary is absent

Before connecting to Wireless Access Points Encrypt communications With help of infrastructure Distributed 28 bluetoothtracking.org 29 Pleaserobme.com 30 Mix networks vs Mix zones Alice home

Mix node Mix node Alice work Bob Alice Mix node Mix network Mix Zones

31 Assumption Central authority periodically computes optimal mix cover offline Knows the (dynamic) node or traffic intensity on roads Knows mixing cost at each intersection Nodes or vehicles access the latest mix cover computation from the central authority 32 Solution Size Number of vertices in the final solution v/e

v/e LDF Florida SDF LDF SDF Virginia SDF performs better than LDF in Florida LDF performs better than SDF in Virginia Algorithms do not optimize solution size; depends on road network topology Solution size between 46% and 58% of the total number of vertices 33