Finding a team of Experts in Social Networks Theodoros Lappas UC Riverside Joint work with: Evimaria Terzi (IBM Almaden), Kun Liu (IBM Almaden) 02/13/20 Motivation How can I find a team of experts that can collaborate effectively in order to complete a given task? 2

02/13/20 Problem Given a task and a set of experts organized in a network, find a subset of experts that can effectively perform the task Task: set of required skills Expert: an individual with a specific skill-set Network: represents strength of relationships 3 02/13/20 Expertise networks Collaboration networks (e.g., DBLP graph,

coauthor networks) Organizational structure of companies LinkedIn Geographical (map) of experts 4 02/13/20 What makes a team effective for a task? T = {algorithms, java, graphics, python} Alice {algorithms} Coverage: Bob {python}

Cynthia {graphics, java} David {graphics} {graphics,java,python} every required skill in T is included in the skill-set of at least one team member 5 Eleanor 02/13/20

Is coverage enough? Alice {algorithms} Bob Cynthia {python} David {graphics, java} {graphics} A

A,B,C form an effective group that can communicate B Communication: T={algorithms,java,graphics,python} D C Eleanor {graphics,java,python} A,E could perform the task if they could

communicate E the members of the team must be able to efficiently communicate and work together 6 02/13/20 Problem definition Given a task and a social network of individuals G, find the subset (team) of G that can effectively perform the given task. Thesis: Good teams are teams that have the necessary skills and can also communicate

effectively 7 02/13/20 How to measure effective communication? The longest shortest path between any two nodes in the subgraph Diameter of the subgraph defined by the group members A B D

C E diameter diameter = infty =1 8 02/13/20 How to measure effective communication? The total weight of the edges of a tree that spans all the team nodes MST (Minimum spanning tree) of the subgraph defined by the group members

A B D C E MST MST= =infty 2 9 02/13/20 Problem definition v.1.1

Given a task and a social network G of individuals, find the subset (team) of individuals that can perform the given task and define a subgraph in G with the minimum diameter. Problem is NP-hard 10 02/13/20 The RarestFirst algorithm T={algorithms,java,graphics,python} {graphics,python,java} {algorithms,graphics} A

B Skills: algorithms E {algorithms,graphics,java} graphics java C {python,java} D

python {python} rare= algorithms Diameter = 2 Srare={Bob, Eleanor} 11 02/13/20 The RarestFirst algorithm T={algorithms,java,graphics,python} {graphics,python,java} {algorithms,graphics}

A B Skills: algorithms E {algorithms,graphics,java} graphics java C {python,java}

D python {python} rare=time: algorithms Diameter = 1 Running Quadratic to the number of nodes Srare={Bob, Efactor: leanor} Approximation 2xOPT 12 02/13/20

Problem definition v.1.2 Given a task and a social network G of individuals, find the subset (team) of individuals that can perform the given task and define a subgraph in G with the minimum MST cost. Problem is NP-hard Best known Approximation factor: O(log3n log k) 13 02/13/20 The SteinerTree problem Graph G(V,E) Required vertices

Set of Required Vertices R Find G subgraph of G such that G contains all the required vertices (R) and MST(G) is minimized 14 02/13/20 The EnhancedSteiner algorithm T={algorithms,java,graphics,python} graphics {graphics,python,java} java {algorithms,graphics}

A B E python C {python,java} {algorithms,graphics,java} algorithms D {python}

MST Cost = 1 15 02/13/20 Experiments 16 02/13/20 Dataset DBLP Dataset ( DM, AI, DB, T ) ~6000 authors Skills: keywords appearing in paper titles ~2000 features Social Network: Co-Authorship Graph

Tasks: Subsets of keywords with different cardinality 17 02/13/20 Cardinality of teams 18 02/13/20 Example teams (I) S. Brin, L. Page: The anatomy of a large-scale hypertextual Web search engine Paolo Ferragina, Patrick Valduriez, H. V. Jagadish, Alon Y. Levy, Daniela Florescu, Divesh Srivastava, S. Muthukrishnan

P. Ferragina ,J. Han, H. V.Jagadish, Kevin Chen-Chuan Chang, A. Gulli, S. Muthukrishnan, Laks V. S. Lakshmanan 19 02/13/20 Example teams (II) J. Han, J. Pei, Y. Yin: Mining frequent patterns without candidate generation F. Bonchi A. Gionis, H. Mannila, R. Motwani 20 02/13/20 Extensions

Other measures of effective communication Other practical restrictions Incorporate ability levels 21 02/13/20 Thanks for your attention! 22 02/13/20