COMMUNITIES AND BALANCE IN SIGNED NETWORKS : SPECTRAL APPROACH -Pranay Anchuri*, Malik Magdon Ismail Rensselaer Polytechnic Institute, NY. OUTLINE Introduction Structural Balance Heuristic Spectral Methods Results Conclusion Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute SIGNED SOCIAL NETWORKS

Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute STRUCTURAL BALANCE Stable Notation : Unstable Negative Edge

Network is strongly balanced if all triads are stable. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Positive Edge WEAK STRUCTURAL BALANCE Stable Unstable

Network is weakly balanced if all triads are stable. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute COMMUNITIES IN BALANCED NETWORK Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Balanced network can be divided so that positive edges lie within communities negative edges between communities.

Real world networks are rarely structurally balanced. Frustration : of edges that disturb the balance. Positive edges between communities + Negative edges within communities. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Number Real world networks are rarely structurally balanced. Frustration : of edges that disturb the balance. Positive edges between communities + Negative

edges within communities. Frustration = 1 Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Number Real world networks are rarely structurally balanced. Frustration : of edges that disturb the balance. Positive edges between communities + Negative edges within communities. Frustration = 1 Pranay Anchuri, Malik Magdon Ismail,

Rensselaer Polytechnic Institute Number Real world networks are rarely structurally balanced. Frustration : of edges that disturb the balance. Positive edges between communities + Negative edges within communities. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Number Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute

Community Detection HEURISTIC Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Ignore the negative edges and cluster the remaining nodes. HEURISTIC Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute HEURISTIC Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute

Isolated nodes are added in such a way that minimizes the frustration. HEURISTIC Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Spectral Methods MINIMIZING FRUSTRATION Community C divided into C1,C2 Positive

Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute edges between C1 and C2 increase frustration. Negative edges between C1 and C2 decrease frustration. MINIMIZING FRUSTRATION Community C divided into C1,C2 Positive C2 C1 Frustration = 2 Frustration = 2

Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute edges between C1 and C2 increase frustration. Negative edges between C1 and C2 decrease frustration. MINIMIZING FRUSTRATION Community C divided into C1,C2 Positive C1 C2 Frustration = 2 Frustration = 1 Pranay Anchuri, Malik Magdon Ismail,

Rensselaer Polytechnic Institute edges between C1 and C2 increase frustration. Negative edges between C1 and C2 decrease frustration. MINIMIZING FRUSTRATION Community C divided into C1,C2 Positive C1 C2 Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute edges between C1 and C2 increase frustration.

Negative edges between C1 and C2 decrease frustration. MODULARITY Unsigned Modularity : Number Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute of edges within communities expected number if edges were randomly permuted. Measure of the surprise factor. Higher modularity is better. SIGNED MODULARITY Signed Modularity

Surprise Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute factor due to positive edges within communities and negative edges between communities. Minimizing Frustration Maximizing Modularity Both objectives reduce to maximizing ST M S Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute

COMPUTING THE MAXIMUM Maximizing f(M,S) = ST M S Optimum Requires sparse matrix vector multiplication which is efficient. S RRn Rbut Rwe Rneed RS R R{-1,+1}n R!! Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute S : Eigen vector corresponding to maximum Eigen value of M. Eigen vector can be computed by Power Iteration.

BOOLEAN SOLUTION Rounding : Based Rounding w/ Improvement : Start with an initial Boolean solution and move the nodes one at a time. If there is a sequence of flips such that solution is closer optimum then retain the changes. Complexity : O(N^2). Rounding w/ Partial Improvement: Consider zero.

nodes whose magnitude is close to Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute on sign of si, si >= 0 1 and -1 o/w. 5 7 4 3 8 0 2

1 Val in Eigen Vector 0 0.55 1 0.27 2 0.25 3 -0.44

4 -0.45 5 -0.33 6 0.10 7 0.13 8 -0.09 Pranay Anchuri, Malik Magdon Ismail,

Rensselaer Polytechnic Institute 6 Nod e 5 7 4 3 8 0 2 1 Val in Eigen

Vector 0 0.55 1 0.27 2 0.25 3 -0.44 4 -0.45

5 -0.33 6 0.10 7 0.13 8 -0.09 Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute 6

Nod e 5 7 4 3 8 0 2 1 Val in Eigen Vector 0

0.55 1 0.27 2 0.25 3 -0.44 4 -0.45 5

-0.33 6 0.10 7 0.13 8 -0.09 Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute 6 Nod e

MULTIPLE COMMUNITIES Communities can be further divided Until Change in the objective can be reduced to ST MS Also requires sparse matrix vector multiplication. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute frustration cannot be reduced. Modularity cannot be increased.

Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Results MODULARITY MAXIMIZATION Algorithm # Communities Largest Frustration (% of ve edges) Epinions.com 15 69802 175.07

Clustering (40 means) 40 59022 195.06 Modularity 15 56754 211.62 Modularity w/ partial improvement

4 117072 100 Slashdot.com Clustering ( 15 means) 15 24460 259.93 Clustering (40 means) 40

21666 288.69 Modularity 14 40378 176.85 Modularity w/ partial improvement 3 60031

141.72 Datasets obtained from http://snap.stanford.edu/ Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Clustering ( 15 means) FRUSTRATION MINIMIZATION Algorithm # Communities Largest Frustration ( % of ve edges) Epinions.com

2 74100 58.97 Two Division w/ Partial Improvement 2 73861 54.99 Multiple Division 20 69004

47.65 Multiple Division w/ Partial Improvement 23 68990 43.92 Slashdot.com Two Division 2 57824 66.34

Two Division w/ Partial Improvement 2 57853 64.71 Multiple Division 8 55479 62.52 Multiple Division w/ Partial Improvement

10 57853 60.67 Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Two Division STRONG VS WEAK BALANCE Minimum Frustration: = 1 when max # communities =2 = 0 when # communities = 3 ( each node in its own community)

Minimum frustration with multiple communities implies weak balance. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute NEGATIVE INCIDENT RATIO Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute NIR = 3/2 CONCLUSION Spectral algorithm to detect communities in signed communities. Functions : Minimizing frustration,

Maximizing frustration. Careful assignment of nodes leads to better communities. Structural balance (strong and weak) affects the communities detected. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Objective Questions ? Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Thank You