Non-hermitian matrices for graph clustering?

Simon Coste - ENS

GDR ISIS, Mars 2022

spectrum spectrum

Clustering algorithms in a nutshell

Step Spectral algorithms DL
1 network representation laplacian, adjacency, non-backtracking, Bethe-Hessian… tensorized representations of the graph
2 low-dim node embedding eigenvectors, singular vectors equivariant GNNs
3 clustering on the embedding K-means, EM linear classifier

adj.

spectrum spectrum

lap.

spectrum spectrum

asy.

spectrum spectrum

First challenge: sparsity

A graph is sparse when n_\mathrm{edges} \approx n_\mathrm{nodes}.

  • classical spectral clustering works when graphs are not too sparse

  • classical spectral clustering doesn’t work when graphs are sparse

Potential reasons

Eigenvalues of normal matrices (including SVD) are aligned with high-degrees.

Sparse graphs often have very heterogeneous degrees.

Second challenge: directedness

  • many (most?) real-world networks are directed

  • most clustering methods represent digraphs with normal matrices (including SVD-based methods)

Takeaway messages: Asymmetry helps

  1. non-Hermitian matrices have eigenvalues, too

  2. non-Hermitian matrices should not frighten you

[Chen, Cheng, Fan, 18] [Bordenave, C, Nadakuditi, 20] [C, Stephan, 21]

Some spectral theory of sparse directed graphs

A model of directed inhomogeneous random graphs

Parameters: size n super large, connectivity matrix P=(P_{i,j})

A_{i,j} = edge from node i to node j

A_{i,j} = \mathrm{Bernoulli}(P_{i,j}) ~~~~~~~~~~~~~~~ (\mathrm{independent})

  • Sparsity \iff \Vert P \Vert_{\infty} = O(1/n)

  • \mathbf{E}[A] = P

  • If P has a block-structure as in a SBM, this structure is visible in the eigenvectors

  • \lambda_i = eigenvalues of P with left/right eigenvectors u_i, v_i.

  • \mu_i = eigenvalues of A with left/right eigenvectors \varphi_i, \psi_i (random).

Theorem (C, Stephan, 2021+)

t = \sqrt{\Vert P \Vert} ~~~~~~~~~~~~ \text{(the crucial threshold)}

  • if |\lambda_i| > t then:

\mu_i = \lambda_i + o(1) ~~~~~ \text{and} ~~~~~|\langle u_i, \varphi_i\rangle| \to \mathrm{something}_i

  • other eigenvalues of A are smaller than t + o(1).

Usage: estimate u_i from \varphi_i, then find the block-structure

Example

Here \mathbf{E}[A] has 3 eigenvalues 1,2,3 and t=\sqrt{\Vert P \Vert} \approx 1,37.

spectrum

Directed Stochastic Block Models

SBM with a block-wise path structure:

P = \frac{d}{n}\begin{bmatrix}1 & 1- \eta & 0 \\ \eta & 1 & 1-\eta & 0 \\ 0 & \eta & 1 & 1 - \eta & \ddots \\ &&&&&&&\\ & & \ddots & \ddots & \ddots \\ \\ \end{bmatrix}

Eigenvalues of P and threshold t=\sqrt{\Vert P \Vert}:

explicit in terms of k (number of blocks), d (density) and \eta (asymmetry)

Green: eig. of an ad hoc skew-Hermitian matrix ([Laenen & Sun 20]) .

d=2 ~~~~~~~~~~~~~~~~~~~~~~~~~ d=5 ~~~~~~~~~~~~~~~~~~~~~~~~~ d=10

(performance metric= mean adjusted overlap)

Conclusion

Question: best graph representations for GNNs?

  • [Dasoulas, Lutzeyer, Vazirgiannis 21] parametrize their graph matrix representation using a few basic building blocks:
    \alpha_1 D^{\gamma_1} + \alpha_2 D^{\gamma_2}A D^{\gamma_3} + \alpha_3 I

They optimize this in \alpha_i, \gamma_j for a fixed GNN architecture.

What happens if you allow truly non-normal representations,

even if the graph is undirected?

and C. Bordenave, L. Massoulié, L. Stephan, L. Dall’Amico…