admmDensestSubmatrix: ADMM for the densest submtarix problem

Brendan Ames, Polina Bombina

2019-10-28

Introduction

This is the R-package accompanying the paper (Convex optimization for the densest subgraph and densest submatrix problems.

The problem of identifying a dense submatrix is a fundamental problem in the analysis of matrix structure and complex networks. This package provides tools for identifying the densest submatrix of a given graph using first-order optimization methods.

See the tutorials below to get started.

The densest submatrix problem

Let \([M] = \{1,2,\dots, M\}\) for each positive integer \(M\). Given a matrix \(\mathbf{A} \in R^{M\times N}\), the densest \(m\times n\)-submatrix problem seeks subsets \(\bar U \subseteq {[M]}\) and \(\bar V \subseteq {[N]}\) of cardinality \(|\bar U|=m\) and \(|\bar V| = n\), respectively, such that the submatrix \(\mathbf{A}{[\bar U, \bar V]}\) with rows index by \(\bar U\) and columns indexed by \(\bar V\) contains the maximum number of nonzero entries. That is, the densest \(m\times n\)-submatrix problem seeks the densest \(m\times n\)-submatrix of \(\mathbf{A}\).

The densest \(m\times n\)-submatrix problem can be formulated as:

\[\begin{equation} \min_{\mathbf{X}, \mathbf{Y} \in { \{0,1\}}^{M\times N} } {\operatorname{Tr}(\mathbf{Y} \mathbf{e} \mathbf{e}^T): \mathrm{P}_{\Omega}(\mathbf{X}-\mathbf{Y}) = \mathbf{0}, \operatorname{Tr}(\mathbf{X} \mathbf{e} \mathbf{e}^T) = mn, \operatorname{rank}(\mathbf{X}) = 1 }, \end{equation}\]

where

Unfortunately, optimization problems involving rank and binary constraints are intractable in general.

Relaxing the rank constraint with a nuclear norm penalty term, \(\|\mathbf{X} \|_* = \sum_{i=1}^N \sigma_i(\mathbf{X})\), i.e., the sum of the singular values of matrix, and the binary constraints with box constraints yields the convex problem:

\[\begin{align} \min \; & \|\mathbf{X} \|_* + \gamma \operatorname{Tr}(\mathbf{Y} \mathbf{e} \mathbf{e}^T) \\ \operatorname{s.t.}\; & \operatorname{Tr}(\mathbf{X} \mathbf{e} \mathbf{e}^T) = mn, \\ & \mathrm{P}_\Omega(\mathbf{X} - \mathbf{Y}) = \mathbf{0}, \\ & \mathbf{Y} \ge \mathbf{0}, \\ & \mathbf{0} \le \mathbf{X} \le \mathbf{e} \mathbf{e}^T, \end{align}\] where \(\gamma >0\) is a regularization parameter chosen to tune between the two objectives.

It can be shown that the relaxation is exact when binary matrix \(\mathbf{A}\) contains a single, relatively large dense \(m\times n\) block. For more information, see (Convex optimization for the densest subgraph and densest submatrix problems

Alternating Direction Method of multipliers for densest submatrix problem

The alternating direction method of multipliers (ADMM) has been succesfully used in a broad spectrum of applications. The ADMM solves convex optimization problems with composite objective functions subject to equality constraints.

We direct the reader to Prof. Stephen Boyd’s website (ADMM) for a more thorough discussion of the ADMM.

To apply ADMM to our problem, we introduce artificial variables \(\mathbf{Q}\), \(\mathbf{W}\) and \(\mathbf{Z}\) to obtain the equivalent optimization problem:

\[\begin{align} \min \; & \|\mathbf{X} \|_* + \gamma \|\mathbf{Y} \|_1 +{1}_{\Omega_Q}(\mathbf{Q})+{1}_{\Omega_W}(\mathbf{W})+{1}_{\Omega_Z}(\mathbf{Z})\\ \operatorname{s.t.}\; & \mathbf{X}-\mathbf{Y}=\mathbf{Q},\mathbf{X}=\mathbf{W}, \mathbf{X}=\mathbf{Z} \end{align}\]

where

Here \({1}_{S}: R^{M\times M} \rightarrow \left \{0,+\infty \right \}\) is the indicator function of the set \(S \subseteq R^{M\times N}\), such that \({1}_S(\mathbf{X})=0\) if \(\mathbf{X}\in S\), and \(+\infty\) otherwise.

Since our objective function is separable, we iteratively solve this optimization program using the ADMM. The basic idea is to rotate through \(3\) steps:

  1. minimize the augmented Lagrangian over primal variables,
  2. update dual variables usng the updated primal variables,
  3. calculate primal and dual residuals.

Interested readers are referred to (Convex optimization for the densest subgraph and densest submatrix problems. We include a summary of the algorithm below.

Examples

We test this package on two different types of data: first, using random matrices sampled from the planted dense \(m \times n\) submtarix model and, second, real-world collaboration and communication networks.

Random matrices

We first generate a random matrix with noise obscuring the planted submatrix using the function plantedsubmatrix. and then call the function densub to recover the planted submatrix.

After generating the structure random containing the random matrix with desired planted structure, we can visually represent the matrix and planted submatrix as two-tone images, where dark pixels correspond to nonzero entries, and light pixels correspond to zero entries, using the following commands.

Tne vizualization of the randomly generated matrix \(\mathbf{A}\) helps us to understand its structure. It is clear that \(\mathbf{A}\) contains a dense \(50 \times 40\) block (in the bottom left corner).

Visual representation of randomly generated \mathbf{A}

Visual representation of randomly generated \(\mathbf{A}\)

We can remove all noise and isolate an image of a rank-one matrix \(\mathbf{X0}\) with \(mn\) nonzero entries.

Visual representation of dense submatrix

Visual representation of dense submatrix

Then we vizualize matrix \(\mathbf{Y0}\) to see the number of disagreements between original matrix \(\mathbf{A}\) and \(\mathbf{X0}\).

Disagreement between \mathbf{A} and \mathbf{X_0}

Disagreement between \(\mathbf{A}\) and \(\mathbf{X_0}\)

We call the ADMM solver and visualize the output using the following commands.

The ADMM solver returns the optimal solutions \(\mathbf{X}\) and \(\mathbf{Y}\). It must be noted that matrices \(\mathbf X\) and \(\mathbf Y\) are identical to the actual structures of \(\mathbf{X_0}\) and \(\mathbf{Y_0}\). The planted submatrix is recovered.

Optimal solution

Optimal solution

Optimal Solution

Optimal Solution

Collaboration Network

The following is a simple example on how one could use the package to analyze the collaboration network found in the JAZZ dataset. It is known that this network contains a cluster of \(100\) musicians which performed together.

JAZZ Network

JAZZ Network

We have already prepared dataset to work with. More details can be found in the provided file JAZZ_IN_R.R.

Our algorithm converges to the dense submatrix representing the community of \(100\) musicians after \(50\) iterations.