Contents - Index


PURPOSE Partitions network data by splitting blocks based upon the CONvergence of iterated CORrelations (CONCOR).

DESCRIPTION Given an adjacency matrix, or a set of adjacency matrices for different relations, a correlation matrix can be formed by the following procedure.  Form a profile vector for a vertex i by concatenating the ith row in every adjacency matrix;  the i,jth element of the correlation matrix is the Pearson correlation coefficient of the profile vectors of i and j.  This (square, symmetric) matrix is called the first correlation matrix. 

The procedure can be performed iteratively on the correlation matrix until convergence.  Each entry is now 1 or -1.  This matrix is used to split the data into two blocks such that members of the same block are positively correlated, members of different blocks are negatively correlated.

CONCOR uses the above technique to split the initial data into two blocks.  Successive splits are then applied to the separate blocks.  At each iteration all blocks are submitted for analysis, however blocks containing two vertices are not split.  Consequently n-partitions of the binary tree can produce up to 2n blocks.

Note that any similarity matrix can be used as input.

Input dataset:
Name of file containing network to be analyzed. Data type: Multirelational.
Include transpose in calculations?:  (Default = YES).
For non-symmetric data each vertices profile would depend on its out ties only (since we only consider rows).  The in-ties can be considered by adding the transpose of the data matrices as additional relations.

Method of handling diagonal values: (Default = RECIPROCAL)
Choices are:

Reciprocal - In considering adjacency matrix X and  comparing profile of actor i with actor j we replace the comparison of elements xii with xji and xij with xjj by the comparisons xii with xjj and xij with xji respectively.

Ignore - Diagonals are treated as missing values so that the comparisons of xii with xji and xij with xjj are dropped.

Retain - Profile vectors are compared directly element by element, including the xii and xjj elements.
Max depth of splits (not blocks):  (Default =2).
How far down the binary tree splits are to be taken.  A value of n can produce up to 2n blocks.

Convergence criteria: (Default = 0.2).
In practice iterations are not taken to convergence but taken to within a tolerance TOL.  Convergence is accepted on values of 1.0 - TOL and -1.0 + TOL.  Smaller values of TOL increase computation time but create more robust solutions.

Maximum iterations: (Default = 25).
The maximum number of iterations performed on the correlation matrix before terminating through lack of convergence.
Input is corr mat: (Default ='No')
If the input dataset is a correlation matrix already then set to 'yes'.

(Output) Partition dataset:  (Default = 'ConcorCPart').
Name of file which contains partition by actor indicator matrix.The indicator matrix has the same number of rows as specified by the 'Max # of partitions' the number of columns equals the size of the network. The value k in row i column label j means that vertex labeled j is in block k at level i (that is the ith partition).  All other members of block k can be found by simply locating all column labels which correspond to an entry of k in the matrix. This matrix is not displayed in the LOG FILE.

(Output) Permuted dataset:  (Default = 'ConcorCCPerm').
Name of file which contains permuted vertex vector.  Permuted vector is such that vertices in the same block are grouped together. This vector is not displayed in the LOG FILE.

(Output) First correlation matrix:  (Default = 'Concor1stCorr').
Name of file which contains the correlation matrix constructed after the first iteration.

If ticked the use can select to split either block, both blocks or just one block at each level. More comprehensive interaction is available in CONCOR INTERACTIVE.

LOG FILE The correlation matrix constructed during the first iteration.

Blocks represented in terms of a clustering dendrogram. The blocks are given for each level specified in 'Max # of partitions'. The level at which any pair of actors are aggregated is the point at which both can be reached by tracing from the start to the actors from right to left. Hence to find all members of vertex i's block at level k simply locate the value of k on the line connected to i then all actors that can be reached from this point by tracing to the left are in i's block. The diagram can be printed or saved. Parts of the diagram can be viewed by moving the mouse to the split point in a tree diagram or the beginning of a line in the dendrogram and clicking. The first click will highlight a portion of the diagram and the second click will display just the highlighted portion. To return to the original right click on the mouse. There is also a simple zoom facility simply change the values and then press enter. If the labels need to be edited (particularly the scale labels) then you should take the partition indicator matrix into the spreadsheet editor remove or reduce the labels and then submit the edited data to Tools>Dendrogram>Draw. 

Behind the dendrogram is the correlation matrix constructed during the first iteration. Followed by an alternative cluster diagram. Members of the same block are connected by row of X's.  Hence to find all members of vertex i's block at level k simply locate the X in column label i at level k and trace along in both directions until a space is encountered.  All column labels corresponding to the Xs found are members of i's block.  A '' indicates a singleton block.

A blocked adjacency matrix.  The rows and columns of the original adjacency matrix are permuted into blocks.  The adjacency matrix is displayed in terms of the matrix blocks it contains.

The correlation coefficient R-squared of the partitioned data matrix and an ideal structure matrix.  The structure matrix has the same dimension as the data matrix but each cell in a block is set to the average value of the corresponding block in the data matrix.

TIMING Each iteration is O(N^3).

COMMENTS The algorithm splits every non-trivial block at every level.  The user may wish to reject a split at some level - since the history of all splits are given it is a simple matter to recombine clusters if the user so wishes. If the user wishes to control the splits at each level then the user should use the interactive version Networks -> Roles -> Structural Equivalence -> Concor interactive

REFERENCES Breiger R, Boorman S and Arabie P (1975).  An algorithm for clustering relational data, with applications to social network analysis and comparison with multi-dimensional scaling.  Journal of Mathematical Psychology, 12, 328-383.