Contents - Index


NETWORK > ROLES & POSITIONS > STRUCTURAL EQUIVALENCE > CONCOR INTERACTIVE

PURPOSE Partitions network data by splitting blocks based upon the CONvergence of iterated CORrelations (CONCOR) with user control of the splits.

DESCRIPTION Given an adjacency matrix, or a set of adjacency matrices for different relations, a correlation matrix can be formed by the following procedure.  Form a profile vector for a vertex i by concatenating the ith row in every adjacency matrix;  the i,jth element of the correlation matrix is the Pearson correlation coefficient of the profile vectors of i and j.  This (square, symmetric) matrix is called the first correlation matrix. 

The procedure can be performed iteratively on the correlation matrix until convergence.  Each entry is now 1 or -1.  This matrix is used to split the data into two blocks such that members of the same block are positively correlated, members of different blocks are negatively correlated.

CONCOR uses the above technique to split the initial data into two blocks.  Successive splits are then applied to the separate blocks and are controlled by the user.

Note that any similarity matrix can be used as input and that the data must be loaded before any of the parameters can be changed or any options selected.

On selecting the interactive option a window opens and the parameters open, split, unsplit, densities, save and options are listed along the top of the window. These are descibed in the parameter section below.

Once an input file has been selected click on the load button; in the "select a block of nodes" window the term "whole network" will be hghlighted in blue. The blue highlight indicates which block is active and can be split, initially this can only be the whole network. The "log" window will state that the network is loaded and will give the number of nodes and the number of relations the loaded network contains. The user is now in a position to start spliting the data. Clicking split will partition the network into two groups, these will be labelled 0.1 and 0.2. Clicking on either will make them active and they will again be highlighted in blue. The log window will give the name of the highlighted block, the number of nodes it contains and will specify the nodes in the block.

Active blocks can be split further into two blocks with each subsplit labelled with a 1 or 2 and a dot seperating the splits. Each block is therefore given a unique number which describes how it was arrived at. For example a block labelled 0.1.2.2 indicates it was split three times from the original data, the first 1 shows that the first block was taken from the first split this was then split again and this time the second block was taken and this was split a final time and the second of these is the block this label refers to. The history of the the spliting tree is given in the "select a block of nodes" window and this can be compressed or expaned by clicking on the + or - in the same was as selecting folders in windows. the number of blocks the data has been split into is displayed in the centre.

Blocks that have been split can be rejoined by highlighting an antecedent block in the tree and clicking unsplit.

At any point the user can click the density button to give an image matrix of the blocking that contains either the densities of the blocks or a frequency count (actually the sum of all the block entries) together with the correlation coefficient R-squared of the partitioned data matrix and an ideal structure matrix. The structure matrix has the same dimension as the data matrix but each cell in a block is set to the average value of the corresponding block in the data matrix. This image matrix can be saved.

PARAMETERS 
  Open, split, unsplit, densities  have the same function as the buttons described above.

Save clicking save opens a window that asks for the output dataset where the results will be stored this is the same as the output partition. 

Input multirelational network:
Name of file containing network to be analyzed. Data type: Multirelational.

Output Partition: (Default = <inputfilename>-ccpart')
Name of dataset which contains a partition indicator vector. This vector has the form (k1,k2,...ki,...)' where ki assigns vertex i to faction ki so that (1 1 2 1 2)' assigns vertices 1, 2 and 4 to faction 1 and 3 and 5 to faction 2. This vector is not displayed in the LOG FILE.
  Options 
Handling ties to equivalents (Default = Reciprocal single count)
 
Choices are:
  Ignore - Diagonals are treated as missing values so that the comparisons of xii with xji and xij with xjj are dropped.
Retain single count - Profile vectors are compared directly element by element, including the xii and xjj elements however if we include the transpose these would be included twice and the second time they are dropped.
Retain double count - Profile vectors are compared directly element by element, including the xii and xjj elements and if the transpose is included they are counted twice
Reciprocal single count- In considering adjacency matrix X and  comparing profile of actor i with actor j we replace the comparison of elements xii with xji and xij with xjj by the comparisons xii with xjj and xij with xji respectively however if we include the transpose these would be included twice and the second time they are dropped.
Reciprocal double count- In considering adjacency matrix X and  comparing profile of actor i with actor j we replace the comparison of elements xii with xji and xij with xjj by the comparisons xii with xjj and xij with xji respectively however if we include the transpose is included they are counted twice.

  Include transposes:  (Default = YES).
For non-symmetric data each vertices profile would depend on its out ties only (since we only consider rows).  The in-ties can be considered by adding the transpose of the data matrices as additional relations.

Use geodesic distance?:  (Default = YES).
For binary matrices convert to geodesic distance matrix before calculating correlations.

Convergence criteria: (Default = 0.8).
In practice iterations are not taken to convergence but taken to within a tolerance TOL which is greater than zero and less than one.  Convergence is accepted when all  values are either less than - TOL or greater than + TOL.  Larger values of TOL increase computation time but create more robust solutions.

Maxi iterations: (Default = 25).
The maximum number of iterations performed on the correlation matrix before terminating through lack of convergence.
 
Output Partition (suffix):  (Default = -CCpart').
Suffix which is added to the inputfilename to create the output partition matrix.

Output Density (suffix):  (Default = -CCden').
Suffix which is added to the inputfilename to create the output density matrix.

Output 1st corr Matrix (suffix):  (Default = '-Ccorr').
Suffix which is added to the inputfilename to create output the correlation matrix constructed after the first iteration.


LOG FILE No logfile is created.

TIMING Each iteration is O(N^3).

COMMENTS None

REFERENCES Breiger R, Boorman S and Arabie P (1975).  An algorithm for clustering relational data, with applications to social network analysis and comparison with multi-dimensional scaling.  Journal of Mathematical Psychology, 12, 328-383.