Networks -> Roles -> Structural Equivalence -> Profile Similarity

Contents - Index

NETWORK > ROLES & POSITIONS > STRUCTURAL > PROFILE

PURPOSE Compute measures of structural equivalence based upon comparisons of rows and columns of data matrices and forms clusters based upon the results.

DESCRIPTION The profile of an actor is the row vector corresponding to the actor in the adjacency matrix. Multiple relations are permissible and the profile vector is the concatenation of each individual relation profile vector. This matrix can be real or binary.

Structurally equivalent actors have the same profile except for the diagonal entries of the adjacency matrix. This routine compares the profile vectors of all pairs of actors and hence computes a measure of profile similarity. Measures of similarity can be made using Euclidean distance, Pearson correlation, exact matches or matches of positive entries only. Euclidean distance produces a distance matrix and all the other options produce a similarity matrix. This matrix is then analyzed by single link hierarchical clustering.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Multirelational.

Measure of profile similarity/distance: (Default = EUCLIDEAN DISTANCE).
Choices are:

Euclidean Distance - The distance between the vectors in n-dimensional space, i.e. the root of the sum of squared differences.

Correlation - Pearson product correlation coefficient of every pair of profiles.

Matches - Proportion of exact matches between all pairs of profiles.

Positive Matches - Proportion of exact matches in which at least one element is positive, between all pairs of profiles.

Method of handling diagonal values: (Default = IGNORE)
Choices are:

Ignore - Diagonals are treated as missing values so that the comparisons of xii with xji and xij with xjj are dropped.

Retain (single count) - Profile vectors are compared directly element by element, including the xii and xjj elements but when the transposes are included the second time the same pairing occurs is ignored.

Retain (double count) - Profile vectors are compared directly element by element, including the xii and xjj elements and if transposes are included the second occurence of these pairing is still included.

Reciprocal (single count) - In considering adjacency matrix X and comparing profile of actor i with actor j we replace the comparison of elements xii with xji and xij with xjj by the comparisons xii with xjj and xij with xji respectively. If the transposes are included the second occurance of these pairings are ignored.

Reciprocal (double count) - In considering adjacency matrix X and comparing profile of actor i with actor j we replace the comparison of elements xii with xji and xij with xjj by the comparisons xii with xjj and xij with xji respectively. If the transposes are included the second occurance of these pairings is still included.

Include transpose in calculations?: (Default = YES).
Including transposes means that profiles correspond to rows and columns. This is obviously not necessary for symmetric data.

For binary data: convert to geodesic distances: (Default = NO).
Converts binary data to geodesic data before performing an analysis.

Diagram Type: (Default = 'Dendrogram')
The clustering diagram can either be a Tree Diagram or a Dendrogram.

(Output) Equivalence matrix: (Default = 'SE').
Name of data file containing actor by actor equivalence matrix.

(Output) Partition dataset: (Default = 'SEPart').
Name of data file containing partition indicator matrices derived from single link hierarchical clustering. A value of k in row labeled x and column j means that actor j is in partition k at level x. Actor k is always a member of partition k, and is a representative label for the group. This matrix is not displayed in the LOG FILE.

LOG FILE Single link hierarchical clustering dendrogram (or tree diagram) of the structural equivalence matrix. The level at which any pair of actors are aggregated is the point at which both can be reached by tracing from the start to the actors from right to left. The diagram can be printed or saved. Parts of the diagram can be viewed by moving the mouse to the split point in a tree diagram or the beginning of a line in the dendrogram and clicking. The first click will highlight a portion of the diagram and the second click will display just the highlighted portion. To return to the original right click on the mouse. There is also a simple zoom facility simply change the values and then press enter. If the labels need to be edited (particularly the scale labels) then you should take the partition indicator matrix into the spreadsheet editor remove or reduce the labels and then submit the edited data to Tools>Dendrogram>Draw.

Behind the plot is the actor by actor structural equivalence matrix. This is followed by an alternative clustering diagram representing the same information as above. The columns are rearranged and labeled. A '·' in column label j at level x means that actor j is not in any cluster at level x. An x indicates that actor j is in a cluster at this level together with those actors which can be traced across that row without encountering a space.

TIMING O(N2).

COMMENTS None.

REFERENCES Burt R (1976). Positions in Networks. Social Forces, 55, 93-122.