Author/Authors :
Loet Leydesdorff، نويسنده , , Liwen Vaughan، نويسنده ,
Abstract :
Co-occurrence matrices, such as cocitation, coword,
and colink matrices, have been used widely in the information
sciences. However, confusion and controversy
have hindered the proper statistical analysis of these
data. The underlying problem, in our opinion, involved
understanding the nature of various types of matrices.
This article discusses the difference between a symmetrical
cocitation matrix and an asymmetrical citation matrix
as well as the appropriate statistical techniques that
can be applied to each of these matrices, respectively.
Similarity measures (such as the Pearson correlation
coefficient or the cosine) should not be applied to the
symmetrical cocitation matrix but can be applied to the
asymmetrical citation matrix to derive the proximity
matrix. The argument is illustrated with examples. The
study then extends the application of co-occurrence
matrices to the Web environment, in which the nature of
the available data and thus data collection methods are
different from those of traditional databases such as
the Science Citation Index. A set of data collected with
the Google Scholar search engine is analyzed by using
both the traditional methods of multivariate analysis and
the new visualization software Pajek, which is based on
social network analysis and graph theory.