Ландэ Д.В., Балагура И.В., Андрущенко В.Б.
Построение сетей соавторства по данным сервиса Google Scholar Citations
// Открытые семантические технологии проектирования интеллектуальных систем (OSTIS-2016): материалы VI междунар. науч.-техн. конф. (Минск 18-20 февраля 2016 года) / - Минск: БГУИР, 2016. - С. 233-237.
В работе приводится алгоритм построения сети соавторства ученых, регулируемой их научными интересами.
CREATION OF NETWORKS OF THE COAUTHORSHIP ACCORDING TO THE GOOGLE SCHOLAR CITATIONS SERVICE
Lande D.V., Balagura I.V., Andrushchenko V.B.
Institute for Information Recording NAS of Ukraine, Kiev , Ukraine
The algorithm of creation of the network of a coauthorship of scientists regulated by their scientific interests is given in work. The network of a coauthorship is formed on the basis of sounding of the Google Scholar Citations service. It is shown that the descriptors defining subject influence the size of the formed network, and also dynamics of its growth. It is shown that clusters in networksof a co-authorship can be considered as a basis for identification of schools of sciences.
The objective of the work . is the description of the theoretical principles and methods of automatic formation of the co-authoring networks, in particular in the fields - Complex Networks and Text Mining sounding the great information network. To attain this aim the specific algorithm of Google Scholar Citations service scanning is used to receive the representative co-authors bank as the base units for the future network. Within the sounding notion we will perceive the small size fetch of the most important content from the large networks, which couldn.t be sounded by the processing reason [Lande, 2015].
It.s evident that the co-authoring network can be of a big size, if is not measured by the defined theme, targeted by the tags of the first author . the origin of the network formation.
Such an effect complicates considerably the perception of the formed network and reduces to the effect of .themes drift.. Also the identical last names and initials spelling can occur. To cope with these effects the thematic filtering is used i.e. the used descriptors are referred to authors of the scientometric network, and define their thematic direction. Accordance of these descriptors in the final analysis defines the size of the formatting co-authoring networks and the dynamic of its growth. In addition the clusters identification in such networks can be perceived as a basis for the science schools, experts. groups etc. extraction [Lande etc, 2013].
It is appropriate to use the approved on the peering networks (peer to peer, P2P) models, based on the equality of participants. Peering networks consist of units; each of it interacts only with the several subsets of other units, which corresponds to the co-authoring network.
The sounding of the reference model network is
provided according to the next algorithm [Lande,
Exactly on the results of the quality modeling there was made a conclusion about the opportunity of forming the small branches of connected co-authors, according to the tags, users of Google Scholar Citations service are interested in.
The described algorithm was adapted to the real coauthoring
network of Google Scholar Citations in such
According to the described algorithm the process of the network sounding from the several (root) unit is stopped after the circularity, when according to the algorithm the pass is implemented to the unit, been traced, and also if the left units are vary from the main themes (it defines by taking into account the lexical make-up of the tags). And the exact .circularity. is the feature of the pass to another root author or the end of the sounding process.
The suggested attempt is directed to form the networks of co-authorship in frames of the knowledge domain, limited elements of which are the several tags, targeted previously by the scientists . members of the Google Scholar Citations project. It.s necessary to notice that the basic difference of the suggested model of automatic way of the network formation from the existed ones, based on the direct participation of the experts in choosing straight units and connections. In this case the researcher uses only the tiny knowledge parts, inlayed by co-authors, tags, marked as the main for them. Thus the expert environment is widened considerably. The model is used for the science fields Complex Networks and Text Mining in frames of the Google Scholar Citations, but the suggested attempt can be used for other knowledge domains and scientometric arrays. The modeling results received using the procedure proposed in the chapter 3 also can be applied to create the subject field model.