Link prediction plays an important role in scientific collaboration networks, and can favourably affect the organization of international scientific projects. In this paper, a meta-path computed prediction (MPCP) algorithm for link prediction among scientists and publications is presented. The MPCP algorithm is based on a heterogeneous information network model composed of authors and keywords in articles retrieved from the Web of Science database. Two kinds of meta-paths are defined: Author to Author to Author (A-A-A) and Author to Direction to Author (A-D-A). By calculating A-A-A and A-D-A using the heterogeneous information network model, the predictive strength of the links can be computed. The overlap of the meta-paths is also taken into account. By restoring links and calculating the number of restored links with different standard values, similar results are achieved for (quantum communication and link prediction). The number of restored links decreases as a special threshold value increases. The experimental studies show that, for any threshold value up to 1, at least 50% of links are restored. The results presented in this paper verify that the algorithm is a feasible means of predicting collaboration among scientists.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
• An algorithm for link prediction among scientists and publications is proposed.
• Importance of using heterogeneous information networks is discussed.
• Advantages of the proposed algorithm are confirmed via examples.
• Perspectives of using the proposed algorithm are discussed.
Cite this article
Lande, D., Fu, M., Guo, W. et al. Link prediction of scientific collaboration networks based on information retrieval. World Wide Web : Internet and Web Information Systems. - N 23, pp. 2239-2257(2020). https://doi.org/10.1007/s11280-019-00768-9