An Algorithm Based on Correlation Coefficient to Find Scientific Communities Archive - IT Research Paper

An Algorithm Based on Correlation Coefficient to Find Scientific Communities

Title An Algorithm Based on Correlation Coefficient to Find Scientific Communities
Abstract URL:http://www.it-paper.com/an-algorithm-based-on-correlation-coefficient-to-find-scientific-communities.html

Because of the features of promptness, rapidity and wideness, more and more scientists accustom to publish their documental papers on the World Wide Web (WWW) to be studied, referred and deepened by other scientists. However, because of the network’s hugeness and disorder, to find related papers called scientific community has become an intractable problem. The closely connected subgraph in a citation graph is often considered as a community of related papers. Nowadays, citation analysis has become an important branch of link analysis.This paper analyzed the model of citation graph, compared the algorithm based on times cited, PageRank and similarity-based algorithm. The latter two algorithms are both better than the algorithm based on times cited. Similarity-based algorithm is the most accurate. Nevertheless, the searching results of the three algorithms often contain some unrelated papers, which influence the precision of the found communities.Panupong et al. brought in a scientific community finding algorithm according to the similarity of two nearby papers, using the random walk model, but that algorithm lacks of scientific explanation. By deeply studying citation relationship, this essay analyses the concepts and problems of RWGC algorithm. The formula used by RWGC algorithm’s lack of scientific basis influences the result. Based on RWGC, this essay puts forward an improved algorithm based on correlation coefficient–RWGC-CC (The Improved Random Walk Graph Clustering Algorithm Based on Correlation Coefficient). It takes 3 layers of referring relation into consideration, modeling documents into variables as well as modeling the similarity of adjacent documents into correlation coefficient in Probability. This correlation coefficient reflects the degree of similarity, so it gives the mathematic explanation of similarity. This paper also deeply analyses the relationship of similarity threshold and the size of a community by some experimentsThe experimental result shows that RWGC-CC increases the precision by 15% than RWGC, meanwhile, RWGC-CC removes the iteration process while calculating similarity, saves a lot of time and improves the efficiency.

Category Internet
Keywords Citation Graph, correlation coefficient, random walk, Scientific Community,
FileType PDF
Pages 179
Price US$100.00
Buy Now
Download
Contact E-Mail:itpaper@hotmail.com
TEL:1-888-786-998A
FAQ How to get this paper's electronic documents?
1, Click the "Buy Now" button to complete the online payment
2, Download the paper's electronic document from the successful payment return page/Or the system will send this paper's electronic document to your E-Mail within 24 hours
Favorite ADD TO FAVORITE
Version zh-cn
© IT Research Paper