Research on the Categorization Methods of EBM Web Documents and Its Application Archive - IT Research Paper

Research on the Categorization Methods of EBM Web Documents and Its Application

Title Research on the Categorization Methods of EBM Web Documents and Its Application
Abstract

Evidence-based medicine (EBM) has become the prevailing medical model in the international medicine area, in which it is key to get and select the best evidences and form EBM Systematic Review. With the rapid development of Internet, artificial classification and management cann’t deal with vast amounts of medical documents. Therefore, it is urgent to get the support of effective webpage categorization methods and other intelligence information processing techniques.This thesis summarizes current research about webpage classification and explores several key technologies, and then presents some new points of view and solutions by our own way. The main work of the paper includes:Firstly, web page pretreatment and categorization methods are studied. Especially in English webpage pretreatment, some effective methods have been found, including content extraction of webpage title and steming of English word.Secondly, a method of Weighted Naive Bayesian (WNB) based on the summarization of webpage is presented. When critical information is extracted by LUHN and LSA methods and word weights is adjusted by evaluation functions, and then construct WNB classifier. It is confirmed by our experiments that the precision can be increased 6-9 percentage points.Thirdly, an approach of using Boosting algorithms to improve SVM classifier based on webpage summary is proposed.The experimental results show that this method can improve classification accuracy more effectively than single SVM classifier.Fourthly, a webpage classification system is designed and implemented, including webpage pretreatment, some general webpage classification methods, and some improved classification algorithms.

Category IT Paper
Keywords Boosting, Text Categorization, Webpage Categorization, Webpage Summary, weight adjustment,
FileType PDF
Pages 113
Price US$48.00
Buy Now
Download
Contact E-Mail:itpaper@hotmail.com
TEL:1-888-786-998A
FAQ How to get this paper's electronic documents?
1, Click the "Buy Now" button to complete the online payment
2, Download the paper's electronic document from the successful payment return page/Or the system will send this paper's electronic document to your E-Mail within 24 hours
Favorite ADD TO FAVORITE
Version zh-cn
© IT Research Paper