Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data

Portada
Springer Science & Business Media, 2007 - 532 páginas
The rapid growth of the Web in the last decade makes it the largest p- licly accessible data source in the world. Web mining aims to discover u- ful information or knowledge from Web hyperlinks, page contents, and - age logs. Based on the primary kinds of data used in the mining process, Web mining tasks can be categorized into three main types: Web structure mining, Web content mining and Web usage mining. Web structure m- ing discovers knowledge from hyperlinks, which represent the structure of the Web. Web content mining extracts useful information/knowledge from Web page contents. Web usage mining mines user access patterns from usage logs, which record clicks made by every user. The goal of this book is to present these tasks, and their core mining - gorithms. The book is intended to be a text with a comprehensive cov- age, and yet, for each topic, sufficient details are given so that readers can gain a reasonably complete knowledge of its algorithms or techniques without referring to any external materials. Four of the chapters, structured data extraction, information integration, opinion mining, and Web usage mining, make this book unique. These topics are not covered by existing books, but yet they are essential to Web data mining. Traditional Web mining topics such as search, crawling and resource discovery, and link analysis are also covered in detail in this book.

Dentro del libro

Páginas seleccionadas

Índice

Introduction
1
Supervised Learning
3
Bibliographic Notes
12
341
53
Discussion
81
Bibliographic Notes
115
Text Documents
138
20
139
Bibliographic Notes
149
Web Mining
183
7
237
32
271
Bibliographic Notes
320
Merge Algorithm 10 9 2 Lexical Appropriateness 10 9 3 Instance Appropriateness Bibliographic Notes
406
Bibliographic Notes
482
Página de créditos

31
145

Otras ediciones - Ver todo

Términos y frases comunes

Sobre el autor (2007)

Bing Liu is an associate professor in Computer Science at the University of Illinois at Chicago (UIC). He received his PhD degree in Artificial Intelligence from University of Edinburgh. Before joining UIC in 2002, he was with National University of Singapore. His research interests include data mining, Web mining, text mining, and machine learning. He has published extensively in these areas in leading conferences and journals. He served (or serves) as a vice chair, deputy vice chair or program committee member of many conferences, including WWW, KDD, ICML, VLDB, ICDE, AAAI and ICDM.

Información bibliográfica