2009 ModelingAndDataMiningInBlogosphere
Jump to navigation
Jump to search
- (Agarwal & Liu, 2009) ⇒ Nitin Agarwal, Huan Liu. (2009). “Modeling and Data Mining in Blogosphere.” Morgan & Claypool. doi:10.2200/S00213ED1V01Y200907DMK001
Subject Areas: Statistical Modeling Task, Data Mining Task, Blogosphere.
Notes
Quotes
Abstract
- This book offers a comprehensive overview of the various concepts and research issues about blogs or weblogs. It introduces techniques and approaches, tools and applications, and evaluation methodologies with examples and case studies. Blogs allow people to express their thoughts, voice their opinions, and share their experiences and ideas. Blogs also facilitate interactions among individuals creating a network with unique characteristics. Through the interactions individuals experience a sense of community. We elaborate on approaches that extract communities and cluster blogs based on information of the bloggers. Open standards and low barrier to publication in Blogosphere have transformed information consumers to producers, generating an overwhelming amount of ever-increasing knowledge about the members, their environment and symbiosis. We elaborate on approaches that sift through humongous blog data sources to identify influential and trustworthy bloggers leveraging content and network information. Spam blogs or “splogs” are an increasing concern in Blogosphere and are discussed in detail with the approaches leveraging supervised machine learning algorithms and interaction patterns. We elaborate on data collection procedures, provide resources for blog data repositories, mention various visualization and analysis tools in Blogosphere, and explain conventional and novel evaluation methodologies, to help perform research in the Blogosphere.
- The book is supported by additional material, including lecture slides as well as the complete set of figures used in the book, and the reader is encouraged to visit the book website for the latest information: http://tinyurl.com/mcp-agarwal
1. Modeling Blogosphere
- … Blogosphere provides a conducive platform to build the virtual communities of special interests. It reshapes business models [3], assists viral marketing [4], provides trend analysis and sales prediction [5,6], aids counter-terrorism efforts [7], and acts as grassroot information sources [8].
- Past few years have observed a phenomenal growth in the blogosphere. Technorati (http://technorati.com/blogging/state-of-the-blogosphere/) published a report on the growth of the blogosphere. The report mentioned that the blogosphere is consistently doubling every 5 months for the last 4 years and the size was estimated to be approximately 133 million blogs by December 2008. Furthermore, 2 new blogs or roughly 18.6 new blog posts are added to the blogosphere every second. …
1.1 Modeling Essentials
- The blogosphere consists of two main graph structures - a blog network and a post network. A post network is formed by considering the links between blog posts, and ignoring the blogs to which they belong. In a post network, the nodes represent individual blog posts, and edges represent the links between them. A post network gives a microscopic view of the blogosphere and helps in discerning “high-resolution” details like blog post level interactions, communication patterns in blog post interactions, authoritative blog post based on links, etc. A blog network is formed by collapsing those individual nodes in the post network that belong to a single blog, to a single node. By doing so links between the blog posts that belong to a single blog disappear and links between blog posts of different blogs are agglomerated and weighted accordingly. A blog network gives a macroscopic view of the blogosphere and helps in observing “low-resolution” details like blog level interactions, communication patterns in blog-blog interactions, authoritative blogs based on links, etc. Both post and blog networks are directed graph.
Blog Clustering and Community Discovery
Influence and Trust
Spam Filtering in Blogosphere
Data Collection and Evaluation
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2009 ModelingAndDataMiningInBlogosphere | Huan Liu Nitin Agarwal | Modeling and Data Mining in Blogosphere | 10.2200/S00213ED1V01Y200907DMK001 | 2009 |