- Author
- Title
- Mining social media: tracking content and predicting behavior
- Supervisors
- Award date
- 5 December 2012
- Number of pages
- 207
- ISBN
- 9789461821980
- Document type
- PhD thesis
- Faculty
- Faculty of Science (FNWI)
- Institute
- Informatics Institute (IVI)
- Abstract
-
The advent of social media has established a symbiotic relationship between social media and online news. This relationship can be leveraged for tracking news content, and predicting behavior with tangible real-world applications, e.g., online reputation management, ad pricing, news ranking, and media analysis. In this thesis we focus on tracking news content in social media, and predicting user behavior.
In the first part, we develop methods for tracking content which build upon, and extend practices in Information Retrieval. We begin with discovering social media posts that discuss a news article yet they do not provide a hyperlink to it. Our methods model news articles using several channels of information, either endogenous or exogenous to the article. These models are then used to query an index of social media posts. During this process we found that the query models are close in size to the documents to be retrieved, violating a standard assumption of language modeling. We correct for this discrepancy by introducing two hypergeometric language models for modeling both queries, and documents to be retrieved.
In the second part, we focus on predicting behavior. First we look at predicting listeners’ preference in spoken user generated content, namely, podcasts. Then, we predict popularity of news articles from several news agents in terms of the volume of comments they receive. We develop models for predicting the popularity of an article for both before and after it is published. Finally, we look at a different aspect of news impact: how reading a news article affects future user browsing behavior. In each setting, we find patterns that characterize the underlying behavior and extract features that we then use to establish models for predicting online behavior. - Note
- SIKS dissertation series no. 2012-47
Research conducted at: Universiteit van Amsterdam - Persistent Identifier
- https://hdl.handle.net/11245/1.377411
- Downloads
-
Thesis
Front cover
Title pages
Contents
1: Introduction
2: Background
PART I: Tracking content: introduction
3: Linking online news and social media
4: Hypergeometric language models
Conclusion to part I
PART II: Predicting behavior: introduction
5: Podcast preference
6: Commenting behavior on online news
7: Context discovery in online news
Conclusion to part II
8: Conclusions
Bibliography
Samenvatting
SIKS dissertation series
Back cover
Disclaimer/Complaints regulations
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.