Second Chance: A Hybrid Approach for Dynamic Result Caching and Prefetching in Search Engines

Download
2013-12-01
Ozcan, Rifat
Altıngövde, İsmail Sengör
Barla Cambazoglu, B.
ULUSOY, ÖZGÜR
Web search engines are known to cache the results of previously issued queries. The stored results typically contain the document summaries and some data that is used to construct the final search result page returned to the user. An alternative strategy is to store in the cache only the result document IDs, which take much less space, allowing results of more queries to be cached. These two strategies lead to an interesting trade-off between the hit rate and the average query response latency. In this work, in order to exploit this trade-off, we propose a hybrid result caching strategy where a dynamic result cache is split into two sections: an HTML cache and a docID cache. Moreover, using a realistic cost model, we evaluate the performance of different result prefetching strategies for the proposed hybrid cache and the baseline HTML-only cache. Finally, we propose a machine learning approach to predict singleton queries, which occur only once in the query stream. We show that when the proposed hybrid result caching strategy is coupled with the singleton query predictor, the hit rate is further improved.
ACM TRANSACTIONS ON THE WEB

Suggestions

Cost-Aware Strategies for Query Result Caching in Web Search Engines
Ozcan, Rifat; Altıngövde, İsmail Sengör; Ulusoy, Ozgor (Association for Computing Machinery (ACM), 2011-05-01)
Search engines and large-scale IR systems need to cache query results for efficiency and scalability purposes. Static and dynamic caching techniques (as well as their combinations) are employed to effectively cache query results. In this study, we propose cost-aware strategies for static and dynamic caching setups. Our research is motivated by two key observations: (i) query processing costs may significantly vary among different queries, and (ii) the processing cost of a query is not proportional to its po...
Cache-Based Query Processing for Search Engines
Cambazoglu, B. Barla; Altıngövde, İsmail Sengör; Ozcan, Rifat; Ulusoy, Ozgur (Association for Computing Machinery (ACM), 2012-11-01)
In practice, a search engine may fail to serve a query due to various reasons such as hardware/network failures, excessive query load, lack of matching documents, or service contract limitations (e.g., the query rate limits for third-party users of a search service). In this kind of scenarios, where the backend search system is unable to generate answers to queries, approximate answers can be generated by exploiting the previously computed query results available in the result cache of the search engine. In...
Exploiting Navigational Queries for Result Presentation and Caching in Web Search Engines
Ozcan, Rifat; Altıngövde, İsmail Sengör; Ulusoy, Ozgur (Wiley, 2011-04-01)
Caching of query results is an important mechanism for efficiency and scalability of web search engines. Query results are cached and presented in terms of pages, which typically include 10 results each. In navigational queries, users seek a particular website, which would be typically listed at the top ranks (maybe, first or second) by the search engine, if found. For this type of query, caching and presenting results in the 10-per-page manner may waste cache space and network bandwidth. In this article, w...
Second Chance: A Hybrid Approach for Dynamic Result Caching in Search Engines
Altıngövde, İsmail Sengör; Barla Cambazoglu, B.; Ulusoy, Ozgur (2011-01-01)
Result caches are vital for efficiency of search engines. In this work, we propose a novel caching strategy in which a dynamic result cache is split into two layers: an HTML cache and a docID cache. The HTML cache in the first layer stores the result pages computed for queries. The docID cache in the second layer stores ids of documents in search results. Experiments under various scenarios show that, in terms of average query processing time, this hybrid caching approach outperforms the traditional approac...
Exploiting interclass rules for focused crawling
Altıngövde, İsmail Sengör (Institute of Electrical and Electronics Engineers (IEEE), 2004-11-01)
A focused crawler gathers relevant Web pages on a particular topic. This rule-based Web-crawling approach uses linkage statistics among topics to improve. a baseline focused crawler's harvest rate and coverage.
Citation Formats
R. Ozcan, İ. S. Altıngövde, B. Barla Cambazoglu, and Ö. ULUSOY, “Second Chance: A Hybrid Approach for Dynamic Result Caching and Prefetching in Search Engines,” ACM TRANSACTIONS ON THE WEB, pp. 0–0, 2013, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/38291.