Archive for active feedback

Implicit Feedback, Pseudo Feedback, Relevance Feedback and Active Feedback

Implicit feedback is a popular way to do personalized search. But general audience may confuse it with pseudo feedback and relevance feedback. So it is worth making a clarification here.

Relevance feedback in information retrieval research was proposed in the 1970’s by Gerald Salton and his co-workers as a way to improve retrieval accuracy. Relevance feedback works in the following way. After the user submits a query, the retrieval system will do the first run to rank documents and then present a few top ranked documents for the user to explicitly judge the relevance. After getting the user relevance judgment of these documents, the retrieval system will combine these judged documents with the original query through query expansion to do the second run and present newly ranked documents to the user. A lot of empirical evaluations show that relevance feedback is an effective way to improve the retrieval accuracy. Rocchio feedback formula is the most popular formula to do relevance feedback using vector space model. Model-based feedback proposed by ChengXiang Zhai in his CIKM 2001 paper is a popular way to do relevance feedback using statistical language model.

However, in many retrieval tasks such as web search, the user is not willing to provide the relevance feedback to the retrieval system. So pseudo feedback was later proposed. Pseudo feedback works in the following way. After the user submit a query, the retrieval system will do the first run to rank document and pick a few top ranked document. These top ranked documents are assumed to be relevant by the retrieval system and are combined with the original query through query expansion to do the second run. The retrieval system presents newly ranked documents to the user. Here we can clearly see that relevance feedback needs user involvement in the relevance judgment process while pseudo feedback does not. A lot of empirical evaluations show that pseudo feedback generally, but not always, can outperform the baseline retrieval. However, pseudo feedback is not as effective as relevance feedback.

Relevance feedback is not applicable in many search activities, while implicit feedback totally excludes the user in the feedback process. So either relevance feedback or implicit feedback has limitations. In interactive information retrieval such as web search, the user generally has many interactions with the retrieval system. During these interactions, the user gives a lot of hints to the retrieval system, which can help the retrieval system infer the user’s information need better. Thus implicit feedback was proposed. Implicit feedback works in the following way. The retrieval system will store user interaction data such as query and clickthrough history,  infer the user’s information need better through these interaction data, compose the new query to rank documents and present ranked documents to the user. We can see that implicit feedback neither asks for the user’s explicit relevance judgment nor categorically assumes that top ranked documents of baseline retrieval are relevant. Instead, implicit feedback intelligently infer the user’s information need through those hints implicitly provided by the user.  However, there is a caveat for implicit feedback. We need carefully analyze those hints and do not incorporate noise into the new query, which may even hurt the retrieval performance. Read the paper Context-Sensitive Information Retrieval Using Implicit Feedback for more discussion and references.

To summarize the difference of these three feedback techniques, relevance feedback asks the user explicit relevance judgment; pseudo feedback assumes top ranked document of baseline retrieval are relevant; implicit feedback tries to better infer the user’s information need through the data implicitly provided by the user.

Active feedback was proposed in the paper Active Feedback in Ad-hoc Information Retrieval. Active feedback can be considered as a kind of relevance feedback. But traditional relevance feedback focuses on how to incorporate judged document into the new query (e.g., query term addition and query term reweighting), while active feedback studies which documents should be presented to the user for relevance judgment in order to maximize the learning benefits of the retrieval system from the user judgment. A general framework was proposed in the paper and several specific algorithms were deduced from the framework.

Comments