Archive for September, 2005

Vivisimo teams with MSN for FirstGov.gov

Vivisimo teams with MSN to provide the search technology for U. S. government FirstGov.gov portal, which is reported in the 09/26/2005 article of Search Engine Watch. Compared with well-exposed Google activities, which always attract media attention, even when it is about the new business of ex-chef of Google (see Google to Noodles: A Chef Strikes Out on His Own from New York Times) and hiring activity of some new chefs (see Wanted at Google: A few good chefs from News.com), the report about this event is relatively minimum.

Vivisimo has interesting technologies to do search engine result clustering. Raul Valdes-Perez, CEO of Vivisimo thinks that the personalization is a dead end and had written an article about it, which I do not agree in general. The problems he mentioned in the article had been addressed or are being addressed in the personalization research.

Leave a Comment

A New Version of UCAIR Toolbar

There is a new version of UCAIR toolbar, which can be downloaded from the UCAIR project website. This version is rewritten by Bin nearly from scratch. We redesigned the software architecture of UCAIR toolbar, which aims to be extensible and robust.

Comments (1)

A Seminar Course about Search Engines in SIMS, Berkeley

There is a seminar course (Search Engines: Technology, Society, and Business) offered in SIMS, Berkeley in fall 2005. From the course website, it is said “A set of top-notch experts have agreed to give lectures for fall 2005.” Among them, Dr. Susan Dumais from Microsoft Research and Dr. Sepandar Kamvar (co-founder of Kaltrix) from Google will give lectures. Both of them are doing personalized search. Thus the topics of them probably are related with the personalized search. The slides and videos for some talks are available at the website.

Leave a Comment

Personalized Search Papers at ACM CIKM 2005

CIKM 2005, one of top information retrieval research conferences, will be held in Bremen, Germany from October 31st to November 5th. The last session of this conference is about context and personalization. There will be three paper presentations in this session:

Context Modeling and Discovery Using Vector Space Bases by Massimo Melucci (University of Padua)

Y!Q: Contextual Search at the Point of Inspiration by Reiner Kraft, Farzin Maghoul, Chi Chao Chang (Yahoo! Inc.)

Implicit User Modeling for Personalized Search by Xuehua Shen, Bin Tan, Chengxiang Zhai (CS, UIUC)


For the Y!Q paper, the blog of the first author Reiner Kraft explains the new feature of Y!Q. When you read a web page and are interested in some phrases or a sentence, you can mark them and trigger the search. Actually this functionality appeared in the defunct IntelliZap system (See WWW 2001 paper).

Leave a Comment

Search Engine Web APIs

Google Web API provides a way for programmers to develop interesting search related applications utilizing the power of Google search engine. But currently there are some limitations for programmer to develop a large-scale application. I notice that there are at least two limitations. One is that one account can at most submit 1000 requests one day and the other is that for each query the user can only get at most 10 search results. With these two limitations, the client-side programs can not get many results frequently from Google through Google Web API and thus can not do many interesting processing such as result reranking at a large scale.

Yahoo Web API permits 5000 queries per IP per day and 50 search results per query. So Yahoo Web API is friendlier to developers. Meanwhile, MSN is also preparing to release their Web APIs (see news from News.com). Hope the competition will boost the upgrade of Web APIs of all search engines in the near future, which will benefit developers and eventually end users.

Leave a Comment

Notions of Personalization in Industry

Besides personalized search engines in industry, there are personalized portal and recommendation system, which is briefly discussed as follows.

Personalized Portal: My Yahoo is the pioneer in the personalized web portal, which includes personalized news, weather forecast, comics, and TV listing. The user can customize the personalized portal by setting user interested content, color, layout and etc. Findory is a web site which provides the personalized news service. Unlike My Yahoo, the user does not need to explicitly specify the user interest. Instead, the web site implicitly infers the user interests through the user interaction history on the web site. The more user browsing history is collected, the better personalized news articles selection is presented.

Recommendation System: Many E-Commerce web sites try to build personalized stores for each online customer. Amazon is the most famous one in building personalized web stores. They use collaborative filtering techniques to recommend stuff for the customers according to product purchased or viewed by customers before.

Leave a Comment

Notions of Personalization in Personalized Search Engine

Web search engines have achieved great successes in helping people find information on the Web, especially for simple information need such as homepage finding. However, search engines still perform poorly in many other tasks. There are many reasons to cause the poor performance of the search engine. Among them, two important reasons are frequently pointed out. First, many user queries are ambiguous or the user himself does not know how to specify the information need exactly. Thus the search engine can not infer the real user information need just according to the current user query. Second, information retrieval is an interactive process; users will adjust their queries during this process. Therefore, the search engine should also adjust the inference of user information need. Nevertheless, currently most, if not all, search engines use only the user’s current query to do the search. Some search engine companies such as Google, MSN Search and Yahoo are trying to use contextual and personal information to help the search. Some search engines have already released the test version of personalized search such as Google. Yahoo co-founder Jerry Young said that the relevance of search is still the Holy Grail for any search application and the key challenge for Yahoo and all search companies going forward will be to find ways to increase the personalization of results, i.e., making sure that a user truly finds what he or she is looking for when typing in a keyword search.

Leave a Comment

Notions of Personalization in Human-Computer Interaction Community

Currently, there is much interest in the personalization of product interfaces. For example, mobile phones are now sold with replaceable colored covers, e-commerce sites learn a user preference, and word processors allow you to customize the menus and tool bars. In an HCI 2000 poster, the personalization is defined as follows.

  Personalization is defined here as a process that changes the functionality, interface, information
content, or distinctiveness of a system to increase its personal relevance to an individual.

The motivation for personalization is divided into those that are primarily to facilitate the work, e.g., bookmarking a web page, and those that are primarily to accommodate social requirements, e.g., expressing the identity of the user. HCI community focuses on how to model user search behavior, what kind of user actions such as mouse moving are related to user interests, and how the system can extract useful information during the user interaction with the system to do personalization. In an IUI 2004 paper, the author studied the correlation of four mouse operations and user interests and used these mouse operations as the clue to extract some context keywords to do similarity search.

Leave a Comment

Notions of Personalization in Information Retrieval Community

Research in information retrieval has a long history dating back to 1950’s. Over decades, significant progress has been made in developing retrieval models such as vector space model, probabilistic model, and recently statistical language model, performing large scale empirical evaluation and building useful systems such as SMART, Lemur and Google. Nevertheless, almost all existing retrieval models and systems can be characterized as “one size fits all”. Only user queries are used to represent user information need and there is no representation of search context and user preference. Thus same queries submitted by different users are treated as the exactly same. A great amount of responsibility of finding relevant information is taken by the user. However, the ideal retrieval system should proactively incorporate both the user’s search context and personal preference into the retrieval decision process. In a recent workshop about challenges in information retrieval and language model, personalization and contextual search is considered as one of two big challenges in information retrieval. They define the contextual search as follows.

Contextual Search: Combine search technologies and knowledge about query and search context into a single framework in order to provide the most “appropriate” answer for a user’s
information needs.


However, despite recent attention to this problem, little progress has been made due to the difficulty of capturing and representing knowledge about the user, context and task in the general web search environment. Although there are many studies of retrieval models (by researchers of computer science) and user models and user information seeking process (by researchers of information science), the research in user model and retrieval model are currently is not well integrated.

Participants of the workshop believe that the future search engine should be able to collect use context and query features to infer characteristics of the information need unobtrusively. A retrieval framework integrating retrieval model and user model needs to be proposed, studied and evaluated empirically.

Leave a Comment

Web Browser, Search Engine and Toolbar

The Web Browser is the most important window to the immense information on Internet. There are Internet Explorer (IE) (IE 7.0 is in beta testing), Firefox (more than 80 million downloads since its release on November 9, 2004), Opera (it just celebrated its 10th anniversary on August 30, 2005), Netscape (watch the drama of browser war between Netscape/Firefox and IE), Safari, and others.  Developers can add new functionalities into the web browser through add-ins such as Google toolbar.

The Search Engine helps the user find information on Internet. Google, Yahoo and MSN are dominant players in Search Engine arena (watch the drama of Microsoft’s suit against Google and Kai-Fu Lee).  All of them offer IE toolbars, which help the user search information without visiting search engine homepage, and APIs, which help developers to add new functionalities based on those search engines.

UCAIR toolbar is an IE toolbar, which uses Google search engine search results as basic results. But so far, it does not make use of Google APIs. However, it is a choice under the consideration.

Leave a Comment

Older Posts »