The XML KEYWORD QUERY SUGGESTION IN SPONSORED SEARCH Project

ABSTRACT

In the setting of sponsored search, in which we attempt to match enormous numbers of queries to a much smaller corpus of advertiser listings, a search engine or information retrieval system may not be able to retrieve documents matching the query as stated. In this project we research query suggestion, that is, generating a new query to replace a user's original search query to find the matches in advertisers. We focus on the problem of XML keyword queries, due to the fact that XML is a standard representation format of advertising data. Compared with query suggestion for text document search, XML presents better opportunity for generating meaningful refined queries, as XML data is semi-structured with mark-ups providing meaningful annotations to data content. In this project, we design a novel, effective and efficient two level XML query suggestion model by mining clickthrough data and analyzing the semi-structure of XML documents. This model includes clichthrough data processing to extract the semantic between queries, and query transformation technique which includes term deletion, merge, splitting and substitution operations to find better search results in XML trees.

Demo

Data

Currently, our repository collects XML data from the following websites: We are preforming experiments to collect more real data in XML sponsored search.

For any questions regarding this project, please send email to jiahenglu AT gmail.com