Friday, August 5, 2011

net search considers relationships


Etzioni proposes that instead of simply looking for strings of text, a web search engine would identify basic entities -- people, places, things -- and uncover the relationships between them. This is the goal of the UW's Turing Center, which he directs.

The Turing Center has developed an open-source tool called ReVerb that uses information on the web to determine the relationship between two entities.


ReVerb is a program that automatically identifies and extracts binary relationships from English sentences. ReVerb is designed for Web-scale information extraction, where the target relations cannot be specified in advance and speed is important.

Open Information Extraction (IE) is the task of extracting assertions from massive corpora without requiring a pre-specified vocabulary. This paper shows that the output of state-of- the-art Open IE systems is rife with uninformative and incoherent extractions. To over- come these problems, we introduce two simple syntactic and lexical constraints on bi- nary relations expressed by verbs. We implemented the constraints in the ReVerb Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TextRunner and WOE-pos. More than 30% of ReVerb's extractions are at precision 0.8 or higher— compared to virtually none for earlier systems. The paper concludes with a detailed analysis of ReVerb's errors, suggesting directions for future work.

No comments: