开发者

Stanford NER - Extract Multi word entities

开发者 https://www.devze.com 2023-02-27 05:06 出处:网络
How can I tag collocations in Stanford NER? Currently it tags Federal Reserve Bank of New York as <wi num=\"11\" entity=\"ORGANIZATION\">Federal</wi> <wi num=\"12\" entity=\"ORGANIZATI

How can I tag collocations in Stanford NER? Currently it tags Federal Reserve Bank of New York as

<wi num="11" entity="ORGANIZATION">Federal</wi> <wi num="12" entity="ORGANIZATION">Reserve</wi> <wi num="13" entity="ORGANIZATION">Bank</wi> <wi num="14" entity="ORGANIZATION">of</wi> <w开发者_高级运维i num="15" entity="ORGANIZATION">New</wi> <wi num="16" entity="ORGANIZATION">York</wi>

I want it to be recognized as

<wi num="11" entity="ORGANIZATION">Federal Reserve Bank of New York</wi>

Is this possible?


Something similar is, yes. If you give the flag

-outputFormat inlineXML

then you'll get:

<ORGANIZATION>Federal Reserve Bank of New York</ORGANIZATION>

(Note that this isn't really changing how Stanford NER works but just the formatting of output. If you don't like any of the provided output formats, it is fairly simple to write your own.)

0

精彩评论

暂无评论...
验证码 换一张
取 消