开发者

English dictionary as txt or xml file with support of synonyms [closed]

开发者 https://www.devze.com 2022-12-27 14:46 出处:网络
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 5 years ago.

Improve this question

Can someone point me to where I can download English dictionary as a tx开发者_JAVA百科t or xml file. I am building a simple app for myself and looking for something what I could start using immediately without learning complex API.

Support for synonyms would be great, that is it should be easier to retrieve all the synonyms for a particular word.

It would be absolutely fantastic if the dictionary would be listing British and American spelling of the words where they differ.

Even if it would be small dictionary (a few thousand words) that's OK, I only need it for a small project.

I even would be willing to buy one if the price is reasonable, and the dictionary is easy to use - simple XML would be great.

Any directions please.


WordNet is what you want. It's big, containing over a hundred thousand entries, and it's freely available.

However, it's not stored as XML. To access the data, you'll want to use one of the existing WordNet APIs for your language of choice.

Using the APIs is generally pretty straightforward, so I don't think you have to worry much about "learning (a) complex API". For example, borrowing from the WordNet How to for the Python based Natural Language Toolkit (NLTK):

 >>> from nltk.corpus import wordnet
 >>> 
 >>> # Get All Synsets for 'dog'
 >>> # This is essentially all senses of the word in the db
 >>> wordnet.synsets('dog')
 [Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), 
  Synset('cad.n.01'), Synset('frank.n.02'),Synset('pawl.n.01'), 
  Synset('andiron.n.01'), Synset('chase.v.01')]
 
 >>> # Get the definition and usage for the first synset
 >>> wn.synset('dog.n.01').definition
 'a member of the genus Canis (probably descended from the common 
 wolf) that has been domesticated by man since prehistoric times; 
 occurs in many breeds'
 >>> wn.synset('dog.n.01').examples
 ['the dog barked all night']

 >>> # Get antonyms for 'good'
 >>> wordnet.synset('good.a.01').lemmas[0].antonyms()
 [Lemma('bad.a.01.bad')]

 >>> # Get synonyms for the first noun sense of 'dog'
 >>> wordnet.synset('dog.n.01').lemmas
 [Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), 
 Lemma('dog.n.01.Canis_familiaris')]

 >>> # Get synonyms for all senses of 'dog'
 >>> for synset in wordnet.synsets('dog'): print synset.lemmas
 [Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), 
 Lemma('dog.n.01.Canis_familiaris')]
 ...
 [Lemma('frank.n.02.frank'), Lemma('frank.n.02.frankfurter'), 
 ...

While there is an American English bias in WordNet, it supports British spellings and usage. For example, you can look up 'colour' and one of the synsets for 'lift' is 'elevator.n.01'.

Notes on XML

If having the data represented as XML is essential, you could easily use one of the APIs to access the WordNet database and convert it into XML, e.g. see Thinking XML: Querying WordNet as XML.


I know this question is quite old but I had problems myself for finding that as a txt file, so if anyone would be looking synonyms and antonyms txt file database the simplest yet very detailed try https://ia801407.us.archive.org/10/items/synonymsantonyms00ordwiala/synonymsantonyms00ordwiala_djvu.txt .


I have used Roget's thesaurus in the past. It has the synonymy information in plain text files. There is also some java code to help you parse the text.

These pages provides links to a bunch of thesauri/lexical resources some of which are freely downloadable.

http://www.w3.org/2001/sw/Europe/reports/thes/thes_links.html

http://www-a2k.is.tokushima-u.ac.jp/member/kita/NLP/lex.html


Try WordNet.

0

精彩评论

暂无评论...
验证码 换一张
取 消