开发者

Can I create a corpus from a collection of strings in NLTK? [duplicate]

开发者 https://www.devze.com 2023-02-01 02:48 出处:网络
This question already has answers here: Creating a new corpus with NLTK (4 answers) Closed 9 years ago. Is there a way to create a corpus without having to have items in files. For instan
This question already has answers here: Creating a new corpus with NLTK (4 answers) Closed 9 years ago.

Is there a way to create a corpus without having to have items in files. For instance, I want to manipulate Tweets or paragraphs that I am grabbing from the web. Can I do something like

myCorpus = MyCorp开发者_开发问答us([
    ('id', 'item', 'category'), 
    ('id', 'item', 'category'),
    ('id', 'item', 'category'), 
    ... ])

Or

myCorpus.add('id', 'item', 'category')

The purpose is to manipulate the corpus with existing NLTK capabilities. I checked TextCollection but it seems that it doesn't handle categories.


Why not just write the strings out to a file or files and then process them as a corpus?

0

精彩评论

暂无评论...
验证码 换一张
取 消