Remove common words but when asked to return an understandable content?_问答_开发者

Remove common words but when asked to return an understandable content?

开发者 https://www.devze.com 2023-04-12 07:09 出处：网络

I was wondering if somehow (maybe with an aglorithm) a submitted text like the one below can be summarized (removing the common words)

Scarlet and blue have featured on the club shirt for more than one hundred years and the club is widely known as the ‘Blaugrana’ in reference to the names of these colours in the Catalan language.

but when it is asked, to make use of the saved data and return an understandable content. Maybe not the same but something that you开发者_JS百科 easily understand.

Will this make use of artificial intelligence ? What methods are today that doing this ?

Update (to clear things up): I want to know how does a computer can connect keywords to provide an understandable content. For example "Scarlet, blue, club, shirt" to be returned like "Scarlet and blue are the club shirt"

There are 2 different tasks:

Extract important information.
Generate meaningful content.

To accomplish both of them you have to use some meaningful text representation between (1) and (2). The best option I can think of is using ontologies. First extract facts from free text and put them into ontology, then generate text from this ontology. Something like this. Anyway, you need to extract facts, not keywords.

Why do you need this for? it looks like you need compression and not intelligent word removal and restore. Try this:

function compress($text)
{
    return base64_encode(gzencode($text));
}
function decompress($text)
{
    return gzdecode(base64_decode($text));
}

The keyword is "Text Summarization".

Update: Based on your update, I have expanded my answer. You can store your documents in a text search engine such as Lucene/Elasticsearch and query your keywords (such as "Scarlet, Blue, Club, Shirt" to retrieve the matching documents. Not exactly the "other way round"; but you can build additional domain-specific analysis on the returned results of the query.