开发者

If a word is made up of two valid words

开发者 https://www.devze.com 2023-02-08 09:56 出处:网络
Given a dictionary find out if given word can be made by two words in dictionary. For eg. given \"newspaper\" you have to find if it can be made by two words. (news and paper in this case). Only thing

Given a dictionary find out if given word can be made by two words in dictionary. For eg. given "newspaper" you have to find if it can be made by two words. (news and paper in this case). Only thing i can think o开发者_运维百科f is starting from beginning and checking if current string is a word. In this case checking n, ne, new, news..... check for the remaining part if current string is a valid word.

Also how do you generalize it for k(means if a word is made up of k words) ? Any thoughts?


Starting your split at the center may yield results faster. For example, for newspaper, you would first try splitting at 'news paper' or 'newsp aper'. As you can see, for this example, you would find your result on the first or second try. If you do not find a result, just search outwards. See the example for 'crossbow' below:

cros sbow
cro ssbow
cross bow


For the case with two words, the problem can be solved by just considering all possible ways of splitting the word into two, then checking each half to see if it's a valid word. If the input string has length n, then there are only O(n) different ways of splitting the string. If you store the strings in a structure supporting fast lookup (say, a trie, or hash table).

The more interesting case is when you have k > 2 words to split the word into. For this, we can use a really elegant recursive formulation:

A word can be split into k words if it can be split into a word followed by a word splittable into k - 1 words.

The recursive base case would be that a word can be split into zero words only if it's the empty string, which is trivially true.

To use this recursive insight, we'll modify the original algorithm by considering all possible splits of the word into two parts. Once we have that split, we can check if the first part of the split is a word and if the second part of the split can be broken apart into k - 1 words. As an optimization, we don't recurse on all possible splits, but rather just on those where we know the first word is valid. Here's some sample code written in Java:

 public static boolean isSplittable(String word, int k, Set<String> dictionary) {
     /* Base case: If the string is empty, we can only split into k words and vice-
      * versa.
      */
     if (word.isEmpty() || k == 0)
         return word.isEmpty() && k == 0;

     /* Generate all possible non-empty splits of the word into two parts, recursing on
      * problems where the first word is known to be valid.
      *
      * This loop is structured so that we always try pulling off at least one letter
      * from the input string so that we don't try splitting the word into k pieces
      * of which some are empty.
      */
     for (int i = 1; i <= word.length(); ++i) {
          String first = word.substring(0, i), last = word.substring(i);

          if (dictionary.contains(first) &&
              isSplittable(last, k - 1, dictionary)
              return true;
     }

     /* If we're here, then no possible split works in this case and we should signal
      * that no solution exists.
      */
     return false;
 }

 }

This code, in the worst case, runs in time O(nk) because it tries to generate all possible partitions of the string into k different pieces. Of course, it's unlikely to hit this worst-case behavior because most possible splits won't end up forming any words.


I'd first loop through the dictionary using a strpos(-like) function to check if it occurs at all. Then try if you can find a match with the results. So it would do something like this:

  • Loop through the dictionary strpos-ing every word in the dictionary and saving results into an array, let's say it gives me the results 'new', 'paper', and 'news'.
  • Check if new+paper==newspaper, check if new+news==newspaper, etc, untill you get to paper+news==newspaper which returns.

Not sure if it is a good method though, but it seems more efficient than checking a word letter for letter (more iterations) and you didn't explain how you'd check when the second word started.

Don't know what you mean by 'how do you generalize it for k'.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号