I was asked to help solve the following word puzzle: find valid n-characters-word from m characters.
In the following example, The answer is combine.
I am not good at this kind of game. But I am a programmer, so I can write a program to do it.
I use open source English dictionary libraries:
WordNet http://wordnet.princeton.edu/wordnet/download/.
JWI: http://projects.csail.mit.edu/jwi/.
JWI is a Java library for interfacing with Wordnet.
First I will load all English words into memory, later, when we call dict.getIndexWord, it will directly search the word HashMap in the memory.
First I will load all English words into memory, later, when we call dict.getIndexWord, it will directly search the word HashMap in the memory.
private void init(String dictFile) throws IOException, InterruptedException { dict = new RAMDictionary(new File(dictFile), ILoadPolicy.NO_LOAD); dict.open(); dict.load(true); }In this article, I solve this problem with recursion.
1. Get all combinations n-characters-word of m characters.
2. Iterative every word in the list returned in step 1, determine whether it is a valid word.
Another similar word game is to find the longest words from m characters.
To solve this, I tried get m-length word, then (m-1) length word, etc. The code is like below:
public List<String> getMaxLenValidWord(String str) throws MalformedURLException, IOException { List<String> list = new ArrayList<>(); for (int len = str.length(); len > 0; len--) { list = getAllValidWords(str, len); if (!list.isEmpty()) { return list; } } return list; }This is not very efficient, as it will stores all all combinations n-characters-word of m characters in memory, this is huge. When n and m is large, it will throw OOM error.
In later post , I will explain how to solve this problem iteratively.
Implementation
The code is like below: you can review the complete code at Github.
The output is like below:
getAllValidWords took 5095 mill seconds.
getMaxLenValidWord throws Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
package org.codeexample.algorithms.collected.wordpuzzle; import edu.mit.jwi.RAMDictionary; import edu.mit.jwi.data.ILoadPolicy; import edu.mit.jwi.item.IIndexWord; import edu.mit.jwi.item.POS; public class WordPuzzleWithRecusion { private RAMDictionary dict; public WordPuzzleWithRecusion(String dictFile) throws IOException, InterruptedException { init(dictFile); } private void init(String dictFile) throws IOException, InterruptedException { dict = new RAMDictionary(new File(dictFile), ILoadPolicy.NO_LOAD); dict.open(); dict.load(true); } public void close() { dict.close(); } public List<String> getAllValidWords(String str, int length) throws MalformedURLException, IOException { List<String> lists = new ArrayList<String>(); long start = new Date().getTime(); HashSet<String> result = getNCombination(str, length); System.out.println("getAllValidWords took " + (new Date().getTime() - start) + " mill seconds."); for (String word : result) { if (isValidWorld(word)) { lists.add(word); } } System.out.println("getAllValidWords took " + (new Date().getTime() - start) + " mill seconds."); return lists; } public List<String> getMaxLenValidWord(String str) throws MalformedURLException, IOException { long start = new Date().getTime(); List<String> list = new ArrayList<>(); for (int len = str.length(); len > 0; len--) { list = getAllValidWords(str, len); if (!list.isEmpty()) { return list; } } System.out.println("getMaxLenValidWord took " + (new Date().getTime() - start) + " mill seconds, found " + list); return list; } /** * Solve the problem using recursion, need a lot of memory. */ private HashSet<String> getNCombination(String strs, int len) { HashSet<String> result = new HashSet<String>(); if (strs == null) { // error case return result; } if (len == 0 || len > strs.length()) return result; if (strs.length() == 1 && len == 1) { // base case result.add(strs); return result; } HashSet<String> result1 = getNCombination(strs.substring(1), len); result.addAll(result1); String firstStr = strs.substring(0, 1); if (firstStr.length() == len) { result.add(firstStr); } else { HashSet<String> result2 = getNCombination(strs.substring(1), len - 1); for (String tmp : result2) { for (int i = 0; i < tmp.length(); i++) { String tmpResult = tmp.substring(0, i) + firstStr + tmp.substring(i); result.add(tmpResult); } result.add(tmp + firstStr); } } return result; } public static void main(String[] args) throws Exception { String str = "oivmwujbecn"; int length = 7; // getAllValidWords took 5095 mill seconds. WordPuzzleWithRecusion instance = new WordPuzzleWithRecusion( "C:/Program Files (x86)/WordNet/2.1/dict"); System.out.println(instance.getAllValidWords(str, length)); // This will throw OOM System.out.println(instance.getMaxLenValidWord(str)); instance.close(); } private boolean isValidWorld(String word) { Boolean isWord = false; IIndexWord idxWord = dict.getIndexWord(word, POS.NOUN); if (idxWord != null && !idxWord.getWordIDs().isEmpty()) { isWord = true; } return isWord; } }