Skip navigation

Man, I’ve had to deal with writing so brutal java programs this quarter for my 143 class. The LinkList program which requires you to write your own node class, and LinkLint class, plus three other classes that create a mini word editor was probably the hardest. You had to be able to insert or delete a letter at your cursor using LinkLists. If you don’t know what I’m talking about, a LinkList works by using only one class and the only way to get to a node in the list is to call the same class. LinkLists don’t bother me when you just have to use a class that is already written, but when you have to write your own class to make it work that’s a whole nother story… who ever came up with the concept I bow to, lol.

 

Anyways, I spent all night on another ridiculous program. Basically I had to find all possible words that can be created within a certain phrase/word. Like, how many words can you make out of the word "sink". The solution took me forever to find. After asking my teacher for help I finally came up with one. It’s a recursive solution that explodes extremely fast in memory due to the amount of possible solutions. For brushhair there are 362,880 solutions just for using all the letters (also know as a anagram – if i printed them out the list would be huge), and then when counting smaller words it grows to an even more extremely large number. If you enter a word over 10 letters it takes about 5 minutes to finish gathering all the possible solutions. Out of the millions of possible solutions there are only a couple words that match in the dictionary file that was given to us!  Anyways, here is the solution I came up with the code:

 

/* Joey Sipos
 * Assignmetn 07 Anagram solver
 * May 19, 2009
 */
import java.io.*;
import java.util.*;
public class AnagramSolver {
  //ArrayList of all anagram-phrases
  private List<String> finalAnagramList = new ArrayList<String>();
  List<String> gumbleAnagramList = new ArrayList<String>();
  private String phrase = "";
  Set<String> dictionary = new TreeSet<String>();
 
  public AnagramSolver(String phrase1, String dictName) throws FileNotFoundException {
    //add the dictionary words to the set
    String dictWords = "";
    Scanner input = new Scanner(new File(dictName));
    while (input.hasNext()) {
      dictWords = input.next();
      dictionary.add(dictWords);
    }
   
    //remove white spaces from phrase, cosolidate into one word
    Scanner tokenScanner = new Scanner(phrase1);
    String temp = "";
    while (tokenScanner.hasNext()) {
      temp = temp + tokenScanner.next();
    }
    phrase = temp;
  }
 
  //takes a word and produces all possible anagrams
  //all anagrams that are the orginal length of the phrase are stored in a list
  //The list is procesed for smaller words is the solver method
  public void produceAllAnagrams(char insertChar, String word) {
   
    //store the letters into a char array
    char[] arrayWord = word.toCharArray();
    //turn the char array into a linked list
    LinkedList listWord = new LinkedList();
    for (int i = 0; i < arrayWord.length; i++) {
      listWord.add(arrayWord[i]);
    }
   
    for (int i = 0; i <= listWord.size(); i++) {
      //add the insert char to create the new word
      listWord.add(i,insertChar);
      int j = 0;
      //turns the new Word into a string
      String newWord = "";
      while (j < listWord.size()) {
        newWord += listWord.get(j);
        j++;
      }
     
      //recurse
      if (newWord.length() <  phrase.length()) {
        produceAllAnagrams(phrase.charAt(newWord.length()), newWord);
      }
      //store if it the length of the orginal phrase
      if (newWord.length() == phrase.length()) {
        gumbleAnagramList.add(newWord);
        //System.out.println("test");
      }
      //remove our insertchar 
      listWord.remove(i);
    }
  }
 
  public void printAnagrams() {
    String firstLetter = "" + phrase.charAt(0);
    produceAllAnagrams(phrase.charAt(1), firstLetter);
   
    for(int j=0; j < gumbleAnagramList.size(); j++) {
      String anagram = gumbleAnagramList.get(j);
      for(int k=1; k <= phrase.length(); k++) {
        //if the dictionary has the word and is not already in the final list
        if ((dictionary.contains(anagram.substring(0, k))) &&
             (finalAnagramList.contains(anagram.substring(0, k)) == false)) {
          finalAnagramList.add(anagram.substring(0,k));
        }
      }
    }
    System.out.println();
    for (int i=0; i < finalAnagramList.size(); i++) {
      System.out.println(finalAnagramList.get(i));
    }
  }
}
 
 
 
 
 
 
Advertisements

5 Comments

  1. You might want to take this down. Other students can cheat by copying your solution.

    • hey, where’s your main class?

  2. you suck, where’s your main class, i made a even more complex java program before.

    • Main class was in a separate file 😉 This is simply the AnagramSolver class. You have to call the methods from your main. If you were so good you would know this. Fun to look back at this little project through haha

  3. Hi Joe,

    I realize this was posted a few years ago and I don’t know where you are currently at in your computer science journey. Congratulations on coming up with an algorithm to solve your assignment. You mentioned how the time complexity of your algorithm was fairly poor in that as the size of the word increased the time it took to solve increased in a non linear manner (much slower actually).

    I think it’s always good idea to think about these problems a bit before you solve them. The first question would obviously be what is an anagram and how do anagram relate. As you know the relationship between two anagrams is such that you can form the second word by rearranging the letters in the first.

    Naturally when you first look at the problem it seems like an obvious solution is to find all possible combinations of letters for a given word. You can then compare this against a dictionary and return the matches.

    However let’s think about this for a second. That solution might be fine for two letter words because there is only two possible combinations. And it’s probably ok for three letter words with 6 combinations. I would say it’s ok for four and five too. But can you tell me how many combinations there are for 6, 7, 8, 9, and 10 letter words?

    Let’s take a step back and think about what an anagram really is by looking at the difference between words that are not anagrams as oppose to the similarities between anagrams. If you had to describe to me an anagram by what it is not or how it is dissimilar to other words what would you say?

    Well an anagram is not any word of different lengths, correct? And any word that contains any letter that does not appear in the first word is also not an anagram of that word. Also two words are not anagrams if the frequency of any letter used in word 1 is different than the frequency used in word 2.

    Two things pop out to me after breaking this problem down. Firstly, it seems to me that anagrams are somewhat unique to each other and different from all others. And what is our goal here again. We want to know if one word is an anagram of another word. IE we want to know if two words are the same in a particular way with each other and different from others based of a particular property. Let’s start to think mathematically here and instead of looking at a word as a string of characters let’s look at them as a number. Let’s take a three character word as an example for simplicity. Let’s think of the letters in this word in terms of numbers. Is there any way to assign a number to every letter of the alphabet such that if you do an operation (such as multiple, subtract, divide, etc) on all the letters you will arrive at a total that is the same for all anagrams and different for non anagrams? Well what do you know about prime numbers? A prime number is a number that only has one pair of multiples, correct? What about prime factorization? Let’s take the word “abc”. Let’s assign a=1, b=3, c=5. Multiples together is 15. Can you think of any other 3 letter word that would multiple to be 15 if you assign each letter to a sequential prime number. Well, acb, bac, bca, cba, and cab obviously. But is there any else? No? What did we just do there? If we wanted to tell if a word is an anagram, could we compare words that way?

    I think we can, but I also want to take another step back. Before we said an anagram has a few properties. It always has the same length, it always has the same letters, and it always has the same frequency of letters (if word 1 has 2 a and 3 b so must word 2). Well which one of those properties makes the others redundant? Does two words of the same length always have the same letters? Do two words with same letters always have the same frequency? No? But do two words that have the same frequency of every letter always have the same length and the same letters? Yes? So maybe we are determining anagrams we can compare the histogram of two words based on letter frequency


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: