Walking Randomly » Programming challenge: Which words have the most anagrams?

Programming challenge: Which words have the most anagrams?

September 18th, 2013 | Categories: Problem of the week, programming | Tags:

While on the train to work this morning, I wondered which English words have the most anagrams that are also valid English words. So, I knocked up few lines of Mathematica code and came up with 4 sets of 7:

{{"ates", "east", "eats", "etas", "sate", "seat", "teas"},
{"pares", "parse", "pears", "rapes", "reaps", "spare", "spear"}, 
{"capers", "crapes", "pacers", "parsec", "recaps", "scrape", "spacer"},
{"carets", "caster", "caters", "crates", "reacts", "recast", "traces"}}

So, according to my program (and Mathematica’s dictionary), the best you can do is 7. I’m not going to post the code until later because I don’t want to influence the challenge which is ‘Write a program in your language of choice that queries a dictionary to find which English words have the most anagrams that are also valid English words.’

Leave a comment | Trackback

Christian Perfect

September 18th, 2013 at 16:29

Reply | Quote | #1

Here are a couple of ways of doing it in Python: https://gist.github.com/christianp/6610845.

Because my wordlist has loads of not-really-words in it, I get:
[‘apers’, ‘apres’, ‘asper’, ‘pares’, ‘parse’, ‘pears’, ‘prase’, ‘presa’, ‘rapes’, ‘reaps’, ‘spare’, ‘spear’]
Thales Fernandes

September 18th, 2013 at 17:14

Reply | Quote | #2

You forgot {“least”, “slate”, “Stael”, “stale”, “steal”, “tales”, “teals”, “Tesla”}
Or names don’t count? If not we get your solution, but a 1 set of 8 words exist. :)

Code Bellow:

char = CharacterRange[“a”, “z”];
char = char ~ Join ~ ToUpperCase@char;
n = Length@char;
WordToNumber[str_String] := Position[char, #] & /@ StringSplit[str, “”] // Flatten // Sort // FromDigits[#, n] &
SetAttributes[WordToNumber, {Listable}];

words = ToLowerCase /@ DictionaryLookup[{“English”, All}] // DeleteDuplicates;
wordnumb = WordToNumber@words;

tally = Transpose@Tally@wordnumb;
count = Last@tally;
vals = First@tally;

pos = Extract[vals, Position[count, Max@count]];
wordPos = (Position[wordnumb, #] & /@ pos);
wordAnagram = Extract[words, #] & /@ wordPos;
wordAnagram /. str_String :> First@DictionaryLookup[str, IgnoreCase -> True]

* Bonus Function (Slow as hell for big words, 1st function try, LOL)
Anagram[str_String] := Select[StringJoin /@ Permutations@StringSplit[str, “”], Length@DictionaryLookup@# > 0 &]
Thales Fernandes

September 18th, 2013 at 17:30

Reply | Quote | #3

@Thales Fernandes
Improved version for non-english words. In Portuguese (my language) we have an anagram of 15 words!

{“aporeis”, “aproeis”, “aprosei”, “arpoeis”, “esporai”, “operais”, “opiares”, “poeiras”, “Poiares”, “posarei”, “proseai”, “proseia”, “raposei”, “repoisa”, “sopeira”}

{“arrastei”, “arrestai”, “arretais”, “arriaste”, “aterrais”, “atirares”, “iteraras”, “rasteira”, “rastreai”, “rastreia”, “ratareis”, “restaria”, “retraias”, “tarareis”, “traseira”}

{“arroteis”, “estoirar”, “ostreira”, “rasteiro”, “rastreio”, “reitoras”, “retroais”, “rotareis”, “soterrai”, “tesoirar”, “torareis”, “torreais”, “torreias”, “traseiro”, “troareis”}

Code:
Clear[words, lang, char, n, WordToNumber, wordnumb, tally, count, vals, pos, wordPos, wordAnagram, Anagram]

lang = “Portuguese”;

Anagram[str_String] := Select[StringJoin /@ Permutations@StringSplit[str, “”], Length@DictionaryLookup@# > 0 &]

words = ToLowerCase /@ DictionaryLookup[{lang, All}] // DeleteDuplicates;
Print[“Number of words to test (“, lang, “): “, Length@words];

char = Union@StringSplit[StringJoin@words, “”];
n = Length@char;
Print[“Character used(“,n,”): “, StringJoin@char];

WordToNumber[str_String] := Position[char, #] & /@ StringSplit[str, “”] // Flatten // Sort // FromDigits[#, n] &
SetAttributes[WordToNumber, {Listable}];

wordnumb = WordToNumber@words;

tally = Transpose@Tally@wordnumb;
count = Last@tally;
vals = First@tally;
Print[“Word with maximum number of anagrams: “, Max@count];

pos = Extract[vals, Position[count, Max@count]];
wordPos = (Position[wordnumb, #] & /@ pos);
wordAnagram = Extract[words, #] & /@ wordPos;
wordAnagram /. str_String :> First@DictionaryLookup[{lang, str}, IgnoreCase -> True]
Szabolcs

September 18th, 2013 at 18:59

Reply | Quote | #4

Mathematica, without much thinking:

SortBy[GatherBy[DictionaryLookup[“*”],
Sort@Characters@ToLowerCase[#] &], Length][[-10 ;;]]

It gives the 10 longest lists. Should be a good exercise for other languages, which I’ll try later.
Mike Croucher

September 18th, 2013 at 19:09

Reply | Quote | #5

OK, so its not quite a week but here’s mine. Much longer than @Szabolcs!

(*Given a word, this function finds all permutations that are also \
words*)
wordperms = Flatten[
Map[DictionaryLookup, Map[StringJoin, Permutations[Characters[#]]]]
] &;
words = DictionaryLookup[]; (*Get all words from the dictionary*)
sorted = Sort[
Map[Sort,
Map[Characters,
words]]]; (*Sort letters of each word into alphabetical order \
and then sort the resulting words*)
found = Part[Select[Tally@sorted, Part[#, 2] > 6 &], All,
1]; (*Select all lists that are repeated more than 6 times. More \
than 7 gives an empty list*)
Map[wordperms, found]

I missed {“least”, “slate”, “Stael”, “stale”, “steal”, “tales”, “teals”, “Tesla”} because I didn’t consider case.
Sander Huisman

September 18th, 2013 at 21:54

Reply | Quote | #6

words = {#, Sort[Characters[ToLowerCase[#]]]} & /@ DictionaryLookup[];
words = SortBy[GatherBy[words, Last], Length][[All, All, 1]];
words[[-20 ;;]] // Grid

Same idea as Szabolcs
Szabolcs

September 19th, 2013 at 17:17

Reply | Quote | #7

So far there are five Mathematica solutions and a single Python one. I would like to see solutions in other languages, in particular R. People on Unix based platforms can use /usr/share/dict/words as the database.
Sean Handley

September 20th, 2013 at 16:37

Reply | Quote | #8

organ has 9 anagrams: rogan, ronga, orang, nagor, groan, grano, goran, argon, angor
reset has 8 anagrams: reest, estre, ester, stree, stere, steer, terse, tsere
trace has 8 anagrams: react, recta, crate, creta, creat, cater, carte, caret
plate has 7 anagrams: pleat, palet, patel, pelta, petal, leapt, tepal

This is using Ruby with words of length 7 or smaller.

WORDS = File.readlines(‘/usr/share/dict/words’).select{|w| w.length < 7 }
puts "There are #{WORDS.count} WORDS in the dictionary."
WORDS_HASH = Hash[WORDS.map{|w| [w.strip,w.strip]}]

results = []

class Result
include Comparable

attr_reader :original_word, :valid_anagrams, :size

def initialize(original_word, valid_anagrams)
@original_word = original_word
@valid_anagrams = valid_anagrams
end

def other
self.size other.size
end

def size
valid_anagrams.count
end

end

n = 0

for word in WORDS
word.strip!
word.downcase!
word_chars = word.chars
word_chars_a = word_chars.to_a
permutations = word_chars_a.permutation(word.length)
permutations = permutations.map(&:join)
valid_anagrams = []
permutations.each do |perm|
valid_anagrams < 0
results <= 25
#break
end
permutations = nil
word = nil
valid_anagrams = nil
word_chars = nil
word_chars_a = nil
end

results.sort.reverse.take(30).each do |result|
puts “#{result.original_word} has #{result.size} anagrams: #{result.valid_anagrams.join(‘, ‘)}”
end
Sean Handley

September 20th, 2013 at 16:39

Reply | Quote | #9

Looks like some characters and formatting are lost. There is a gist here: https://gist.github.com/seanhandley/48d4e42c90b533f6d966

Still needs some work with regard to efficiency!
Globules

September 22nd, 2013 at 07:56

Reply | Quote | #10

Here’s a Haskell version. (I don’t consider case.)

import Control.Arrow
import Data.Function
import Data.List
import Data.Ord
import System.Environment
import qualified Data.Map as M
import qualified Data.Set as S

anagrams :: [String] -> [[String]]
anagrams = map (S.toList . snd) . M.toList
. M.fromListWith S.union
. map (sort &&& S.singleton)

rank :: [[a]] -> [[a]]
rank = sortBy (compare `on` (Down . length))

mostPopular :: [[a]] -> [[a]]
mostPopular = concat . take 1 . groupBy ((==) `on` length)

mostAnagrams :: [String] -> [[String]]
mostAnagrams = mostPopular . rank . anagrams

main :: IO ()
main = do
args readFile path >>= mapM_ print . mostAnagrams . words
_ -> error “Supply one argument giving the path to a dictionary.”

When run on MacOS I get:

$ ./anawords /usr/share/dict/words
[“caret”,”carte”,”cater”,”crate”,”creat”,”creta”,”react”,”recta”,”trace”]
[“angor”,”argon”,”goran”,”grano”,”groan”,”nagor”,”orang”,”organ”,”rogan”]
[“ester”,”estre”,”reest”,”reset”,”steer”,”stere”,”stree”,”terse”,”tsere”]
$
Claudio

September 23rd, 2013 at 06:22

Reply | Quote | #11

listIta = {Sort[Characters[#]], #} & /@ DictionaryLookup[{“Italian”, All}];
Reverse[Sort[Length /@ GatherBy[listIta, First]]][[1 ;; 20]]
Reverse[Sort[GatherBy[listIta, First]]][[1 ;; 3, All, 2]]

Result
{{“aperti”, “aprite”, “pareti”, “patire”, “patrie”, “perita”,
“pietra”, “rapite”, “ripeta”}, {“avresti”, “restavi”, “servita”,
“stivare”, “svitare”, “versati”, “viraste”, “vistare”}, {“parasti”,
“piastra”, “rapasti”, “raspati”, “satrapi”, “sparati”, “sparita”,
“spirata”}}
Arthur G

September 27th, 2013 at 15:04

Reply | Quote | #12

Here’s how I did it in MATLAB, which gave me the same 10 “apers”:

wordList = urlread(‘http://www-01.sil.org/linguistics/wordlists/english/wordlist/wordsEn.txt’);
words = regexp(wordList, ‘\s+’, ‘split’);
map = containers.Map(‘KeyType’, ‘char’, ‘ValueType’, ‘any’);
for i = 1:numel(words)
key = sort(lower(words{i}));
if isKey(map, key)
map(key) = [map(key) words(i)];
else
map(key) = words(i);
end
end
vals = values(map);
counts = cellfun(@numel, vals);
maxCount = max(counts);
vals{counts == maxCount}
David Carlisle

October 10th, 2013 at 17:11

Reply | Quote | #13

A bash one-liner (the line got a bit longer than I intended) It prints out any sets of anagrams with 10 or more words.

bash-4.2$ c=0; while read l; do echo `echo $l | fold -1 | sort ` @ $l;done < /usr/share/dict/words |sort | while read x; do oldk=$k;k=${x%%@*}; if [ "$k" = "$oldk" ] ; then c=$(($c + 1)); w="$w ${x#*@}"; else if [ $c -gt 9 ]; then echo $c $w; fi; c=1; w=${x#*@}; fi; done

14 anestri antsier asterin eranist nastier ratines resiant restain retains retinas retsina stainer starnie stearin
11 alerts alters artels estral laster lastre rastle ratels relast resalt salter
12 astel least salet setal slate stale steal stela taels tales teals tesla
10 aster astre rates reast resat serta stare strae tares tarse
10 estrous oestrus ousters sestuor sourest souters stoures toruses trousse tussore

so /usr/share/dict/words claims 14 (somewhat dubious:-) anagrams for nastier
Brendan Babb

November 19th, 2013 at 11:22

Reply | Quote | #14

This problem reminds me of playing Boggle: http://en.wikipedia.org/wiki/Boggle. Where if you can spell A R T from a triangle of letters, then you can spell R A T and T A R. I am not sure how you would do it, but could you modify the programs to look for a long word that would give you other smaller words in Boggle. I know it would really depend on the layout of the Boggle dice, and you would have less anagrams than without the Boggle constraint.

« Numpy, MATLAB and singular matrices Lamenting the lack of multiple assignment in MATLAB »

TOP