Merge pull request #99 from dcw803/master

Imported my solutions, plus modified README explaining how.
author: Mohammad S Anwar <Mohammad.Anwar@yahoo.com> 2019-04-28 11:11:13 +0100
committer: GitHub <noreply@github.com> 2019-04-28 11:11:13 +0100
commit: 3a58da7550c5cf5784e088859cf51ae569080b5e (patch)
tree: 3f741a6c6100ebf2f2195de7ea43d27968d8b567 /challenge-005
parent: 8db62ce29b1cb243a8c35363338bd0fff706cce3 (diff)
parent: 6320d00f9bcc63af64e6b35807b3f220fc450f4e (diff)
download: perlweeklychallenge-club-3a58da7550c5cf5784e088859cf51ae569080b5e.tar.gz
perlweeklychallenge-club-3a58da7550c5cf5784e088859cf51ae569080b5e.tar.bz2
perlweeklychallenge-club-3a58da7550c5cf5784e088859cf51ae569080b5e.zip
3 files changed, 142 insertions, 24 deletions
diff --git a/challenge-005/duncan-c-white/README b/challenge-005/duncan-c-white/README
index c5628c24dd..b785f171b8 100644
--- a/challenge-005/duncan-c-white/README
+++ b/challenge-005/duncan-c-white/README
@@ -1,24 +1,35 @@
-Challenge 1: "Write a script to output the same number of PI digits
-as the size of your script. Say, if your script size is 10, it should
-print 3.141592653."
-
-Note that it DOESN'T SAY "calculate the same number of PI digits..." so
-I took the liberty to grab them over the internet (via a bitly link I set
-up to shorten the URL and hence the program) rather than generate them
-on the fly, because generating Pi is so dull.
-
-Note that ch-1.pl takes an optional single command line argument to tell
-how many digits to print, if absent the default is to use the size of
-the script as the question wanted.  I built two cut down versions of the
-script, but didn't include them here.
-
-
-Challenge 2: "You are given a file containing a list of words (case
-insensitive 1 word per line) and a list of letters. Print each word from
-the file than can be made using only letters from the list. You can use
-each letter only once (though there can be duplicates and you can use
-each of them once), you don't have to use all the letters."
-
-This is a natural "bag of words" question, essentially we need to build
-a "bag subset" operation.  See ch-2.pl for the solution, simple and obvious
-(I love Perl's hashes especially for their idiomatic set and bag uses).
+Challenge 1: "Write a program which prints out all anagrams for a given word."
+
+Note: I assume that we'll need a wordlist to judge whether a rearrangement
+of letters is actually a word, /usr/share/dict/words is the obvious one
+to use.  The simplest way of solving this problem is to use alphabetically
+sorted SIGNATURES of words - simply the bag of letters in the word sorted.
+
+So "hello"'s signature is "ehllo".  The important thing for anagram purposes
+is that "ehllo" is the signature not only of "hello", but of any other anagram
+of hello too.
+
+So: calculate the signature of the given word, then for every word in the
+dictionary (of the right length if we want to save time), calculate that
+dictionary word's signature - then print out the dictionary word if it's
+signature is the sameas the given word's signature.
+
+Note that ch-1.pl takes two command line arguments: the first is the
+word, the second is the optional filename of the wordlist to use,
+if it's not given then /usr/share/dict/words is used by default.
+
+
+Challenge 2: "Write a program to find the sequence of characters that has
+the most anagrams."
+
+That's rather badly worded, but I choose to interprete it as "find which
+word in a wordlist has most anagrams also in that wordlist".
+
+Note that ch-2.pl takes a single optional command line argument: 
+the filename of the wordlist to use, if it's not given then
+/usr/share/dict/words is used by default.
+
+So: Calculate the signature of every word in the dictionary, building %anag:
+a mapping from signature -> list of words with that signature, and keep track
+of the longest word list of any signature (i.e. the biggest anagram set so far)
+as we go.  Print out the signature with the longest word list at the end.
diff --git a/challenge-005/duncan-c-white/perl5/ch-1.pl b/challenge-005/duncan-c-white/perl5/ch-1.pl
new file mode 100755
index 0000000000..6099e5b6e3
--- /dev/null
+++ b/challenge-005/duncan-c-white/perl5/ch-1.pl
@@ -0,0 +1,45 @@
+#!/usr/bin/perl
+
+# Print all anagrams for a given word, using a wordlist, using the concept of
+# alphabetically sorted SIGNATURES of words - simply the bag of letters in the
+# word sorted.  So "hello"'s signature is "ehllo".
+# The important thing for anagram purposes is that "ehllo" is the signature
+# of all anagrams of hello too.
+# 
+# So: calculate the signature of the given word, then for every word in the
+# dictionary (of the right length if we want to save time), calculate that
+# dictionary word's signature - then print out the dictionary word if it's
+# signature is the sameas the given word's signature.
+
+use strict;
+use warnings;
+use Data::Dumper;
+
+die "Usage: ch-1.pl word [word list filename, default /u/s/d/words\n"
+	if @ARGV == 0 || @ARGV > 2;
+
+my $word = shift;
+my $wordfile = shift // "/usr/share/dict/words";
+
+#
+# my $sig = makesig( $word );
+#	Build and return the SIGNATURE of the given word,
+#	as described at the top of this program.  Easy to
+#	do in Perl, I love "join '' sort split //"
+#
+sub makesig
+{
+	my( $word ) = @_;
+	return join( '', sort split(//,$word) );
+}
+
+my $goalsig = makesig( lc($word) );
+
+open(my $infh, '<', $wordfile) || die "can't open $wordfile\n";
+while( <$infh> )
+{
+	chomp;
+	my $sig = makesig( lc($_) );
+	print "$_\n" if $sig eq $goalsig;
+}
+close( $infh );
diff --git a/challenge-005/duncan-c-white/perl5/ch-2.pl b/challenge-005/duncan-c-white/perl5/ch-2.pl
new file mode 100755
index 0000000000..785d468d38
--- /dev/null
+++ b/challenge-005/duncan-c-white/perl5/ch-2.pl
@@ -0,0 +1,62 @@
+#!/usr/bin/perl
+
+# Find which word (from a wordlist) has the most anagrams.  Print out that
+# word and all it's anagrams.
+# 
+# Calculate the signature of every word in the dictionary, building %anag:
+# signature -> list of words with that signature, keeping track of the
+# longest list (i.e. the biggest anagram set so far) as we go.
+
+use strict;
+use warnings;
+use Data::Dumper;
+
+# "Usage: ch-2.pl [word list filename, default /u/s/d/words\n"
+
+my $wordfile = shift // "/usr/share/dict/words";
+
+#
+# my $sig = makesig( $word );
+#	Build and return the SIGNATURE of the given word,
+#	as described at the top of this program.  Easy to
+#	do in Perl, I love "join '' sort split //"
+#
+sub makesig
+{
+	my( $word ) = @_;
+	return join( '', sort split(//,$word) );
+}
+
+# build %anag: signature -> list of words with that signature
+# and keep track of the signature with the longest list of words
+my %anag;
+my $longestn   = 0;
+my $longestsig = "";
+open(my $infh, '<', $wordfile) || die "can't open $wordfile\n";
+while( <$infh> )
+{
+	chomp;
+	my $sig = makesig( lc($_) );
+	my $aref = ($anag{$sig} //= []);
+	push @$aref, $_;
+
+	# keep track of longest(n,sig) as we go
+	my $len = @$aref;
+	if( $len > $longestn )
+	{
+		$longestn   = $len;
+		$longestsig = $sig;
+	}
+}
+close( $infh );
+
+die "AARGH..\n" if $longestn==0;		# empty wordlist?
+
+#die Dumper \%anag;
+
+# ok, print answer - including the anagrams of the longest set
+print "signature with most anagrams ($longestn): $longestsig\n";
+
+my @anag = @{$anag{$longestsig}};
+
+print "those anagrams are: ".join(',',@anag)."\n";
author	Mohammad S Anwar <Mohammad.Anwar@yahoo.com>	2019-04-28 11:11:13 +0100
committer	GitHub <noreply@github.com>	2019-04-28 11:11:13 +0100
commit	3a58da7550c5cf5784e088859cf51ae569080b5e (patch)
tree	3f741a6c6100ebf2f2195de7ea43d27968d8b567 /challenge-005
parent	8db62ce29b1cb243a8c35363338bd0fff706cce3 (diff)
parent	6320d00f9bcc63af64e6b35807b3f220fc450f4e (diff)
download	perlweeklychallenge-club-3a58da7550c5cf5784e088859cf51ae569080b5e.tar.gz perlweeklychallenge-club-3a58da7550c5cf5784e088859cf51ae569080b5e.tar.bz2 perlweeklychallenge-club-3a58da7550c5cf5784e088859cf51ae569080b5e.zip