imported my solutions to this week's tasks, including a bonus (ch-2a.pl) which reads the relations from self-describing CSV files. All good clean fun:-)

author: dcw <d.white@imperial.ac.uk> 2021-10-03 23:52:32 +0100
committer: dcw <d.white@imperial.ac.uk> 2021-10-03 23:52:32 +0100
commit: 42519717810ae3f1605ee4d4a47bc5fe3738da85 (patch)
tree: ab99e23d9bd27165515d5faaa82422cd362ba759
parent: 0eacf73b0557a9b3ff654d14506744f6a84858ef (diff)
download: perlweeklychallenge-club-42519717810ae3f1605ee4d4a47bc5fe3738da85.tar.gz
perlweeklychallenge-club-42519717810ae3f1605ee4d4a47bc5fe3738da85.tar.bz2
perlweeklychallenge-club-42519717810ae3f1605ee4d4a47bc5fe3738da85.zip
6 files changed, 455 insertions, 37 deletions
diff --git a/challenge-132/duncan-c-white/README b/challenge-132/duncan-c-white/README
index 1df544175c..4918f2b94a 100644
--- a/challenge-132/duncan-c-white/README
+++ b/challenge-132/duncan-c-white/README
@@ -1,61 +1,106 @@
-Task 1: "Consecutive Arrays
+TASK #1 - Mirror Dates
 
-You are given a sorted list of unique positive integers.
+You are given a date (yyyy/mm/dd).
 
-Write a script to return list of arrays where the arrays are consecutive
-integers.
+Assuming, the given date is your date of birth. Write a script to find
+the mirror dates of the given date.
+
+Assuming today is 2021/09/22.
 
 Example 1:
 
-   Input:  (1, 2, 3, 6, 7, 8, 9)
-   Output: ([1, 2, 3], [6, 7, 8, 9])
+Input: 2021/09/18
+Output: 2021/09/14, 2021/09/26
+
+On the date you were born, someone who was your current age, would have
+been born on 2021/09/14.  Someone born today will be your current age
+on 2021/09/26.
 
 Example 2:
 
-   Input:  (11, 12, 14, 17, 18, 19)
-   Output: ([11, 12], [14], [17, 18, 19])
+Input: 1975/10/10
+Output: 1929/10/27, 2067/09/05
 
-Example 3:
+On the date you were born, someone who was your current age, would have
+been born on 1929/10/27.  Someone born today will be your current age
+on 2067/09/05.
 
-   Input:  (2, 4, 6, 8)
-   Output: ([2], [4], [6], [8])
+Example 3:
 
-Example 4:
+Input: 1967/02/14
+Output: 1912/07/08, 2076/04/30
 
-   Input:  (1, 2, 3, 4, 5)
-   Output: ([1, 2, 3, 4, 5])
-"
+On the date you were born, someone who was your current age, would have
+been born on 1912/07/08.  Someone born today will be your current age
+on 2076/04/30.
 
-My notes: easy, should be able to do this in 1-pass.
+MY NOTES: Sounds like a pretty easy date manipulation exercise:  dob - delta,
+today + delta where delta = today - date.  The hardest part is working out
+which Date manipulation module to use, as Perl has so many.
 
 
-Task 2: "Find Pairs
+TASK #2 - Hash Join
 
-You are given a string of delimiter pairs and a string to search.
+Write a script to implement Hash Join algorithm as suggested by wikipedia.
 
-Write a script to return two strings, the first with any characters
-matching the 'opening character' set, the second with any matching
-the 'closing character' set.
+1. For each tuple r in the build input R
+    1.1 Add r to the in-memory hash table
+    1.2 If the size of the hash table equals the maximum in-memory size:
+        1.2.1 Scan the probe input S, and add matching join tuples to the output relation
+        1.2.2 Reset the hash table, and continue scanning the build input R
+2. Do a final scan of the probe input S and add the resulting join tuples to the output relation
 
-Example 1:
+Example
 
 Input:
-    Delimiter pairs: ""[]()
-    Search String: "I like (parens) and the Apple ][+" they said.
 
-Output:
-    "(["
-    ")]"
-
-Example 2:
-
-Input:
-    Delimiter pairs: **//<>
-    Search String: /* This is a comment (in some languages) */ <could be a tag>
+    @player_ages = (
+        [20, "Alex"  ],
+        [28, "Joe"   ],
+        [38, "Mike"  ],
+        [18, "Alex"  ],
+        [25, "David" ],
+        [18, "Simon" ],
+    );
+
+    @player_names = (
+        ["Alex", "Stewart"],
+        ["Joe",  "Root"   ],
+        ["Mike", "Gatting"],
+        ["Joe",  "Blog"   ],
+        ["Alex", "Jones"  ],
+        ["Simon","Duane"  ],
+    );
 
 Output:
-    /**/<
-    /**/>
-"
 
-My notes: also pretty easy, if I've understood it right.  Also doable in 1-pass.
+    Based on index = 1 of @players_age and index = 0 of @players_name.
+
+    20, "Alex",  "Stewart"
+    20, "Alex",  "Jones"
+    18, "Alex",  "Stewart"
+    18, "Alex",  "Jones"
+    28, "Joe",   "Root"
+    28, "Joe",   "Blog"
+    38, "Mike",  "Gatting"
+    18, "Simon", "Duane"
+
+MY NOTES: Ok, I think I understand, but I'm going to ignore the
+whole "out of memory" part as that's too complicated.
+Also, I can't see what logical order the example output is ordered by,
+as far as I can see, the described algorithm leads to the order that I
+produce - not the order the above example output shows; so I'm going to
+ignore that too.  After all, in a relation, order doesn't matter, right?
+
+So for the example I build %name2ages containing:
+"Alex" => [20, 18],
+"Joe" => [28],
+"Mike" => [38],
+"David" => [25],
+"Simon" => [18].
+Then use %name2ages while iterating over @player_names.
+
+There's also the question of how to provide the relations,
+in ch-2.pl I hard-coding them as arrays of pairs as shown above,
+but see also ch-2a.pl which generalises them as files, read by Text::CSV
+and containing fieldnames in row 1.
diff --git a/challenge-132/duncan-c-white/perl/ages b/challenge-132/duncan-c-white/perl/ages
new file mode 100644
index 0000000000..149e6c7563
--- /dev/null
+++ b/challenge-132/duncan-c-white/perl/ages
@@ -0,0 +1,7 @@
+age,forename
+20,"Alex"
+28,"Joe"
+38,"Mike"
+18,"Alex"
+25,"David"
+18,"Simon"
diff --git a/challenge-132/duncan-c-white/perl/ch-1.pl b/challenge-132/duncan-c-white/perl/ch-1.pl
new file mode 100755
index 0000000000..c91f966cec
--- /dev/null
+++ b/challenge-132/duncan-c-white/perl/ch-1.pl
@@ -0,0 +1,70 @@
+#!/usr/bin/perl
+# 
+# TASK #1 - Mirror Dates
+# 
+# You are given a date (yyyy/mm/dd).
+# 
+# Assuming, the given date is your date of birth. Write a script to find
+# the mirror dates of the given date.
+# 
+# Assuming today is 2021/09/22.
+# 
+# Example 1:
+# 
+# Input: 2021/09/18
+# Output: 2021/09/14, 2021/09/26
+# 
+# On the date you were born, someone who was your current age, would have
+# been born on 2021/09/14.  Someone born today will be your current age
+# on 2021/09/26.
+# 
+# Example 2:
+# 
+# Input: 1975/10/10
+# Output: 1929/10/27, 2067/09/05
+# 
+# On the date you were born, someone who was your current age, would have
+# been born on 1929/10/27.  Someone born today will be your current age
+# on 2067/09/05.
+# 
+# Example 3:
+# 
+# Input: 1967/02/14
+# Output: 1912/07/08, 2076/04/30
+# 
+# On the date you were born, someone who was your current age, would have
+# been born on 1912/07/08.  Someone born today will be your current age
+# on 2076/04/30.
+# 
+# MY NOTES: Sounds like a pretty easy date manipulation exercise:  dob - delta,
+# today + delta where delta = today - date.  The hardest part is working out
+# which Date manipulation module to use, as Perl has so many.
+# 
+
+use strict;
+use warnings;
+use feature 'say';
+use Getopt::Long;
+#use Data::Dumper;
+use Date::Simple ('date', 'today');
+
+my $debug=0;
+die "Usage: mirror-dates YOUR_DOB\n" unless
+	GetOptions( "debug"=>\$debug ) && @ARGV==1;
+my $dobstr = shift @ARGV;
+$dobstr =~ s|/|-|g;		# Date::Simple likes YYYY-MM-DD not YYYY/MM/DD
+my $dob = date($dobstr) || die "bad date: $dobstr\n";
+
+my $today = today();
+$today = date("2021-09-22") if $debug;	# use today=exampletoday if debugging
+say "Using today = $today" if $debug;
+
+say "dob=$dob, today=$today" if $debug;
+
+my $delta = $today - $dob;
+say "delta=$delta" if $debug;
+
+my $before = $dob - $delta;
+my $after = $today + $delta;
+
+say "$before, $after";
diff --git a/challenge-132/duncan-c-white/perl/ch-2.pl b/challenge-132/duncan-c-white/perl/ch-2.pl
new file mode 100755
index 0000000000..e38e9bb34c
--- /dev/null
+++ b/challenge-132/duncan-c-white/perl/ch-2.pl
@@ -0,0 +1,139 @@
+#!/usr/bin/perl
+# 
+# TASK #2 - Hash Join
+# 
+# Write a script to implement Hash Join algorithm as suggested by wikipedia.
+# 
+# 1. For each tuple r in the build input R
+#     1.1 Add r to the in-memory hash table
+#     1.2 If the size of the hash table equals the maximum in-memory size:
+#         1.2.1 Scan the probe input S, and add matching join tuples to
+#	        the output relation
+#         1.2.2 Reset the hash table, and continue scanning the build input R
+# 2. Do a final scan of the probe input S and add the resulting join tuples
+#    to the output relation
+# 
+# Example
+# 
+# Input:
+# 
+#     @player_ages = (
+#         [20, "Alex"  ],
+#         [28, "Joe"   ],
+#         [38, "Mike"  ],
+#         [18, "Alex"  ],
+#         [25, "David" ],
+#         [18, "Simon" ],
+#     );
+# 
+#     @player_names = (
+#         ["Alex", "Stewart"],
+#         ["Joe",  "Root"   ],
+#         ["Mike", "Gatting"],
+#         ["Joe",  "Blog"   ],
+#         ["Alex", "Jones"  ],
+#         ["Simon","Duane"  ],
+#     );
+# 
+# Output:
+# 
+#     Based on index = 1 of @players_age and index = 0 of @players_name.
+# 
+#     20, "Alex",  "Stewart"
+#     20, "Alex",  "Jones"
+#     18, "Alex",  "Stewart"
+#     18, "Alex",  "Jones"
+#     28, "Joe",   "Root"
+#     28, "Joe",   "Blog"
+#     38, "Mike",  "Gatting"
+#     18, "Simon", "Duane"
+# 
+# MY NOTES: Ok, I think I understand, but I'm going to ignore the
+# whole "out of memory" part as that's too complicated.
+# Also, I can't see what logical order the example output is ordered by,
+# as far as I can see, the described algorithm leads to the order that I
+# produce - not the order the above example output shows; so I'm going to
+# ignore that too.  After all, in a relation, order doesn't matter, right?
+#
+# So for the example I build %name2ages containing:
+# "Alex" => [20, 18],
+# "Joe" => [28],
+# "Mike" => [38],
+# "David" => [25],
+# "Simon" => [18].
+# Then use %name2ages while iterating over @player_names.
+#
+# There's also the question of how to provide the relations,
+# (in this file) let's start hard-coding them as arrays of pairs as above,
+# but see also ch-2a.pl which generalises them as files.
+#
+
+use strict;
+use warnings;
+use feature 'say';
+use Function::Parameters;
+use Getopt::Long;
+use Data::Dumper;
+
+my $debug = 0;
+
+die "Usage: hash-join [-d|--debug]\n"
+	unless GetOptions( "debug"=>\$debug ) && @ARGV==0;
+
+my @player_ages = (
+	[20, "Alex"  ],
+	[28, "Joe"   ],
+	[38, "Mike"  ],
+	[18, "Alex"  ],
+	[25, "David" ],
+	[18, "Simon" ],
+);
+
+my @player_names = (
+	["Alex", "Stewart"],
+	["Joe",  "Root"   ],
+	["Mike", "Gatting"],
+	["Joe",  "Blog"   ],
+	["Alex", "Jones"  ],
+	["Simon","Duane"  ],
+);
+
+#
+# my @result = hashjoin( $relation1, $fieldno1, $relation2, $fieldno2 );
+#	Do the hash join of two relations $relation1 and $relation2 (each a
+#	reference to an array of pairs), on field no $fieldno1 in $relation1
+#	with $fieldno2 from $relation2.  Returns an array of pairs.
+#
+fun hashjoin( $relation1, $fieldno1, $relation2, $fieldno2 )
+{
+	my %hash;
+	foreach my $ref (@$relation1)
+	{
+		my @r = @$ref;
+		my $aref = ($hash{$r[$fieldno1]}//=[]);
+		push @$aref, $r[1-$fieldno1];
+	}
+	#die Dumper \%hash;
+
+	my @result;
+
+	foreach my $ref (@$relation2)
+	{
+		my @r = @$ref;
+		my $key = $r[$fieldno2];
+		my $other = $r[1-$fieldno2];
+		foreach my $val (@{$hash{$key}})
+		{
+			push @result, [ $val, $key, $other ];
+		}
+	}
+
+	return @result;
+}
+
+
+my @result = hashjoin( 
+	\@player_ages, 1,
+	\@player_names, 0,
+	);
+say join(', ',@$_) for @result;
diff --git a/challenge-132/duncan-c-white/perl/ch-2a.pl b/challenge-132/duncan-c-white/perl/ch-2a.pl
new file mode 100755
index 0000000000..50c09af444
--- /dev/null
+++ b/challenge-132/duncan-c-white/perl/ch-2a.pl
@@ -0,0 +1,150 @@
+#!/usr/bin/perl
+# 
+# TASK #2a - Hash Join (with relations read from files)
+# 
+# Write a script to implement Hash Join algorithm as suggested by wikipedia.
+# This version reads the relations from CSV-formatted files, which contain
+# the fieldnames in row 1.
+# 
+# 1. For each tuple r in the build input R
+#     1.1 Add r to the in-memory hash table
+# 2. Do a final scan of the probe input S and add the resulting join tuples
+#    to the output relation
+# 
+# Example
+# 
+# Input:
+# 
+#     ages:
+#	  age,forename
+#         20,"Alex"
+#         28,"Joe"
+#         38,"Mike"
+#         18,"Alex"
+#         25,"David"
+#         18,"Simon"
+# 
+#     names:
+#	  forename,surname
+#         "Alex","Stewart"
+#         "Joe","Root"
+#         "Mike","Gatting"
+#         "Joe","Blog"
+#         "Alex","Jones"
+#         "Simon","Duane"
+# 
+# Output:
+# 
+#     Join based on forename fields
+# 
+#     20, "Alex",  "Stewart"
+#     20, "Alex",  "Jones"
+#     18, "Alex",  "Stewart"
+#     18, "Alex",  "Jones"
+#     28, "Joe",   "Root"
+#     28, "Joe",   "Blog"
+#     38, "Mike",  "Gatting"
+#     18, "Simon", "Duane"
+# 
+# MY NOTES: Ok, I think I understand, but I'm going to ignore the
+# whole "out of memory" part as that's too complicated.
+# Also, I can't see what logical order the example output is ordered by,
+# as far as I can see, the described algorithm leads to the order that I
+# produce - not the order the above example output shows; so I'm going to
+# ignore that too.  After all, in a relation, order doesn't matter, right?
+#
+# So for the example I build %name2ages containing:
+# "Alex" => [20, 18],
+# "Joe" => [28],
+# "Mike" => [38],
+# "David" => [25],
+# "Simon" => [18].
+# Then use %name2ages while iterating over @player_names.
+#
+
+use strict;
+use warnings;
+use feature 'say';
+use Function::Parameters;
+use Getopt::Long;
+use List::Util qw(first);
+use Data::Dumper;
+use Text::CSV;
+
+my $debug = 0;
+
+die "Usage: hash-join [-d|--debug] relation1 fieldname1 relation2 [fieldname2]\n"
+	unless GetOptions( "debug"=>\$debug ) && (@ARGV==3 || @ARGV==4);
+my( $relname1, $field1, $relname2, $field2 ) = @ARGV;
+$field2 //= $field1;		# reuse same fieldname if fieldname2 omitted
+
+
+#
+# my @relation = read_relation( $relname );
+#	Read a relation from file $relname, and return the array of rows.
+#
+fun read_relation( $relname )
+{
+	my $csv = Text::CSV->new();
+	open( my $fh, '<', $relname ) || die "can't open $relname\n";
+	my @rows;
+	while( my $row = $csv->getline ($fh) )
+	{
+		#die Dumper \$row;
+		push @rows, $row;
+	}
+	close $fh;
+	return @rows;
+}
+
+
+#
+# my @result = hashjoin( $relation1, $fieldno1, $relation2, $fieldno2 );
+#	Do the hash join of two relations $relation1 and $relation2 (each a
+#	reference to an array of pairs), on field no $fieldno1 in $relation1
+#	with $fieldno2 from $relation2.  Returns an array of pairs.
+#
+fun hashjoin( $relation1, $fieldno1, $relation2, $fieldno2 )
+{
+	my %hash;
+	foreach my $ref (@$relation1)
+	{
+		my @r = @$ref;
+		my $aref = ($hash{$r[$fieldno1]}//=[]);
+		push @$aref, $r[1-$fieldno1];
+	}
+	#die Dumper \%hash;
+
+	my @result;
+
+	foreach my $ref (@$relation2)
+	{
+		my @r = @$ref;
+		my $key = $r[$fieldno2];
+		my $other = $r[1-$fieldno2];
+		foreach my $val (@{$hash{$key}})
+		{
+			push @result, [ $val, $key, $other ];
+		}
+	}
+
+	return @result;
+}
+
+
+my @player_ages = read_relation( $relname1 );
+#die Dumper \@player_ages;
+my @age_fields = @{shift @player_ages};
+my $fieldno1 = first { $age_fields[$_] eq $field1 } 0..$#age_fields;
+say "fieldno1($field1)=$fieldno1" if $debug;
+
+my @player_names = read_relation( $relname2 );
+my @name_fields = @{shift @player_names};
+my $fieldno2 = first { $name_fields[$_] eq $field2 } 0..$#name_fields;
+say "fieldno2($field2)=$fieldno2" if $debug;
+
+my @result = hashjoin( 
+	\@player_ages, $fieldno1,
+	\@player_names, $fieldno2,
+	);
+say join(', ',@$_) for @result;
diff --git a/challenge-132/duncan-c-white/perl/names b/challenge-132/duncan-c-white/perl/names
new file mode 100644
index 0000000000..2c2d3c98f4
--- /dev/null
+++ b/challenge-132/duncan-c-white/perl/names
@@ -0,0 +1,7 @@
+forename,surname
+"Alex","Stewart"
+"Joe","Root"
+"Mike","Gatting"
+"Joe","Blog"
+"Alex","Jones"
+"Simon","Duane"
author	dcw <d.white@imperial.ac.uk>	2021-10-03 23:52:32 +0100
committer	dcw <d.white@imperial.ac.uk>	2021-10-03 23:52:32 +0100
commit	42519717810ae3f1605ee4d4a47bc5fe3738da85 (patch)
tree	ab99e23d9bd27165515d5faaa82422cd362ba759
parent	0eacf73b0557a9b3ff654d14506744f6a84858ef (diff)
download	perlweeklychallenge-club-42519717810ae3f1605ee4d4a47bc5fe3738da85.tar.gz perlweeklychallenge-club-42519717810ae3f1605ee4d4a47bc5fe3738da85.tar.bz2 perlweeklychallenge-club-42519717810ae3f1605ee4d4a47bc5fe3738da85.zip