diff options
| -rw-r--r-- | challenge-128/james-smith/README.md | 153 | ||||
| -rw-r--r-- | challenge-128/james-smith/blog.txt | 1 | ||||
| -rw-r--r-- | challenge-128/james-smith/perl/ch-1.pl | 62 | ||||
| -rw-r--r-- | challenge-128/james-smith/perl/ch-2.pl | 66 |
4 files changed, 241 insertions, 41 deletions
diff --git a/challenge-128/james-smith/README.md b/challenge-128/james-smith/README.md index 10adb6a346..be4fb1ed86 100644 --- a/challenge-128/james-smith/README.md +++ b/challenge-128/james-smith/README.md @@ -1,4 +1,4 @@ -# Perl Weekly Challenge #127 +# Perl Weekly Challenge #128 You can find more information about this weeks, and previous weeks challenges at: @@ -10,74 +10,145 @@ submit solutions in whichever language you feel comfortable with. You can find the solutions here on github at: -https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-127/james-smith/perl +https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-128/james-smith/perl -# Task 1 - Disjoint Sets +# Task 1 - Maximum Sub-Matrix -***You are given two sets with unique integers. Write a script to figure out if they are disjoint.*** +***You are given m x n binary matrix having 0 or 1 in each cell. Write a script to find out maximum sub-matrix having only 0*** + +## Ambiguity + +There may be multiple solutions (e.g. in Example 1 depending on how you write the algorithm) so rather than returning the matrix - my tests will be on the area of the matrix ## The solution -This is a rewording of simple perl, which is to see if there any elements of set 1 in set 2. To avoid nested loops, we use the efficiency of perl's hash key search to search for members. So for set 1 we create a hash who's keys are the members of set 1. We then search through set 2 to see which members are keys in this hash (and so in both sets). If there are any we return 0 o/w return 1. +Initialy this looks like an `O(n^4)` problem - you would need to scan the area to the right and below `O(n^2)` for a given cell `O(n^2)`. But with a bit of preprocessing - we can remove at least one of these loops - so the challenge becomes `O(n^3)`. + +### Preprocessing the matrix. + +To remove the inner loop - we can pre-compute this - the number of 0s in a continuouse line starting at the point and going right. For the first matrix we get + +``` + [ 1 0 0 0 1 0 ] [ 0 3 2 1 0 1 ] + [ 1 1 0 0 0 1 ] -> [ 0 0 3 2 1 0 ] + [ 1 0 0 0 0 0 ] [ 0 5 4 3 2 1 ] +``` + +The code that does that is this.... ```perl -sub disjoint_sets { - my %m = map { $_=>1 } @{$_[0]}; - return grep( { $m{$_} } @{$_[1]}) ? 0 : 1; -} + ## Last column 1s become 0s, 0s become 1s + my @runs = map { [1 - $_->[-1]] } @rows; + + ## Remaining columns we are working backwards along the rows + ## Column is 0 if the matrix contains a 1 - o/w it is 1 more + ## than the cell to the right (which is the first cell in the row) + ## we use unshift to extend each row left... + foreach my $i (reverse 0..$w-1) { + unshift @{$runs[$_]}, $rows[$_][$i] ? 0 : $runs[$_][0]+1 foreach 0..$h; + } ``` -# Task 2 - Conflict Intervals +We then have the `O(n^3)` to find the maximum area. -***You are given a list of intervals. Write a script to find out if the current interval conflicts with any of the previous intervals.*** +For each cell we work out the maximum area of any rectangle. -## Background +For the first row it is just `$run[$y][$x]`. For subsequent rows it is the minimum of all `$run[$y][$x]` we have seen times height -This is going back to my day job again - but I thought I would add a bit of background this time. I'm not a geneticist, but a web developer working at an institution that uses DNA sequencing to use genomics to investigate genetics. My first project was to develop/lead the developers for a genome browser. Often we had to display information about genomic features and whether or not they overlapped a particular region (to know whether to display them or not) or to bump them for display (to make sure features didn't merge/overlap)... +```perl + my $max_w = 1e9; + foreach my $j ( $y .. $h ) { + last unless $runs[$j][$x]; ## We have a 1 in the rectangle quit + $max_w = $runs[$j][$x] if $runs[$j][$x] < $max_w; + my $area = $max_w * ( $j - $y + 1 ); + $max_area = [ $area, $max_w, $j - $y + 1 ] if $area > $max_area->[0]; + } +``` -## Solution +The variable `$max_area` contains three values `$max_area`, `$max_w`, `$max_h` - the latter two if you wish to draw the empty matrix..... +Put it all together we have... +```perl +sub find_empty { + my @runs = map { [ 1 - $_->[-1] ] } my @rows = @{$_[0]}; + my ( $h, $w ) = ( @rows - 1, @{$rows[0]} - 1 ); -There are six arrangements more any two regions..... See picture below... + foreach my $i ( reverse 0 .. $w - 1 ) { + unshift @{$runs[$_]}, $rows[$_][$i] ? 0 : $runs[$_][0] + 1 foreach 0 .. $h; + } + my $max_area = [ 0, 0 , 0 ]; + foreach my $x ( 0 .. $w ) { + foreach my $y ( 0 .. $h ) { + next unless $runs[$y][$x]; + my $max_w = 1e9; + foreach my $j ( $y .. $h ) { + last unless $runs[$j][$x]; + $max_w = $runs[$j][$x] if $runs[$j][$x] < $max_w; + my $area = $max_w * ( $j - $y + 1 ); + $max_area = [ $area, $max_w, $j - $y + 1 ] if $area > $max_area->[0]; + } + } + } + + return $max_area; +} ``` - [============] -[------] [++++++] [------] - [++++++++++++++++++++] - [++++++] [++++++] -``` -Over the 6 we have two of regions which are disjoint from the top region: Where `region_2_start > region_1_end` or `region_1_start > region_2_end`... -Our loop finally becomes.... +# Task 2 - Minimum Platforms + +***You are given two arrays of arrival and departure times of trains at a railway station. Write a script to find out the minimum number of platforms needed so that no train needs to wait.*** + +## Background + +As mentioned this is effectively my day job again. To display information about genomic features and whether or not they overlapped a particular region (to know whether to display them or not) or to bump them for display (to make sure features didn't merge/overlap) which is exactly this problem. This one is in someways easier as we have the trains already sorted into date order. If we didn't sorting them would make life a lot easier! - not so easy on a genome browser where there may be 10s of thousands of features in a region. + +This is actually more my brother's line of work - he works for what was BR computing - and one of his jobs is just this. For a while BR had 66 minutes in an hour to allow them to get trains in and out of one of the large busy stations. + +## Solution + +As we are assuming that starts are in time order we don't have to do a 2-sided test for overlaps. + +So foreach train we loop through all platforms - seeing if there is a platform with a slot in it (i.e. the last train has already left the platform). If there isn't we make a new platform and add the train to it, and repeat. All we do is store the last departure time for each platform, and because 24-hr date strings alphabetic/time order are the same - we only need to use the string comparison operator (in this case `gt`) to compare times. + +Here we use a little used concept in perl the label "`OUTER`" - this allows our inner loop to break out of it's own `foreach` loop and also jump to the next interation of it's parent loop. ```perl -sub conflict_intervals { - my @in = @{ $_[0] }; - my @conf; - while( my $int = pop @in ) { - foreach(@in) { - next unless $int->[1] < $_->[0] || $int->[0] < $_->[1]; - unshift @conf, $int; - last; +sub bump_platform { + my @arr = @{shift @_}; + my @dep = @{shift @_}; + my @platforms = (); + OUTER: foreach my $st (@arr) { + foreach(0..$#platforms) { + ($platforms[$_] = shift @dep) && (next OUTER) + if $st gt $platforms[$_]; } + push @platforms,shift @dep; } - return \@conf; + return scalar @platforms; } ``` -**Notes:** - # Note we work from the end backwards - this is easier as we just pop the last interval off the loop each time and compare it with previous ones. - # We then use unshift to add it to the start of what we return. - -We can compact this slightly using the `&&` trick in the foor loop to do away with the `next`/`last` combination: +**Notes: +We can also keep information about which trains are on by re-writing what is stored in `@plat` rather than storing the last departure time - we can store the arrival/departure time of trains on each platform... + ```perl -sub conflict_intervals_compact { - my(@i,@o) = @{$_[0]}; - while(my $r = pop @i) { - ($r->[1]<$_->[0] || $r->[0]<$_->[1]) && (unshift @o, $r) && last foreach @i; +sub bump_platform_keep_trains { + my @arr = @{shift @_}; + my @dep = @{shift @_}; + my($train_no, @platforms) = (0); + + OUTER: foreach my $st (@arr) { + foreach(@platforms) { + (push @{$_}, [ $st, (shift @dep), ++$train_no ]) && + (next OUTER) if $st gt $_->[-1][1]; + } + push @platforms, [ [ $st, (shift @dep), ++$train_no ] ]; } - return \@o; + say ' ', join ' ', map { "Train $_->[2]: $_->[0]-$_->[1]" } @{$_} + foreach @platforms; + return scalar @platforms; } ``` diff --git a/challenge-128/james-smith/blog.txt b/challenge-128/james-smith/blog.txt new file mode 100644 index 0000000000..b84ed0fc12 --- /dev/null +++ b/challenge-128/james-smith/blog.txt @@ -0,0 +1 @@ +https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-128/james-smith diff --git a/challenge-128/james-smith/perl/ch-1.pl b/challenge-128/james-smith/perl/ch-1.pl new file mode 100644 index 0000000000..6db18f8378 --- /dev/null +++ b/challenge-128/james-smith/perl/ch-1.pl @@ -0,0 +1,62 @@ +#!/usr/local/bin/perl + +use strict; + +use warnings; +use feature qw(say); +use Test::More; +use Benchmark qw(cmpthese timethis); +use Data::Dumper qw(Dumper); + +my @arr1 = ( + [qw( 1 0 0 0 1 0 )], + [qw( 1 1 0 0 0 1 )], + [qw( 1 0 0 0 0 0 )], +); +my @arr2 = ( + [qw( 0 0 1 1 )], + [qw( 0 0 0 1 )], + [qw( 0 0 1 0 )], +); +my @TESTS = ( [ \@arr1, '6 2 3' ], [ \@arr2, '6 2 3' ],); + +is( "@{ find_empty( $_->[0]) }", $_->[1] ) foreach @TESTS; + +done_testing(); + +sub find_empty { + my @runs = map { [1 - $_->[-1]] } my @rows = @{$_[0]}; + my ($h,$w) = ( @rows-1, @{$rows[0]}-1 ); + + ## First pass through the array - we calculate how many + ## 0s are in the cell and to the right... So for example 1 we get + ## 0 3 2 1 0 1 + ## 0 0 3 2 1 0 + ## 0 5 4 3 2 1 + ## This is O(n^2) + foreach my $i (reverse 0..$w-1) { + unshift @{$runs[$_]}, $rows[$_][$i] ? 0 : $runs[$_][0]+1 foreach 0..$h; + } + ## Now we have to loop over all squares and check rectangles starting + ## at the square and going down and to the right... + ## This is now an O(n^3) operation reduced from the O(n^4) operation + ## by working with run lengths... + ## Effectively the O(n^2) operation above removes the inner loop of + ## scanning right for 0s... + + my $max_area = [0,0,0]; + foreach my $x ( 0..$w ) { + foreach my $y ( 0..$h ) { + next unless $runs[$y][$x]; ## Short cut answer will be 0 + my $max_w = 1e9; + foreach my $j ( $y..$h ) { + last unless $runs[$j][$x]; ## Short cut all subsequent answers are 0 + $max_w = $runs[$j][$x] if $runs[$j][$x] < $max_w; + my $area = $max_w * ($j-$y+1); + $max_area = [$area,$max_w,$j-$y+1] if $area>$max_area->[0]; + } + } + } + return $max_area; +} + diff --git a/challenge-128/james-smith/perl/ch-2.pl b/challenge-128/james-smith/perl/ch-2.pl new file mode 100644 index 0000000000..34a31da1f3 --- /dev/null +++ b/challenge-128/james-smith/perl/ch-2.pl @@ -0,0 +1,66 @@ +#!/usr/local/bin/perl + +use strict; + +use warnings; +use feature qw(say); +use Test::More; +use Benchmark qw(cmpthese timethis); +use Data::Dumper qw(Dumper); + +my @TESTS = ( + [ [qw(11:20 14:30)], + [qw(11:50 15:00)], + 1 ], + [ [qw(10:20 11:00 11:10 12:20 16:20 19:00)], + [qw(10:30 13:20 12:40 12:50 20:20 21:20)], + 3 ] +); + +is( bump_platform( $_->[0], $_->[1] ), $_->[2] ) foreach @TESTS; +is( bump_platform_keep_trains( $_->[0], $_->[1] ), $_->[2] ) foreach @TESTS; + +done_testing(); + +sub bump_platform { + my @arr = @{shift @_}; + my @dep = @{shift @_}; + my @platforms = (); + OUTER: foreach my $st (@arr) { + foreach(0..$#platforms) { + ## If train fits on platform - we extend the last departure time + ## and start working with new train + ($platforms[$_] = shift @dep) && (next OUTER) if $st gt $platforms[$_]; + } + ## Otherwise we start a new platform... + push @platforms,shift @dep; + } + return scalar @platforms; +} + +sub bump_platform_keep_trains { + my @arr = @{shift @_}; + my @dep = @{shift @_}; + my($train_no, @platforms) = (0); + + OUTER: foreach my $st (@arr) { + foreach(@platforms) { + ## If we can fit on this platform - add train details + ## and go to the next train + (push @{$_}, [ $st, (shift @dep), ++$train_no ]) && + (next OUTER) if $st gt $_->[-1][1]; + } + ## No room on any of the existing platforms - so we create + ## a new one and push the train + push @platforms, [ [ $st, (shift @dep), ++$train_no ] ]; + } + ## Display details about trains on platform. + say ' ', + join ' ', + map { "Train $_->[2]: $_->[0]-$_->[1]" } + @{$_} foreach @platforms; + ## Return the number of platforms + return scalar @platforms; +} + + |
