Merge pull request #7350 from drbaggy/master

Added write up - and have added some minor optimizations to shave a bit more time!
author: Mohammad S Anwar <Mohammad.Anwar@yahoo.com> 2023-01-05 20:03:27 +0000
committer: GitHub <noreply@github.com> 2023-01-05 20:03:27 +0000
commit: d59e50d399b7921f0b6e994bdb325fe5eb7873d1 (patch)
tree: 366cc3089776b5517754a2e9edc4bd9b5c55cf00 /challenge-198
parent: 5838bd862d01e88f5814daee604cf369a6d95629 (diff)
parent: ca0685c7e29a38ca0d2345f949b212874abd9e68 (diff)
download: perlweeklychallenge-club-d59e50d399b7921f0b6e994bdb325fe5eb7873d1.tar.gz
perlweeklychallenge-club-d59e50d399b7921f0b6e994bdb325fe5eb7873d1.tar.bz2
perlweeklychallenge-club-d59e50d399b7921f0b6e994bdb325fe5eb7873d1.zip
2 files changed, 124 insertions, 9 deletions
diff --git a/challenge-198/james-smith/README.md b/challenge-198/james-smith/README.md
index 9ba480f98e..0d51fd73b5 100644
--- a/challenge-198/james-smith/README.md
+++ b/challenge-198/james-smith/README.md
@@ -20,6 +20,14 @@ https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-198/ja
 ***You are given a list of integers, `@list`. Write a script to find the total pairs in the sorted list where 2 consecutive elements has the max gap. If the list contains less then 2 elements then return 0.***
 
 ## Solution
+
+I will present two solutions to this problem. They both start the same way - by first sorting the numbers.
+
+### Method 1 - `max_gap_sort`
+
+This computes the difference between each pair, and then sorts these into order (reverse)
+and counts the number of entries who have the same **max** value as the first element of the sorted list...
+
 ```perl
 sub max_gap_sort {
   return 0 unless $#_;
@@ -31,43 +39,140 @@ sub max_gap_sort {
 }
 ```
 
+### Method 2 - `max_gap_nosort`
+
+This computes the difference of each pair and if it is the highest value updates a counter...
+
+**Notes:** If the new value of the difference is greater than the current best value - we update the current best value and
+reset the count to 1. We do this in the for loop with a ternary `?:` and then using `&&` to replace another if or `?:` with
+an empty second value. Finally we update `$p`. We can separate variables etc within the for loop by replacing `;` for `,` in
+a lot of cases...
+
 ```perl
 sub max_gap_nosort {
   return 0 unless $#_;
   @_ = sort { $a <=> $b } @_;
   my($p,$b,$c)=(shift,0,0);
   $_-$p>$b ? ($b,$c)=($_-$p,1) : $_-$p==$b && $c++, $p=$_ for @_;
+  $c
+}
+```
+
+### Bonus Method 3 - `max_gap_nosort_faster`
+
+One issue with method 2 is that it has to store the sorted list back into @_ AND then loop through it - as we have to get the first value of `$p`...
+Now - what if we didn't need to! This would save that store....
+
+Well we can avoid that (in a way) but it is hacky... We set `$p` the be one more than first value in the input! This means that the first gap is always going to be negative so it will only occur once, which as we we aren't interested in the "size" of the gap only the number with the biggest gap we can will always get the right answer...
+
+```perl
+sub max_gap_nosort_faster {
+  return 0 unless $#_;
+  my($p,$t,$c) = ($_[0]+1,0,0);
+  $_-$p>$t ? ($t,$c)=($_-$p,1) : $_-$p==$t && $c++, $p=$_ for sort { $a<=>$b } @_;
   $c;
 }
 ```
 
+### Performance
+
+We can see the relative performance of each method {the timing is based on various lists including a list of 1,000,000 numbers}
+
+|               |    Rate  | sort | slide | nosort | nosort_faster |
+| :------------ | -------: | ---: | ----: | -----: | ------------: |
+| sort          |  0.772/s |   -- |  -31% |   -52% |          -56% |
+| slide         |   1.11/s |  44% |    -- |   -30% |          -37% |
+| nosort        |   1.59/s | 106% |   43% |     -- |           -9% |
+| nosort_faster |   1.75/s | 127% |   58% |    10% |            -- |
+
+The gain between `nosort` & `nosort_faster` is not "visible" in cases wheren the length of list is up to around `100,000` the overhead of the "hack" vs the overhead of storing the intermediate array....
+
+Using List::MoreUtils `slide` has a roughly 30% hit on performance - this again is the effect of having to store relatively large intermediate arrays.
+
 # Task 2 - Prime Count
 
 ***You are given an integer `$n > 0`. Write a script to print the count of primes less than `$n`.***
 
 ## Solution
 
+Now this is going back sometime since we have a had a prime generator question. We could use `Math::Prime::Util` to get the number of primes with `prime_count` but that sort of defeats the issue here...
+
+We will look at a couple of ways of doing this - the first which is "cute" but not particularly efficient, and then another which is very efficient {obv no where near the performance of the `Math::Prime::Util` version.
+
+### Method 1 - "compact"
+
+First we return `0` if `$n` is less than 3 {there are no primes less than 2}. We then initialize the array with the entry 2, and for each number - we look to see it it's prime... Our prime check is in the line:
+```perl
+  //,(grep{($'%$_)||next}@_),push@_,$_
+```
+which is quite compact:
+  * `//,` - match nothing - `$'` the post-match string is set to the string i.e `$_` from the outer loop.
+  * `(grep{$'%$_||next}@_)` - loop through the array of primes, if `$'%$_` is false we check the right-hand side of the or `||` and evaluate it - which skips out of the map and goes to the next number in the outer `for` loop....
+  * `,push@_,$_` if we don't hit the `next` we get to this point - the number is prime so push it onto the list....
+
+The full code is:
+
 ```perl
 sub n_primes_compact {
   return 0if(my$n=shift)<3;
   @_=2;
-  //,(grep{($'%$_)||next}@_),push@_,$_ for 3..$n-1;
+  //,(grep{$'%$_||next}@_),push@_,$_ for 3..$n-1;
   1*@_
 }
 ```
+If we switch `shift` for `pop` & `grep` for `map`.. This comes down to just 85 bytes (if we us a 1 letter function name)
+```perl
+sub p{return 0if(my$n=pop)<3;@_=2;//,(map{$'%$_||next}@_),push@_,$_ for 3..$n-1;1*@_}
+```
+
+### Method 2 - faster!
+
+The previous method is small - but not particularly fast.. There are a few things we can do to speed it up!
+
+ * We only need to try odd numbers (even numbers will not be included)
+ * the largest element of the prime array we need to check is the square root of the number we are testing!
+   * square roots are expensive!
+   * we keep a track of the smallest integer which is greater than or equal the square root of the number. Note we don't ever have to work out the square root. We can just increment the value `$q` by 1 each time `$i` is greater than it's square... the `last` in the tight `last`/`next` line....
+ * a number isn't prime if it has a prime factor - this is the `$i%$_?...:next O` which skips out of the inner `for` and jumps to the start of the next outer `for`.
+ * we don't include `2` in the array as we only check odd-numbers, but add `1` at the end to include it in the count... this gains us about `2%` over keeping it in the array.
+ 
+This give us:
 
 ```perl
-sub n_primes_fast { # for all tests 0.066 seconds
+sub n_primes_fast {
   return 0 if (my $n=shift) <3;
-  my @p = (my $q=2);
+  my($q,@p)=2;
   O: for( my $i=3; $i<$n; $i+=2 ) {
     $q++ if $i>$q*$q;
-    for(@p) {
-      next O unless $i%$_;
-      last if $_>$q;
-    }
+    $i%$_?($_>$q&&last):next O for @p;
     push @p, $i;
   }
-  scalar @p
+  1+@p
+}
+```
+
+There is a more compact version of the code:
+
+```perl
+sub n_primes_fastx {
+  return 0 if (my $n=shift) <3;
+  O: for( my($i,$q)=(3,2); $i<$n; $i+=2,($i>$q*$q)&&$q++ ) {
+    $i%$_?($_>$q&&last):next O for @_;
+    push @_, $i;
+  }
+  1+@_
 }
 ```
+which makes use of the the initalization and increment parts of the class **C-style** `for` to combine lines, and re-using `@_` as after the shift it will be empty...
+
+Times are comparable between `fast` and `fastx` - although I think `fast` is slightly faster than the more compact version.... (and is definitely easier to read)
+
+### Performance
+
+We can see the relative performance of each method {the largest test value we use is `$n=100,000`} so timings are based on that! (factor is approx 180:1) if we increase to `$n=1,000,000` the factor is approximately 630:1
+
+
+|               |    Rate  | compact |  fast |  
+| :------------ | -------: | ------: | ----: |
+| compact       |  0.110/s |      -- |  -99% |
+| fast          | 18.800/s | 16 979% |    -- |
diff --git a/challenge-198/james-smith/perl/ch-1.pl b/challenge-198/james-smith/perl/ch-1.pl
index 371a2c4d13..2f6c10f2a2 100644
--- a/challenge-198/james-smith/perl/ch-1.pl
+++ b/challenge-198/james-smith/perl/ch-1.pl
@@ -22,11 +22,13 @@ my @TESTS = (
 
 is( max_gap_sort(    @{$_->[0]} ), $_->[1] ) for @TESTS;
 is( max_gap_nosort(  @{$_->[0]} ), $_->[1] ) for @TESTS;
+is( max_gap_nosort_faster(  @{$_->[0]} ), $_->[1] ) for @TESTS;
 done_testing();
 
-cmpthese( -10, {
+cmpthese( -2, {
   'sort'   => sub { max_gap_sort(   @{$_->[0]} ) for @TESTS }, # 1700/s
   'nosort' => sub { max_gap_nosort( @{$_->[0]}) for @TESTS },  # 3535/s
+  'nosort_faster' => sub { max_gap_nosort_faster( @{$_->[0]}) for @TESTS },  # 3535/s
 } );
 
 sub max_gap_sort {
@@ -45,3 +47,11 @@ sub max_gap_nosort {
   $_-$p>$b ? ($b,$c)=($_-$p,1) : $_-$p==$b && $c++, $p=$_ for @_;
   $c;
 }
+
+sub max_gap_nosort_faster {
+  return 0 unless $#_;
+  my($p,$t,$c) = ($_[0]+1,0,0);
+  $_-$p>$t ? ($t,$c)=($_-$p,1) : $_-$p==$t && $c++, $p=$_ for sort { $a<=>$b } @_;
+  $c;
+}
+
author	Mohammad S Anwar <Mohammad.Anwar@yahoo.com>	2023-01-05 20:03:27 +0000
committer	GitHub <noreply@github.com>	2023-01-05 20:03:27 +0000
commit	d59e50d399b7921f0b6e994bdb325fe5eb7873d1 (patch)
tree	366cc3089776b5517754a2e9edc4bd9b5c55cf00 /challenge-198
parent	5838bd862d01e88f5814daee604cf369a6d95629 (diff)
parent	ca0685c7e29a38ca0d2345f949b212874abd9e68 (diff)
download	perlweeklychallenge-club-d59e50d399b7921f0b6e994bdb325fe5eb7873d1.tar.gz perlweeklychallenge-club-d59e50d399b7921f0b6e994bdb325fe5eb7873d1.tar.bz2 perlweeklychallenge-club-d59e50d399b7921f0b6e994bdb325fe5eb7873d1.zip