Update README.md

author: James Smith <js5@sanger.ac.uk> 2021-10-07 11:29:42 +0100
committer: James Smith <js5@sanger.ac.uk> 2021-10-07 11:29:42 +0100
commit: ddd032709998a36ca06d4a82b3e4e8ced43a5485 (patch)
tree: 4472e0b37f5bbbd35fa584c868832ca8acc4e294
parent: 12438a4961abbf613dff8d2fe15552ee1e54e632 (diff)
download: perlweeklychallenge-club-ddd032709998a36ca06d4a82b3e4e8ced43a5485.tar.gz
perlweeklychallenge-club-ddd032709998a36ca06d4a82b3e4e8ced43a5485.tar.bz2
perlweeklychallenge-club-ddd032709998a36ca06d4a82b3e4e8ced43a5485.zip
1 files changed, 115 insertions, 3 deletions
diff --git a/challenge-133/james-smith/README.md b/challenge-133/james-smith/README.md
index e467044ffc..c50473185e 100644
--- a/challenge-133/james-smith/README.md
+++ b/challenge-133/james-smith/README.md
@@ -58,18 +58,130 @@ And the support functions:
    We cache the digit sum for each number in `%ds`;
  * `prime_factors` - for a prime returns nothing, for a composite returns the factors,
    We keep a list of primes `@primes`, and also the prime factors for each composite `%comp`.
-   
+   **Note:** We don't need to continue with primes bigger than `sqrt($N)` so short cut the loop.
+
 ```perl
 sub sum        { my $t = 0; $t+=$_ foreach @_; $t;          }
+
 sub sum_digits { return $ds{$_[0]} ||= sum split //, $_[0]; }
 
 sub prime_factors {
   my $N = shift;
-  ( $N % $_) or ( return @{ $comp{$N} = [ $_, @{$comp{ $N / $_ }||[$N/$_]}] } ) foreach @primes;
+  ( $N % $_) ? ( ( $N < $_ * $_ ) && last )
+             : ( return $sum_pf[$N] = $sum_pf[$N/$_] + $sum_pf[$_] )  foreach @primes;
+  $sum_pf[$N] = cl_sum_digits $N;
   push @primes, $N;
-  return;
+  return 0;
 }
+```
+
+## Performance
+
+A slight modification to the code ( to compute Smith Numbers < N ) computes the 29,928 Smith numbers less than 1 million in about 3.3 seconds, and for 10,000,000 about 45 seconds. For large N the script falls over with out-of-memory errors.
+
+## Re-write in C
+
+The algorithm does lend itself to rewriting in a lower level language - it doesn't use any nice features of Perl except for the "auto-sizing" of arrays - no hashes, complex IO etc.
+
+So here is my first C programme for probably 30 years!
+
+### Timings for C vs perl
+
+|     Max value | Count      | Time C  | Time perl |
+| ------------: | ---------: | ------: | --------: |
+|           100 |          6 |   0.00s |     0.01s |
+|         1,000 |         49 |   0.00s |     0.01s |
+|        10,000 |        376 |   0.00s |     0.03s |
+|       100,000 |      3,924 |   0.01s |     0.26s |
+|     1,000,000 |     29,928 |   0.07s |     3.25s |
+|    10,000,000 |    278,411 |   1.18s |    45.38s |
+|   100,000,000 |  2,632,758 |  23.65s |    *fail* |
+| 1,000,000,000 | 25,154,060 | 514.17s |    *fail* |
+
+This is the limit for the algorithm (would need 20+ Gbytes for the 10 billion case)
+
+### The C code
 
+```c
+#include <stdio.h>
 
+// Compute all Smith Numbers below MAX_N
 
+// Use guess that PSIZE should approximately 1.1 * PFSIZE / ln(PFSIZE)
+
+// 10^6
+   #define MAX_N   1000000
+   #define PSIZE   42000        // Have to guess this!
+// 10^7
+// #define MAX_N   10000000
+// #define PSIZE   400000       // Have to guess this!
+// 10^8
+// #define MAX_N   100000000
+// #define PSIZE   3200000      // Have to guess this!
+// 10^9
+// #define MAX_N   1000000000
+// #define PSIZE   28000000     // Have to guess this!
+
+#define PFSIZE (MAX_N/2)
+
+// Set up arrays
+int sum_pf[ PFSIZE+1 ], primes[ PSIZE ], prime_index = 0;
+
+// Compute sum of digits - unlike Perl we can't use split
+// so we use repeated modulus/divide by 10..
+
+int sum_digits(int n) {
+  int total = 0;
+  do { total += n%10; } while( n /= 10 );
+  return total;
+}
+
+// Get the sum of prime factors -
+// as we build this in order we only need to find a
+// factorisation then we just add together the
+// digit sum of the two factors (Here for speed we
+// know one will be prime.
+// We go through all primes we have until prime^2
+// is greater than the number itself.
+//
+// To make the last bit easier IF we have a prime
+// we return 0 as not composite...
+//
+// Note to save memory we only store the sum if
+// n < MAX_N/2 as we won't need it again (can't
+// be a factor of a larger number less than MAX_N
+
+int sum_prime_factors( int n ) {
+  int p;
+  for(int i=0; i< prime_index; i++ ) {
+    p = primes[i];
+    if( !(n % p) ) {
+      if( n <= PFSIZE ) {
+        return sum_pf[n] = sum_pf[n/p] + sum_pf[p];
+      } else {
+        return sum_pf[ n/p ] + sum_pf[ p ];
+      }
+    }
+    if( n < p*p  ) { break; }
+  }
+  if( n <= PFSIZE ) {
+    sum_pf[ n             ] = sum_digits(n);
+    primes[ prime_index++ ] = n;
+  }
+  return 0;
+}
+
+// Main is simple just loop and search, printing out
+// Smith numbers
+int main() {
+  int count = 0, n = 1;
+  while( n++ <= MAX_N ) {
+    if( sum_digits(n) == sum_prime_factors(n) ) {
+      printf( "%11d\n", n );
+    }
+  }
+}
 ```
+
+**Notes:** In this version we use an optimisation - we know that we will never need to use factorisations or primes greater than `MAX_N/2` so we don't store/cache these.
+
author	James Smith <js5@sanger.ac.uk>	2021-10-07 11:29:42 +0100
committer	James Smith <js5@sanger.ac.uk>	2021-10-07 11:29:42 +0100
commit	ddd032709998a36ca06d4a82b3e4e8ced43a5485 (patch)
tree	4472e0b37f5bbbd35fa584c868832ca8acc4e294
parent	12438a4961abbf613dff8d2fe15552ee1e54e632 (diff)
download	perlweeklychallenge-club-ddd032709998a36ca06d4a82b3e4e8ced43a5485.tar.gz perlweeklychallenge-club-ddd032709998a36ca06d4a82b3e4e8ced43a5485.tar.bz2 perlweeklychallenge-club-ddd032709998a36ca06d4a82b3e4e8ced43a5485.zip