diff options
| author | James Smith <js5@sanger.ac.uk> | 2022-05-17 16:12:02 +0100 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2022-05-17 16:12:02 +0100 |
| commit | 40283704ea926e979fa65b3caefcd1eedaff1ccf (patch) | |
| tree | 1d4addca00ea89e65473cf0b182d14f63ec49ca2 /challenge-164 | |
| parent | 0dd01ba039f5984febf8fdc28200ca06c5d877d6 (diff) | |
| download | perlweeklychallenge-club-40283704ea926e979fa65b3caefcd1eedaff1ccf.tar.gz perlweeklychallenge-club-40283704ea926e979fa65b3caefcd1eedaff1ccf.tar.bz2 perlweeklychallenge-club-40283704ea926e979fa65b3caefcd1eedaff1ccf.zip | |
Update README.md
Diffstat (limited to 'challenge-164')
| -rw-r--r-- | challenge-164/james-smith/README.md | 296 |
1 files changed, 164 insertions, 132 deletions
diff --git a/challenge-164/james-smith/README.md b/challenge-164/james-smith/README.md index 2ddb7d9199..b819fe5b32 100644 --- a/challenge-164/james-smith/README.md +++ b/challenge-164/james-smith/README.md @@ -14,169 +14,201 @@ You can find the solutions here on github at: https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-164/james-smith -# Challenge 1 - Scalable Vector Graphics (SVG) & Challenge 2 - Line of Best Fit +# Challenge 1 - Prime Palindrome -Usually I write up to separate pieces for our weekly challenge - but as these are linked I though I would write this as a single blog. +***Write a script to find all prime numbers less than 1000, which are also palindromes in base 10. Palindromic numbers are numbers whose digits are the same in reverse. For example, 313 is a palindromic prime, but 337 is not, even though 733 (337 reversed) is also prime.*** -## Line of best fit +## The solution -Although this is challenge 2, let me start by computing explaining how to compute the best fit line (using linear regression) of a set of points. +We use `Math::Prime::Util` and it's functions `next_prime` to loop through primes.... So the code is quite simple.. -The equation for the gradient is: - -``` - n * sum(xy) - sum(x) * sum(y) - b = ----------------------------- - n * sum(xx) - sum(x) * sum(x) -``` - -we can then get the intercept as: - -``` - s(y) - b * s(x) - a = --------------- - n +```perl +my ($p,$lim,@pal)=(1,shift//1e3); +($p ^ reverse $p) || (push @pal,$p) while ($p=next_prime $p) < $lim; +say for @pal; ``` -This is encaptulated in the function: +If we don't want to store the values in `@pal` but instead just print them as generated we can simplify this to: ```perl -sub best_fit { - my $sx = my $sy = my $sxy = my $sxx = 0, my $n = @{$_[0]}; - $sx += $_->[0], $sxy += $_->[0]*$_->[1], - $sy += $_->[1], $sxx += $_->[0]*$_->[0] foreach @{$_[0]}; - my $b = ( $n*$sxy-$sx*$sy ) / ( $n*$sxx - $sx*$sx ); - return ( ($sy-$b*$sx)/$n, $b ); -} +$_=1,my$lim=shift//1e3; +($_^reverse$_)||say while($_=next_prime$_)<$lim; ``` -## SVG +**Note** we use the `//` operator to set a default value `shift//1e3` sets the parameter to the first command line argument or 1000 if there is nothing passed. -There are two parts to the SVG render. +# Challenge 2 - Happy Numbers - * Working out what range of points to draw; - * rendering the points +***Write a script to find the first 8 Happy Numbers in base 10.*** -## Using `$0` +***Starting with any positive integer, replace the number by the sum of the squares of its digits, and repeat the process until the number equals 1 (where it will stay), or it loops endlessly in a cycle which does not include 1. Those numbers for which this process end in 1 are happy numbers, while those numbers that do not end in 1 are unhappy numbers.*** -Rather than writing two programmes this week - I have written a single program - and used `$0` the programme name two switch on/off parts of the code. +## Solution. - * If we are running `ch-1.pl` we need to include the lines in the range calculation in the renderer, if we are running `ch-2.pl` we don't want to (it adds the margins a second time which we don't want) +```perl +sub is_happy { + my($n,$t,%seen) = shift; + while($n>1) { + return 0 if $seen{$n}; + $seen{$t=$n}=1,$n=0; + do {$n+=($t%10)**2} while $t=int($t/10); + } + 1; +} +``` +Not much to see in this code - we loop through the function until we see the same value twice (or we get to `1`). We simply store the values seen as the keys to a hash to achieve this. - * If we are running `ch-2.pl` we need to compute the line of best fit, but obviously in `ch-1.pl` we don't. +**Note:** We use `do { } until` rather than just `{ } until` as this force the block of code within the `{ }` to be executed before the check for the conditional, rather than afterwards. -So (pseudo code) we have: +### Optimized solution -``` - get_data() - get_best_fit_and_limit_to_box() if $0 eq 'ch-2.pl'; - ## Render - get_range( $0 eq 'ch-1.pl' ? "include lines" : "points only" ); - get_sizes_and_rescale() - render(); -``` - -## Altogether now: +If memory is no issue then you can cache the answers (happy/unhappy) for every number in the chain and if you see them for subsequent calculations you can just return that number. Obviously this requires lots of memory if you are working with lists of larger numbers.... ```perl -#!/usr/local/bin/perl -use strict; -use warnings; -use feature qw(say); - -my( @lines, @pts ); - -get_points_and_lines( ); -add_best_fit_line( 20 ) if $0 eq 'ch-2.pl'; -say render_svg( \@pts, \@lines, { - 'max_w' => 960, 'max_h' => 540, - 'margin' => 10, 'radius' => 10, 'fill' => '#090', 'stroke' => 5, 'color' => '#900' -} ); - -sub get_points_and_lines { - my @t; - while(<>) { - chomp; - ## If the entry has 4 numbers it is a line not a point. - 4 == (@t = split /,/) ? ( push @lines, [@t] ) : ( push @pts, [@t] ) for split; +sub happy_list_cache { + my $count= @_ ? $_[0] : 8, my %seen, my $t, state @happy = (0,my @ret = my $N = 1); + OUT: for (2..$count) { + %seen=(); + my $n = ++$N; + while($n>1) { + last if defined $happy[$n] && $happy[$n]==1; + if(defined $happy[$n] || $seen{$n}) { ## Unhappy no... + $happy[$_] = 0 foreach keys %seen; + redo OUT; + } + $seen{$t=$n}=1,$n=0; + do {$n+=($t%10)**2} while $t=int($t/10); + } + $happy[$_]=1 for keys %seen; + $happy[$N]=1; + push @ret,$N; } + @ret; } +``` -sub best_fit { - my $sx = my $sy = my $sxy = my $sxx = 0, my $n = @{$_[0]}; - ## Loop through all points computing sums of x, x^2, y & x.y - $sx += $_->[0], $sxy += $_->[0]*$_->[1], $sy += $_->[1], $sxx += $_->[0]*$_->[0] foreach @{$_[0]}; +**Notes** - ## Use these to compute b with formulae below - my $b = ( $n*$sxy-$sx*$sy ) / ( $n*$sxx - $sx*$sx ); + * If you don't have unlimited memory you could consider only cacheing lower values of "happy".... for a larger value of `n` - we note that the sum of the digits is only going to be at most `81 x #digits` so you could consider only storing the first `1,539` to allow you quickly compute the happy status of all 64-bit signed integers (9,223,372,036,854,775,808) - the second iteration will always be below `1,539`... - ## Work out a by solving y = a + bx for each point (we know that (sx/n,sy/n) is on the line and return - ( ($sy-$b*$sx)/$n, $b ); +```perl +sub happy_list_cache_limited { + my $count= @_ ? $_[0] : 8, my %seen, my $t, state @happy = (0,my @ret = my $N = 1); + OUT: for (2..$count) { + %seen=(); + my $n = ++$N; + while($n>1) { + last if defined $happy[$n] && $happy[$n]==1; + if(defined $happy[$n] || $seen{$n}) { ## Unhappy no... + ($_<1540) && ($happy[$_] = 0) for keys %seen; + redo OUT; + } + $seen{$t=$n}=1,$n=0; + do {$n+=($t%10)**2} while $t=int($t/10); + } + ($_<1540) && ($happy[$_]=1) for $N, keys %seen; + push @ret,$N; + } + @ret; } +``` -sub get_ranges { - my( $pts, $lines ) = @_; - ## We start with the first point/line to get an initial value for min/max x/y. - my($min_x,$min_y)=my($max_x,$max_y) = @{$pts} ? @{$pts->[0]} : @{$lines->[0]}; +**Note** This value is roughly linear in `n` - asymptotic value is around `24.4 x n`. - ## Loop through all points/lines and get the min/max... - ($_->[0]<$min_x)&&($min_x=$_->[0]), ($_->[0]>$max_x)&&($max_x=$_->[0]), - ($_->[1]<$min_y)&&($min_y=$_->[1]), ($_->[1]>$max_y)&&($max_y=$_->[1]) for @{$pts}, map {($_,[$_->[2],$_->[3]])} @{$lines}; +### Pre-computing cache - ## Return the values - ( $min_x, $max_x, $min_y, $max_y ); -} +``` +sub happy_list_precache { + state @happy; + my( $L, $count, $t, @ret, %seen ) = ( 1_640, @_ ? $_[0] : 8 ); + + unless( @happy ) { + @happy=(0,1); + O: for my $N ( 2..$L ) { + my $n = $N;%seen=(); + while($n>1){ + last if defined $happy[$n] && $happy[$n]==1; + if( defined $happy[$n] || $seen{$n} ) { + $happy[$_]=0 for keys %seen; + next O; + } + $seen{$t=$n}=1,$n=0; + do {$n+=($t%10)**2} while $t=int($t/10); + } + $happy[$_]=1 for $N, keys %seen; + } + } -sub add_best_fit_line { - my $extn = shift; - my( $a, $b ) = best_fit( \@pts ); - my( $min_x, $max_x, $min_y, $max_y ) = get_ranges( \@pts ); - my $l_y = $a + $b * ($min_x - $extn); - my $r_y = $a + $b * ($max_x + $extn); - my $l_x = $l_y < $min_y - $extn ? ( ($l_y = $min_y - $extn ) - $a)/$b - : $l_y > $max_y + $extn ? ( ($l_y = $max_y + $extn ) - $a)/$b : $min_x - $extn; - my $r_x = $r_y < $min_y - $extn ? ( ($r_y = $min_y - $extn ) - $a)/$b - : $r_y > $max_y + $extn ? ( ($r_y = $max_y + $extn ) - $a)/$b : $max_x + $extn; - push @lines, [ $l_x,$l_y,$r_x,$r_y ]; + ## Now loop through until we have a list of first $count happy + ## numbers. + ## If we wanted to use this method in an if_happy function - could + ## equally replace this with + ## return $happy[$N] if $N <= $L; + ## my $n=0; + ## do {$n+=($N%10)**2} while $N=int($N/10); + ## return $happy[$n]; + my $N=0; + for (1..$count) { + $N++; + if( $N <= $L ) { + $happy[$N] || redo; + } else { + my $n=0,$t=$N; + do {$n+=($t%10)**2} while $t=int($t/10); + $happy[$n] || redo; + } + push @ret,$N; + } + @ret; } +``` + +We can use this technique to create an optimized version of `is_happy` - as we are using a `state` variable to store the cache -sub render_svg { - my( $pts, $lines, $config ) = @_; - - # Get range (I know we do this twice for ch-2.pl - but want to keep this code simple! - my( $min_x, $max_x, $min_y, $max_y ) = get_ranges( $pts, $0 eq 'ch-2.pl' ? [] : $lines ); - my $margin = $config->{'margin'}//20; - - ## Get size of image and size of chart {we add a margin to each side of the graph so "points" don't fall off the edge... - my($W,$H,$width,$height) = ($config->{'max_w'}//800,$config->{'max_h'}//600,$max_x-$min_x+2*$margin,$max_y-$min_y+2*$margin); - - ## resize bounding box so that graph's aspect ratio is preserved, but the graph just fits in the box. - ( $width/$height > $W/$H ) ? ( $H = $height/$width*$W ) : ( $W = $width/$height*$H ); - - ## calculate scale factor - so the spots and lines are the same size/width irrespective of scaling from size of image in px - ## to dimensions of graph... - my $sf = $width/$W; - - ## Finally render the chart - with one mega sprintf... - - return sprintf '<?xml version="1.0" encoding="UTF-8" standalone="yes"?> -<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd"> -<svg height="%s" width="%s" viewBox="%s %s %s %s" xmlns="http://www.w3.org/2000/svg" - xmlns:svg="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"> - <rect stroke="%s" stroke-width="%s" fill="%s" x="%s" y="%s" width="%s" height="%s" /> - <g stroke="%s" stroke-width="%s"> - %s - </g> - <g fill="%s"> - %s - </g> -</svg>', - $H, $W, $min_x - $margin, $min_y - $margin, $width, $height, ## svg element - $config->{'border'}//'#000', $sf, $config->{'bg'}//'#eee', ## background rectangle - $min_x - $margin, $min_y - $margin, $width, $height, - $config->{'fill'}//'#000', ($config->{'stroke'}//5) * $sf, ## lines - join( qq(\n ), map { sprintf '<line x1="%s" y1="%s" x2="%s" y2="%s" />', @{$_} } @{$lines} ), - $config->{'color'}//'#ccc', ## dots - join qq(\n ), map { sprintf '<circle cx="%s" cy="%s" r="%s" />', @{$_}, ($config->{'radius'}//10)*$sf } @{$pts}; +```perl +sub is_happy_precache { + state @happy; + my( $L, $N, $t, @ret, %seen ) = ( 1_640, $_[0] ); + + ## Set up cache if empty + unless( @happy ) { + @happy=(0,1); + O: for my $N ( 2..$L ) { + my $n = $N;%seen=(); + while($n>1){ + last if defined $happy[$n] && $happy[$n]==1; + if( defined $happy[$n] || $seen{$n} ) { + $happy[$_]=0 for keys %seen; + next O; + } + $seen{$t=$n}=1,$n=0; + do {$n+=($t%10)**2} while $t=int($t/10); + } + $happy[$_]=1 for $N, keys %seen; + } + } + + ## Get value from cache.... + if( $N > $L ) { ## If not in cached array we replace + my $n=$N,$N=0; ## $N by the sum of it's digits squared + do {$N+=($n%10)**2} while $n = int($n/10); + } + $happy[$N]; ## And look up value in the cache.. } ``` + +### Relative performance: + +Computing list of first 1,000,000 happy values + +| | time | is_happy | list_cache | list_cache_limit | is_precache | list_precache | +| :---------------- | ----: | -------: | ---------: | ---------------: | ----------: | ------------: | +| is_happy | 69.1 | -- | -75% | -78% | -84% | -87% | +| list_cache | 17.0 | 306% | -- | -11% | -37% | -47% | +| list_cache_limit | 15.1 | 356% | 12% | -- | -29% | -40% | +| is_precache | 10.8 | 542% | 58% | 41% | -- | -16% | +| list_precache | 9.04 | 664% | 88% | 67% | 19% | -- | + + * So we can see both `precache` methods are most efficient being over **40%** faster that every other method. + * The best method `list_precache` is nearly **8** times faster than the naive looping and calling our `is_happy` function repeatedly. + * Limiting the cache to just those numbers below 1540 has a slight performance gain about `1/8` over the one where we store every value in the cache. For values over `1540` we only ever store data in the cache and not retrieve it. |
