diff options
| author | drbaggy <js5@sanger.ac.uk> | 2021-06-06 22:16:11 +0100 |
|---|---|---|
| committer | drbaggy <js5@sanger.ac.uk> | 2021-06-06 22:16:11 +0100 |
| commit | 86f1bdedafd2a12e2354979a8d50d12d8d43ea8a (patch) | |
| tree | 4f3b667a52f1d44c0973da773f6af9babb1bb83a /challenge-115/james-smith | |
| parent | 7deeff0a33cc808985745bd3dce4a3aeb1b2d3bb (diff) | |
| parent | 200c5f7c9c0d4882d800a9a25b8f04f14b034fdf (diff) | |
| download | perlweeklychallenge-club-86f1bdedafd2a12e2354979a8d50d12d8d43ea8a.tar.gz perlweeklychallenge-club-86f1bdedafd2a12e2354979a8d50d12d8d43ea8a.tar.bz2 perlweeklychallenge-club-86f1bdedafd2a12e2354979a8d50d12d8d43ea8a.zip | |
x
Diffstat (limited to 'challenge-115/james-smith')
| -rw-r--r-- | challenge-115/james-smith/README.md | 356 |
1 files changed, 175 insertions, 181 deletions
diff --git a/challenge-115/james-smith/README.md b/challenge-115/james-smith/README.md index 95ca983fee..aa10b1d2cd 100644 --- a/challenge-115/james-smith/README.md +++ b/challenge-115/james-smith/README.md @@ -1,6 +1,6 @@ -# Perl Weekly Challenge #114 +# Perl Weekly Challenge #115 -# What no regexs or loops.... +# Cursing at recursion You can find more information about this weeks, and previous weeks challenges at: @@ -12,228 +12,222 @@ submit solutions in whichever language you feel comfortable with. You can find the solutions here on github at: -https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-114/james-smith/perl +https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-115/james-smith/perl -# Challenge 1 - Next highest palindrome +# Challenge 1 - String Chain -***You are given a positive integer `$N`. Write a script to find out -the next Palindrome Number higher than the given integer `$N`.*** +***You are given an array of strings. Write a script to find out if the given strings can be chained to form a circle. Print 1 if found otherwise 0.*** -## The solution - naive +## Clarification -We will see this again for the next challenge we just increment `$N` -until we find another palindrome. +Here we make the assumption that the chain includes **ALL** elements. -```perl -sub next_palindrome_naive { - my ($n) = @_; - 1 until ++$n eq reverse $n; - return $n -} -``` - -## The solution - optimized - -First we note that it is easier to compute the palindrome greater than -equal to itself {so we just incremement the passed parameter}. +## The solution - a quick filter -We should then be able to do away with the loop entirely as the -palindromic number will either have the same first half as itself OR -will have this value incrememented by 1 as the first half.... No loop -requried.. +There is a trick to see if we have **NO** solution. If we keep a track +of all the times a letter appears at the beginning of the word AND at +the end then these have to be equal! We can do this in perl using a hash, +for initial letters we increment the value of the hash, for final letters +we decrement it. -### The cases.. +If any value in the hash at the end of the loop is non-zero then we have +an imbalance and so we can't find a solution. -There are two cases we need to consider: +```perl + my %F; + ($F{substr$_,0,1}++,$F{substr$_,-1}--) foreach @words; + return 0 if grep {$_} values %F; +``` - * There are an even number of digits - * There are an odd number of digits.. +If this check is passed we may still not have a solution as we may have +two or more circles. e.g. -The first case is slightly easiers as we just check to see if the -palindrome created by reversing the first digits and putting them -at the end is greater than or equal to the number, and if not -increment and try again. +``` + abc a->b->c m->n->o + cde ^ | ^ | + efg | v | v + gha h d t p + mno ^ | ^ | + opq | v | v + qrs g<-f<-e s<-r<-q + stm +``` +## The curse of recursion. -The second case is slightly more interesting as we have the middle -digit to consider. In the 2nd half above we can increment the middle -digit if (less than 9) OR incremennt the first digits.. +We can use recursion to find out if we have ANY solution which +satisfies this criteria. ```perl -sub next_palindrome { - my $p = 1 + shift; - my $x = substr $p, 0, (length $p)>>1; - if( 1 & length $p ) { - my $y = substr $p, (length$p)>>1, 1; - return $x.$y.reverse $x if $p <= $x.$y.reverse $x; - return $x.($y+1).reverse $x if $y<9; - $x++; - return $x.'0'.reverse $x; - } else { - $x++ if $p > $x.reverse $x; - return $x.reverse $x; +sub exhaust { + my ($init,@words) = @_; + my $n = @words; + if( $n==1) { + return substr($init,-1) eq substr($words[0],0,1) + && substr($init,0,1) eq substr($words[0],-1) ? 1 : 0; + } + foreach(1..$n) { + push @words,shift @words; + next unless (substr $init,-1) eq substr $words[0],0,1; + return 1 if exhaust( $init.$words[0], @words[1..($n-1)] ); } + return 0; } ``` -## Notes and Summary +We rotate the words array to avoid needing to do a complex `splice`... -You will note I've used the "Yoda" form of some of the expressions -inequalities. It is much easier for instance to realise that: -`1 & length $p` is "and"ing `1` with the length of `$p` rather than -"and"ing `1` with `$p` and then taking the length (which will be 1) if -you were to write `length $p & 1`... +This works - but for complex examples can hit the dreaded +"Deep recursion" warning... -There were some cases where I thought assigning the result of -`reverse$x` and `length$p` would speed things up - but it seemed to -slow things down by 10% or so - So I'm assuming there is some neat -code in the interpreter/compiler does this cacheing for you. +## The cure for recursion... -For small numbers of `$N` there is little difference in the performance -15% - but as soon as numbers are up to 3/4 digits then the optimised -version is 6 times faster, for 5/6 digits 80 times faster, for 7/8 -approximately 1000 times faster... +We don't actually need to recurse here, we know that we can combine +the strings into 2 or more circles (in many different ways potentially). +But if we have two circles that touch - it is is easy to see that by +splicing the two circles together at this point makes a single large +circle... -# Challenge 2 - Higher Integer Set Bits +So here comes solution 2.... -***You are given a positive integer `$N`. Write a script to find -the next higher integer having the same number of 1 bits in binary -representation as `$N`.*** +Find a loop, if we have any words left - see if any of those words start +with a letter we have already seen in the loop. If so we repeat the +loop finder with the rest of the words... Probably easier to see in +pictures: +``` +abc + +cde -> a-b-c -> a-b-c -> a b c + | | | | | | +efg h d h d h d + | | | | | | +gha g-f-e g-f-e-o-p g e-o-p + | / | / +eop e-o-p t q f q + | / |/ | / +pqz t q z e-t-z + |/ +zte z +``` -## The solution - naive - -There is a simple solution we can try - and that is to take the number, -count the number of 1-bits, and then just increment repeatedly until we -get a number with the same amount of 1-bits. - +So the non-recursive routine is: ```perl -sub next_bin { - my $n = shift; - my $c = (sprintf '%b', $n) =~ tr/1/1/; - while(++$n) { - return $n if $c == ( (sprintf '%b', $n) =~ tr/1/1/ ); +sub circ_single_non_recursive { + ## This quickly filters out those cases in which we + ## can't join end on end + my @words = @_; + my %F; + ($F{substr$_,0,1}++,$F{substr$_,-1}--) foreach @words; + return 0 if grep {$_} values %F; + + ## Now we start at any point and get the first circle, keeping + ## track of letters we have included in the loop(s) `%seen` + my %seen; + while(@words) { + my $init = shift @words; + $seen{ord $init}=1; + my $ptr = 0; + ## Skip this bit if the word is "self-closing" ie starts/ends + ## with same letter... + if( substr($init,0,1) ne substr $init, -1 ) { + while($ptr++ < @words) { + ## If we have a match - we just start again until + ## we do not find a match.... + if( (substr $init,-1) eq substr $words[0],0,1 ) { + $seen{ ord $words[0] } = 1; + $init =shift @words; + $ptr = 0; + return 1 unless @words; ## Return 1 we have got to end of list! + } + ## Rotate the list. + push @words, shift @words; + } + } + return 1 unless @words; ## We have no words left - success... + ## Do we have a loop that will extend the first loop... + ## Find any word which starts with a letter we have already seen! + $ptr=0; + $init=undef; + while( $ptr++ < @words) { + if($seen{ord $words[0]}) { + $init=1; + last; + } + push @words,shift@words; + } + return 0 unless $init; ## No words - so will return 0 } + return 1; ## Got to the end - no words left! YAY!!! } ``` +## Summary - * We convert the number to binary using sprintf with the format `'%b'`; - * We count the number of "1"s in the string using `tr`. `tr/1/1/` leaves - the string unchanged but returns the number of "1"s that were replaced. - -## The solution - optimized - -We can easily find a solution to this problem. - -If the number contains a pair of digits "01" then we can find a number -that is larger but has the same number of digits by swapping the "01" to "10". -(Note we can force the binary representation to always have a "01" by prefixing -the binary representation with "0") - -So e.g. `174 = 1010 1110` - you can replace either of the `01`s to give either: - - * `1100 1110 = 206` - * `1011 0110 = 182` - -We note that to minimize the number we start by replacing the last `01` by `10` - -So we have: `182 = 1011 0110 > 174 = 1010 1100` +Looking at performance - avoiding recursion is good and increases +performance considerably. For small examples it is 5-20% faster, but +for more complex examples the benefit grows rapidly. -The digits after the last `01` will be of the form `1...10..0`, so we can again -reduce the value by flipping this string around to be `0...01...1`; +### A difficult example... -So now we have: `179 = 1011 0011 > 174 = 1010 110` +To test performance I created a random sequence of words (all combinations +of 2 letters!)... There is obvious a solution to this, but there are also +many many shorter solutions... -The code then becomes either: +The non-recursive solution is approximately 6 times faster. -```perl -sub next_bin_rex { - return oct '0b'.sprintf('0%b',shift) =~ s{01(1*)(0*)$}{10$2$1}r; -} ``` -or - -```perl -sub next_bin_rrev { - my $t = rindex my $s = sprintf('0%b',shift),'01'; - return oct '0b'.substr($s,0,$t).'10'.reverse substr $s,$t+2; -} +pk fz iy oz cf xm gm uy ur te ct zz rw jm aq oq xy mi me rv jc iv sx pq lz nd +cm vj uf rq ij zk ef wm bb cj vv oo og ft fq mj os uh gn ml mz fm az yr zh wa +bm gj xn df yf er xc xb bl uw ri nq nn oi pc ym jr da rz bq vm sr ni jz po oj +wf iu ja tu lk yt nc sl wi zb hm uv th kn hk pv yq ez we im gt za sj nh qr bt +rr ok ai xx qs lg ue fc ws vc vy ki xi wy fv lt rl xw nj gu dz ip zl je pf hn +uk di mv ug vf uz wt yn qx rh sz fh pm sa qz gp gw jp ve le fe ia nw pu km uo +gy li pe hj mn ew hg qg se mq by vh ca hd bn nl xu dd ji bd ol vp wp yo st ac +bh bx fa md zx mm ox qi mc lc jx wz jd xt vr yj pn uj zy ih ul pb id xh wb qq +xg ou sp bo yz bu ec vz fy io hl jo cl zs ge tz qp mf zf kq sk qw as vs rm jw +yk tr cw tk xz kt ra qo tj fu dy hx ic ej nt jt gf ko rd od ep qn sg ek ui bg +iw zm at dv fx kb xr nu xl xs lr xk na ne xe rt jf ga hv kf xq sq pt cp rj fr +fp qf gg ii ey tn ce ya kl wc ks qh em gc ts dn wk is fo pa sd ly uc zn dk wq +bw tq kg xf vn ea hh ik lm nr wd kc mu ru co kx nb gl fs bi hu fk ld qa qy qm +wl cn cr zd ke jl gz wr xp tm tl kj no ex wj su dc sh ee hf dw ax ms hq jj sb +ed qj vw ha ju wx yi sf ln jy rx ei sy ar dm hr al ah mb on ob uq ps lv ad jk +rb fb gk cc rp jq ka my ix nv re vk tp zw rg tb up pp uu ds ho zv nf ty cu kp +eg in lu hy mk um zt hc qt yp tv rk hb pj ph bk af to lq qe ib gi bz jh iz lo +yv ci jg gh yx bj il cz gx ro ff kk vd ub et bs tw si qd ql au ti xj yh yl kr +om nx lw wo gs gq ku mw py tx ll fl xd ch rc go dt lx zu ry hz bc lj la lh ux +sc fn it ir tt mo pr gd sm mx jn cd vl vb mt pz vt he eh ss dh dj yw xv be ov +wh ww pg ao es ye xa vq hs yc yd vu ns zr lf rs pl cs eo zi qb qk so oc wn el +zc yb de mg fw cx cv wv ot cb qu wu nm ow zp rn hw lp ma en vg cy rf tc am ut +cq sv kv oh jb np us ck mh ny gr gb op an kw aa vo zq iq ba px dr un or ze bp +zg eb ud if dl dq zj ky bf vi cg ua br yu mp bv sn pi db ae of kh pd hp qc jv +xo du fg ta do kd dx av ys tg ls fi kz tf eu aj sw vx oa pw fd ab hi va lb dg +ig dp nz js qv ag aw eq mr zo yy ie nk yg ap oe gv oy ht ev ak td ay fj ng wg ``` -depending on whether or not you use a regular expression to find -the last "`01`" in the binary representaiton. - -## The solution - with go faster stripes... +# Challenge 2 - Largest even -After a discussion on facebook with Eliza Skr, about whether or not -to use regexs rather than `rindex` she supplied a different algorithm -for finding the next number - which didn't involve manipulating the -binary string but by working out the arithmetic to make the changes. +***You are given a list of positive integers (`0`-`9`), single digit. Write a script to find the largest multiple of `2` that can be formed from the list.*** - * The number is of the form is `0 1111 00000000` - * The next hightest number is `1 000000000 111` - * To map `0 1111 00000000` to `1 000000000 000` we need to add - `1 00000000` (which is 2^#zeros) - * To map `1 000000000 000` to `1 000000000 111` we need to add `111` - which is 2^(#ones -1) -1 +## The solution -Eliza's solution was to obtain counts of `0`s and `1`s using a simple -regex `/(1+)(0*)$/` which works - but is still a regular expression, -which as we discussed above is a slow operation. +For once challenge 2 is easier. -We can replace this again with using `rindex`... Also rather than -using `2**$n` we replace it with the much quicker bit-shift operator -`1<<$n` - which achieves the same effect. +To find the largest number we just sort the digits in descending order +and stitch them together. -This gives us: +To find the largest even number we just sort the digits in descending +order, but move the lowest even number to the end. ```perl -sub next_bin_rindex2 { - my $t=rindex my$s=sprintf('%b',$_[0]),'1'; - return $_[0] + (1<<(-1-$t+length$s)) - - 1 + (1<<(-1+$t-rindex$s,'0',$t)); +sub biggest_even { + my $ptr = my @digits = reverse sort @{$_[0]}; + while( $ptr-- ) { + next if $digits[$ptr] & 1; ## Skip if odd... + return join '', + @digits[ 0..$ptr-1, $ptr+1..$#digits, $ptr ]; + } + return ''; } ``` -A few notes: - - * here we use the three parameter version of `rindex`, - which allows you to specify an offset for the search to start (in this - case we want the last "`0`" before the last "`1`" so we use the position - of the "`1`" as the offset) - * We use the bit-shift operator `<<` to raise to the power `2` - rather than the power operator.... If we break down all the efficiency - gains between the the rrev & rind2 methods - most of the gain would - be lost if we reverted back to `2**$n`. - * We looked to see if unpack was more efficient than sprintf - but found - that this was not the case {about 20-40% slower}. - -## Summary - -Both the performance of `next_bin_regex` and `next_bin_rrev` appear -to slow down only slightly as `$N` increases - comparabale with -"linear" scans for the last "`01`". - -Interestingly the `next_bin_rind2` seems to run at similar speeds for -all ranges of `$N`. - -The naive `next_bin` - doesn't have that property - at all and it -rapidly tails off performance wise. - -Running this a large number of times - we have the following -approximate rates for the calculations.... - -| Size of number | Rate rind2 | Rate rrev | Rate regex | Rate naive | -| -------------- | ---------: | ---------: | ---------: | ---------: | -| 1-500 | 1,900,000 | 1,550,000 | 500,000 | 600,000 | -| Approx 1000 | 1,800,000 | 1,500,000 | 440,000 | 400,000 | -| Approx 1x10^6 | 1,800,000 | 1,350,000 | 390,000 | 4,000 | -| Approx 1x10^9 | 1,850,000 | 1,250,000 | 330,000 | 1 | - -The calls do include the hardest example `2^n-1` for which the next -number is `2^(n-1)` more - so much of the time in the naive loop is -taken up by that example - in the 1x10^9 example this would require -500_000_000 iterations of the increment/check loop. - -We see as we did a few weeks ago that if you don't actually need to -use regexs then you can get an appreciable speed boost. Obviously -remembering there is trade off between coding and running time. +The while loop just looks for the smallest even number & moves it +to the end using an array slice. |
