x

author: drbaggy <js5@sanger.ac.uk> 2021-06-06 22:16:11 +0100
committer: drbaggy <js5@sanger.ac.uk> 2021-06-06 22:16:11 +0100
commit: 86f1bdedafd2a12e2354979a8d50d12d8d43ea8a (patch)
tree: 4f3b667a52f1d44c0973da773f6af9babb1bb83a /challenge-115/james-smith
parent: 7deeff0a33cc808985745bd3dce4a3aeb1b2d3bb (diff)
parent: 200c5f7c9c0d4882d800a9a25b8f04f14b034fdf (diff)
download: perlweeklychallenge-club-86f1bdedafd2a12e2354979a8d50d12d8d43ea8a.tar.gz
perlweeklychallenge-club-86f1bdedafd2a12e2354979a8d50d12d8d43ea8a.tar.bz2
perlweeklychallenge-club-86f1bdedafd2a12e2354979a8d50d12d8d43ea8a.zip
1 files changed, 175 insertions, 181 deletions
diff --git a/challenge-115/james-smith/README.md b/challenge-115/james-smith/README.md
index 95ca983fee..aa10b1d2cd 100644
--- a/challenge-115/james-smith/README.md
+++ b/challenge-115/james-smith/README.md
@@ -1,6 +1,6 @@
-# Perl Weekly Challenge #114
+# Perl Weekly Challenge #115
 
-# What no regexs or loops....
+# Cursing at recursion
 
 You can find more information about this weeks, and previous weeks challenges at:
 
@@ -12,228 +12,222 @@ submit solutions in whichever language you feel comfortable with.
 
 You can find the solutions here on github at:
 
-https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-114/james-smith/perl
+https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-115/james-smith/perl
 
-# Challenge 1 - Next highest palindrome
+# Challenge 1 - String Chain
 
-***You are given a positive integer `$N`. Write a script to find out
-the next Palindrome Number higher than the given integer `$N`.***
+***You are given an array of strings. Write a script to find out if the given strings can be chained to form a circle. Print 1 if found otherwise 0.***
 
-## The solution - naive
+## Clarification
 
-We will see this again for the next challenge we just increment `$N`
-until we find another palindrome.
+Here we make the assumption that the chain includes **ALL** elements.
 
-```perl
-sub next_palindrome_naive {
-  my ($n) = @_;
-  1 until ++$n eq reverse $n;
-  return $n
-}
-```
-
-## The solution - optimized
-
-First we note that it is easier to compute the palindrome greater than
-equal to itself {so we just incremement the passed parameter}.
+## The solution - a quick filter
 
-We should then be able to do away with the loop entirely as the
-palindromic number will either have the same first half as itself OR
-will have this value incrememented by 1 as the first half.... No loop
-requried..
+There is a trick to see if we have **NO** solution. If we keep a track
+of all the times a letter appears at the beginning of the word AND at
+the end then these have to be equal! We can do this in perl using a hash,
+for initial letters we increment the value of the hash, for final letters
+we decrement it.
 
-### The cases..
+If any value in the hash at the end of the loop is non-zero then we have
+an imbalance and so we can't find a solution.
 
-There are two cases we need to consider:
+```perl
+  my %F;
+  ($F{substr$_,0,1}++,$F{substr$_,-1}--) foreach @words;
+  return 0 if grep {$_} values %F;
+```
 
- * There are an even number of digits
- * There are an odd number of digits..
+If this check is passed we may still not have a solution as we may have
+two or more circles. e.g.
 
-The first case is slightly easiers as we just check to see if the
-palindrome created by reversing the first digits and putting them
-at the end is greater than or equal to the number, and if not
-increment and try again.
+```
+ abc           a->b->c    m->n->o
+ cde           ^     |    ^     |
+ efg           |     v    |     v
+ gha           h     d    t     p
+ mno           ^     |    ^     |
+ opq           |     v    |     v
+ qrs           g<-f<-e    s<-r<-q
+ stm
+```
+## The curse of recursion.
 
-The second case is slightly more interesting as we have the middle
-digit to consider. In the 2nd half above we can increment the middle
-digit if (less than 9) OR incremennt the first digits..
+We can use recursion to find out if we have ANY solution which
+satisfies this criteria.
 
 ```perl
-sub next_palindrome {
-  my $p  = 1 + shift;
-  my $x = substr $p, 0, (length $p)>>1;
-  if( 1 & length $p ) {
-    my $y = substr $p, (length$p)>>1, 1;
-    return $x.$y.reverse $x      if $p <= $x.$y.reverse $x;
-    return $x.($y+1).reverse $x  if $y<9;
-    $x++;
-    return $x.'0'.reverse $x;
-  } else {
-    $x++ if $p > $x.reverse $x;
-    return $x.reverse $x;
+sub exhaust {
+  my ($init,@words) = @_;
+  my $n = @words;
+  if( $n==1) {
+    return substr($init,-1)  eq substr($words[0],0,1)
+        && substr($init,0,1) eq substr($words[0],-1)  ? 1 : 0;
+  }
+  foreach(1..$n) {
+    push @words,shift @words;
+    next unless (substr $init,-1) eq substr $words[0],0,1;
+    return 1 if exhaust( $init.$words[0], @words[1..($n-1)] );
   }
+  return 0;
 }
 ```
 
-## Notes and Summary
+We rotate the words array to avoid needing to do a complex `splice`...
 
-You will note I've used the "Yoda" form of some of the expressions
-inequalities. It is much easier for instance to realise that:
-`1 & length $p` is "and"ing `1` with the length of `$p` rather than
-"and"ing `1` with `$p` and then taking the length (which will be 1) if
-you were to write `length $p & 1`...
+This works - but for complex examples can hit the dreaded
+"Deep recursion" warning...
 
-There were some cases where I thought assigning the result of
-`reverse$x` and `length$p` would speed things up - but it seemed to
-slow things down by 10% or so - So I'm assuming there is some neat
-code in the interpreter/compiler does this cacheing for you.
+## The cure for recursion...
 
-For small numbers of `$N` there is little difference in the performance
-15% - but as soon as numbers are up to 3/4 digits then the optimised
-version is 6 times faster, for 5/6 digits 80 times faster, for 7/8
-approximately 1000 times faster...
+We don't actually need to recurse here, we know that we can combine
+the strings into 2 or more circles (in many different ways potentially).
+But if we have two circles that touch - it is is easy to see that by 
+splicing the two circles together at this point makes a single large
+circle...
 
-# Challenge 2 - Higher Integer Set Bits
+So here comes solution 2....
 
-***You are given a positive integer `$N`. Write a script to find
-the next higher integer having the same number of 1 bits in binary
-representation as `$N`.***
+Find a loop, if we have any words left - see if any of those words start
+with a letter we have already seen in the loop. If so we repeat the
+loop finder with the rest of the words... Probably easier to see in 
+pictures:
 
+```
+abc
+
+cde  ->   a-b-c    ->  a-b-c        ->  a b c
+          |   |        |   |            |   |
+efg       h   d        h   d            h   d
+          |   |        |   |            |   |
+gha       g-f-e        g-f-e-o-p        g   e-o-p
+                           |  /         |      /
+eop       e-o-p            t q          f     q
+          |  /             |/           |    /
+pqz       t q              z            e-t-z
+          |/
+zte       z 
+```
 
-## The solution - naive
-
-There is a simple solution we can try - and that is to take the number,
-count the number of 1-bits, and then just increment repeatedly until we
-get a number with the same amount of 1-bits.
-
+So the non-recursive routine is:
 ```perl
-sub next_bin {
-  my $n = shift;
-  my $c = (sprintf '%b', $n) =~ tr/1/1/;
-  while(++$n) {
-    return $n if $c == ( (sprintf '%b', $n) =~ tr/1/1/ );
+sub circ_single_non_recursive {
+  ## This quickly filters out those cases in which we
+  ## can't join end on end
+  my @words = @_;
+  my %F;
+  ($F{substr$_,0,1}++,$F{substr$_,-1}--) foreach @words;
+  return 0 if grep {$_} values %F; 
+
+  ## Now we start at any point and get the first circle, keeping
+  ## track of letters we have included in the loop(s) `%seen`
+  my %seen;
+  while(@words) {
+    my $init = shift @words;
+    $seen{ord $init}=1;
+    my $ptr = 0;
+    ## Skip this bit if the word is "self-closing" ie starts/ends
+    ## with same letter...
+    if( substr($init,0,1) ne substr $init, -1 ) {
+      while($ptr++ < @words) {
+        ## If we have a match - we just start again until
+        ## we do not find a match....
+        if( (substr $init,-1) eq substr $words[0],0,1 ) {
+          $seen{ ord $words[0] } = 1;
+          $init =shift @words;
+          $ptr = 0;
+          return 1 unless @words; ## Return 1 we have got to end of list!
+        }
+        ## Rotate the list.
+        push @words, shift @words;
+      }
+    }
+    return 1 unless @words; ## We have no words left - success...
+    ## Do we have a loop that will extend the first loop...
+    ## Find any word which starts with a letter we have already seen!
+    $ptr=0;
+    $init=undef;
+    while( $ptr++ < @words) {
+      if($seen{ord $words[0]}) {
+        $init=1;
+        last;
+      }
+      push @words,shift@words;
+    }
+    return 0 unless $init; ## No words - so will return 0
   }
+  return 1; ## Got to the end - no words left! YAY!!!
 }
 ```
+## Summary
 
- * We convert the number to binary using sprintf with the format `'%b'`;
- * We count the number of "1"s in the string using `tr`. `tr/1/1/` leaves
-   the string unchanged but returns the number of "1"s that were replaced.
-
-## The solution - optimized
-
-We can easily find a solution to this problem.
-
-If the number contains a pair of digits "01" then we can find a number
-that is larger but has the same number of digits by swapping the "01" to "10".
-(Note we can force the binary representation to always have a "01" by prefixing
-the binary representation with "0")
-
-So e.g. `174 = 1010 1110` - you can replace either of the `01`s to give either:
-
- * `1100 1110 = 206`
- * `1011 0110 = 182`
-
-We note that to minimize the number we start by replacing the last `01` by `10`
-
-So we have: `182 = 1011 0110 > 174 = 1010 1100`
+Looking at performance - avoiding recursion is good and increases
+performance considerably. For small examples it is 5-20% faster, but
+for more complex examples the benefit grows rapidly.
 
-The digits after the last `01` will be of the form `1...10..0`, so we can again
-reduce the value by flipping this string around to be `0...01...1`;
+### A difficult example...
 
-So now we have: `179 = 1011 0011 > 174 = 1010 110`
+To test performance I created a random sequence of words (all combinations
+of 2 letters!)... There is obvious a solution to this, but there are also
+many many shorter solutions...
 
-The code then becomes either:
+The non-recursive solution is approximately 6 times faster.
 
-```perl
-sub next_bin_rex {
-  return oct '0b'.sprintf('0%b',shift) =~ s{01(1*)(0*)$}{10$2$1}r;
-}
 ```
-or
-
-```perl
-sub next_bin_rrev {
-  my $t = rindex my $s = sprintf('0%b',shift),'01';
-  return oct '0b'.substr($s,0,$t).'10'.reverse substr $s,$t+2;
-}
+pk fz iy oz cf xm gm uy ur te ct zz rw jm aq oq xy mi me rv jc iv sx pq lz nd
+cm vj uf rq ij zk ef wm bb cj vv oo og ft fq mj os uh gn ml mz fm az yr zh wa
+bm gj xn df yf er xc xb bl uw ri nq nn oi pc ym jr da rz bq vm sr ni jz po oj
+wf iu ja tu lk yt nc sl wi zb hm uv th kn hk pv yq ez we im gt za sj nh qr bt
+rr ok ai xx qs lg ue fc ws vc vy ki xi wy fv lt rl xw nj gu dz ip zl je pf hn
+uk di mv ug vf uz wt yn qx rh sz fh pm sa qz gp gw jp ve le fe ia nw pu km uo
+gy li pe hj mn ew hg qg se mq by vh ca hd bn nl xu dd ji bd ol vp wp yo st ac
+bh bx fa md zx mm ox qi mc lc jx wz jd xt vr yj pn uj zy ih ul pb id xh wb qq
+xg ou sp bo yz bu ec vz fy io hl jo cl zs ge tz qp mf zf kq sk qw as vs rm jw
+yk tr cw tk xz kt ra qo tj fu dy hx ic ej nt jt gf ko rd od ep qn sg ek ui bg
+iw zm at dv fx kb xr nu xl xs lr xk na ne xe rt jf ga hv kf xq sq pt cp rj fr
+fp qf gg ii ey tn ce ya kl wc ks qh em gc ts dn wk is fo pa sd ly uc zn dk wq
+bw tq kg xf vn ea hh ik lm nr wd kc mu ru co kx nb gl fs bi hu fk ld qa qy qm
+wl cn cr zd ke jl gz wr xp tm tl kj no ex wj su dc sh ee hf dw ax ms hq jj sb
+ed qj vw ha ju wx yi sf ln jy rx ei sy ar dm hr al ah mb on ob uq ps lv ad jk
+rb fb gk cc rp jq ka my ix nv re vk tp zw rg tb up pp uu ds ho zv nf ty cu kp
+eg in lu hy mk um zt hc qt yp tv rk hb pj ph bk af to lq qe ib gi bz jh iz lo
+yv ci jg gh yx bj il cz gx ro ff kk vd ub et bs tw si qd ql au ti xj yh yl kr
+om nx lw wo gs gq ku mw py tx ll fl xd ch rc go dt lx zu ry hz bc lj la lh ux
+sc fn it ir tt mo pr gd sm mx jn cd vl vb mt pz vt he eh ss dh dj yw xv be ov
+wh ww pg ao es ye xa vq hs yc yd vu ns zr lf rs pl cs eo zi qb qk so oc wn el
+zc yb de mg fw cx cv wv ot cb qu wu nm ow zp rn hw lp ma en vg cy rf tc am ut
+cq sv kv oh jb np us ck mh ny gr gb op an kw aa vo zq iq ba px dr un or ze bp
+zg eb ud if dl dq zj ky bf vi cg ua br yu mp bv sn pi db ae of kh pd hp qc jv
+xo du fg ta do kd dx av ys tg ls fi kz tf eu aj sw vx oa pw fd ab hi va lb dg
+ig dp nz js qv ag aw eq mr zo yy ie nk yg ap oe gv oy ht ev ak td ay fj ng wg
 ```
 
-depending on whether or not you use a regular expression to find
-the last "`01`" in the binary representaiton.
-
-## The solution - with go faster stripes...
+# Challenge 2 - Largest even
 
-After a discussion on facebook with Eliza Skr, about whether or not
-to use regexs rather than `rindex` she supplied a different algorithm
-for finding the next number - which didn't involve manipulating the
-binary string but by working out the arithmetic to make the changes.
+***You are given a list of positive integers (`0`-`9`), single digit. Write a script to find the largest multiple of `2` that can be formed from the list.***
 
- * The number is of the form is `0 1111 00000000`
- * The next hightest number is  `1 000000000 111`
- * To map `0 1111 00000000` to `1 000000000 000` we need to add
-   `1 00000000` (which is 2^#zeros)
- * To map `1 000000000 000` to `1 000000000 111` we need to add `111`
-   which is 2^(#ones -1) -1
+## The solution
 
-Eliza's solution was to obtain counts of `0`s and `1`s using a simple
-regex `/(1+)(0*)$/` which works - but is still a regular expression,
-which as we discussed above is a slow operation.
+For once challenge 2 is easier.
 
-We can replace this again with using `rindex`... Also rather than
-using `2**$n` we replace it with the much quicker bit-shift operator
-`1<<$n` - which achieves the same effect.
+To find the largest number we just sort the digits in descending order
+and stitch them together.
 
-This gives us:
+To find the largest even number we just sort the digits in descending
+order, but move the lowest even number to the end.
 
 ```perl
-sub next_bin_rindex2 {
-  my $t=rindex my$s=sprintf('%b',$_[0]),'1';
-  return $_[0] + (1<<(-1-$t+length$s))
-           - 1 + (1<<(-1+$t-rindex$s,'0',$t));
+sub biggest_even {
+  my $ptr = my @digits = reverse sort @{$_[0]};
+  while( $ptr-- ) {
+    next if $digits[$ptr] & 1; ## Skip if odd...
+    return join '',
+      @digits[ 0..$ptr-1, $ptr+1..$#digits, $ptr ];
+  }
+  return '';
 }
 ```
 
-A few notes:
-
- * here we use the three parameter version of `rindex`,
-   which allows you to specify an offset for the search to start (in this
-   case we want the last "`0`" before the last "`1`" so we use the position
-   of the "`1`" as the offset)
- * We use the bit-shift operator `<<` to raise to the power `2`
-   rather than the power operator.... If we break down all the efficiency
-   gains between the the rrev & rind2 methods - most of the gain would
-   be lost if we reverted back to `2**$n`.
- * We looked to see if unpack was more efficient than sprintf - but found
-   that this was not the case {about 20-40% slower}.
-
-## Summary
-
-Both the performance of `next_bin_regex` and `next_bin_rrev` appear
-to slow down only slightly as `$N` increases - comparabale with 
-"linear" scans for the last "`01`".
-
-Interestingly the `next_bin_rind2` seems to run at similar speeds for
-all ranges of `$N`.
-
-The naive `next_bin` - doesn't have that property - at all and it
-rapidly tails off performance wise.
-
-Running this a large number of times - we have the following
-approximate rates for the calculations....
-
-| Size of number | Rate rind2 | Rate rrev  | Rate regex | Rate naive |
-| -------------- | ---------: | ---------: | ---------: | ---------: |
-| 1-500          |  1,900,000 |  1,550,000 |    500,000 |    600,000 |
-| Approx 1000    |  1,800,000 |  1,500,000 |    440,000 |    400,000 |
-| Approx 1x10^6  |  1,800,000 |  1,350,000 |    390,000 |      4,000 |
-| Approx 1x10^9  |  1,850,000 |  1,250,000 |    330,000 |          1 |
-
-The calls do include the hardest example `2^n-1` for which the next
-number is `2^(n-1)` more - so much of the time in the naive loop is
-taken up by that example - in the 1x10^9 example this would require
-500_000_000 iterations of the increment/check loop.
-
-We see as we did a few weeks ago that if you don't actually need to
-use regexs then you can get an appreciable speed boost. Obviously
-remembering there is trade off between coding and running time.
+The while loop just looks for the smallest even number & moves it
+to the end using an array slice.
author	drbaggy <js5@sanger.ac.uk>	2021-06-06 22:16:11 +0100
committer	drbaggy <js5@sanger.ac.uk>	2021-06-06 22:16:11 +0100
commit	86f1bdedafd2a12e2354979a8d50d12d8d43ea8a (patch)
tree	4f3b667a52f1d44c0973da773f6af9babb1bb83a /challenge-115/james-smith
parent	7deeff0a33cc808985745bd3dce4a3aeb1b2d3bb (diff)
parent	200c5f7c9c0d4882d800a9a25b8f04f14b034fdf (diff)
download	perlweeklychallenge-club-86f1bdedafd2a12e2354979a8d50d12d8d43ea8a.tar.gz perlweeklychallenge-club-86f1bdedafd2a12e2354979a8d50d12d8d43ea8a.tar.bz2 perlweeklychallenge-club-86f1bdedafd2a12e2354979a8d50d12d8d43ea8a.zip