Merge pull request #4161 from drbaggy/master

Binary addition code version...
author: Mohammad S Anwar <Mohammad.Anwar@yahoo.com> 2021-05-30 20:54:56 +0100
committer: GitHub <noreply@github.com> 2021-05-30 20:54:56 +0100
commit: 4cfa58ea3629e9a4d11cb07a6a7ceecd624f603e (patch)
tree: 3292398a82f00c76d4d52c61990fb23060377088
parent: b4f2c135093c3d380c25c426b66b54e1ec908f32 (diff)
parent: b35ab7938cf2b04cb6b13cb934d1672fcde14e52 (diff)
download: perlweeklychallenge-club-4cfa58ea3629e9a4d11cb07a6a7ceecd624f603e.tar.gz
perlweeklychallenge-club-4cfa58ea3629e9a4d11cb07a6a7ceecd624f603e.tar.bz2
perlweeklychallenge-club-4cfa58ea3629e9a4d11cb07a6a7ceecd624f603e.zip
2 files changed, 120 insertions, 16 deletions
diff --git a/challenge-114/james-smith/README.md b/challenge-114/james-smith/README.md
index 9bf832ce51..cbfc55a4df 100644
--- a/challenge-114/james-smith/README.md
+++ b/challenge-114/james-smith/README.md
@@ -161,22 +161,68 @@ sub next_bin_rrev {
 depending on whether or not you use a regular expression to find
 the last "`01`" in the binary representaiton.
 
+## The solution - with go faster stripes...
+
+After a discussion on facebook with Eliza Skr, about whether or not
+to use regexs rather than `rindex` she supplied a different algorithm
+for finding the next number - which didn't involve manipulating the
+binary string but by working out the arithmetic to make the changes.
+
+ * The number is of the form is `0 1111 00000000`
+ * The next hightest number is  `1 000000000 111`
+ * To map `0 1111 00000000` to `1 000000000 000` we need to add
+   `1 00000000` (which is 2^#zeros)
+ * To map `1 000000000 000` to `1 000000000 111` we need to add `111`
+   which is 2^(#ones -1) -1
+
+Eliza's solution was to obtain counts of `0`s and `1`s using a simple
+regex `/(1+)(0*)$/` which works - but is still a regex... so we can
+replace this again with using `rindex`... Also rather than using 
+`2**$n` this can be replaced by the much quicker `1<<$n` which does
+the same thing but is much more efficient.
+
+This gives us:
+
+```perl
+sub next_bin_rindex2 {
+  my $t=rindex my$s=sprintf('%b',$_[0]),'1';
+  return $_[0] + (1<<(-1-$t+length$s))
+           - 1 + (1<<(-1+$t-rindex$s,'0',$t));
+}
+```
+
+A few notes:
+
+ * here we use the three parameter version of `rindex`,
+   which allows you to specify an offset for the search to start (in this
+   case we want the last "`0`" before the last "`1`" so we use the position
+   of the "`1`" as the offset)
+ * I repeat we use the bit-shift operator `<<` to raise to the power `2`
+   rather than the power operator.... This is very much more efficient
+   - infact all the efficiency gain between the 2 `rindex` solutions is
+   due to using this over '`**`'...
+ * I looked to see if unpack was more efficient than sprintf - but found
+   that this was not the case {about 20-40% slower}.
+
 ## Summary
 
 Both the performance of `next_bin_regex` and `next_bin_rrev` appear
 to slow down only slightly as `$N` increases - comparabale with 
-"linear" scans for the last "`01`". Whereas the `next_bin` naive
-method has no such property.
+"linear" scans for the last "`01`". Interestingly the rind2 seems to
+run at similar speeds for all ranges of `$N`.
+
+The naive `next_bin` - doesn't have that property - at all and it
+rapidly tails off performance wise.
 
 Running this a large number of times - we have the following
 approximate rates for the calculations....
 
-| Size of number | Rate rrev  | Rate regex | Rate naive |
-| -------------- | ---------: | ---------: | ---------: |
-| 1-500          |  1,600,000 |    500,000 |    600,000 |
-| Approx 1000    |  1,550,000 |    440,000 |    400,000 |
-| Approx 1x10^6  |  1,500,000 |    390,000 |      4,000 |
-| Approx 1x10^9  |  1,450,000 |    330,000 |          1 |
+| Size of number | Rate rind2 | Rate rrev  | Rate regex | Rate naive |
+| -------------- | ---------: | ---------: | ---------: | ---------: |
+| 1-500          |  1,900,000 |  1,550,000 |    500,000 |    600,000 |
+| Approx 1000    |  1,800,000 |  1,500,000 |    440,000 |    400,000 |
+| Approx 1x10^6  |  1,800,000 |  1,350,000 |    390,000 |      4,000 |
+| Approx 1x10^9  |  1,850,000 |  1,250,000 |    330,000 |          1 |
 
 The calls do include the hardest example `2^n-1` for which the next
 number is `2^(n-1)` more - so much of the time in the naive loop is
diff --git a/challenge-114/james-smith/perl/ch-2.pl b/challenge-114/james-smith/perl/ch-2.pl
index 773f79b0cb..43cd954ab4 100644
--- a/challenge-114/james-smith/perl/ch-2.pl
+++ b/challenge-114/james-smith/perl/ch-2.pl
@@ -10,25 +10,31 @@ use Benchmark qw(cmpthese);
 my @sols = (
 [7,11],[11,13],[13,14],[14,19],[19,21],[21,22],[22,25],[25,26],[26,28],[28,35],[35,37],[37,38],[38,41],[41,42],[42,44],[44,49],[49,50],[50,52],[52,56],[56,67],[67,69],[69,70],[70,73],[73,74],[74,76],[76,81],[81,82],[82,84],[84,88],[88,97],[97,98],[98,100],[100,104],[104,112],[112,131],[131,133],[133,134],[134,137],[137,138],[138,140],[140,145],[145,146],[146,148],[148,152],[152,161],[161,162],[162,164],[164,168],[168,176],[176,193],[193,194],[194,196],[196,200],[200,208],[208,224],[224,259],[259,261],[261,262],[262,265],[265,266],[266,268],[268,273],[273,274],[274,276],[276,280],[280,289],[289,290],[290,292],[292,296],[296,304],[304,321],[321,322],[322,324],[324,328],[328,336],[336,352],[352,385],[385,386],[386,388],[388,392],[392,400],[400,416],[416,448],[255,383],[383,447],[447,479],[479,495],[495,503],[503,507],[507,509],[509,510],[3,5],[5,6],[6,9],[9,10],[10,12],[12,17],[17,18],[18,20],[20,24],[24,33],[33,34],[34,36],[36,40],[40,48],[48,65],[65,66],[66,68],[68,72],[72,80],[80,96],[96,129],[129,130],[130,132],[132,136],[136,144],[144,160],[160,192],[192,257],[257,258],[258,260],[260,264],[264,272],[272,288],[288,320],[320,384],[15,23],[23,27],[27,29],[29,30],[30,39],[39,43],[43,45],[45,46],[46,51],[51,53],[53,54],[54,57],[57,58],[58,60],[60,71],[71,75],[75,77],[77,78],[78,83],[83,85],[85,86],[86,89],[89,90],[90,92],[92,99],[99,101],[101,102],[102,105],[105,106],[106,108],[108,113],[113,114],[114,116],[116,120],[120,135],[135,139],[139,141],[141,142],[142,147],[147,149],[149,150],[150,153],[153,154],[154,156],[156,163],[163,165],[165,166],[166,169],[169,170],[170,172],[172,177],[177,178],[178,180],[180,184],[184,195],[195,197],[197,198],[198,201],[201,202],[202,204],[204,209],[209,210],[210,212],[212,216],[216,225],[225,226],[226,228],[228,232],[232,240],[240,263],[263,267],[267,269],[269,270],[270,275],[275,277],[277,278],[278,281],[281,282],[282,284],[284,291],[291,293],[293,294],[294,297],[297,298],[298,300],[300,305],[305,306],[306,308],[308,312],[312,323],[323,325],[325,326],[326,329],[329,330],[330,332],[332,337],[337,338],[338,340],[340,344],[344,353],[353,354],[354,356],[356,360],[360,368],[368,387],[387,389],[389,390],[390,393],[393,394],[394,396],[396,401],[401,402],[402,404],[404,408],[408,417],[417,418],[418,420],[420,424],[424,432],[432,449],[449,450],[450,452],[452,456],[456,464],[464,480],[127,191],[191,223],[223,239],[239,247],[247,251],[251,253],[253,254],[254,319],[319,351],[351,367],[367,375],[375,379],[379,381],[381,382],[382,415],[415,431],[431,439],[439,443],[443,445],[445,446],[446,463],[463,471],[471,475],[475,477],[477,478],[478,487],[487,491],[491,493],[493,494],[494,499],[499,501],[501,502],[502,505],[505,506],[506,508],[31,47],[47,55],[55,59],[59,61],[61,62],[62,79],[79,87],[87,91],[91,93],[93,94],[94,103],[103,107],[107,109],[109,110],[110,115],[115,117],[117,118],[118,121],[121,122],[122,124],[124,143],[143,151],[151,155],[155,157],[157,158],[158,167],[167,171],[171,173],[173,174],[174,179],[179,181],[181,182],[182,185],[185,186],[186,188],[188,199],[199,203],[203,205],[205,206],[206,211],[211,213],[213,214],[214,217],[217,218],[218,220],[220,227],[227,229],[229,230],[230,233],[233,234],[234,236],[236,241],[241,242],[242,244],[244,248],[248,271],[271,279],[279,283],[283,285],[285,286],[286,295],[295,299],[299,301],[301,302],[302,307],[307,309],[309,310],[310,313],[313,314],[314,316],[316,327],[327,331],[331,333],[333,334],[334,339],[339,341],[341,342],[342,345],[345,346],[346,348],[348,355],[355,357],[357,358],[358,361],[361,362],[362,364],[364,369],[369,370],[370,372],[372,376],[376,391],[391,395],[395,397],[397,398],[398,403],[403,405],[405,406],[406,409],[409,410],[410,412],[412,419],[419,421],[421,422],[422,425],[425,426],[426,428],[428,433],[433,434],[434,436],[436,440],[440,451],[451,453],[453,454],[454,457],[457,458],[458,460],[460,465],[465,466],[466,468],[468,472],[472,481],[481,482],[482,484],[484,488],[488,496],[1,2],[2,4],[4,8],[8,16],[16,32],[32,64],[64,128],[128,256],[63,95],[95,111],[111,119],[119,123],[123,125],[125,126],[126,159],[159,175],[175,183],[183,187],[187,189],[189,190],[190,207],[207,215],[215,219],[219,221],[221,222],[222,231],[231,235],[235,237],[237,238],[238,243],[243,245],[245,246],[246,249],[249,250],[250,252],[252,287],[287,303],[303,311],[311,315],[315,317],[317,318],[318,335],[335,343],[343,347],[347,349],[349,350],[350,359],[359,363],[363,365],[365,366],[366,371],[371,373],[373,374],[374,377],[377,378],[378,380],[380,399],[399,407],[407,411],[411,413],[413,414],[414,423],[423,427],[427,429],[429,430],[430,435],[435,437],[437,438],[438,441],[441,442],[442,444],[444,455],[455,459],[459,461],[461,462],[462,467],[467,469],[469,470],[470,473],[473,474],[474,476],[476,483],[483,485],[485,486],[486,489],[489,490],[490,492],[492,497],[497,498],[498,500],[500,504]);
 
+#is( next_bin_rindex2ns( $_->[0]), $_->[1] ) foreach @sols;exit;
+#is( next_bin_rindex2( $_->[0]), $_->[1] ) foreach @sols;exit;
 #is( next_bin_rex( $_->[0]), $_->[1] ) foreach @sols;exit;
 #is( next_bin_rrev($_->[0]), $_->[1] ) foreach @sols;exit;
 #is( next_bin(     $_->[0]), $_->[1] ) foreach @sols;exit;
 #done_testing();
 
 my @ranges = (
-  [ 1000, 1,                       500 ],
-  [ 500,  500,                    2500 ],
-  [ 100,  1_047_576,         1_049_576 ],
-  [ 50,   1_073_740_824, 1_073_742_824 ],
+  [ 20000, 1..500,1..500,1..500,1..500 ],
+  [ 20000,  500..2499 ],
+  [ 20000,  1_047_576..1_049_575 ],
+  [ 20000,  1_073_740_824..1_073_742_823 ],
 );
 
 foreach my $r (@ranges) {
-  cmpthese( 10*$r->[0], {
-    'rind' => sub { next_bin_rrev( $_ ) foreach $r->[1] .. $r->[2] },
-    'rex'  => sub { next_bin_rex( $_ ) foreach $r->[1] .. $r->[2] },
-#    'simp' => sub { next_bin(      $_ ) foreach $r->[1] .. $r->[2] },
+  my($c,@n) = @{$r};
+  cmpthese( $c, {
+    'rind2x'=> sub { next_bin_rindex2ns($_) foreach @n },
+    'rind2' => sub { next_bin_rindex2( $_ ) foreach @n },
+    'rind'  => sub { next_bin_rrev( $_    ) foreach @n },
+#    'rex'   => sub { next_bin_rex( $_     ) foreach @n },
+#   'simp'  => sub { next_bin($_          ) foreach @n },
   });
 }
+
 sub next_bin {
   my $n = shift;
   my $c = (sprintf '%b', $n) =~ tr/1/1/;
@@ -68,3 +74,55 @@ sub next_bin_rrev {
   return oct '0b'.substr($s,0,$t).'10'.reverse substr $s,$t+2;
 }
 
+## We can get further optimization by avoiding the call to "oct" by
+## converting the above to simple arithmetic...
+## To move all the 1s to the end and flip the '01' to '10'
+## We note that if the number is of the form..
+##
+##    0 1111 00000000
+##
+## The next highest number (with the same number of bits is:
+##
+##    1 000000000 111
+##
+## To get this we first do:
+##
+##    0 1111 00000000
+##  + 0 0001 00000000
+##    ---------------
+##    1 0000 00000000
+##
+## and then:
+##
+##    1 0000 00000000
+##  + 0 0000 00000111
+##    ---------------
+##    1 0000 00000111
+##
+## which is the answer we are looking for....
+##
+## This is basically:
+##
+##    $N + 2^(#0s) +2^(#1s-1) - 1;
+##
+## We can get these with rindex
+##
+## #0s = length of the binary string - (last index of 1 + 1)
+## #1s = last index of 1 - last index of 0 (before this 1)
+##
+## This all leads into the following fn.
+##
+## Note we don't use 2**$N but 1<<$N which is much more efficient
+## Investigated unpack vs sprintf to do the dec->binary conversion
+## the latter is faster by about 20%...
+
+sub next_bin_rindex2ns {
+  my $t = rindex my $s = sprintf('%b',$_[0]),'1';
+  return $_[0] - 1 + (1<<(-1-$t+length$s)) + (1<<(-1+$t-rindex $s,'0',$t));
+}
+
+sub next_bin_rindex2 {
+  my $t = rindex my $s = sprintf('%b',$_[0]), '1';
+  return $_[0] + (1<<(-$t-1+length$s)) - 1 + (1<<($t-1-rindex$s,'0',$t));
+}
+
author	Mohammad S Anwar <Mohammad.Anwar@yahoo.com>	2021-05-30 20:54:56 +0100
committer	GitHub <noreply@github.com>	2021-05-30 20:54:56 +0100
commit	4cfa58ea3629e9a4d11cb07a6a7ceecd624f603e (patch)
tree	3292398a82f00c76d4d52c61990fb23060377088
parent	b4f2c135093c3d380c25c426b66b54e1ec908f32 (diff)
parent	b35ab7938cf2b04cb6b13cb934d1672fcde14e52 (diff)
download	perlweeklychallenge-club-4cfa58ea3629e9a4d11cb07a6a7ceecd624f603e.tar.gz perlweeklychallenge-club-4cfa58ea3629e9a4d11cb07a6a7ceecd624f603e.tar.bz2 perlweeklychallenge-club-4cfa58ea3629e9a4d11cb07a6a7ceecd624f603e.zip