aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authordrbaggy <js5@sanger.ac.uk>2021-05-30 10:03:28 +0100
committerdrbaggy <js5@sanger.ac.uk>2021-05-30 10:03:28 +0100
commit1e3ad98df9e85924cd5f60910a189be17bd042c5 (patch)
tree6a98615aa7a0a6b2b481d36041903263dac27fdd
parent973c096b1d2b6e063457a2fe37b795749e9071cb (diff)
parent41abc68a5a99e57cb8bf1cca501b09c6aa909be0 (diff)
downloadperlweeklychallenge-club-1e3ad98df9e85924cd5f60910a189be17bd042c5.tar.gz
perlweeklychallenge-club-1e3ad98df9e85924cd5f60910a189be17bd042c5.tar.bz2
perlweeklychallenge-club-1e3ad98df9e85924cd5f60910a189be17bd042c5.zip
Merge branch 'master' of github.com:drbaggy/perlweeklychallenge-club
-rw-r--r--challenge-114/james-smith/README.md62
1 files changed, 54 insertions, 8 deletions
diff --git a/challenge-114/james-smith/README.md b/challenge-114/james-smith/README.md
index 9bf832ce51..ebff4e20e8 100644
--- a/challenge-114/james-smith/README.md
+++ b/challenge-114/james-smith/README.md
@@ -161,22 +161,68 @@ sub next_bin_rrev {
depending on whether or not you use a regular expression to find
the last "`01`" in the binary representaiton.
+## The solution - with go faster stripes...
+
+After a discussion on facebook with Eliza Skr, about whether or not
+to use regexs rather than `rindex` she supplied a different algorithm
+for finding the next number - which didn't involve manipulating the
+binary string but by working out the arithmetic to make the changes.
+
+ * The number is of the form is `0 1111 00000000`
+ * The next hightest number is `1 000000000 111`
+ * To map `0 1111 00000000` to `1 000000000 000` we need to add
+ `1 00000000` (which is 2^#zeros)
+ * To map `1 000000000 000` to `1 000000000 111` we need to add `111`
+ which is 2^(#ones -1) -1
+
+Eliza's solution was to obtain counts of `0`s and `1`s using a regex
+`/(1+)(0*)$/` which works - but is still a regex... so we can replace
+this again with using `rindex`... Also rather than using `2**$n` this
+can be replaced by the much quicker `1<<$n` which does the same thing
+but is much more efficient.
+
+This gives us:
+
+```perl
+sub next_bin_rindex2 {
+ my $t = rindex my $s = sprintf('%b',my $n = shift),'1';
+ return $n - 1 + (1<<(-1-$t+length$s))
+ + (1<<(-1+$t-rindex $s,'0',$t));
+}
+```
+
+A few notes:
+
+ * here we use the three parameter version of `rindex`,
+ which allows you to specify an offset for the search to start (in this
+ case we want the last "`0`" before the last "`1`" so we use the position
+ of the "`1`" as the offset)
+ * I repeat we use the bit-shift operator `<<` to raise to the power `2`
+ rather than the power operator.... This is very much more efficient
+ - infact all the efficiency gain between the 2 `rindex` solutions is
+ due to using this over '`**`'...
+ * I looked to see if unpack was more efficient than sprintf - but found
+ that this was not the case {about 20-40% slower}.
+
## Summary
Both the performance of `next_bin_regex` and `next_bin_rrev` appear
to slow down only slightly as `$N` increases - comparabale with
-"linear" scans for the last "`01`". Whereas the `next_bin` naive
-method has no such property.
+"linear" scans for the last "`01`". Interestingly the rind2 seems to
+run at similar speeds for all ranges of `$N`.
+
+The naive `next_bin` - doesn't have that property - at all and it
+rapidly tails off performance wise.
Running this a large number of times - we have the following
approximate rates for the calculations....
-| Size of number | Rate rrev | Rate regex | Rate naive |
-| -------------- | ---------: | ---------: | ---------: |
-| 1-500 | 1,600,000 | 500,000 | 600,000 |
-| Approx 1000 | 1,550,000 | 440,000 | 400,000 |
-| Approx 1x10^6 | 1,500,000 | 390,000 | 4,000 |
-| Approx 1x10^9 | 1,450,000 | 330,000 | 1 |
+| Size of number | Rate rind2 | Rate rrev | Rate regex | Rate naive |
+| -------------- | ---------: | ---------: | ---------: | ---------: |
+| 1-500 | 1,850,000 | 1,550,000 | 500,000 | 600,000 |
+| Approx 1000 | 1,750,000 | 1,500,000 | 440,000 | 400,000 |
+| Approx 1x10^6 | 1,700,000 | 1,350,000 | 390,000 | 4,000 |
+| Approx 1x10^9 | 1,800,000 | 1,250,000 | 330,000 | 1 |
The calls do include the hardest example `2^n-1` for which the next
number is `2^(n-1)` more - so much of the time in the naive loop is