aboutsummaryrefslogtreecommitdiff
path: root/challenge-198/james-smith/README.md
blob: 0d51fd73b54cd9527d4e7a1b874cd074ed54d6e9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
[< Previous 197](https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-197/james-smith) |
[Next 199 >](https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-199/james-smith)

# The Weekly Challenge 198

You can find more information about this weeks, and previous weeks challenges at:

  https://theweeklychallenge.org/

If you are not already doing the challenge - it is a good place to practise your
**perl** or **raku**. If it is not **perl** or **raku** you develop in - you can
submit solutions in whichever language you feel comfortable with.

You can find the solutions here on github at:

https://github.com/drbaggy/perlweeklychallenge-club/tree/master/challenge-198/james-smith

# Task 1 - Max Gap

***You are given a list of integers, `@list`. Write a script to find the total pairs in the sorted list where 2 consecutive elements has the max gap. If the list contains less then 2 elements then return 0.***

## Solution

I will present two solutions to this problem. They both start the same way - by first sorting the numbers.

### Method 1 - `max_gap_sort`

This computes the difference between each pair, and then sorts these into order (reverse)
and counts the number of entries who have the same **max** value as the first element of the sorted list...

```perl
sub max_gap_sort {
  return 0 unless $#_;
  @_ = sort { $a <=> $b } @_;
  my $p = shift;
  @_ = sort { $b <=> $a } map { ($_-$p,$p=$_)[0] } @_;
  $_[0]==$_[$_] || return $_ for 1..$#_;
  1*@_
}
```

### Method 2 - `max_gap_nosort`

This computes the difference of each pair and if it is the highest value updates a counter...

**Notes:** If the new value of the difference is greater than the current best value - we update the current best value and
reset the count to 1. We do this in the for loop with a ternary `?:` and then using `&&` to replace another if or `?:` with
an empty second value. Finally we update `$p`. We can separate variables etc within the for loop by replacing `;` for `,` in
a lot of cases...

```perl
sub max_gap_nosort {
  return 0 unless $#_;
  @_ = sort { $a <=> $b } @_;
  my($p,$b,$c)=(shift,0,0);
  $_-$p>$b ? ($b,$c)=($_-$p,1) : $_-$p==$b && $c++, $p=$_ for @_;
  $c
}
```

### Bonus Method 3 - `max_gap_nosort_faster`

One issue with method 2 is that it has to store the sorted list back into @_ AND then loop through it - as we have to get the first value of `$p`...
Now - what if we didn't need to! This would save that store....

Well we can avoid that (in a way) but it is hacky... We set `$p` the be one more than first value in the input! This means that the first gap is always going to be negative so it will only occur once, which as we we aren't interested in the "size" of the gap only the number with the biggest gap we can will always get the right answer...

```perl
sub max_gap_nosort_faster {
  return 0 unless $#_;
  my($p,$t,$c) = ($_[0]+1,0,0);
  $_-$p>$t ? ($t,$c)=($_-$p,1) : $_-$p==$t && $c++, $p=$_ for sort { $a<=>$b } @_;
  $c;
}
```

### Performance

We can see the relative performance of each method {the timing is based on various lists including a list of 1,000,000 numbers}

|               |    Rate  | sort | slide | nosort | nosort_faster |
| :------------ | -------: | ---: | ----: | -----: | ------------: |
| sort          |  0.772/s |   -- |  -31% |   -52% |          -56% |
| slide         |   1.11/s |  44% |    -- |   -30% |          -37% |
| nosort        |   1.59/s | 106% |   43% |     -- |           -9% |
| nosort_faster |   1.75/s | 127% |   58% |    10% |            -- |

The gain between `nosort` & `nosort_faster` is not "visible" in cases wheren the length of list is up to around `100,000` the overhead of the "hack" vs the overhead of storing the intermediate array....

Using List::MoreUtils `slide` has a roughly 30% hit on performance - this again is the effect of having to store relatively large intermediate arrays.

# Task 2 - Prime Count

***You are given an integer `$n > 0`. Write a script to print the count of primes less than `$n`.***

## Solution

Now this is going back sometime since we have a had a prime generator question. We could use `Math::Prime::Util` to get the number of primes with `prime_count` but that sort of defeats the issue here...

We will look at a couple of ways of doing this - the first which is "cute" but not particularly efficient, and then another which is very efficient {obv no where near the performance of the `Math::Prime::Util` version.

### Method 1 - "compact"

First we return `0` if `$n` is less than 3 {there are no primes less than 2}. We then initialize the array with the entry 2, and for each number - we look to see it it's prime... Our prime check is in the line:
```perl
  //,(grep{($'%$_)||next}@_),push@_,$_
```
which is quite compact:
  * `//,` - match nothing - `$'` the post-match string is set to the string i.e `$_` from the outer loop.
  * `(grep{$'%$_||next}@_)` - loop through the array of primes, if `$'%$_` is false we check the right-hand side of the or `||` and evaluate it - which skips out of the map and goes to the next number in the outer `for` loop....
  * `,push@_,$_` if we don't hit the `next` we get to this point - the number is prime so push it onto the list....

The full code is:

```perl
sub n_primes_compact {
  return 0if(my$n=shift)<3;
  @_=2;
  //,(grep{$'%$_||next}@_),push@_,$_ for 3..$n-1;
  1*@_
}
```
If we switch `shift` for `pop` & `grep` for `map`.. This comes down to just 85 bytes (if we us a 1 letter function name)
```perl
sub p{return 0if(my$n=pop)<3;@_=2;//,(map{$'%$_||next}@_),push@_,$_ for 3..$n-1;1*@_}
```

### Method 2 - faster!

The previous method is small - but not particularly fast.. There are a few things we can do to speed it up!

 * We only need to try odd numbers (even numbers will not be included)
 * the largest element of the prime array we need to check is the square root of the number we are testing!
   * square roots are expensive!
   * we keep a track of the smallest integer which is greater than or equal the square root of the number. Note we don't ever have to work out the square root. We can just increment the value `$q` by 1 each time `$i` is greater than it's square... the `last` in the tight `last`/`next` line....
 * a number isn't prime if it has a prime factor - this is the `$i%$_?...:next O` which skips out of the inner `for` and jumps to the start of the next outer `for`.
 * we don't include `2` in the array as we only check odd-numbers, but add `1` at the end to include it in the count... this gains us about `2%` over keeping it in the array.
 
This give us:

```perl
sub n_primes_fast {
  return 0 if (my $n=shift) <3;
  my($q,@p)=2;
  O: for( my $i=3; $i<$n; $i+=2 ) {
    $q++ if $i>$q*$q;
    $i%$_?($_>$q&&last):next O for @p;
    push @p, $i;
  }
  1+@p
}
```

There is a more compact version of the code:

```perl
sub n_primes_fastx {
  return 0 if (my $n=shift) <3;
  O: for( my($i,$q)=(3,2); $i<$n; $i+=2,($i>$q*$q)&&$q++ ) {
    $i%$_?($_>$q&&last):next O for @_;
    push @_, $i;
  }
  1+@_
}
```
which makes use of the the initalization and increment parts of the class **C-style** `for` to combine lines, and re-using `@_` as after the shift it will be empty...

Times are comparable between `fast` and `fastx` - although I think `fast` is slightly faster than the more compact version.... (and is definitely easier to read)

### Performance

We can see the relative performance of each method {the largest test value we use is `$n=100,000`} so timings are based on that! (factor is approx 180:1) if we increase to `$n=1,000,000` the factor is approximately 630:1


|               |    Rate  | compact |  fast |  
| :------------ | -------: | ------: | ----: |
| compact       |  0.110/s |      -- |  -99% |
| fast          | 18.800/s | 16 979% |    -- |