Special and frequent numbers

Weekly challenge 195 — 12 December 2022

Week 195 - 12 Dec 2022

Task 1

You are given a positive integer, $n > 0. Write a script to print the count of all special integers between 1 and $n. An integer is special when all of its digits are unique.

Example 1: Input: $n = 15 Output: 14 as except 11 all other integers between 1 and 15 are special.Example 2: Input: $n = 35 Output: 32 as except 11, 22, 33 all others are special.

Let's start by thinking of a clever way to quickly identify special numbers in a given range. If you've come up with that - well done! After trying a number of leads I came to the conclusion that none was much better - and most were more complicated - than simply testing all the numbers in the range.

How do you test a number for being special? This was my initial thought:

$is_special = 1; for $digit (0..9) { if ($j =~ m|$digit.*$digit|) { $is_special = 0; last; } }

The regular expression in there says if there is a digit, followed by anything or nothing, followed by the
same digit again, then `$j`

is not special. So we can wrap that in a loop of `$j`

going from 1 to `$n`

and count the specials.

And that does work. But if `$n`

is large, it takes a while. On my (quite slow) computer
using that method (method 1), if `$n`

is 1 million,
it takes 55 seconds, and for 5 million, 223 seconds. So I applied a couple of techniques to speed it up.

The first is to note that as we go from 1 to `$n`

we will hit some numbers with trailing zeroes, such as 10, 200,
55000 and so on. Let's split these numbers into the bit before the zeroes (call it A), and the zeroes themselves (B).

So here is my observation: if A isn't special then we can quickly jump to the number that starts with A + 1 and follows that with B. So for example when checking 55000, we can skip 55001 to 55999 (because they all start with that unspecial 55) and continue at 56000. So my loop now looks like this:

$j = 0; while (1) { $j ++; last if $j > $test; if ($j =~ m|(.+?)(0+)$|) { $j = ($1 + 1) * 10 ** length($2) - 1 if unspecial($1); } $count ++ unless unspecial($j); }

So for my 55000 example, the first `if`

splits that into `$1 = 55`

and `$2 = 000`

, and the next lines jumps `$j`

to
56 * 10^3 - ie 56000 (actually to 55999, because the loop increments `$j`

at the top.) You'll note too that I have
taken the test for specialness out to a subroutine, but it's the same method as I showed as my first code example above.
Doing that roughly halves the time: for 1 million it now takes 24 seconds. But that's still quite a long time to
wait. So for my next trick I unfolded `sub unspecial`

into 10 lines such as:

return 1 if $_[0] =~ m|1.*1|;

Now you might think 'why do that?' Well, my first answer is that for `$n`

= 5 million, this
(method 2) now takes just 7
seconds as opposed to 223 seconds for method 1, so clearly it's worth it.

But why that's so much more efficient lies in the guts of Perl, about which I know very little. I am assuming that
there is a lot of overhead in working with a regular expression containing a variable like `m|$digit.*$digit|`

because the regular expression
has to be reconsidered 5 million times at run time, whereas in the unfolded version it can be done just once at
compile time.

So all in all, a very interesting task.

#!/usr/bin/perl # Peter Campbell Smith - 2022-12-12 # PWC 195 task 1 use v5.28; use utf8; use warnings; my (@tests, $test, $count, $j, $digit, $start); @tests = (15, 34, 77, 123, 1234, 12345, 1000000, 5000000); # loop over tests say qq[\n--- METHOD 1 ---]; for $test (@tests) { # loop over 1 to $n $start = time; $count = 0; TEST: for $j (1 .. $test) { # check for 2 of each digit for $digit (0..9) { # don't count if digit repeated next TEST if $j =~ m|$digit.*$digit|; } # is good $count ++; } say qq[\nInput: $test\nOutput: $count (] . (time - $start) . qq[ secs)]; } # loop over tests say qq[\n--- METHOD 2 ---]; for $test (@tests) { # loop over 1 to $n $start = time; $count = 0; $j = 0; while (1) { $j ++; last if $j > $test; if ($j =~ m|(.+?)(0+)$|) { $j = ($1 + 1) * 10 ** length($2) - 1 if unspecial($1); } $count ++ unless unspecial($j); } say qq[\nInput: $test\nOutput: $count (] . (time - $start) . qq[ secs)]; } sub unspecial { # returns 0 if special, 1 if not return 1 if $_[0] =~ m|1.*1|; return 1 if $_[0] =~ m|2.*2|; return 1 if $_[0] =~ m|3.*3|; return 1 if $_[0] =~ m|4.*4|; return 1 if $_[0] =~ m|5.*5|; return 1 if $_[0] =~ m|6.*6|; return 1 if $_[0] =~ m|7.*7|; return 1 if $_[0] =~ m|8.*8|; return 1 if $_[0] =~ m|9.*9|; return 1 if $_[0] =~ m|0.*0|; return 0; }

--- METHOD 1 --- Input: 15 Output: 14 (0 secs) Input: 34 Output: 31 (0 secs) Input: 77 Output: 70 (0 secs) Input: 123 Output: 100 (0 secs) Input: 1234 Output: 803 (0 secs) Input: 12345 Output: 5660 (1 secs) Input: 1000000 Output: 168570 (55 secs) Input: 5000000 Output: 410490 (223 secs) --- METHOD 2 --- Input: 15 Output: 14 (0 secs) Input: 34 Output: 31 (0 secs) Input: 77 Output: 70 (0 secs) Input: 123 Output: 100 (0 secs) Input: 1234 Output: 803 (0 secs) Input: 12345 Output: 5660 (0 secs) Input: 1000000 Output: 168570 (2 secs) Input: 5000000 Output: 410490 (7 secs)

Note: the running times above are on my slow development computer and are not comparable with those generated by 'Try it', which run on my external webserver.

The content of this website which has been created by

Peter Campbell Smith is hereby placed in the public domain

Peter Campbell Smith is hereby placed in the public domain