DEV Community

Bob Lied
Bob Lied

Posted on

PWC 350 Good Substring / Shuffle Pairs

Musical Interlude

The movie version of Wicked is in theaters right now, so I am reminded of the song For Good -- but I'm gonna link to the Broadway version, because I'm classy like that. It's relevant to programming in Perl because "I don't know if I've been changed for the better, but I have been changed for good." For part two, Lido Shuffle by Boz Scaggs.

Task 1: Good Substrings

The Task

You are given a string. Write a script to return the number of good substrings of length three in the given string. A string is good if there are no repeated characters.

  • Example 1: Input $str = "abcaefg", Output: 5
    • Good substrings of length 3: abc, bca, cae, aef and efg
  • Example 2: Input: $str = "xyzzabc", Output: 3
  • Example 3: Input: $str = "aababc", Output: 1
  • Example 4: Input: $str = "qwerty", Output: 4
  • Example 5: Input: $str = "zzzaaa", Output: 0

The Think-y Part

There's probably a regular expression for this, but I'm not going to find it. Do the simplest thing that works: take three characters at a time and see if they're different.

The Code-y Part

sub goodSubstring($str) { my $good = 0; my @s = split(//, $str); for ( 0 .. $#s - 2 ) { my ($first, $second, $third) = @s[$_, $_+1, $_+2]; $good++ if ( $first ne $second && $first ne $third && $second ne $third ); } return $good; } 
Enter fullscreen mode Exit fullscreen mode

Notes:

  • Start by turning the string into a list of characters. It could be done with substr, but that would be untidy.
  • @s[$_, $_+1, $_+2] -- With a nod to readability, I'll extract three consecutive characters with an array slice. It occurs to me that I'll always have two of the next three characters in hand at the bottom of the loop, so doing a complete splice every time could probably be optimized, but I declare it "good" enough.
  • Since there's exactly three characters in play, check for uniqueness in the most obvious way.

Task 2:Shuffle Pairs

The Task

If two integers A <= B have the same digits but in different orders, we say that they belong to the same shuffle pair if and only if there is an integer k such that A = B * k. k is called the witness of the pair. For example, 1359 and 9513 belong to the same shuffle pair, because 1359 * 7 = 9513.

Interestingly, some integers belong to several different shuffle pairs. For example, 123876 forms one shuffle pair with 371628, and another with 867132, as 123876 * 3 = 371628, and 123876 * 7 = 867132.

Write a function that for a given $from, $to, and $count returns the number of integers $i in the range $from <= $i <= $to that belong to at least $count different shuffle pairs.

  • Example 1:
    • Input: $from = 1, $to = 1000, $count = 1
    • Output: 0
    • There are no shuffle pairs with elements less than 1000.
  • Example 2:

    • Input: $from = 1500, $to = 2500, $count = 1
    • Output: 3
    • There are 3 integers between 1500 and 2500 that belong to shuffle pairs.
      • 1782, the other element is 7128 (witness 4)
      • 2178, the other element is 8712 (witness 4)
      • 2475, the other element is 7425 (witness 3)
  • Example 3:

    • Input: $from = 1_000_000, $to = 1_500_000, $count = 5
    • Output: 2
    • There are 2 integers in the given range that belong to 5 different shuffle pairs.
      • 1428570 pairs with 2857140, 4285710, 5714280, 7142850, and 8571420
      • 1429857 pairs with 2859714, 4289571, 5719428, 7149285, and 8579142
  • Example 4:

    • Input: $from = 13_427_000, $to = 14_100_000, $count = 2
    • Output: 11
    • 6 integers in the given range belong to 3 different shuffle pairs,
    • 5 integers belong to 2 different ones.
  • Example 5:

    • Input: $from = 1030, $to = 1130, $count = 1
    • Output: 2
    • There are 2 integers between 1020 and 1120 that belong to at least one shuffle pair:
      • 1035, the other element is 3105 (witness k = 3)
      • 1089, the other element is 9801 (witness k = 9)

Deliberations

It takes a minute to digest this one.

I first wondered if there's some algebraic number theory trick that would cut the search space way down, but that made my head hurt, so I moved to doing what computers do best: grinding through a lot of possibilities.

A bad first thought was to try all combinations of the digits, but that's going to die an excruciating slow death on the crucifix of combinatorics, not to mention that we'd be completely wasting our time on all but a few combinations.

A better thought is to look only at the multiples of a given number. There are at most 8 multiples of a number in play: the 10th would add a digit, and therefore can't possibly be a reordering. Examples 3 and 4 have a lot of numbers to grind through, but how long can it take, really? It's one banana, Michael; how much could it cost, ten dollars?

How will I decide that a number is a re-ordering? I think I'll reduce each number to a canonical form where the digits are sorted, then use string compare to see if a multiple has the same canonical form.

To the bat-editor, Robin!

First, a little function to turn a number into a canonical form with its digits in sorted order. Turn the number into a list of digits, sort, and then join the digits back into a string.

sub canonical($n) { join("", sort split(//, $n)); } 
Enter fullscreen mode Exit fullscreen mode

Now the main course. I'll want to examine every number in the range $from to $to, inclusive. For each number, I'll want to examine its multiples to see if they have the same digits. I need to count the ones that work so that I can check that there are at least $count of them.

sub shufflePair($from, $to, $count) { my $answer = 0; for my $n ( $from .. $to ) { my $base = canonical($n); my $max = (9 x length($n))+0; my $pair = 0; for ( 2 .. 9 ) { my $multiple = $n * $_; next if $multiple > $max || index($base, substr($multiple, 0, 1)) < 0 || index($base, substr($multiple, -1, 1)) < 0; if ( canonical($multiple) eq $base ) { $pair++; } } $answer++ if $pair >= $count; } return $answer; } 
Enter fullscreen mode Exit fullscreen mode

Notes:

  • my $base = canonical($n) -- hang on to this for comparison.
  • my $max = (9 x length($n))+0; -- An optimization. The maximum number we need to be concerned with is one that has the same number of digits, but is all 9s. For example, if $n is 480, then we are dealing with 3-digit numbers, so the largest possible is 999. That's less than 480*3=1440, so we don't have to examine any of the multiples beyond 480*2.
  • for ( 2..9 ) -- These are the only multiples of $n that could possibly have the same number of digits.
  • next if ... -- Besides the check on $max, we can make cheap checks on a single digit. If the first or last digit isn't one of the possible digits, we can avoid the overhead of canonical(), which isn't horrendous, but it does involve allocating lists and a sort.
  • canonical($multiple) eq $base -- This string compare is where we decide if we have a shuffle pair.
  • $answer++ if $pair >= $count -- We increment the answer if this number has at least $count shuffle pairs.

This solution takes a few seconds to run the examples. My optimizations to bail early in many cases cut the run time approximately in half (from about 8 seconds to about 4.5).

Top comments (0)