Camel
Peter
Peter Campbell Smith

Questions are good

Weekly challenge 328 — 30 June 2025

Week 328: 30 Jun 2025

Task 2

Task — Good string

You are given a string made up of lower and upper case English letters only. Write a script to return the good string of the given string. A string is called good string if it doesn’t have two adjacent same characters, one in upper case and the other in lower case.

Examples


Example 1
Input: $str = 'WeEeekly'
Output: 'Weekly'
We can remove either, 'eE' or 'Ee' to make it good.

Example 2
Input: $str = 'abBAdD'
Output: ''
We remove 'bB' first: 'aAdD'
Then we remove 'aA': 'dD'
Finally remove 'dD'.

Example 3
Input: $str = 'abc'
Output: 'abc'

Analysis

An interesting and at first sight tricky challenge to solve efficiently. In particular, is there any way to solve it without a repetitive search? By a repetitive search I mean something like this:

  • Search for the first bad pair (Xx or xX)
  • Delete that from the string
  • Repeat until the search fails

That works, but the search has to start each time from the beginning of the string in order to catch cases like this: ABCDEedcba, where we will not identify the Aa pair until we have successively eliminated Ee, Dd, Cc and Bb.

It is also a little awkward in that I don't know of any regex element that matches an upper case any letter followed by the lower case of the same letter (or vice versa).

So here is my algorithm, which is not quite single pass, but which will be much faster for a string that, say, has a million good characters followed by 'Aa'.

The method involves creating a partition $p, where $string[0] to $string[$p] is good, ie contains no bad pairs. $p is initially zero, and clearly the one-character string $p[0] to $p[0] cannot contain a bad pair.

We then enter a loop which goes like this:

  • Is $string[p] . $string[p + 1] a bad pair?
  • If yes, delete $string[$p] and $string[p + 1] and decrement $p
  • If no, increment $p, and if $string[$p] is now the last character in $string, quit.

Here is an example, with | shown as the partition:

  • X|AbBa ... XA is good, so increment $p
  • XA|bBa ... Ab is good, so increment $p
  • XAb|Ba ... bB is bad so delete it and decrement $p
  • XA|a ... Aa is bad, so delete it and decrement $p
  • X| ... $string[$p] is the end of the string, so quit

I debated where it would be more efficient to hold the string as a scalar and use substr() to perform the algorithm steps, or to split() the intial string into an array and then use array slices. I didn't have time to try both so went with the array.

Try it 

Try running the script with any input:



example: oOStTuccYyrResmMsZziI

Script


#!/usr/bin/perl

# Blog: http://ccgi.campbellsmiths.force9.co.uk/challenge

use v5.26;    # The Weekly Challenge - 2025-06-30
use utf8;     # Week 328 - task 2 - Good string
use warnings; # Peter Campbell Smith
binmode STDOUT, ':utf8';
use Encode;

good_string('WeEeekly');
good_string('abcddeFfgghH');
good_string('goodstringsGOODSTRINGS');
good_string('BbaADdsSTtrRIinNGg');

sub good_string {
    
    my (@string, $p, @new_string, $last_p);
    
    # initialise
    @string = split('', $_[0]);
    $p = 0;
    $last_p = $#string;
    
    # move partition along
    while (1) {
        
        # finished
        last if $p >= $#string;
        
        # pair at p, p + 1 is bad, cut them out and move partition left
        if (is_bad($string[$p], $string[$p + 1])) {
            @new_string = ();
            @new_string = @string[0 .. $p - 1] if $p > 0;
            @new_string = (@new_string, @string[$p + 2 .. $#string]) if $p + 2 <= $#string;
            @string = @new_string;
            $p -- if $p > 0;
            
        # pair at p, p + 1 is good - move partition right
        } else {
            $p ++;
        }
    }           
    
    say qq[\nInput:  '$_[0]'];
    say qq[Output: '] . join('', @string) . q['];
}

sub is_bad {
    
    # good unless they are the same character
    return 0 unless lc($_[0]) eq lc($_[1]);
    
    # bad if they are Xx or xX
    return 1 if $_[0] le 'Z' and $_[1] ge 'a' or
                $_[0] ge 'a' and $_[1] le 'Z';
    
    # otherwise good (xx or XX)
    return 0;
}

Output


Input:  'WeEeekly'
Output: 'Weekly'

Input:  'abcddeFfgghH'
Output: 'abcddegg'

Input:  'goodstringsGOODSTRINGS'
Output: 'goodstringsGOODSTRINGS'

Input:  'BbaADdsSTtrRIinNGg'
Output: ''

 

Any content of this website which has been created by Peter Campbell Smith is in the public domain