Peter
Peter Campbell Smith

Duplicates and triplets

Weekly challenge 234 — 11 September 2023

Week 234 - 11 Sep 2023

Task 1

Task — Common characters

You are given an array of words made up of alphabetic characters only. Write a script to return all alphabetic characters that show up in all words including duplicates.

Examples


Example 1
Input: @words = ("java", "javascript", "julia")
Output: ("j", "a")

Example 2
Input: @words = ("bella", "label", "roller")
Output: ("e", "l", "l")

Example 3
Input: @words = ("cool", "lock", "cook")
Output: ("c", "o")

Analysis

Let's start by sorting each word's letters into alphabetical order, so, for example, 'javascript' becomes 'aacijprstv'. This ensures that any repeated letters - 'a' in this case - are together.

Now we can be sure that any letters that appear in all the words must appear in $words[0]. So let's iterate over the letters in $words[0], and see if they appear in $words[1] etc. Any that do, we add to our list of common characters.

But there is a complication, which is that we need to allow for repeated letters.

The first way round this is for our iteration not just to be over individual characters but over sequences of the same character, which I've called samples. So in our 'javascipt' example, first we check that sample 'aa' appears in all the words, then 'c' and so on.

But that still doesn't quite work, because while 'julia' doesn't contain 'aa', it does contain 'a'. So for every sample in $words[0] which is more than 1 character we need to check shorter versions. So in our example, after checking 'aa' we need to check 'a'.

And that's my algorithm. Note especially the while ($sample) loop with $sample =~ s|.||; at the end to repeatedly shorten the sample by one character until it's empty.

Try it 

Try running the script with any input, for example:
java, javascript, julia


Script


#!/usr/bin/perl

use v5.16;    # The Weekly Challenge - 2023-09-11
use utf8;     # Week 234 task 1 - Common characters
use strict;   # Peter Campbell Smith
use warnings; # Blog: http://ccgi.campbellsmiths.force9.co.uk/challenge

common_characters('java', 'javascript', 'julia');
common_characters('bella', 'label', 'roller'); 
common_characters('cool', 'lock', 'cook');
common_characters('abcdefffffgh','deffffxudlge', 'jrwtsiffftmhdeff', 'deffxxxxxxx', 'xxexxxdxfxxxxf');
common_characters('abc', 'def', 'ghi');

sub common_characters {
    
    my (@words, $word, $sample, $j, @result, $good);
    
    # sanity check
    @words = @_;
    return if scalar @words < 2;
    
    # sort all words' letters into ascending order
    for $word (@words) {
        $word = join('', sort(split('', lc($word))));
    }
    
    # loop over 'samples' in word 0 - eg a or bbb or cc
    SAMPLE: while ($words[0] =~ m|(.)(\1*)|g) { 
        $sample = $1 . $2;
        while ($sample) {
            
            # check whether the sample appears in all the other words, else go to next sample
            $good = 1;
            for $j (1 .. scalar @words - 1) {
                $good = 0 unless $words[$j] =~ m|$sample|;
                next SAMPLE if (not $good and length($sample) == 1);
            }
            
            # if all the words matched, add the individual characters of sample to the result
            if ($good) {
                push(@result, $1) while $sample =~ m|(.)|g;
                next SAMPLE;
            }
            
            # if eg aaa didn't match, try again with aa then a
            $sample =~ s|.||;
        }
    }
    
    # show results
    say qq[\nInput:  \@words = ('] . join(qq[', '], @_) . q[')];
    say qq[Output: ('] . join(qq[', '], @result) . q[')];
}

Output


Input:  @words = ('java', 'javascript', 'julia')
Output: ('a', 'j')

Input:  @words = ('bella', 'label', 'roller')
Output: ('e', 'l', 'l')

Input:  @words = ('cool', 'lock', 'cook')
Output: ('c', 'o')

Input:  @words = ('abcdefffffgh', 'deffffxudlge', 'jrwtsiffftmhdeff', 'deffxxxxxxx', 'xxexxxdxfxxxxf')
Output: ('d', 'e', 'f', 'f')

Input:  @words = ('abc', 'def', 'ghi')
Output: ('')