The Weekly Challenge

Peter Campbell Smith

Orders and anagrams

Weekly challenge 307 — 3 February 2025

Week 307: 3 Feb 2025

Task 2

Task — Find anagrams

You are given a list of words, @words. Write a script to find any two consecutive words and if they are anagrams, drop the first word and keep the second. You continue this until there are no more anagrams in the given list and return the count of final list.

Examples


Example 1
Input: @words = ('acca', 'dog', 'god', 'perl', 'repl')
Output: 3
Step 1: 'dog' and 'god' are anagrams, so dropping 'dog'
and keeping 'god' => ('acca', 'god', 'perl', 'repl')
Step 2: 'perl' and 'repl' are anagrams, so dropping 
'perl' and keeping 'repl' => ('acca', 'god', 'repl')

Example 2
Input: @words = ('abba', 'baba', 'aabb', 'ab', 'ab')
Output: 2
Step 1: 'abba' and 'baba' are anagrams, so dropping 
'abba' and keeping 'baba' => ('baba', 'aabb', 'ab', 
'ab')
Step 2: 'baba' and 'aabb' are anagrams, so dropping 
'baba' and keeping 'aabb' => ('aabb', 'ab', 'ab')
Step 3: 'ab' and 'ab' are anagrams, so dropping 'ab' 
and keeping 'ab' => ('aabb', 'ab')

Analysis

Key to this challenge is identifying anagrams, and we might use something like are_anagrams($a, $b). But we're going to need that a lot for a long list, so how can we make it efficient?

My solution to that is to start by sorting each word individually into alphabetical order, so 'perl' becomes 'elpr' and 'challenge' becomes 'aceeghlln'. Now, are_anagrams can simply be $a eq $b.

After that, we need to loop along the list in a slighly unusual fashion, examining each element to see if it is the same as the following one. If it isn't, we simply move to the next element, but if it does match, we remove one of the pair from the array and - here's the unusual bit - we don't increment the loop variable. That way we can catch and eliminate 3 (or more) consecutive anagrams in a single pass.

I did the removal of anagrams by using array slices. Conceivably it would be faster to set the eliminated element to an empty string, and then count the non-empty strings at the end to provide the answer. I would certainly consider doing that in a production environment if the array was more than say 1000 elements or the code had to be run say every second. But I'm not sure I can think of any real-life example of needing that.

Try it

Script


#!/usr/bin/perl

# Blog: http://ccgi.campbellsmiths.force9.co.uk/challenge

use v5.26;    # The Weekly Challenge - 2025-02-03
use utf8;     # Week 307 - task 2 - Find anagrams
use warnings; # Peter Campbell Smith
binmode STDOUT, ':utf8';

find_anagrams('acca', 'dog', 'god', 'perl', 'repl');
find_anagrams('abba', 'baba', 'aabb', 'ab', 'ab');
find_anagrams(qw[abcd abdc acbd acdb adbc adcb bacd badc 
    bcad bcda mouse bdac bdca cabd cadb cbad cbda cdab 
    cdba dabc dacb dbac dbca dcab dcba]); 

sub find_anagrams {
    
    my (@words, $point);
    
    @words = @_;
    say qq[\nInput:  \@words = ('] . join(q[', '], @words) . q[')];
    
    # sort each word alphabetically
    $words[$_] = join('', sort(split('', $words[$_]))) for 0 .. $#words;
    
    # walk along the array
    $point = 0;
    while (1) {
        
        # finished
        last if $point == $#words;
        
        # word followed by anagram
        if ($words[$point] eq $words[$point + 1]) {
            
            # .. followed by no more words
            if ($point + 2 > $#words) {
                @words = @words[0 .. $point];
                
            # .. followed by more words
            } else {
                @words = (@words[0 .. $point], @words[$point + 2 .. $#words]);
            }
        
        # word not followed by anagram
        } else {
            $point ++;
        }
    }   
    say qq[Output: ] . scalar(@words);
}

Output


Input:  @words = ('acca', 'dog', 'god', 'perl', 'repl')
Output: 3

Input:  @words = ('abba', 'baba', 'aabb', 'ab', 'ab')
Output: 2

Input:  @words = ('abcd', 'abdc', 'acbd', 'acdb', 'adbc',
'adcb', 'bacd', 'badc', 'bcad', 'bcda', 'mouse', 'bdac',
'bdca', 'cabd', 'cadb', 'cbad', 'cbda', 'cdab', 'cdba',
'dabc', 'dacb', 'dbac', 'dbca', 'dcab', 'dcba')
Output: 3

Any content of this website which has been created by Peter Campbell Smith is in the public domain