Peter
Peter Campbell Smith

Latin buses

Weekly challenge 274 — 17 June 2024

Week 274 - 17 Jun 2024

Task 1

Task — Goat Latin

You are given a sentence, $sentence. Write a script to convert the given sentence to Goat Latin, a made up language similar to Pig Latin.

  1. If a word begins with a vowel ("a", "e", "i", "o", "u"), append "ma" to the end of the word.
  2. If a word begins with consonant ie not a vowel, remove the first letter and append it to the end and then add "ma".
  3. Add letter "a" to the end of first word in the sentence, "aa" to the second word, etc etc.

Examples


Example 1
Input: $sentence = "I love Perl"
Output: "Imaa ovelmaaa erlPmaaaa"

Example 2
Input: $sentence = "Perl and Raku are friends"
Output: "erlPmaa andmaaa akuRmaaaa aremaaaaa riendsfmaaaaaa"

Example 3
Input: $sentence = "The Weekly Challenge"
Output: "heTmaa eeklyWmaaa hallengeCmaaaa"

Analysis

On the face of it, this is easy challenge. Just split the sentence into words and apply the three rules.

But, wait a mo, what exactly is a word? Obviously it could be just a sequence of letters, ie m|\w+|. That will find any sequence of upper or lower case letters. (It will also match numbers or _, but if they exist in 'words' we might as well include them).

But what about words like half-baked or don't? OK, maybe we need m|[\w+'\-]|.

But we might also have something like

The good ship 'Jolly Roger'

How do we apply rule 2 to 'Jolly ? Is the first letter ', or is it J?

So I came up with the following:

  1. Split $sentence into 'words' using m|([^\s]+)(\s*)|g which assigns a run of non-space characters to $word and a following run of spaces to $post.
  2. I then look more closely at $word using
    m|^([^\w]*)([\w'\-]+)([^\w]*)$| which splits it into 3 parts: any leading non-word characters, word characters including - or ', and any following characters, assigning these 3 parts to $pre, $word, $post.
  3. I then look at this new $word, and if it ends with ' I remove that and prepend it to $post.

Now, at last I can follow the first 2 rules in the task description, applying them to $word and then concatenating:

$pre . $word . 'ma' . $append . $post . ' '

That will cope with the following examples and combinations thereof:

half-baked, don't, 'sausage', why?, hello

which I think is good enough for a goat.

Try it 

Try running the script with any input:



example: This ground-squirrel is called 'Daisy'

Script


#!/usr/bin/perl

# Blog: http://ccgi.campbellsmiths.force9.co.uk/challenge

use v5.26;    # The Weekly Challenge - 2024-06-17
use utf8;     # Week 274 - task 1 - Goat latin
use warnings; # Peter Campbell Smith
binmode STDOUT, ':utf8';

goat_latin("I love Perl");
goat_latin("Perl and Raku are friends");
goat_latin("The Weekly Challenge");
goat_latin(qq[Can't you say 'Hello!' to my mother-in-law?]);

sub goat_latin {
    
    my ($sentence, $append, $word, $pre, $post, $goated);
    
    $sentence = shift;
    $append = '';
    
    # split sentence into words and punctuation
    while ($sentence =~ m|([^\s]+)(\s*)|ig) {
        $word = $1;
        $word =~ m|^([^\w]*)([\w'\-]+)([^\w]*)$|;
        ($pre, $word, $post) = ($1, $2, $3);
        if ($word =~ m|'$|) {
            $word = substr($word, -1);
            $post = qq['$post];
        }
            
        # apply the rules
        $word = substr($word, 1) . substr($word, 0, 1)
            unless $word =~ m|^[aeiou]|i;   
        $append .= 'a';
        
        # join it all up
        $goated .= $pre . $word . 'ma' . $append . $post . ' ';
    }
    
    printf(qq[\nInput:  \$sentence = "%s"\n], $sentence);
    printf(qq[Output: "%s"\n], substr($goated, 0, -1));
}

Output


Input:  $sentence = "I love Perl"
Output: "Imaa ovelmaaa erlPmaaaa"

Input:  $sentence = "Perl and Raku are friends"
Output: "erlPmaa andmaaa akuRmaaaa aremaaaaa riendsfmaaaaaa"

Input:  $sentence = "The Weekly Challenge"
Output: "heTmaa eeklyWmaaa hallengeCmaaaa"

Input:  $sentence = "Can't you say 'Hello!' to my mother-in-law?"
Output: "an'tCmaa ouymaaa aysmaaaa 'elloHmaaaaa!' otmaaaaaa ymmaaaaaaa other-in-lawmmaaaaaaaa?"