Peter’s blog ✴ Week 365 ✴ 16 March 2026

THE WEEKLY CHALLENGE
Lots of counting

The Perl Camel

Task 2

Valid token counter

You are given a $sentence. Write a script to split the given sentence into space-separated tokens and count how many are valid words. A token is valid if it

  • contains no digits,
  • has at most one hyphen surrounded by lowercase letters, and
  • at most one punctuation mark [!,.] appearing only at the end.

Examples


Example 1
Input: $str = 'cat and dog'
Output: 3
Tokens: 'cat', 'and', 'dog'

Example 2
Input: $str = 'a-b c! d,e'
Output: 2
Tokens: 'a-b', 'c!', 'd,e'
'a-b' -> valid (one hyphen between letters)
'c!'  -> valid (punctuation at end)
'd,e' -> invalid (punctuation not at end)

Example 3
Input: $str = 'hello-world! this is fun'
Output: 4
Tokens: 'hello-world!', 'this', 'is', 'fun'
All satisfy the rules.

Example 4
Input: $str = 'ab- cd-ef gh- ij!'
Output: 2
Tokens: 'ab-', 'cd-ef', 'gh-', 'ij!'
'ab-'   -> invalid (hyphen not surrounded by letters)
'cd-ef' -> valid
'gh-'   -> invalid
'ij!'   -> valid

Example 5
Input: $str = 'wow! a-b-c nice.'
Output: 2
Tokens: 'wow!', 'a-b-c', 'nice.'
'wow!'  -> valid
'a-b-c' -> invalid (more than one hyphen)
'nice.' -> valid

Analysis

So let's start by splitting the sentence into words separated by one or more spaces. Then let's examine each word:

  • Contains a digit? If so, bad.
  • Contains hyphens?
    • More than one? If so, bad.
    • One without a letter before and after? If so, bad.
  • Contains punctuation?
    • More than 1? If so, bad.
    • Not at the end? If so, bad
  • Otherwise, good!

Perhaps the least intuitive line is:

$count = @z = $word =~ m|\-|g;

The $word =~ m... construct returns an array of matches into @z and the assignment to $count gives the number of elemenst in @z and thus the number of matches.

Perl Weekly’s review

from Perl Weekly issue 765

This is a good example of a solid engineering solution. It shows a structured and clear thinking process, as well as how well you have used the basic features of Perl to accomplish the task at hand. Your implementation is both concise and expressive; thus, demonstrating your mastery of decomposing problems into their components and using clean, idiomatic coding methods in your programming experience.

Try it 

Try running the script with any input:



example: all cows eat grass!

Script


#!/usr/bin/perl

# Blog: http://ccgi.campbellsmiths.force9.co.uk/challenge

use v5.26;    # The Weekly Challenge - 2026-03-16
use utf8;     # Week 365 - task 2 - Valid token counter
use warnings; # Peter Campbell Smith
binmode STDOUT, ':utf8';
use Encode;

valid_token_counter('cat and dog');
valid_token_counter('a-b c! d,e');
valid_token_counter('hello-world! this is fun');
valid_token_counter('ab- cd-ef gh- ij!');
valid_token_counter('wow! a-b-c nice.');

sub valid_token_counter {
    
    my ($string, $valid, $word, @words, $count, @z, $good);
    
    # initialise
    $string = $_[0];
    $valid = 0;
    $good = '';
    
    # loop over words
    @words = split(/\s+/, $string);
    for $word (@words) {
        next if $word =~ m|\d|;   # no digits
        $count = @z = $word =~ m|\-|g; # count hyphens
        next if $count > 1;
        next if ($count == 1 and not $word =~ m|[a-z]\-[a-z]|);
        $count = @z = $word =~ m|[!\.,]|; # count punctuation
        next if $count > 1;
        next if ($count == 1 and not $word =~ m|[!\.,]$|);
        
        # passes all tests
        $valid ++;
        $good .= qq['$word', ];
    }
    $good = ' - (' . substr($good, 0, -2) . ')' if $good;
    
    say qq[\nInput:  '$string'];
    say qq[Output: $valid valid tokens $good];
}

19 lines of code

Output from script


Input:  'cat and dog'
Output: 3 valid tokens  - ('cat', 'and', 'dog')

Input:  'a-b c! d,e'
Output: 2 valid tokens  - ('a-b', 'c!')

Input:  'hello-world! this is fun'
Output: 4 valid tokens  - ('hello-world!', 'this', 'is', 'fun')

Input:  'ab- cd-ef gh- ij!'
Output: 2 valid tokens  - ('cd-ef', 'ij!')

Input:  'wow! a-b-c nice.'
Output: 2 valid tokens  - ('wow!', 'nice.')

 

Any content of this website which has been created by Peter Campbell Smith is in the public domain