Camel
Peter
Peter Campbell Smith

Lots of counting

Weekly challenge 365 — 16 March 2026

Week 365: 16 Mar 2026

Task 2

Task — Valid token counter

You are given a $sentence. Write a script to split the given sentence into space-separated tokens and count how many are valid words. A token is valid if it

  • contains no digits,
  • has at most one hyphen surrounded by lowercase letters, and
  • at most one punctuation mark [!,.] appearing only at the end.

Examples


Example 1
Input: $str = 'cat and dog'
Output: 3
Tokens: 'cat', 'and', 'dog'

Example 2
Input: $str = 'a-b c! d,e'
Output: 2
Tokens: 'a-b', 'c!', 'd,e'
'a-b' -> valid (one hyphen between letters)
'c!'  -> valid (punctuation at end)
'd,e' -> invalid (punctuation not at end)

Example 3
Input: $str = 'hello-world! this is fun'
Output: 4
Tokens: 'hello-world!', 'this', 'is', 'fun'
All satisfy the rules.

Example 4
Input: $str = 'ab- cd-ef gh- ij!'
Output: 2
Tokens: 'ab-', 'cd-ef', 'gh-', 'ij!'
'ab-'   -> invalid (hyphen not surrounded by letters)
'cd-ef' -> valid
'gh-'   -> invalid
'ij!'   -> valid

Example 5
Input: $str = 'wow! a-b-c nice.'
Output: 2
Tokens: 'wow!', 'a-b-c', 'nice.'
'wow!'  -> valid
'a-b-c' -> invalid (more than one hyphen)
'nice.' -> valid

Analysis

So let's start by splitting the sentence into words separated by one or more spaces. Then let's examine each word:

  • Contains a digit? If so, bad.
  • Contains hyphens?
    • More than one? If so, bad.
    • One without a letter before and after? If so, bad.
  • Contains punctuation?
    • More than 1? If so, bad.
    • Not at the end? If so, bad
  • Otherwise, good!

Perhaps the least intuitive line is:

$count = @z = $word =~ m|\-|g;

The $word =~ m... construct returns an array of matches into @z and the assignment to $count gives the number of elemenst in @z and thus the number of matches.

Try it 

Try running the script with any input:



example: all cows eat grass!

Script


#!/usr/bin/perl

# Blog: http://ccgi.campbellsmiths.force9.co.uk/challenge

use v5.26;    # The Weekly Challenge - 2026-03-16
use utf8;     # Week 365 - task 2 - Valid token counter
use warnings; # Peter Campbell Smith
binmode STDOUT, ':utf8';
use Encode;

valid_token_counter('cat and dog');
valid_token_counter('a-b c! d,e');
valid_token_counter('hello-world! this is fun');
valid_token_counter('ab- cd-ef gh- ij!');
valid_token_counter('wow! a-b-c nice.');

sub valid_token_counter {
    
    my ($string, $valid, $word, @words, $count, @z, $good);
    
    # initialise
    $string = $_[0];
    $valid = 0;
    $good = '';
    
    # loop over words
    @words = split(/\s+/, $string);
    for $word (@words) {
        next if $word =~ m|\d|;   # no digits
        $count = @z = $word =~ m|\-|g; # count hyphens
        next if $count > 1;
        next if ($count == 1 and not $word =~ m|[a-z]\-[a-z]|);
        $count = @z = $word =~ m|[!\.,]|; # count punctuation
        next if $count > 1;
        next if ($count == 1 and not $word =~ m|[!\.,]$|);
        
        # passes all tests
        $valid ++;
        $good .= qq['$word', ];
    }
    $good = ' - (' . substr($good, 0, -2) . ')' if $good;
    
    say qq[\nInput:  '$string'];
    say qq[Output: $valid valid tokens $good];
}

last updated 2026-03-16 — 19 lines of code

Output


Input:  'cat and dog'
Output: 3 valid tokens  - ('cat', 'and', 'dog')

Input:  'a-b c! d,e'
Output: 2 valid tokens  - ('a-b', 'c!')

Input:  'hello-world! this is fun'
Output: 4 valid tokens  - ('hello-world!', 'this', 'is',
   'fun')

Input:  'ab- cd-ef gh- ij!'
Output: 2 valid tokens  - ('cd-ef', 'ij!')

Input:  'wow! a-b-c nice.'
Output: 2 valid tokens  - ('wow!', 'nice.')

 

Any content of this website which has been created by Peter Campbell Smith is in the public domain