Peter
Peter Campbell Smith

Abecedarian words
and pangrams

Weekly challenge 161 — 18 April 2022

Week 161 - 18 Apr 2022

Task 1

Task — Abecedarian words

An abecedarian word is a word whose letters are arranged in alphabetical order. Output or return a list of all abecedarian words in this dictionary, sorted in decreasing order of length.

Optionally, using only abecedarian words, leave a short comment in your code to make your reviewer smile.

Examples


Example 1:
“knotty” is an abecedarian word

Example 2:
“knots” is not  an abecedarian word

Analysis

There are, maybe surprisingly, few abecedarian words in the supplied dictionary: of 39172 words only 339 (0.86%) are abecedarian, and none is more than 6 letters long.

But if we consider any pair of random letters, c1 and c2, the probability that c1 alphabetically precedes c2 and the probability that c2 precedes c1 are clearly the same. The only other option is that they are the same letter, and the probability of that is 1 in 26. So the probability that c2 is the same as or comes alphabetically after c1 is 13.5 in 26, which is close to 52% - not bad odds you might think, and indeed there quite a few two-letter words in the list (some, it has to be said, are dubiously valid as words).

Now you might think that a 3-letter word would have a 52% x 52% = 27% chance of being abecedarian. Actually, it's less than that because if you think of a 3-letter word w1w2w3 where w1w2 is abecedarian, w2 is therefore in the range w1-Z whereas a randomly selected w3 is in A-Z. So w3 has a less than 52% chance of making w1w2w3 abecedarian.

How much less I leave to those with younger brains than me, and of course these theoretical numbers make the demonstrably false assumption that words consist of random letters and that every letter is equally likely to appear in a word, but the fact that there are few abecedarian words of more than 6 letters is quite explicable.

And as optionally requested: 'Oh no, a bent hoop, a gory bus, a dirty floor, all beg for deep effort'.

Script


#!/usr/bin/perl

# Peter Campbell Smith - 2022-04-18
# PWC 161 task 1

use v5.28;
use strict;
use warnings;
use utf8;

my ($dictionary, $k, $word, @letters, %results, $line, $word_count, $result_count, $abcd); 

# fetch dictionary
$dictionary = `curl -s -L https://github.com/manwar/perlweeklychallenge-club/raw/master/data/dictionary.txt`;

# loop over dictionary words
WORD: while ($dictionary =~ m|(.*)?\n|g) {
    $word = $1;
    $word_count ++;
    
    # split word into array of letters
    @letters = split(//, $word);
    
    # single letter words are ok
    if ($#letters == 1) {
        $results{98 . $word} = 1;   # key is so that they sort correctly
        
    # check successive pairs
    } else {
        for $k (0 .. $#letters - 1) {
            next WORD if ($letters[$k + 1] lt $letters[$k]);
        }
        
        # result!
        $results{(99 - $#letters) . $word} = 1;
    }
}

# print them out in the specified order
say qq[\nAbecedarian words from dictionary:];
for $k (sort keys %results) {
    $line .= substr($k, 2, 99)  . ' ';
    if (length $line > 100) {
        say $line;
        $line = '';
    }
    $abcd ++;
}
say $line if $line;

say qq[\nFrom $word_count words found $abcd that are abecedarian (] .
    (int($abcd / $word_count * 10000) / 100) . '%)';

Output


Abecedarian words from dictionary:
abhors accent accept access accost almost begins bellow 
billow cellos chills chilly chimps chintz choosy 
choppy effort floors floppy glossy knotty abbey abbot
abhor abort adept adopt affix afoot aglow allot 
allow alloy annoy beefs beefy beers befit begin bells
belly below berry bills boors boost booty bossy 
cello cells chill chimp chins chips chops coops deems
deeps deity dills dirty ditty doors empty fills 
filly films filmy first floor flops floss forty ghost
gills glory gloss hills hilly hippy hoops loops 
lorry moors mossy abet ably aces adds ahoy ails aims
airs airy ally alms amps beef been beer bees beet 
begs bell belt bent best bill bins blot blow boor boos
boot boss buzz cell cent chin chip chop chow city 
clot coop coos cops copy cost crux deem deep deer deft
defy dens dent deny dill dims dins dips dirt door 
eels eggs egos elms envy errs fill film fins firs fist
fizz flop flow flux foot fort foxy fuzz gill gilt 
gins gist glow gory hill hilt hims hint hips hiss hoop
hoot hops host ills imps inns knot know loop loot 
lops loss lost moor moos moot mops moss most nosy ace
act add ado ads ago ail aim air all amp ant any 
apt art ass bee beg bet bin bit boo bop bow box boy buy
chi coo cop cot cow cox coy cry den dew dim din 
dip dos dot dry eel egg ego elm err fin fir fit fix flu
fly for fox fry gin gnu goo got guy him hip his 
hit hop hot how iii ill imp inn ins ivy jot joy lop lot
low moo mop mow nor not now opt pry xxx ad ah 
am an as at ax be by cc cs do eh em go ha he hi ho id if
ii in is it iv ix ma me ms mu my no of oh on 
or ox pa pi qt re so to up us vi we xi xv xx a m x 

From 39172 words found 339 that are abecedarian (0.86%)