Big ones and jelmbud wrods
Weekly challenge 289 — 30 September 2024
Week 289: 30 Sep 2024
An Internet legend dating back to at least 2001 goes something like this:
Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.
This supposed Cambridge research is unfortunately an urban legend. However, the effect has been studied. For example—and with a title that probably made the journal’s editor a little nervous—Raeding wrods with jubmled lettres: there is a cost by Rayner, White, et al. looked at reading speed and comprehension of jumbled text.
Your task is to write a program that takes English text as its input and outputs a jumbled version as follows:
I don’t know if this effect has been studied in other languages besides English, but please consider sharing your results if you try!
Example 1: “Perl” could become “Prel”, or stay as “Perl”, but it could not become “Pelr” or “lreP”.
The task description is silent as to words with embedded punctuation such as Don't
or half-baked
, so I have assumed
that the punctuation stays where it is but the letters can be jumbled between before and after, so for example
half-baked
could become heka-flabd
.
So my algorithm looks like this:
(half-baked,)
(h
d,)
alf-bake
alfbake
flbakea
flb-akea
(hflb-akead,)
That leaves any embedded non-letters where they are, but jumbles the letters before and after them.
If the text mainly comprises common words with up to six or seven letters, then I would say the assertion that it is easily read is true. However, with longer words, unfamiliar words and technical words it becomes steadily less comprehensible.
I tried German, French, Greek and Russian examples (see below). These demonstrate that Perl's \w
correctly matches letters
with accents and letters in (at least some) non-Latin alphabets. I would say that the comprehensibility of the
French and German phrases is about the same as the English ones, but German is given to long compound words
which will probably be incomprehensible when scrambled.
I don't speak Greek or Russian, so can't comment on those.
#!/usr/bin/perl # Blog: http://ccgi.campbellsmiths.force9.co.uk/challenge use v5.26; # The Weekly Challenge - 2024-09-30 use utf8; # Week 289 - task 2 - Jumbled letters use warnings; # Peter Campbell Smith binmode STDOUT, ':utf8'; jumbled_letters(qq[The quick brown fox jumps over the lazy dog.]); jumbled_letters(qq[The X-factor's inventor was Bloggs-Jones, who said 'Hello!']); jumbled_letters(qq[Psychological Abstracts contains nonevaluative abstracts of literature in psychology and related disciplines, grouped into 22 major classification categories]); jumbled_letters(qq[Deoxyribonucleic acid, mucopolysaccharides and propan-2-ol are organic chemicals.]); jumbled_letters(qq[Das Mädchen möchte die Straẞe früh überqueren.]); jumbled_letters(qq[L'accent circonflexe va disparaître des manuels scolaires à la rentrée: que s'est-il passé, et qu'en pense le ministre de l'Éducation nationale Najat Vallaud Belkacem?]); jumbled_letters(qq[Η γρήγορη καφετιά αλεπού πηδά πάνω από το τεμπέλικο σκυλί]); jumbled_letters(qq[Быстрая бурая лиса перепрыгивает через ленивую собаку.]); sub jumbled_letters { my ($str, $before, $rest, $middle, $after, $one, $two, $x, $letters, $count, $length, $word, $lm, $s, $m, $jumbled); $str = $_[0] . ' '; $jumbled = ''; # loop over 'words' while ($str =~ m|([^\s]*)\s+|gi) { # split word into $before, $middle and $after $word = $1; if ($word =~ m|\w| and length($word) >= 4) { ($before, $rest) = $word =~ m|([^\w]*\w)(.*)|; ($middle, $after) = $rest =~ m|(.*?)(\w[^\w]*)$|; # put just the letters (\w) into letters $lm = length($middle); $letters = $middle; $letters =~ s|[^\w]||g; $count = length($letters); # swap letters around randomly lots of times if ($count > 1) { for (0 .. $count + rand(7)) { do { $one = int(rand($count)); $two = int(rand($count)); } until $one != $two; $x = substr($letters, $one, 1); substr($letters, $one, 1) = substr($letters, $two, 1); substr($letters, $two, 1) = $x; } # now put the jumbled letters in place of the originals $s = 0; for $m (0 .. length($middle) - 1) { if (substr($middle, $m, 1) =~ m|\w|) { substr($middle, $m, 1) = substr($letters, $s ++, 1); } } } # reassemble the word $word = $before . $middle . $after; } # and add it to the jubled output $jumbled .= $word . ' '; } say qq[\nInput: $str]; say qq[Output: $jumbled]; }
Input: The quick brown fox jumps over the lazy dog. Output: The qcuik brown fox jmups oevr the lazy dog. Input: The X-factor's inventor was Bloggs-Jones, who said 'Hello!' Output: The X-cafort's invoetnr was BsngJl-ooges, who siad 'Hlleo!' Input: Psychological Abstracts contains nonevaluative abstracts of literature in psychology and related disciplines, grouped into 22 major classification categories Output: Poihcsocalygl Atabtrscs cnitoans noavavnlteiue aatsbtcrs of lteaitrure in pcyolgsohy and retaled depiiisncls, geuprod into 22 maojr clctfoissiiaan cgaeiortes Input: Deoxyribonucleic acid, mucopolysaccharides and propan-2-ol are organic chemicals. Output: Dinooeuilyrcebxc aicd, mlaiadcrhoocycsueps and ppanor-2-ol are onagirc clmiahces. Input: Das Mädchen möchte die Straẞe früh überqueren. Output: Das Mhcäedn mtcöhe die Sẞrate früh üererubqen. Input: L'accent circonflexe va disparaître des manuels scolaires à la rentrée: que s'est-il passé, et qu'en pense le ministre de l'Éducation nationale Najat Vallaud Belkacem? Output: L'acenct croxlecfine va dtrpiîarsae des maleuns solaciers à la rétrnee: que s'sti-el pssaé, et qe'un pnsee le mtisinre de l'tiacuÉdon nnitaloae Njaat Vaualld Balecekm? Input: Η γρήγορη καφετιά αλεπού πηδά πάνω από το τεμπέλικο σκυλί Output: Η γροήργη κιατεφά αεπολύ πηδά πνάω από το τειλκέμπο συλκί Input: Быстрая бурая лиса перепрыгивает через ленивую собаку. Output: Баысртя буаря лсиа пыреапвигреет через лвиуеню собаку.
Any content of this website which has been created by Peter Campbell Smith is in the public domain