Peter’s blog ✴ Week 166 ✴ 23 May 2022

THE WEEKLY CHALLENGE
D00DlE5 and directory compare

The Perl Camel

Task 2

K-directory diff

Given a few (three or more) directories (non-recursively), display a side-by-side difference of files that are missing from at least one of the directories. Do not display files that exist in every directory.

Since the task is non-recursive, if you encounter a subdirectory, append a '/', but otherwise treat it the same as a regular file.

Examples


Example 1:
Given the following directory structure:
 
dir_a:
Arial.ttf  Comic_Sans.ttf  Georgia.ttf  Helvetica.ttf  
Impact.otf  Verdana.ttf  Old_Fonts/
 
dir_b:
Arial.ttf  Comic_Sans.ttf  Courier_New.ttf
Helvetica.ttf  Impact.otf  Tahoma.ttf  Verdana.ttf
 
dir_c:
Arial.ttf  Courier_New.ttf  Helvetica.ttf  Impact.otf  
Monaco.ttf  Verdana.ttf
 
... the output should look similar to the following:
 
dir_a          | dir_b           | dir_c
-------------- | --------------- | ---------------
Comic_Sans.ttf | Comic_Sans.ttf  |
               | Courier_New.ttf | Courier_New.ttf
Georgia.ttf    |                 |
               |                 | Monaco.ttf
Old_Fonts/     |                 |
               | Tahoma.ttf      |

Analysis

The logic of this is:

  • Loop over directories and files
  • Find the longest file name (to determine column width)
  • Create %files with key = file name and value = a list of the containing directories
  • Create a string ($all) that indicates the file is in all directories

Print the heading lines using the longest file name (above) to determine column width and then:

  • Loop over directories and files
  • Skip file if $files{$file} eq $all
  • Print the file name in every column that appears in $files{$file}

Getting the columns to line up is a little messy, made slightly worse by the example have one fewer vertical line than columns. The %-s format in sprintf comes in useful for padding the file names to the right width.

Perl Weekly’s review

from PW issue 566

Algorithm laid out clearly for anyone to get to the bottom of the task. Thank you for sharing the knowledge.

Try it 

Try running the script with any input:



example: a.txt b.txt c.txt



example: a.txt c.txt d.txt



example: b.txt c.txt d.txt

Script


#!/usr/bin/perl

# Peter Campbell Smith - 2022-05-23
# PWC 166 task 2

use v5.28;
use strict;
use warnings;
use utf8;

my (@dirs, $dir, $file, $width, $header, $all, %files, $line1, $line2, $prefix);

@dirs = ([qw[Arial.ttf  Comic_Sans.ttf  Georgia.ttf  Helvetica.ttf  Impact.otf  Verdana.ttf  Old_Fonts/]],
         [qw[Arial.ttf  Comic_Sans.ttf  Courier_New.ttf  Helvetica.ttf  Impact.otf  Tahoma.ttf  Verdana.ttf]],
         [qw[Arial.ttf  Courier_New.ttf  Helvetica.ttf  Impact.otf  Monaco.ttf  Verdana.ttf]]);

# loop over directories
$width = 0;
for $dir (0 .. scalar @dirs - 1) {
    
    # loop over files within directory
    for $file (@{$dirs[$dir]}) {
        $files{$file} .= qq[/$dir/];   # if file exists within directory n, $files{$file} matches /n/
        $width = length($file) if length($file) > $width;   # get max file name length
    }
    $all .= qq[/$dir/];   # if $files{$file} eq /0//1//2/ (etc) then file exists in all directories and is skipped below
}

# heading lines
$line1 = qq[\n];
$prefix = ' ';
for $dir (0 .. scalar @dirs - 1) {
    $line1 .= $prefix . 'dir_' . sprintf('%-' . ($width - 4) . 's', chr(ord('a') + $dir)) . ' ';
    $line2 .= $prefix . ('-' x ($width)) . ' ';
    $prefix = '| ';
}
say qq[$line1\n$line2];

# file lines
for $file (sort keys %files) {
    next if $files{$file} eq $all; # skip file if in all directories
    
    # loop over directories
    $prefix = '';
    for $dir (0 .. scalar @dirs - 1) {
        
        # file is in this directory
        if ($files{$file} =~ m|/$dir/|) {
            print sprintf($prefix . " %-${width}s", $file);
        
        # file isn't in this directory
        } else {
            print sprintf($prefix . " %-${width}s", ' ');
        }
    $prefix = ' |';
    }
    print qq[\n];
}
print qq[\n];

32 lines of code

Output from script


 dir_a           | dir_b           | dir_c           
 --------------- | --------------- | --------------- 
 Comic_Sans.ttf  | Comic_Sans.ttf  |                
                 | Courier_New.ttf | Courier_New.ttf
 Georgia.ttf     |                 |                
                 |                 | Monaco.ttf     
 Old_Fonts/      |                 |                
                 | Tahoma.ttf      |                


 

Any content of this website which has been created by Peter Campbell Smith is in the public domain