Peter
Peter Campbell Smith

Striped arrays
and balanced splits

Weekly challenge 211 — 3 April 2023

Week 211 - 3 Apr 2023

Task 2

Task — Split same average

You are given an array of integers. Write a script to find out if the array elements can be split into two separate arrays whose averages are the same.

Analysis

Suppose the input array is @array. we are trying to find @array1 and @array2 which between them contain all the members of @array, and whose means are equal.

I can't see any solution which doesn't involve a search across a number of possible solutions, which could be quite large in number if @array is long. My initial thought was generate all permutations of @array, and then try splitting each of these at all possible points and testing all the results. But we can do better!

The first saving comes from the fact that the order of the members within @array1 and @array2 does not matter: the mean of 1, 2, 3, 4 is the same as that of 3, 1, 4, 2 or any other ordering. For that reason we can use combinations rather than permutations, which are far fewer in number.

Let's stipulate that $array1 is no longer than @array2. As an example, let's suppose @array has 9 members. We need to look at all the possible sets of values of @array1 with 1, 2, 3 or 4 members. We can stop at 4, because we know that @array1 is shorter than @array2 and we've looked at all the possible sets of values for @array1 up to length 4. But in doing so we've also covered all the possible sets of values for @array2, which will be 5, 6, 7 or 8 members long.

So here is the algorithm in words:

  • Calculate $sum and $count of the elements of @array
  • $loop from 1 to $count / 2, or $count / 2 - 0.5 if $count is even
    • Loop over all combinations of $loop members from @array, and for each:
    • Calculate $sum1 and $count1 and thus $mean1
    • Calculate $sum2 = $sum - $sum1 and $count2 = $count - $count1 and thus $mean2
    • If $mean1 == $mean2 then stop! - we have a pair of arrays with equal means
  • And if we get here without stopping then no such pair exists.

    Interestingly, there is no need to construct @array2 as we can calculate its mean without doing so. However, I do construct it just so that I can list it and check that the result is correct.

    This algorithm copes with 20 random numbers in the range (0 .. 14) in just a few seconds even if no mean-equal subsets exist - and over 600 000 combinations are tested.

    Try it 

    Unfortunately I can't let you try it from this script, because the Algorithm::Combinatorics module is not installed, and I don't have access to the server to install it. You are welcome to copy my script onto your server and try it.

  • Script

    
    #!/usr/bin/perl
    
    use v5.16;    # The Weekly Challenge - 2023-04-03
    use utf8;     # Week 211 task 2 - Split same average
    use strict;   # Peter Campbell Smith
    use warnings; # Blog: http://ccgi.campbellsmiths.force9.co.uk/challenge
    
    use Algorithm::Combinatorics ('combinations');
    
    my ($j, @data);
    
    equal_means(1, 2, 3, 4, 5, 6, 7, 8);
    equal_means(1, 3);
    equal_means(10, 1, 7, 5, 3, 11, 8, 4, 2, 9);
    equal_means(6, 3, 1, 9, 12, 25, 18, 20);
    equal_means(8, 6, 3, 5, 1, 4, 2, 7);
    
    for $j (0 .. 19) {
        $data[$j] = int(rand(15));
    }
    equal_means(@data);
    
    sub equal_means {
        
        my (@array, $count, $mean, $sum, $count1, @array1, $mean1, $sum1, 
            $count2, $sum2, $mean2, $comb, $iter, $d, $array2p, $combs);
        
        # initialise
        @array = @_;
        ($count, $sum, $mean) = stats(@array);
        
        # loop over sizes of @array1 to consider
        OUTER: for $count1 (1 .. int($count / 2)) {
            
            # loop over combinations from @array of that size
            $iter = combinations(\@array, $count1);
            while ($comb = $iter->next) {
    
                # calculate @array1 data
                $combs ++;
                @array1 = @$comb;
                ($count1, $sum1, $mean1) = stats(@array1);
                
                # deduce @array2 data
                $count2 = $count - $count1;
                $sum2 = $sum - $sum1;
                $mean2 = $sum2 / $count2;
                
                # means match - result!
                if ($mean1 == $mean2) {
                    
                    # format and print
                    $array2p = ', ' . join(', ', @array) . ', ';
                    for $d (@array1) {
                        $array2p =~ s|, $d, |,|;
                    }
                    say qq[\nInput:  (] . join(', ', @array) . q[)];
                    say qq[Output: true];
                    say qq[   array1: (] . join(', ', @array1) . q[)];
                    say qq[   array2: (] . substr($array2p, 1, -2) . q[)];
                    say qq[   mean = $mean1 (after $combs combinations tested)];
                    return;
                }
            }
        }
        say qq[\nInput:  (] . join(', ', @array) . q[)];
        say qq[Output: false (after $combs combinations tested)];
    }
    
    sub stats {
        
        my ($sum, $count, $d);
        
        $sum = $count = 0;
        for $d (@_) {
            $sum += $d;
            $count ++;
        }
        return ($count, $sum, $sum / $count);
    }
            
    

    Output

    
    Input:  (1, 2, 3, 4, 5, 6, 7, 8)
    Output: true
       array1: (1, 8)
       array2: (2, 3, 4, 5, 6, 7)
       mean = 4.5 (after 15 combinations tested)
    
    Input:  (1, 3)
    Output: false (after 2 combinations tested)
    
    Input:  (10, 1, 7, 5, 3, 11, 8, 4, 2, 9)
    Output: true
       array1: (10, 2)
       array2: (1, 7, 5, 3, 11, 8, 4, 9)
       mean = 6 (after 18 combinations tested)
    
    Input:  (6, 3, 1, 9, 12, 25, 18, 20)
    Output: true
       array1: (6, 3, 18, 20)
       array2: (1, 9, 12, 25)
       mean = 11.75 (after 107 combinations tested)
    
    Input:  (8, 6, 3, 5, 1, 4, 2, 7)
    Output: true
       array1: (8, 1)
       array2: (6, 3, 5, 4, 2, 7)
       mean = 4.5 (after 12 combinations tested)
    
    Input:  (10, 3, 6, 8, 8, 7, 6, 12, 8, 5, 2, 5, 7, 11, 0, 14, 14, 13, 8, 12)
    Output: false (after 616665 combinations tested)