Peter
Peter Campbell Smith

Sequential permutations

Weekly challenge 294 — 4 November 2024

Week 294: 4 Nov 2024

Task 1

Task — Consecutive sequence

You are given an unsorted array of integers, @ints. Write a script to return the length of the longest consecutive elements sequence. Return -1 if none is found. The algorithm must run in O(n) time.

Examples


Example 1
Input: @ints = (10, 4, 20, 1, 3, 2)
Output: 4
The longest consecutive sequence (1, 2, 3, 4).
The length of the sequence is 4.

Example 2
Input: @ints = (0, 6, 1, 8, 5, 2, 4, 3, 0, 7)
Output: 9

Example 3
Input: @ints = (10, 30, 20)
Output: -1

Analysis

Let's start withe questions. Firstly, the challenge says that @ints is comprised of integers, but the examples consist of only non-negative integers. I have assumed that non-negative integers was implied, but my solution can easily be adapted to work with negative ones as well.

Secondly, we have requirement to run a time which is O(n). But what is n? My solution certainly increases in execution time proportional to the length of @ints, but it also depends on factors such as the number of sequences, and the number of elements of @ints which are part of a sequence.

So I am loosely interpreting the O(n) requirement to mean we can't sort @ints, because most sort algorithms increase in execution time faster than the number of items to be sorted.

The essence of my solution is this:

  • Create an array @seen, where $seen[$j] == 1 if $j occurs (once or more often) in @ints. The array @seen will probably be sparse - that is, it will have undefined elements - but Perl handles sparse arrays quite well.
  • Loop over the elements of @seen sequentially. If $seen[$j] == 1 and $seen[$j - 1] == 1, then $j is part of a sequence, and we need only count the length of the sequence and keep track of the longest.

The first step above is probably O(n) as we are handling each element of @ints just once. I ran some timing experiments which confirmed that assertion, although the timings even for an 10 000 element array are in the milliseconds. The second step's run time is primarily proportional to the number of unique numbers in @ints, which could range from none to all of them.

I also wondered if rather than creating a sparse array, I could use a hash %seen. That has the advantage of coping with negative numbers. It does (of course!) work, but is noticeably slower (by a factor around 3), so I stuck with the array solution.

With @ints comprising 10 000 random numbers in the range 0 to 999 999 my sparse array solution takes under 0.2 seconds to find a sequence of around 90 consecutive numbers, and multiplying by 10 - so 100 000 random numbers in the range 0 to 9 999 999 finds a run of about 1000 in 1.5 seconds. So multiplying n by 10 increases the run time by a factor of about 8, which I think justifies my conclusion that it's about O(n).

Try it 

Try running the script with any input:



example: 1, 7, 3, 6, 2, 9, 11, 4

Script


#!/usr/bin/perl

# Blog: http://ccgi.campbellsmiths.force9.co.uk/challenge

use v5.26;    # The Weekly Challenge - 2024-11-04
use utf8;     # Week 294 - task 1 - Consecutive sequence
use warnings; # Peter Campbell Smith
binmode STDOUT, ':utf8';

consecutive_sequence(0, 10, 1, 12, 14, 2, 4, 3, 1, 16);
consecutive_sequence(10, 4, 20, 1, 3, 2);
consecutive_sequence(10, 30, 20);
consecutive_sequence(1000000, 1000001, 1000002, 42);
consecutive_sequence(1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 7);
consecutive_sequence(1, 7, 3, 6, 2, 9, 11, 4);

sub consecutive_sequence {
    
    my (@ints, @seen, $sequence, $best, $end, $j, $this, $run);
    
    # create $seen[$j] = 1 for all $j in @ints
    @ints = @_;
    $seen[$_] = 1 for @ints;
    
    # loop over %seen
    $sequence = $best = 0;
    $end = -1;
    for $j (1 .. $#seen) {
        unless (defined $seen[$j]) {
            $sequence = 0;
            next;
        }
            
        # if consecutive elements of @seen exist, we have a sequence
        if (defined $seen[$j - 1]) {
            $sequence ++;
            if ($sequence > $best) {
                $best = $sequence;
                $end = $j;
            }
        }
    }
    
    # show results
    $run .= qq[$_, ] for ($end - $best) .. $end;
    say qq[\nInput:  \@ints = (] . join(', ', @ints) . ')';
    say '' . ($end > 0) ?
        sprintf ('%s %d: (%s)', 'Output:', $best + 1, substr($run, 0, -2)) :
        qq[Output: -1];
}

Output


Input:  @ints = (0, 10, 1, 12, 14, 2, 4, 3, 1, 16)
Output: 5: (0, 1, 2, 3, 4)

Input:  @ints = (10, 4, 20, 1, 3, 2)
Output: 4: (1, 2, 3, 4)

Input:  @ints = (10, 30, 20)
Output: -1

Input:  @ints = (1000000, 1000001, 1000002, 42)
Output: 3: (1000000, 1000001, 1000002)

Input:  @ints = (1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 
  5, 7)
Output: 5: (1, 2, 3, 4, 5)

Input:  @ints = (1, 7, 3, 6, 2, 9, 11, 4)
Output: 4: (1, 2, 3, 4)

 

Any content of this website which has been created by Peter Campbell Smith is in the public domain