Peter
Peter Campbell Smith

All on the same line,
and am I being cited?

Weekly challenge 207 — 6 March 2023

Week 207 - 6 Mar 2023

Task 2

Task — The H-index for citations

We are given a list of the number of citations a researcher has received for each of his published papers, ordered from most cited to least. We are asked to write a script to compute the researcher’s H-index, which is the maximum n where the n'th number in the list is at least n.

Analysis

The logic behind this is not hard, but there are a few potential pitfalls. Firstly, the supplied array is zero-based, so we are looking for $array[$j + 1] being less than $j, and then the answer is $j - ie the preceding entry.

We need to allow for the case where the author has never been cited and the index is therefore zero - so he or she has an array of (0) or even () - and we need to remember the other edge case where all the papers have been cited enough - eg (4, 4, 4, 4);

I reckon the best solution is with a for loop with C-type parameters so that we preserve the loop variable $j on leaving the loop. I considered folding the 'if' condition into the second parameter of the 'for' and leaving the body of the 'for' loop empty:


for ($j = 0; $j < scalar @list and $list[$j] < $j + 1; $j ++) {
}

but that relies on Perl evaluating the 'and' left to right which is perhaps unwise.

Try it 

Example: 10, 8, 5, 4, 3

Script


#!/usr/bin/perl

# Peter Campbell Smith - 2023-02-20

use v5.28;
use utf8;
use warnings;

# We are given a list of the number of citations a researcher has received 
# for each of his published papers, ordered from most cited to least.
# We are asked to write a script to compute the researcher’s H-index, which is the maximum n
# where the n'th number in the list is at least n.

# Blog: http://ccgi.campbellsmiths.force9.co.uk/challenge/207/2

h_index(10, 8, 5, 4, 3);
h_index(25, 8, 5, 3, 3);
h_index(10, 9, 8, 7, 6, 5, 4, 3, 2, 1);
h_index(0);
h_index();
h_index(4, 4, 4, 4);

sub h_index {
    
    my (@list, $j);
    
    # loop over list (0-based!) to find first where n'th number in list < $n
    @list = @_;
    for ($j = 0; $j < scalar @list; $j ++) {
        last unless $list[$j] >= $j + 1;
    }
    say qq[\nInput:  \@citations = (] . join(', ', @list) . 
        qq[)\nOutput: $j];
}

Output


Input:  @citations = (10, 8, 5, 4, 3)
Output: 4

Input:  @citations = (25, 8, 5, 3, 3)
Output: 3

Input:  @citations = (10, 9, 8, 7, 6, 5, 4, 3, 2, 1)
Output: 5

Input:  @citations = (0)
Output: 0

Input:  @citations = ()
Output: 0

Input:  @citations = (4, 4, 4, 4)
Output: 4