PDL, The Perl Data Language

by levien on wo 26 november 2008 // Posted in misc // under

PDL is an extension of Perl for numeric/scientific data processing. It was originally developed by astrophysicists as a free alternative to packages like IDL and Matlab. It's quite fast and memory-efficient, and very powerful. I've found it to be most useful in cases where you have to mix data-processing with the strengths of Perl (anything involving list & hash-operations, regular expressions and/or text-parsing or output). Its main drawback however is that it has a rather steep learning-curve, because the documentation is quite fragmented and not always clear. Therefore I've collected some useful links and examples.

Introduction material

Reference material

Official site

Add-ons

Installing PDL

  • In Ubuntu, you can simply install the pdl package using Synaptic.
  • In RPM/yum-based systems, install the package perl-PDL.
  • If you don't have root/sudo access, you can get PDL from CPAN and install it in a local directory. From the documentation:\
    PDL depends on a number of other Perl modules for feature complete operation.
    These modules are generally available at the CPAN. The easiest way to 
    resolve these dependencies is to use the CPAN module to install PDL.
    Installation should be as simple as
    
    cpan install PDL # if the cpan script is in your path
    
    or if you don't have the cpan script try
    
    perl -MCPAN -e shell
    cpan> install PDL
    
    NOTE: if this is your first time running the cpan shell, you'll be prompted 
    to configure the running environment.
    

Some examples of using PDL

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
#!/usr/bin/perl -w

use PDL;        # To include basic PDL functionality
use PDL::NiceSlice; # For a shorter "slicing" syntax
use strict;     # Always a good idea

# Create a 256-element vector of double-precision floats
my $vector = zeroes(256);

# Create a 100 x 100 matrix of bytes
# (see also http://pdl.sourceforge.net/PDLdocs/Core.html#datatype_conversions )
my $matrix = zeroes(byte, 100, 100);

# Set values with index 64-128 to a numerical sequence:
$vector->slice("64:128") .= sequence(65);


# Same thing, but with the shorter NiceSlice syntax:
$vector(64:128) .= sequence(65);

# Get length of first dimension
# (for alternative methods see http://pdl.sourceforge.net/PDLdocs/Core.html#nelem )
my $elements = $vector->getdim(0);
print "Vector has $elements elements\n";

# Get and print value at index 100
my $value = $vector->at(100);
print "Value at index 100 is: $value\n";

# Get all values >32
my $largevalues = $vector->where($vector > 0);
print "There were " . nelem($largevalues) . " values > 32, namely: ";
print $largevalues . "\n";

# Replace all 0's with 42's
my $indices = which($vector == 0);
$vector->dice($indices) .= 42;

# Make a reversed copy
my $reversed_vector = $vector(-1:0);    # -1 = last element

# Write both vectors to a file
# (you could use *FILEHANDLE instead of a filename)
my $file = "/tmp/bla.dat";
wcols($vector, $reversed_vector, $file);

# Read them back
my ($column1, $column2) = rcols($file);

# Put some values in our matrix:
$matrix(:,:) .= sequence(100) * 2;

# Create a (rather boring) PNG picture
my $red = $matrix;
my $green = transpose($matrix);
my $blue = $matrix->xchg(0,1)->slice("-1:0,:");  # xchg swaps dimensions

my $picture = zeroes(bytes, 3, 100, 100);
$picture(0) .= $red;
$picture(1) .= $green;
$picture(2) .= $blue;

wpic ($picture, "/tmp/foo.png");

# OK, we're done
exit(0);

# If you're new to PDL, the best way to start is by reading this:
# http://www.johnlapeyre.com/pdl/pdldoc/newbook/node4.html