Augustina's Technological Blog

Technology, Perl, Linux, and a Woman's perspective on the FOSS community

Archive for the ‘Perl’ Category

Getting Started with Devel::NYTProf

leave a comment »

I went to OSCON 2010 and saw Tim Bunce’s talk on Devel::NYTProf. For my local Perl User Group (SPUG) I decided to do a talk on how to get started with NYTProf and to provide an overview of the benefits of code profiling. This entry is a summary of the talk that I gave.

What is Devel::NYTProf?

Devel::NYTProf is a source code profiler. It is a tool that gives you another perspective on how your code is functioning. Typically people use NYTProf to find places in their code that could be optimized to run faster, however it can also be used to better understand what is happening when your program executes.

Why should I use Devel::NYTProf?

If you’re working with unfamiliar code, NYTProf provides a high level overview on what parts you need to pay the most attention to. It breaks the source code down line by line and reports execution times for each block and subroutine. NYTProf highlights hotspots in your code which helps to identify priority areas that need improvement. NYTProf provides visualization tools that are not only useful for communicating to other folks who might not be as familiar with the source code, but are also useful for right-brained artsy-fartsy visually-oriented people like myself ūüėČ

How do I use Devel::NYTProf?

1. Install the module Devel::NYTProf (I used CPAN)

Run the following command:

perl -d:NYTProf my-perl.pl

2. Set options to customize behavior

Command line:

 -addpid=1

Environment variable:

export NYTPROF=file=/tmp/nytprof.out

(See the POD for more info about custom options)

3. Format output

nytprofhtml

When you run NYTProf, it creates an output file in the same location where you ran it (unless you’ve specified something different in the options). To read the output, format it using NYTProf’s formatters. These are discussed in the POD.

If you are using the HTML formatter, by default it creates a folder called nytprof in the same directory you run it from. Open the index.html  file contained in that directory to view the formatted profile.

How do I read the output?

Once you’ve formatted the output into something human-readable, making sense of what’s being reported is the next step. Besides an HTML layout, there are visualizations of the code in Graphiz and a Treemap based on subroutine frequency and inclusive time.

The key terms to be aware of when profiling are Inclusive Time and Exclusive Time. Inclusive Time is the total time to run a subroutine including any additional subroutines being called within the subroutine. This is also referred to as the Cumulative Time in other profilers. Exclusive Time is the total time spent in this subroutine excluding calls to any other subroutines.

Another important thing to note about the reported times is that there are no hard and fast rules for what is considered the ideal time for any subroutine. Everything is relative. Look at things in red to see if the reported time makes sense given the nature of the data and function being performed. The hotspots only indicate what’s taken the most time in the program, it does not mean that something is necessarily inefficient.

Statement Profiling

NYTProf reports on blocks of code in addition to subroutines. It measures the time between entering one perl statement and entering the next.

Perl makes heavy use of additional system calls like regexes, operating sytem functions, and user input. NYTProf intercepts perl opcodes and reports them as subroutines. This allows you to evaluate whether a block of code was inefficient due to something within your code or if the system call is what’s slowing you down. If there is an alternative method to using that system call then you can see how much an improvement it makes by implementing it and re-running the profiler.

Use the -slowops=N option to customize how NYTProf reports opcodes.

Subroutine Profiling

NYTProf determines subroutine durations by measuring the time between entering a subroutine and leaving it. As discussed earlier, it accumulates both Inclusive vs Exclusive times for each subroutine. It also reports how frequently a subroutine was called both for the whole program and in a given location. The profile reports  separate values for each location where a subroutine was called.

Recursive Subroutines

NYTProf differentiates regular subroutines from recursive ones. It reports the Inclusive time for outermost call and also indicates that the subroutine in recursive. It also reports the maximum recursion depth.

Application Profiling

NYTProf is a fairly sophisticated profiler in that it can support threads and forks. For complex applications, it records extra info in a data file, including:

  • Filename
  • Line ranges of subroutines
  • Detects forks

It also supports Apache Profiling with mod_perl hooks. See CPAN module notes for more info on how to hook into Apache.

A note about optimization

Optimization is not always necessary. You really have to ask yourself if the amount of time to execute something makes sense given the amount of work. If the task is pretty hairy, chances are you’re better off finding a different area to optimize.

A lot of small changes add up. If you can make a few small optimizations to subroutines that are called the most frequently, you can shave seconds off of your times.

Devel::NYTProf just finds the hotspots (kind of like SQL Explain). It really doesn’t know anything about how your program is supposed to work, it just gives you a bunch of interesting statistics ūüôā

More Info

Tim Bunce’s Devel::NYTProf Talk

Devel::NYTProf POD

 

Written by missaugustina

February 5, 2011 at 5:15 am

Posted in Perl, Programming

An Introduction to Object-Oriented Perl

with 2 comments

I’ve been using Perl for a few years now and I am finally getting to a place where I’m working with more complex problems. Traditionally I’ve written little utility scripts or handlers for small specific tasks. In my current position, one of my roles is writing ad-hoc scripts for bulk reporting and bulk updates to a large number of database records for client requests. As I’m passing some of these duties on to my coworkers, I realized I needed a template to ensure consistency and to make their lives easier. I’ve also been tasked with a larger project for which I’ve decided the solution is to make a single Object-Oriented API in our code base that can be called by many different sources.

My original training is in Java/C++ and Object-Oriented Design. I wanted to approach both of these problems using my OO background, but I was having trouble grasping how OO really works in Perl. I’ve scoured the internet and poured through books, and none of them could really answer my questions. Finally, I sat down with someone recently who answered my “why why why” questions (and who also proofread this entry for accuracy! Thanks Dave!!). I decided I should document this here in hopes that it might help others who are in the same conundrum.

This article will be most useful if you already understand OO and you understand Perl but you just want to know from a bottom up perspective how OO works in Perl.

Key Concepts

Unlike languages like C++ and Java where Object-Orientation was considered during the language development, in Perl it was more of an afterthought.

Bless

Bless is a built-in Perl function that turns any reference into an object. While generally it’s good practice to use hash references, there are many differing schools of thought on Perl OO, including inside-out objects. These techniques are beyond the scope of this particular article¬†(bad OO pun haha)¬†so I won’t discuss them here.

There are 2 key differences between references and blessed references (aka objects). An object contains an OBJECT flag telling the interpreter that its an object. An object also contains a string with the name of its class so the interpreter can look up methods called on the object that are defined in its class.

When you instantiate an object in your executable Perl code, the Perl interpreter stores the class name in the object data.¬†When a method is called on the object instance, Perl attempts to resolve the method at runtime. Using the class name, the interpreter looks for the method in the class definition. If it does not find the method, it looks in @ISA to see if the class is a child and if the method is defined in the parent class. If it does not find the method defined for the class, the interpreter throws a “Can’t locate object method” exception.

Here are examples using Devel::Peek to view what a reference and an object looks like:

First examine what a hash reference looks like:

perl -MDevel::Peek -wle 'my $not_an_object = {}; print Dump($not_an_object)'

Output:

SV = RV(0x90e492c) at 0x90e4920
REFCNT = 1
FLAGS = (PADMY,ROK)
RV = 0x90e4880
SV = PVHV(0x90ef340) at 0x90e4880
REFCNT = 1
FLAGS = (SHAREKEYS)
ARRAY = 0x0
KEYS = 0
FILL = 0
MAX = 7
RITER = -1
EITER = 0x0

Now examine what an blessed hash reference (aka object) looks like:

perl -MDevel::Peek -wle 'my $object = bless {}, "main"; print Dump($object)'

Output:

SV = RV(0x979d92c) at 0x979d920
REFCNT = 1
FLAGS = (PADMY,ROK)
RV = 0x979d880
SV = PVHV(0x97a8340) at 0x979d880
REFCNT = 1
FLAGS = (OBJECT,SHAREKEYS)
STASH = 0x979d770 "main"
ARRAY = 0x0
KEYS = 0
FILL = 0
MAX = 7
RITER = -1
EITER = 0x0

Notice the second FLAGS entry and the STASH entry. STASH contains both the string of the class name and the address where the string is stored in memory. FLAGS contains the value OBJECT which tells the Perl interpreter that $object is an object.

Intrinsic vs. Explict Methods

In Perl, a method does not¬†intrinsically¬†know what object or class it’s called on. When a method is called, the Perl interpreter either passes a reference to that object or the class name as the first argument. When writing methods that take arguments, the first argument passed in by the interpreter is the object reference or the class name depending on whether the method was called by an object or by the class itself. For instance, the new() constructor is called on the class so the first argument the interpreter passes in is a string containing the class name. For a method you define called by an object instance, the¬†interpreter¬†passes in a reference to that object.

While it’s good practice to call the first argument¬†$self or $me, you can call it whatever you want.

Here’s are two examples of handling arguments in methods:

sub my_method {
my $self = shift;
my ($arg1, $arg2) = @_;
}

sub my_method {
my ($self, $arg1, $arg2) = @_;
}

Here’s an example of calling the method, for this example the class is “MyClass”.

my $my_class = MyClass->new();

$my_class->my_method($arg1, $arg2);

When calling a class method without an object instance, the first argument is a string representing the name of the class.

MyClass->my_method($arg1, $arg2);

Class Methods vs. Object Methods

In Perl, you don’t have to instantiate an object to call its class methods, so you have to provide your own enforcement if this is a possible risk.

When a method is called by a class, the first argument passed is a string containing the name of the class. When a method is called by an object, the first argument is a reference to the object. To enforce calling methods by object, check that the first argument passed is a reference.


die "Please call this method on an object instance." if (! ref $self);

Unenforced Privacy

There are no private methods in Perl except by convention. To designate a method as private, the general practice is to begin the name with an _ character. While this does not prevent use of this method in other contexts, it makes your code and your intentions easier to read.

sub _my_private_method {}

sub my_public_method {}

Data Attributes

One of the advantages to using a blessed hash ref for your object is that you can manage basic data attributes without using accessor methods or having to maintain a long list of variables. I believe accessor methods should be used under two¬†circumstances: 1.¬†when they add some value or additional functionality or enforceability with regards to data types and data values, and 2. when a data attribute needs to be modified external to the class definition. When dealing with simple values used internal to the class definition, maintaining a long list of variables and getter/setter methods is overkill. This is my opinion and I’m sure there are plenty of others out there that differ, so do what works best for the problem you are trying to solve. I’m just presenting an alternative that was shown to me that I’ve found to be pretty handy.


$self->{NAME} = $name;

Instead of having to create and maintain a get/set_name method, your object can access it’s own “name” with $myobject->{NAME}.

Garbage Collection

Once an object goes out of scope, the Perl interpreter calls DESTROY on it prior to dereferencing the memory. Unless your class has specific requirements for what needs to happen when the object is destroyed, you don’t need to implement the destroy method.

To implement the DESTROY method, define a method called DESTROY in all caps.

Perl does not automatically call DESTROY on super classes. To DESTROY a super class, you must explicitly call $self->SUPER::DESTROY.

How To

As stated in the previous section, there are many philosophies about how to do Object Oriented Perl. The intent of this article is provide basic instructions for getting started with writing a simple class and to explain some of the Perl internals. The reason for this is because I was having a hard time finding the information I needed and up-to-date examples. Once you’ve gotten a general idea of the basics, it’s worthwhile to explore different OO Perl philosophies.

1. Create a class file ending in .pm

For this example, create a file called MyClass.pm and save it where Perl can find it.

Define a new() method to bless your passed in reference and assign it to the class. Define methods as appropriate and return 1 at the end.


package MyClass;
use strict;
use warnings;

sub new {
# first argument is the class name
my ($class, %args) = @_;

# create an anonymous hash ref called $self
# bless $self to set it as obj with class name $class
# return obj ref
my $self = bless {}, $class;

# call a method on $self to set an initial value
if (defined $args{name})
{
$self->set_name($args{name});
# alternatively we can set the name using
# $self->{NAME} = $name;
} else {
die "Please specify a name for the $class instance.";
}

# return the obj instance
return $self;
}

sub set_name {
my ($self, $name) = @_;

if (! ref $self) # if called as a class method, $self is a string, not a ref
{
die "Please call this method on an object instance.";

} else {
# assign class data attributes to keys within the $self hash
$self->{NAME} = "My name is " . $name;
}
}

sub print_name {
my ($self) = @_;
print $self->{NAME}, "\n";
}

1;

2. Create a Perl executable file ending in .pl. Instantiate your class by creating a variable and using the new() method. Call a method on your class to see if it works!


#!/usr/bin/perl -w
use strict;
use MyClass;

# create an object by instantiating MyClass
my $object = MyClass->new(name => 'Eugene');

# call an object method on MyClass
$object->print_name();

# calling a class method returns an error
MyClass->set_name('Marsha');

Written by missaugustina

June 6, 2010 at 10:31 pm

Posted in Perl, Programming

FileMaker ODBC Fetch Forward Error

leave a comment »

After scouring the internet for this mysterious “Fetch Forward” error I sometimes get when migrating FileMaker records to other database formats via ODBC and Perl DBI, I discovered what causes it.

The error is seemingly random and goes like this:
[FileMaker][ODBC FileMaker Pro driver][FileMaker Pro]Unknown error (SQL-HY000)
[FileMaker][ODBC FileMaker Pro driver]An attempt to fetch forward has failed for table: TABLENAME (SQL-HY000)(DBD: st_fetc/SQLFetch err=-1) at myperlscript.pl line 123.

This occurs when FileMaker drops the ODBC connection or is unable to respond in a timely manner.

Solutions are as follows:
1) Set up your access script in a while loop so that when it aborts due to an error it will resume after x amount of time.
2) Minimize the number of indexes (if you had to recover a corrupted database check for indexes in your field definitions, FileMaker likes to create them for you).
3) Minimize the number of calculated fields and summary fields.  Only use what is absolutely necessary.

Written by missaugustina

October 29, 2007 at 1:03 pm

Posted in Perl, Programming

Inserting Line Breaks Into Text

with 2 comments

This will force a line break into every 10th character.

$count = 0;
$_ = '1PATonthebackbackpatonthefoot2PAT3PAT4PAT';

s{
.
}{
if ( (++$count %10) == 0) {
$& . "\n";
} else {
$&;
}
}gex;

print;
print "\n"

$_ can also be a specified scalar

$count = 0;
$string = '1PATonthebackbackpatonthefoot2PAT3PAT4PAT';
#$_ = $string; < ---another option

$string =~ s{
\S
}{
if ( (++$count %10) == 0) {
$& . "\n";
} else {
$&;
}
}gex;

print $string;
print "\n"

This was courtesy of the Perl Cookbook!

Written by missaugustina

April 6, 2007 at 6:04 pm

Posted in Perl, Programming

FileMaker and ODBC

leave a comment »

One of the projects I’m working on right now is replicating a FileMaker Pro 6.0 database (hosted on FileMaker Server 5.5) with a MySQL database. MySQL is more robust and more efficient as well it handles connectivity a lot better. Even through ODBC Filemaker is still a pig.

The best information and code samples I’ve found for interacting with a Filemaker database (via ODBC) with Perl is fmpro migrator. In particular there are code samples that have been extremely helpful for me in getting my databases talking to each other. One such example is here.

For older versions of FileMaker, like what I’m using, you’ll want to set up the DSN on Windows. FileMaker ships with its own ODBC driver and unless you want to pay more money for ODBC connectivity (there are some nice options out there), set theirs up on Windows.

To set up your DSN I recommend this walk through.

If you want to know specifically what SQL statements work with FileMaker’s ODBC, this document is extremely helpful (it’s a pdf):
FileMaker ODBC and JDBC Developers Guide

You’ll need Perl’s DBI and DBD::ODBC modules in order to connect.

Here are samples of simple programs I used to connect:

odbctest.pl

#!/usr/bin/perl

use DBI;

my $db_connect_string_fm = 'fmp_dsn';

my $fm_db_name = 'ODBCTest';

my @rowdata = ();

my $fm_dbh = DBI->connect("dbi:ODBC:$db_connect_string_fm", "", "", {RaiseError => 1, PrintError => 1, AutoCommit => 1})

or die "Can't connect to the Filemaker $db_connect_string_fm database: $DBI::errstr\n";

my $fm_sth = $fm_dbh->prepare("select ID, Name from $fm_db_name where ID>42600");

$fm_sth->execute();

while (@rowdata = $fm_sth->fetchrow_array()) {

my ($id, $name) = @rowdata;

print "id:$id name:$name\n";

}

$fm_sth->finish();

$fm_dbh->disconnect or warn "Can't disconnect from Filemaker $db_connect_string_fm database: $DBI::errstr\n";

here’s an example of updating (be sure you don’t have AutoCommit set to 0):

odbcdatetest.pl

#!/usr/bin/perl

#test mod date comparision

use DBI qw(:sql_types);

my $today = "\'3/27/2007\'";

my $db_connect_string_fm = 'fmp_dsn';

my $fm_db_name = 'ODBCTest';

my @rowdata = ();

my $fm_dbh = DBI->connect("dbi:ODBC:$db_connect_string_fm", "", "", {RaiseError => 1, PrintError => 1, AutoCommit => 1})

or die "Can't connect to the Filemaker $db_connect_string_fm database: $DBI::errstr\n"

unless $dbh;

my $fm_sth = $fm_dbh->do("update $fm_db_name set Last_Export=$today");

my $fm_sth2 = $fm_dbh->prepare("select ID, Last_Export from $fm_db_name where ID>42600");

$fm_sth2->execute();

while (@rowdata = $fm_sth2->fetchrow_array()) {

my ($id, $date) = @rowdata;

print "id:$id last_export:$date\n";

}

$fm_sth2->finish();

$fm_dbh->disconnect or warn "Can't disconnect from Filemaker $db_connect_string_fm database: $DBI::errstr\n";

Regarding field types, FileMaker itself will let you insert values into fields that don’t correspond with a fields defined type. Make sure your fields are defined correctly. If you try to use a WHERE statement on a text field and you pass in a number, the ODBC connection will complain (even if the data in the database doesn’t follow this rule). The same is true when using a SELECT statement – if the data in FileMaker is defined as a number but the actual data is text, ODBC will treat it as a number and you won’t get the results you expect.

When passing text values through SQL, the values must be surrounded by single quotes, for example, ‘text’. To manage this in perl, use an escape character, \’text\’. Dates and times need to be surrounded by curly braces: {1/1/2007}.

Written by missaugustina

March 27, 2007 at 6:06 pm

Posted in Perl, Programming