Getting Started with Devel::NYTProf
I went to OSCON 2010 and saw Tim Bunce’s talk on Devel::NYTProf. For my local Perl User Group (SPUG) I decided to do a talk on how to get started with NYTProf and to provide an overview of the benefits of code profiling. This entry is a summary of the talk that I gave.
What is Devel::NYTProf?
Devel::NYTProf is a source code profiler. It is a tool that gives you another perspective on how your code is functioning. Typically people use NYTProf to find places in their code that could be optimized to run faster, however it can also be used to better understand what is happening when your program executes.
Why should I use Devel::NYTProf?
If you’re working with unfamiliar code, NYTProf provides a high level overview on what parts you need to pay the most attention to. It breaks the source code down line by line and reports execution times for each block and subroutine. NYTProf highlights hotspots in your code which helps to identify priority areas that need improvement. NYTProf provides visualization tools that are not only useful for communicating to other folks who might not be as familiar with the source code, but are also useful for right-brained artsy-fartsy visually-oriented people like myself 😉
How do I use Devel::NYTProf?
1. Install the module Devel::NYTProf (I used CPAN)
Run the following command:
perl -d:NYTProf my-perl.pl
2. Set options to customize behavior
(See the POD for more info about custom options)
3. Format output
When you run NYTProf, it creates an output file in the same location where you ran it (unless you’ve specified something different in the options). To read the output, format it using NYTProf’s formatters. These are discussed in the POD.
If you are using the HTML formatter, by default it creates a folder called nytprof in the same directory you run it from. Open the index.html file contained in that directory to view the formatted profile.
How do I read the output?
Once you’ve formatted the output into something human-readable, making sense of what’s being reported is the next step. Besides an HTML layout, there are visualizations of the code in Graphiz and a Treemap based on subroutine frequency and inclusive time.
The key terms to be aware of when profiling are Inclusive Time and Exclusive Time. Inclusive Time is the total time to run a subroutine including any additional subroutines being called within the subroutine. This is also referred to as the Cumulative Time in other profilers. Exclusive Time is the total time spent in this subroutine excluding calls to any other subroutines.
Another important thing to note about the reported times is that there are no hard and fast rules for what is considered the ideal time for any subroutine. Everything is relative. Look at things in red to see if the reported time makes sense given the nature of the data and function being performed. The hotspots only indicate what’s taken the most time in the program, it does not mean that something is necessarily inefficient.
NYTProf reports on blocks of code in addition to subroutines. It measures the time between entering one perl statement and entering the next.
Perl makes heavy use of additional system calls like regexes, operating sytem functions, and user input. NYTProf intercepts perl opcodes and reports them as subroutines. This allows you to evaluate whether a block of code was inefficient due to something within your code or if the system call is what’s slowing you down. If there is an alternative method to using that system call then you can see how much an improvement it makes by implementing it and re-running the profiler.
Use the -slowops=N option to customize how NYTProf reports opcodes.
NYTProf determines subroutine durations by measuring the time between entering a subroutine and leaving it. As discussed earlier, it accumulates both Inclusive vs Exclusive times for each subroutine. It also reports how frequently a subroutine was called both for the whole program and in a given location. The profile reports separate values for each location where a subroutine was called.
NYTProf differentiates regular subroutines from recursive ones. It reports the Inclusive time for outermost call and also indicates that the subroutine in recursive. It also reports the maximum recursion depth.
NYTProf is a fairly sophisticated profiler in that it can support threads and forks. For complex applications, it records extra info in a data file, including:
- Line ranges of subroutines
- Detects forks
It also supports Apache Profiling with mod_perl hooks. See CPAN module notes for more info on how to hook into Apache.
A note about optimization
Optimization is not always necessary. You really have to ask yourself if the amount of time to execute something makes sense given the amount of work. If the task is pretty hairy, chances are you’re better off finding a different area to optimize.
A lot of small changes add up. If you can make a few small optimizations to subroutines that are called the most frequently, you can shave seconds off of your times.
Devel::NYTProf just finds the hotspots (kind of like SQL Explain). It really doesn’t know anything about how your program is supposed to work, it just gives you a bunch of interesting statistics 🙂