Monday, February 22, 2010

Profiling

Although I like Criminal Minds, this blog entry is less thrilling: it's about profiling applications like Dolphin.

When I faced a performance issue in Dolphin, I usually used valgrind in combination with callgrind to find the bottleneck (valgrind --tool=callgrind dolphin -nofork). The output can be visualized by KCacheGrind in a nice way:


One big benefit of Valgrind is that it gives 100 % coverage of user-space code. But as usual there are rarely benefits without drawbacks: As valgrind is a kind of virtual machine, that uses just-in-time compiling, the application runs around 5 times slower. This can get a problem if timers are involved in the application. Also performance issues in combination with threads are tricky to identify, because valgrind does not give a non-obtrusive performance overview of the overall system.

So I started to get familiar with OProfile during the last week. OProfile is a non-obtrusive system-wide profiler, which means that the application runs (nearly) at the same speed as without profiler. OProfile gives an overview about the workload of all processes in the system. This is very useful for Dolphin, as the overall performance of Dolphin depends also on e. g. Nepomuk and asynchronous operations to e. g. get file previews. It is possible to convert the OProfile output to a readable KCachegrind file by 'opreport -gdf | op2calltree'.

Dependent on the usecase, both tools are really helpful. The next thing on my TODO-list is to get more familiar to locate I/O bound bottlenecks: When reading the number of sub directories for 20000 directories, the bottleneck is definitely not on the CPU side (in this case neither Valgrind nor OProfile can detect the bottleneck). All in all I hope that I find the time during the KDE SC 4.5 cycle to improve the Dolphin performance in some areas with the help of these tools.

4 comments:

Anonymous said...

You may want to take a look at perf http://kernelnewbies.org/Linux_2_6_31#head-6004ec219c203c60037057dbebaf0a04fe22f19c (there has been many changes in the last kernel releases aswell)

It's once of the most actively developed parts of the kernel, it is being used instead of oprofile for many things, and once UPROBES gets in the kernel you will have userspace probes (like dtrace/systemtap)

Anonymous said...

For I/O "profiling" iotop may be some start (it's not profiling per se, but at least it allows one to easily see what process is responsible for current I/O activities).

maninalift said...

woohoo improving dolphin performance. I was just thinking this morning about the fact that Dolphin takes several seconds to load files of a few thousand files and a noticeable time to load even small folders and wishing I could do something to help fix it. have fun and good luck.

Anonymous said...

I've written a file synchronizer called csync. To make it fast I profiled it using strace in the end.

I wanted to see which operation are done, which are not necessary and can be called once. The other thing was to look at the order of syscalls and how you can change them to be faster. The result was pretty good.