A while back when I finished rewriting arename's meta-data-reading code, I ran some tests to measure the performance of the new code compared to the old one.

The new code uses Audio::Scan to read data from input files, while the old code used multiple backend modules: MP3::Tag, Ogg::Vorbis::Header and Audio::FLAC::Header. The idea behind changing such an integral part of arename was to remove a lot of the layer that was hiding the differences between the different backends, to get support for more file types and to get significantly improved performance.

To see whether the latter part was achieved, I did the following:

This compares the performance of arename version 3 to the one of version 4.

The test was done on the following OS:

% uname -srm
Linux 2.6.32-5-686 i686
% /lib/libc.so.6
GNU C Library (Debian EGLIBC 2.11.2-7) stable release version 2.11.2,
by Roland McGrath et al.
[...]
Compiled by GNU CC version 4.4.5.
Compiled on a Linux 2.6.32 system on 2010-10-31.

The computer, that was used, is an IBM T60 laptop, equipped with 2GiB of RAM, a 1.66GHz Intel Single Core CPU (1MiB of Cache) and a Fujitsu MHZ2160B SATA harddisk (5400rpm).

The shell used here is zsh. The array `$files[]' is a list of 7495 audio files (mostly .mp3 and .ogg), created using:

% files=(**/*.(#i)(mp3|flac|ogg)(.))

The Linux version used doesn't have a maximum size for external commands, so this works without xargs or other tricks. The total size of the files is about 40GiB.

Using the array defeats the need to produce a file list before actually running arename. Arename is also run in very quiet and dry-run mode.

The audio files reside in a AES-256 encrypted ext2 partition.

The "cold cache" situation was created by using

% sync; echo 3 > /proc/sys/vm/drop_caches

with appropriate privileges.

The v4 times were produced using the code from this commit:

e9f672d3132e1a15607e1a3d62fdde06022a4695

The v3 times are based on code from this commit:

5be0e68ffcd407b643759084e07c305581f4052f

Here are the version three times:

Cold cache:
  % arename -dQ "${files[@]}" > /dev/null 2>&1
  33.07s user 3.58s system 14% cpu 4:18.06 total
Warm cache:
  % arename -dQ "${files[@]}" > /dev/null 2>&1
  18.67s user 1.04s system 88% cpu 22.178 total

And in comparison, the version four times are as follows:

Cold cache:
  % arename -dQ "${files[@]}" > /dev/null 2>&1
  9.23s user 2.70s system 5% cpu 3:46.39 total
Warm cache:
  % arename -dQ "${files[@]}" > /dev/null 2>&1
  4.87s user 0.51s system 87% cpu 6.151 total

The `total' values in warm cache fluctuate a little, but the v4 times are settling down at just under four times the speed of the v3 times. The cold cache times have not improved quite as impressively. But here, hard-disk performance is the likely bottleneck.

Posted Fri 28 Oct 2011 17:06:28 CEST Tags: