Archive for May 2008

 
 

Lightzone for Linux

Great news for Linux and digital photography enthusiasts. Apparently since yesterday “Lightzone for Linux” is available as an official product with the same functionality as the Windows and Mac Versions. Lightcrafts released a couple of betas for Linux, but no “real” product. They are even offering an introductory discount. With this discount the purchase of Lightzone will set me back about 100 Euro.

Hopefully enough Linuxers decide to buy this great product, so that Lightzone continues to be available as a full product.

Still Struggling with Performance

I’m still struggling to get a handle on the performance problems, that I seeing on my Core 2 Duo box. Somehow I can’t convince myself to go back to 32bit Linux. I’m still hoping, that the Linux community will produce a fix in the not too distant future. Personally I still think, that the problem is hidden somewhere in the file system layer.

While I can reliably reproduce the problem by loading a pretty big MP3 file into Audactiy, I didn’t have any luck to produce the observed behavior any other way. For instance, I loaded the particular MP3 file into Audacity. Then, from the content of Audacity’s data directory I produced a shell script, which would produce an equivalent file tree in terms of directory names, file size and names, by executing a sequence of mkdir and dd commands. No luck however. The script ran with top speed and showed no performance problems what so ever. Another approach I tried is, that I created a big tar-archive from Audacity’s data directory and then untared it. Again, the tar-archive was unpacked with top speed and didn’t ran into these extended periods of large wait-I/O percentages. Apparently it takes the particular load profile, that Audacity produces, that the problem is triggered.

Since 2.6.25 was released not too long ago, I was eager to redo my tests with this version. But first I did another run under 2.6.24, just to be sure. The problem is indeed still very visible, but the pattern changed slightly. In the past Audacity would start running with top loading speed for a certain amount of time, then for about an ½ of the complete loading time the load process would slow down to a crawl. Then at the end of the load the speed would again increase to the max. Now, as I redid the test, Audacity would crawl along for the first half and ran at top speed for the rest of the time (as can be seen in the left most picture down below). The only difference, that I can see between this run and the ones from the past, that the file system was filled in the meantime with a couple of 3-4 Gb mpg video files.

I then redid the tests with the vanilla 2.6.25 kernel and the Gentoo patched kernel. With these kernels I activated the latencytop kernel parameter. Here are the diagrams, that I produced from the overall 4 runs. The left most is for 2.6.24, the next for vanilla 2.6.25, then Gentoo 2.6.25 and the last another vanilla 2.6.25.

Run with 2.6.24 Run with 2.6.25 (Vanilla, latencytop) Run with 2.6.25 (Gentoo, latencytop) 2nd Run with 2.6.25 (Vanilla, latencytop)

The runtime as observed from the Audacity progress bar was (in order): 3m36s, 4m35s, 2m49, 3m36. In general the the loading time is definitely much shorter than in the past, where I observed times in the 7-8 minutes range. The only difference I can think of between now and then is the different filling grade of the file system.

Since the latencytop kernel parameter was enabled, here are some snapshot from the latencytop command, which I did, when the loading slowed down to a crawl.

Cause                                               Maximum          Average
Writing back inodes                               1142.8 msec          8.4 msec
Creating block layer request                      358.3 msec        167.7 msec
Writing to file                                    76.1 msec         76.1 msec
Reading EXT3 block bitmaps                         44.6 msec         36.0 msec
do_select core_sys_select sys_select system_call_a  5.0 msec          1.8 msec
Application requested delay                         5.0 msec          2.1 msec

Cause                                               Maximum          Average
Reading EXT3 block bitmaps                        1348.7 msec        1348.7 msec
Writing back inodes                               476.7 msec         45.0 msec
Creating block layer request                      452.5 msec        290.3 msec
do_select core_sys_select sys_select system_call_a  5.0 msec          1.6 msec
Application requested delay                         4.9 msec          1.8 msec
Waiting for event (poll)                            4.9 msec          0.9 msec

Cause                                               Maximum          Average
Reading EXT3 block bitmaps                        1080.2 msec        501.2 msec
EXT3 Creating a file                              657.6 msec         75.0 msec
Creating block layer request                      564.8 msec        209.5 msec
Writing back inodes                               147.4 msec         17.4 msec
do_select core_sys_select sys_select system_call_a  5.0 msec          1.3 msec

Cause                                               Maximum          Average
EXT3 Creating a file                              902.6 msec        351.8 msec
Writing a page to disk                            370.2 msec         99.4 msec
EXT3: Waiting for journal access                   61.2 msec         61.2 msec
Truncating file                                    25.9 msec         25.9 msec
do_select core_sys_select sys_select system_call_a  5.0 msec          1.6 msec

At least these snapshots seem to indicate, that a pretty large amount of time is spend in the file system layer. The above snapshots come from the first vanilla 2.6.25 run.

Cause                                               Maximum          Average
Reading EXT3 block bitmaps                        1113.8 msec        172.2 msec
EXT3 Creating a file                              547.8 msec         24.7 msec
Writing a page to disk                            535.6 msec         53.2 msec
Truncating file                                   509.0 msec        509.0 msec
EXT3: Waiting for journal access                  348.3 msec        348.3 msec
Writing buffer to disk (synchronous)              126.6 msec        126.6 msec
Reading EXT3 indirect blocks                      108.3 msec         50.8 msec
Creating directory                                 38.3 msec         32.0 msec

This snapshot from the Gentoo 2.6.25 run seems to point in the same direction.

With the following snapshots, the situation is not quite as clear, since this output comes from latencytop 0.4, while the other were from 0.3. Apparently the output format was changed.

Cause                                                Maximum     Percentage
sync_page sync_page_killable __lock_page_killable 973.7 msec         16.3 %
sync_buffer __wait_on_buffer bh_submit_read read_b120.3 msec          0.5 %
Scheduler: waiting for cpu block_write_begin ext3_ 26.0 msec          5.5 %
hrtimer_nanosleep sys_nanosleep system_call_after_  5.0 msec         15.5 %
futex_wait do_futex sys_futex system_call_after_sw  5.0 msec          2.9 %
do_select core_sys_select sys_select system_call_a  5.0 msec         54.9 %
do_sys_poll sys_poll system_call_after_swapgs       5.0 msec          4.2 %
blk_execute_rq scsi_execute scsi_execute_req sr_te  1.9 msec          0.1 %
blk_execute_rq scsi_execute scsi_execute_req scsi_  1.8 msec          0.0 %

Cause                                                Maximum     Percentage
sync_page sync_page_killable __lock_page_killable 1133.1 msec         17.9%
Scheduler: waiting for cpu                         18.2 msec          5.0 %
futex_wait do_futex sys_futex system_call_after_sw  5.0 msec          3.0 %
do_select core_sys_select sys_select system_call_a  5.0 msec         58.7 %
do_sys_poll sys_poll system_call_after_swapgs       5.0 msec          5.1 %
hrtimer_nanosleep sys_nanosleep system_call_after_  5.0 msec         10.1 %
blk_execute_rq scsi_execute scsi_execute_req sr_te  2.2 msec          0.1 %
blk_execute_rq scsi_execute scsi_execute_req scsi_  1.8 msec          0.0 %
blk_execute_rq scsi_execute scsi_execute_req sr_te  1.3 msec          0.0 %

I’m wondering, what other people, who are experiencing the same problems, have as their file system layout. Do they have one large root-fs like me (with 300Gb)? Do people without problems have different file systems for os and user data? Could that be an option for me as well?

Currently I’m using an external Esata-drive for all my Audacity. This provides enough of an workaround at the moment.