Heureka, I’ve found it …

Since I’ve build my PC in the beginning of 2008 I’ve been struggling with the disc write performance. Evidence of my problems are these posts: here, here and here and here.

According to the changelogs of the recently released Linux kernels, there were quite a bit of changes in the ext3/ext4 file systems. Therefore in the beginning of October I moved all my file systems from XFS to Ext4 in the hope for things to get better. Once the data was restored the situation was as bad as before, it even felt a bit worse. During this backup and restore process for the first time it dawned on me, that my Samsung F1 hard disk might be the culprit. In particular once I looked at the S.M.A.R.T. data with GSmartControl.

S.M.A.R.T. Data Hardcopy When I looked at the data for the first time after the restore operation, the data looked much than in this screenshot. There recoverd ECC and Soft Errors in 6-7 digit numbers. However never ever did I see that the disk reported an error to the operating system.

And, what I probably should have considered earlier, that the transfer to the external ESATA drive always worked with top performance, the copy back process to the original disk than was awful again. I even went so far to copy the OS to another external USB drive to remove the OS from the equation. I booted from the USB drive and redid the backup and restore operation. Again with the same result, that the Samsung disk showed a very bad write performance. This was when I finally decided to order another hard drive. Tom’s Hardware gave Samsung Spinpoint F3 pretty good marks and that’s the one I ordered then.

Gkrellm Snapshot And what can I say? My problems were gone, completely and utterly gone. Look at the little Gkrellm screenshots on the right. The left half shows the copy process from the old, “bad” disk (top half) to “good” new disk. As you can see, the new disk is even faster than the old one. The right half however shows the copy process from the “good” F3 disk to the “bad”, old F1 disk. Here you can observe the very low write performance of the old F1 disk (top half).

Then I obtained the Samsung hutil disk utility and did a low level reformatting of the old F1 disk and also did self diagnostic with a complete surface scan. Both operation didn’t show any problem whatsoever. I redid some copy operations to the old F1 and the new F3 disks and logged the output from vmstat and created the below diagrams.

Each time the data was copied with the following command line:

tar -c -b 128 -f - source-dir | tar -x --checkpoint=1000 -b 128 -C /destination-dir/

The data was collected with vmstat 10 | tee /tmp/logfile. Roughly 120Gb were transferred.

Optimal Performance Diagram This the diagram for the copy process to the “good” new F3 disk (bi is “block in” for the read op, bo is “block out” for the write op). The source disk was the old F1 disk. The whole process took about 38 minutes. This is the raw text file from the vmstat command. As you can see the read performance of the old F1 disk is perfectly fine.

Bad Performance Diagram And here is the diagram for copy process from the new F3 to the old F1 disk. Simply bloody awful write transfer rate. The whole operation took about 98 minutes. This is the raw text file from the vmstat command. The spikes are probably due to the hard disk write cache. Now wonder, that the whole desktop might get sluggish with such a bad behavior.

I’m so annoyed, that I didn’t draw the right conclusions earlier. I could have spared me more than 1½ year of frustration with Linux in general.


Tags:

 
 
 

2 Responses to “Heureka, I’ve found it …”

  1. Gravatar of Greg Surbey Greg Surbey
    16. February 2010 at 22:16

    Glad you found the issue! Now you can enjoy your computer a lot more :-)

    Try these out: smartctl -a /dev/sda hdparm -tT /dev/sda

    Usually best for DRA to be zero

    sdparm –get DRA /dev/sda

    For performance, WCE should be one

    sdparm –get WCE /dev/sda iostat -m -x sda -d 2

  2. Gravatar of Greg Surbey Greg Surbey
    16. February 2010 at 22:23

    The above post has mangled line breaks, sorry, donno why it did that…

Leave a Reply