ement is very significant.
 We are getting killed by the cacheline bouncing of the files_struct
 lock here. Writes on ramdisk (ext2) seems to vary just too
 much to get any meaningful number.

 Also, With Tridge's thread_perf test on a 4(8)-way (HT) P4 xeon system :

 2.6.12-rc5-vanilla :

 Running test 'readwrite' with 8 tasks
 Threads     0.34 +/- 0.01 seconds
 Processes   0.16 +/- 0.00 seconds

 2.6.12-rc5-fd :

 Running test 'readwrite' with 8 tasks
 Threads     0.17 +/- 0.02 seconds
 Processes   0.17 +/- 0.02 seconds

 I repeated the measurements on ramfs (as opposed to ext2 on ramdisk in
 the earlier measurement) and I got more consistent results from tiobench :

 4(8) way xeon P4
 -----------------
                                         (lock-free)
 Test            2.6.12-rc5      Stdev   2.6.12-rc5-fd   Stdev
 -------------------------------------------------------------
 Seqread         1282            18.59   1343.6          26.37
 Randread        1517            7       2415            34.27
 Seqwrite        702.2           5.27    709.46           5.9
 Randwrite       846.86          15.15   919.68          21.4

 4-way ppc64
 ------------
                                         (lock-free)
 Test            2.6.12-rc5      Stdev   2.6.12-rc5-fd   Stdev
 -------------------------------------------------------------
 Seqread         1549            91.16   1569.6          47.2
 Randread        1473.6          25.11   1585.4          69.99
 Seqwrite        1096.8          20.03   1136            29.61
 Randwrite       1189.6           4.04   1275.2          32.96

 Also running Tridge's thread_perf test on ppc64 :

 2.6.12-rc5-vanilla
 --------------------
 Running test 'readwrite' with 4 tasks
 Threads     0.20 +/- 0.02 seconds
 Processes   0.16 +/- 0.01 seconds

 2.6.12-rc5-fd
 --------------------
 Running test 'readwrite' with 4 tasks
 Threads     0.18 +/- 0.04 seconds
 Processes   0.16 +/- 0.01 seconds

 The benefits are huge (upto ~60%) in some cases on x86 primarily
 due to the atomic operations during acquisition of ->file_lock
 and cache line bouncing in fast path. ppc64 benefits are modest
 due to LL/SC based locking, but still statistically significant.

This patch:

RCU head initilizer no longer needs the head varible name since we don't use
list.h lists anymore.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
tűxzś$x