Pipe performance results ------------------------ We have taken three patches for evaluation. 1. Large Pipe (our patch) 2. Zero copy (Manfred's) 3. Large Pipe + Zero copy (our newly integrated patch) Benchmarks ---------- * Grep (over 50Mb file) * lmbench (bw_pipe) * pipeflex Result Summary: -------------- Grep ---- * Large pipe showed improvement around 165% on 2-way and 3% degradation on UP. * Zero copy showed improvement around 39% on 2-way and 53% degradation on UP. * The integrated patch, 2-way performance raised to 202% and UP showed a 2% improvement. lmbench ------- * Large pipe showed up to 24% degradation on 2-way and up to 36% degradation on UP. * Zero copy showed ~65% improvement on 2-way and UP. * The newly integrated patch showed up to 132% improvement for low Transfer size (~4k), and still 1% improvement for up to 32k (which is the size of pipe). But once the zero copy kicks off again (transfer size > 32k) the performance goes up to 27% for 128k of transfer size. Whereas on a UP, still there is about 42% degradation for transfer size < large buffer size(32k), and around 27% improvement for transfer size > large buffer. pipeflex -------- * Large pipe showed improvement up to 292% on 2-way, and 1% improvement on UP. * Zero copy showed up to 42% degradation on 2-way and up to 87% on UP. * The integrated patch showed up to 338% improvement on a 2-way and up to 48% degradation on UP. In overall, all the benchmarks started showing improvement for the integrated patch on SMP system, but it doesn't help for UP kernel (neither large pipe or zerocopy). So if we providing configurable support for larger pipes but sticking with 4K pipe size in the UP case does not add any additional overhead. We measured 4k pipe size for UP and observed negligible variation in the range of ~1 to ~ -2. Results of 32k pipe size on a SMP kernel as well as UP kernel are in Table 1. Results of 4k pipe size in UP case are in Table 2. Note ---- Grep: Grep over 50Mb file for an non-existing pattern. bw_pipe: We altered that code by providing the chunk size as a variable input parameter which is bw_pipe [2,4,...,32] Pipeflex: The version we tested, reads the token size to transfer will be in bytes (not in KB) -r => token size to read from pipe -w => token size to write into pipe We used fixed 512 bytes for write and 2,4,..128k bytes for read (Transfer Size ie. T.S) sample run: pipeflex -c2 -t 20 -x 500 -y 0 -r [Transfer Size] -w 512 -o 0 -m 0 The notations we used for different patches are LP - large pipe patch ZC - zero copy patch LP+ZC - large pipe and zero copy integrated patch T.S - Transfer size All the results shown here are % improvement over their Base kernel ------------------------------------------------------------------- Table 1 ------- Grep ---- 2way UP ---- -- LP(32k) ZC LP(32k)+ZC LP(32k) ZC LP(32k)+ZC ------- -- ---------- ------- -- ---------- % imp 165 40 203 -3 -53 -3 lmbench ------- 2way UP ---- -- T.S LP(32k) ZC LP(32k)+ZC LP(32k) ZC LP(32k)+ZC --- ------- -- ---------- ------- -- ---------- 2k 4 54 102 -18 16 -10 4k 19 21 133 -27 21 -8 6k 50 70 89 -23 73 -42 8k -1 64 55 -31 30 -62 12k -13 38 8 -36 14 -23 16k -16 38 6 -34 17 -24 24k -24 31 6 -35 22 -14 32k -26 27 1 -33 25 -13 64k -24 23 23 -35 27 26 128k -23 30 27 -36 41 28 pipeflex -------- 2way UP ---- -- T.S LP(32k) ZC LP(32k)+ZC LP(32k) ZC LP(32k)+ZC --- ------- -- ---------- ------- -- ---------- 2k -0 -24 -2 -1 -24 -5 4k 0 -42 -5 -2 -42 -11 6k 10 -11 11 -4 -23 -11 8k 46 -16 24 -7 -41 -20 12k 93 -15 95 -6 -57 -14 16k 163 -14 97 -6 -65 -31 24k 183 -16 174 -7 -74 -31 32k 359 -18 167 -8 -79 -43 64k 276 -15 338 -7 -85 -39 128k 292 -15 279 -4 -88 -48 ------------------------------------------------------------------- Table 2 (UP) ------- Grep ---- LP(4k) ZC LP(4k)+ZC ------ -- --------- % imp 1.61 -53 -53 lmbench pipeflex ------- --------- T.S LP(4k) ZC LP(4k)+ZC LP(4k) ZC LP(4k)+ZC --- ------ -- --------- ------ -- --------- 2k -2.58 16 2 -0.27 -24 -24 4k -2.47 21 23 0 -42 -4 6k 0.21 73 63 0.94 -23 -24 8k 5.98 30 20 -0.14 -41 -42 12k -0.9 14 12 0.11 -57 -57 16k -0.43 17 16 -0.42 -65 -66 24k 0.12 22 21 -0.63 -74 -74 32k 0.98 25 24 0.05 -79 -79 64k 0.07 27 46 -1.28 -85 -85 128k -1.17 41 40 -1.64 -88 -87 -------------------------------------------------------------------