Pipe performance results
------------------------
We have taken three patches for evaluation.
1. Large Pipe (our patch)
2. Zero copy (Manfred's)
3. Large Pipe + Zero copy (our newly integrated patch)

Benchmarks
----------
* Grep (over 50Mb file)
* lmbench (bw_pipe)
* pipeflex

Result Summary:
--------------

Grep
----
* Large pipe showed improvement around 165% on 2-way and 3%
  degradation on UP. 
* Zero copy showed improvement around 39% on 2-way and 53% 
  degradation on UP. 
* The integrated patch, 2-way performance raised to 202% 
  and UP showed a 2% improvement.

lmbench
-------
* Large pipe showed up to 24% degradation on 2-way and up to
  36% degradation on UP.
* Zero copy showed ~65% improvement on 2-way and UP.
* The newly integrated patch showed up to 132% improvement for low 
  Transfer size (~4k), and still 1% improvement for up to
  32k (which is the size of pipe). But once the zero copy kicks off again
  (transfer size > 32k) the performance goes up to 27% for 128k of transfer
  size. Whereas on a UP, still there is about 42% degradation for transfer 
  size < large buffer size(32k), and around 27% improvement for transfer 
  size > large buffer.

pipeflex
--------
* Large pipe showed improvement up to 292% on 2-way, and 1%
  improvement on UP. 
* Zero copy showed up to 42% degradation on 2-way and up to 87% on UP. 
* The integrated patch showed up to 338% improvement on a 2-way and up to 
  48% degradation on UP.

In overall, all the benchmarks started showing improvement for the integrated
patch on SMP system, but it doesn't help for UP kernel (neither large pipe
or zerocopy). 
So if we providing configurable support for larger pipes but sticking with 4K 
pipe size in the UP case does not add any additional overhead. We measured
4k pipe size for UP and observed negligible variation in the range 
of ~1 to ~ -2.


Results of 32k pipe size on a SMP kernel as well as UP kernel are in Table 1.
Results of 4k pipe size in UP case are in Table 2.


Note
----
Grep: Grep over 50Mb file for an non-existing pattern.

bw_pipe: We altered that code by providing the chunk
size as a variable input parameter which is bw_pipe [2,4,...,32]

Pipeflex: The version we tested, reads the token size to transfer
will be in bytes (not in KB)
-r  => token size to read from pipe
-w => token size to write into pipe

We used fixed 512 bytes for write and 2,4,..128k bytes for read (Transfer
Size ie. T.S)

sample run: pipeflex -c2 -t 20 -x 500 -y 0 -r [Transfer Size] -w 512 -o 0
-m 0

The notations we used for different patches are
LP - large pipe patch
ZC - zero copy patch
LP+ZC - large pipe and zero copy integrated patch
T.S - Transfer size


All the results shown here are % improvement over their Base kernel
-------------------------------------------------------------------
                             Table 1
                             -------
Grep
----
                    2way                           UP
                    ----                           --

          LP(32k)   ZC   LP(32k)+ZC          LP(32k)   ZC    LP(32k)+ZC
          -------   --   ----------          -------   --    ----------
% imp       165     40       203                -3    -53        -3

lmbench
-------
                    2way                           UP
                    ----                           --

T.S       LP(32k)   ZC   LP(32k)+ZC       LP(32k)   ZC    LP(32k)+ZC
---       -------   --   ----------       -------   --    ----------
2k           4      54       102            -18     16      -10
4k          19      21       133            -27     21      -8
6k          50      70        89            -23     73      -42
8k          -1      64        55            -31     30      -62
12k        -13      38         8            -36     14      -23
16k        -16      38         6            -34     17      -24
24k        -24      31         6            -35     22      -14
32k        -26      27         1            -33     25      -13
64k        -24      23        23            -35     27       26
128k       -23      30        27            -36     41       28

pipeflex
--------
                    2way                           UP
                    ----                           --

T.S       LP(32k)   ZC   LP(32k)+ZC       LP(32k)   ZC    LP(32k)+ZC
---       -------   --   ----------       -------   --    ----------
2k          -0     -24       -2             -1     -24       -5
4k           0     -42       -5             -2     -42       -11
6k          10     -11       11             -4     -23       -11
8k          46     -16       24             -7     -41       -20
12k         93     -15       95             -6     -57       -14
16k        163     -14       97             -6     -65       -31
24k        183     -16      174             -7     -74       -31
32k        359     -18      167             -8     -79       -43
64k        276     -15      338             -7     -85       -39
128k       292     -15      279             -4     -88       -48 


-------------------------------------------------------------------
                        Table 2 (UP)
                        -------
Grep
----                                       

          LP(4k)   ZC    LP(4k)+ZC
          ------   --    ---------
% imp      1.61    -53      -53


                  lmbench                      pipeflex
                  -------                      ---------

T.S       LP(4k)   ZC    LP(4k)+ZC       LP(4k)    ZC    LP(4k)+ZC
---       ------   --    ---------       ------    --    ---------
2k        -2.58    16       2             -0.27   -24      -24
4k        -2.47    21       23                0   -42       -4
6k         0.21    73       63             0.94   -23      -24
8k         5.98    30       20            -0.14   -41      -42
12k       -0.9     14       12             0.11   -57      -57
16k       -0.43    17       16            -0.42   -65      -66
24k        0.12    22       21            -0.63   -74      -74
32k        0.98    25       24             0.05   -79      -79
64k        0.07    27       46            -1.28   -85      -85
128k      -1.17    41       40            -1.64   -88      -87

-------------------------------------------------------------------