Netperf3 Run Rules

 

.

 

Table of Contents

Introduction

What Netperf does

Test Environment

How to run Netperf3

Install Software

Test Hardware

Test Execution

Data Archive

 

Appendix 1 – Kernel Compilation and  Installation

Appendix 2 – Contents of Config script files

Appendix 3 –  Test Script files 

Appendix 4 –  Generate Netperf Results

 

Introduction:

    Netperf is a benchmark that can be used to measure various aspects of networking performance between a single pair of machines.  Its primary focus is on streaming, request/response using either TCP or UDP protocols.  Netperf3 is an experimental version available at www.netperf.org. Which has both thread and process model.  Netperf3+  (updated version that needs to be built) is an  upgraded version that supports multi-adapter, tcp_rr and tcp_crr features. Right now Netperf3+ does not synchronize the test among multiple clients, so we run the test  between only two machines (client and server). 

 

What Netperf3 does?

 

Netperf 3 is a client/server network application that measures network throughput through tcp/udp streaming, tcp request/response, tcp connect/request/response tests using multiple adapters on both server and client.  The throughput is measured in Mbits/second. The throughput statistics is reported at the client end.

 

Test Environment:

 

    Hardware   

    Server Configuration  

    CPU:                  4 Intel Pentium III 500 MHz

     CACHE:             2 MB

     MEMORY:          2.5GB

     NETWORK:       Intel Ether pro cards using 100mb    

     OS:                     RedHat 6.2+kernel 2.4.0/Kernel 2.4.4/Kernel 2.4.7

 

     Client Configuration   

     CPU:                   8 Intel Pentium III 700 MHz  

     CACHE:             2MB

     MEMORY:          2.5GB

     NETWORK:       Intel Ether Pro cards using 100 mb

     OS:                     RedHat 6.2+Kernel 2.4.0/Kernel 2.4.4/Kernel 2.4.7

 

  Hardware:

         The network NICs are connected using cross-over cables for present workload. For future workloads we will use  fast ethernet, full duplex 24 ports Foundary Network Fast Iron workgrup switch. The network traffic is balanced across all the adapters through static host routes and permanent arps.

 

How do I run Netperf3?

      a.   Linux Kernel: Get the correct kernel at kernel.org

                Compile and install kernel (see Appendix 2)

   

      b.   Installation of Netperf and its script files:

 

  Download netperf3 available at www.netperf.org under experimental directory and apply netperf_adp.patch to get multi-adapter support. Build Netperf3+ and Netserver3+ with pthread flag turned on.  In this document Netperf3+ server is referred as netserver3 and client as netperf3.

 

  Copy the server (netserver3) to /usr/local/bin subdirectory on the server and the client (netperf3) to /usr/local/bin on the client machine.  Netserver is a daemon which can be run either standalone or as part of inetd daemon.  Add /usr/local/bin to the PATH. Also copy 4adp_1thread.sh,2adp_1thread.sh,1adp_1thread.sh,4adp_100thread.sh,2adp_100thread.sh,1adp_100thread.sh, start.sh, stop.sh, bsetup.sh, esetup.sh, vm.sh to /usr/local/bin directory at the client end and copy serv4.sh, serv2.sh, serv1.sh, start.sh, stop.sh, bsetup.sh, esetup.sh and vm.sh  script files to /usr/local/bin subdirectory at the sever end.  Make separate directories for each set of configuration. For eg., for running 4 adapter stream tests using kernel 2.4.0, make s240_4adp_1way_1thread, stream s240_4adp_2way_1thread,s240_4adp_4way_1thread directories on both the server and client machines. Since I run different tests back-to-back I chose to run server daemon as standalone and not through inetd.  I also use the default port for Netperf. (change script files to take port, remote host and local host as inputs).

.

 Linux Kernel:Build the kernel selecting the correct drivers for your network cards under Network device option in menuconfig. If you are using IBM Etherjet or Intel Ether pro adapters, bump up the TX_RING_SIZE and RX_RING_SIZE to 128 before building the kernel otherwise the adapter will fail with "out of resource" reason. This will be fixed in later kernels (not fixed in 2.4.0/2.4.4 used for the test)

 

2. Prepare Test Hardware

 

a. Kernel Selection and Server/client  Reboot: 

    1. Logon to the server/client  as root 

    2. Edit /etc/lilo.conf to boot the correct kernel  

    3. Run "lilo"  

    4. Run "reboot" or "shutdown -r now" to reboot the server

 

b. Load Balance Network Traffic: 

Copy route.sh file to /usr/local/bin and execute this file from  /etc/rc.local file so that every time the system starts the static routes and permanent arps will be set.  

 

How to set Static Routes and Permanent Arps:  

 Type "netstat -rn" to see the existing routes:

execute /usr/local/bin/route.sh to set the static routes and permanent arps on both the client and server systems if you want to refresh that is, if you restart the network. Route.sh gets executed from startup files. route.sh has the following commands:

 

route add -remote_host remote_ipaddress interface_device_name

arp -s remote_host remote_hwaddr

 

Issue the above "route" command for each interface_device_name, namely in our case, eth1, eth2, eth3, eth4 and Issue the above "arp" command for each remote_interface.After setting the static route address type "netstat -rn" to see the static routes and issue "arp -a" to see the arp entries.

  Make sure that all the adapters are functioning using the following   commands:

 ifconfig   -- shows all the interfaces that are up

 ping       -- ping  each remote host on both the server and client

 netstat -i -- shows some statistics such as bytes xmitted, recvd,

               Errors, collisions etc.,

 

 Each interface should have equal number packets as a result of ping command being executed prior to this. 

 

cat /proc/dev/net --  also shows the statistics for each interface. This one shows number of bytes transferred.

 

Avoid other network interference: Bring down the interface that are not used for the test especially the ones that are connected to the backbone by issuing "ifdown interface" (eth0 in our case).  The interface that is connected to the backbone can be  deciphered by using "ifconfig" command.

 

Set up hosts file:  Update /etc/hosts file to add the remote hostnames and its ip addresses on  both the server and client systems.

 

Config changes for 1000 connection test:

Execute the following commands at each session where you want to run netperf and netserver.

 

ulimit -n 8192                                   -- which increases the number of open files per process to

                                                            8192 and the  default is 1024.

echo "32767" > /proc/sys/fs/file-max -- increases the maximum files per process

 

 Test Execution:

   Make sure the path is set to /usr/local/bin in path environment The convention followed for the directory setup on both client and servers are:

 Testname_kernelversion_adp_cpuused_connections/  

   For eg., s240_4adp_1way_1connection/

 

Go to the test directory on the server and execute serv4.sh to start the server.  Go to another session start vm.sh just before starting the test on the client machine. The script vm.sh collects all the system information at the beginning of the test, collects vmstat (cpu usuage) while the test is being executed and finally collects netstatistics after the test completes.

 

On the client machine, open a session, go to the test specific directory and execute vm.sh that collects system information, cpu usuage info for every 4 seconds contiuously and netstatistics at the end. Open another session, go to the test specific directory and execute the test-specific script file.  Note that changing to test specific directory such as    /usr/local/bin/stream240/4adp_1way_1thread is important to collect testspecific data.Note: When you open a session you need to log-in as root to set static routes,  permanent arps etc.,

Since netperf3 does not include convergence test, the tests are run 3 times to make sure the test converges within 5% variation. Take the average of all the 3 runs as results.

 

Data Archive: 

1. On the client under the test directory, network throughput log files for each msg size (4,16,1024,2048,4096,8192,32768) are collected along with   vmstat.log, network statistics, system information before test start and after the test completes.

 2. On the server under the test directory, vmstat.log file, network statistics, system information before test and after test are collected.

 

 All these data need to be saved for data review and verification later and for all future references

  

Appendix 1: Build and Install a Linux Kernel

 

Get the kernel

Get the Linux kernel from ftp://www.kernel.org.   Kernel images are usually located at /pub/linux/kernel/v2.4.   Kernels are available in gzip and bzip2 compressed format.

 

Unpack the Kernel Tar File

Change directory to ‘/usr/src’ and unpack the tarball file (substitute “2.4.0” with the version you are using).  For gzip compressed kernels: ‘tar zxvf linux-2.4.0.tar.gz’.   For bzip2 compressed kernels: ‘bunzip -c linux-2.4.0.tar.bz2 | tar xvf -’.  This will create a new directory called ‘linux’.  Move linux to linux-<version>: ‘mv linux linux-2.4.0’.  Create a symbolic link ‘linux’ to the new kernel directory:  ‘ln -s linux-2.4.0 linux’

 

 

Configure the Kernel

Make sure to have no stale .o files and dependencies lying around:

            ‘cd linux’

            ‘make mrproper’

Use “make menuconfig” to build the kernel using the default options plus :

·        Select SMP option for SMP kernels

·        For UP kernels, enable APIC support

·        Add 4GB mem support.

·        Add IBM ServeRAID support (in kernel, not module) as needed.

·        Add Adaptec AIC7xxx support (in kernel, not module) as needed.

·        Add appropriate network NIC support (in kernel, not module) as needed

 

Make the Kernel

Set up all the dependencies and build a compressed kernel image

            ‘make dep’

            ‘make bzImage’

 

If you configure any parts of the kernel as ‘modules’, you need to do

            ‘make modules’

            ‘make modules_install’

 

Boot the Kernel

To boot the new kernel, you need to copy the kernel image bzImage that you just built to the directory where your bootable kernel is normally found, e.g., /boot, and rename it to bzImage-240-SMP

            ‘cp arch/i386/boot/bzImage   /boot/bzImage-240-SMP’

 

You need to also save the file System.map

            ‘cp system.map    /boot/system.map’

 

Configure the lilo boot loader by editing the file /etc/lilo.conf and adding the following entry.

            image=/boot/bzImage-240-SMP  # specify the kernel is in bzImage-240-SMP                   

             label=240SMP                                  # give it the name “240SMP”

            root=/dev/hda1                                  # use /dev/hda1 is the root filesystem partition

                                                             # change /dev/hda1 to your root filesystem

            read-only

            Append=”CPU=4 idle=poll”

Install lilo on your drive by running

            ‘lilo’

 

Shutdown and reboot the system

            ‘shutdown –r now’

 

Appendix 2:  Test Preparation Script Files:

     An example of  each script file is given below:

         Route.sh  is used to set the host routes and permanent arps:

                 route add -host 198.110.20.15 eth1

        route add -host 198.110.20.16 eth2

        route add -host 198.110.20.17 eth3

        route add -host 198.110.20.18 eth4

        arp -s 198.110.20.15  00:04:AC:D8:37:FB

        arp -s 198.110.20.16  00:04:AC:D8:81:D6

        arp -s 198.110.20.17  00:04:AC:D8:83:2C

        arp -s 198.110.20.18  00:04:AC:D8:38:3A

 

         Serv4.sh is used to start the server using 4 adapters:

                 netserver3 -H perf1,perf2,perf3,perf4

 

      

         rstart.sh file collects data before test start and its contents:

          start.sh                        -- collects system information

          netstat -i &> netbb       -- collects network statistices/per interface

          cat /proc/stat &> bb      -- collects cpu statistics

          vmstat 4 |tee vmstat.log  -- collects cpu statistics for every 4 seconds continuously

 

          Rstop.sh file collects data after the test run and its contents are:

             stop.sh                       -- collects system information

             netstat -i &> netee     -- network statistics

             cat/proc/stat &> ee    -- cpu utlization during the test duration      

              

            The contents of start.sh:

               cat /proc/meminfo > meminfo-start

        cat /proc/slabinfo > slabinfo-start

        cat /proc/cpuinfo > cpuinfo

        cat /proc/version > version

        cat /proc/interrupts > interrupts-start

        cat /proc/net/netstat > netstat-start

        uname -a > uname

        ifconfig -a > ifconfig-start 2>&1

        ps axu > ps-start 2>&1

 

           The contents of stop.sh:

              ps axu > ps-end 2>&1

       cat /proc/meminfo > meminfo-end

       cat /proc/slabinfo > slabinfo-end

       cat /proc/interrupts > interrupts-end

       cat /proc/net/netstat > netstat-end

       ifconfig -a > ifconfig-end 2>&1

 

Appendix 3:   Test Script Files

  There are three sets of script files for tcp protocol test. Right now

we use only Stream test files.

StreamTestfiles_tcp:    

  s_4adp_1thread.sh, s_2adp_1thread.sh, s_1adp_1thread.sh       s_4adp_100thread.sh, s_2adp_100thread.sh, s_1adp_1thread.sh

 

Rrtestfiles_tcp:   

   rr_4adp_1thread.sh, rr_2adp_1thread.sh, rr_1adp_1thread.sh       rr_4adp_100thread.sh, rr_2adp_100thread.sh, rr_1adp_1thread.sh

 

CRRtestfiles_tcp:     

 crr_4adp_1thread.sh, crr_2adp_1thread.sh, crr_1adp_1thread.sh       crr_4adp_100thread.sh, crr_2adp_100thread.sh, crr_1adp_1thread.sh

 

 

 

Appendix 4: Generate Netperf Results:

   Collect  the  throughput log files for each test run from the test specifc directory. For example,

For a 4adapter,1way server,1connection test,  copy msgsize.log (where msgsize is 4, 16, 1024 etc.,) files from s240_4adp_1way_1thread directory.  The network throughput is collected per thread so in  test runs that uses multiple thread (multiple connections), the throughput needs to be sumed up.  Right now we use single client so the results are collected only at the client end.

 

 

Appendix 5:  Sample Results:

 

                                           Netperf3 Stream Test - 1 Connection/Adapter

                                             2P  500 MHz 770 MB Mem

                                                    4 adpater test