Minutes from 4/19/2002 LSE conference call
(compiled by Shailabh Nagar nagar@us.ibm.com)

Daniel Phillips : status of work
--------------------------------

- working on an embedded platform
- an early proof-of-concept patch for uml is  working and has been posted
to
lkml/lse

Major changes :
1) ulong changes to architectures where phys_t type is long long
2) bring all occurrances of mem_map into the per-arch page.h files

He has introduced a new term "ordinal" (formerly known as pagenum) , which
is
related to the logical address space in that
      logical address = ordinal << PAGE_SHIFT

3) Initializing the section table in a generic way is a challenging task.
One way to do it is to hardcode it for each platform (this would not affect
discontig_mem much). A more generic approach is through the bootprompt
using entries like
    mem = <start physical address>@<size>

A list of memory regions can be appended through such entries and these
would then populate the physical address space of the kernel.
The entries would typically be passed from lilo. The order would need to be
significant.

The information could also come from the BIOS. If someone can
provide ACPI support for the code that Daniel has sent out it would be
good. The  bootprompt should override the ACPI settings.


Bill Irwin : Status of pagemap_lru lock breakup
---------------------------------------------

Bill has identified some more issues :

- its necessary to go through a truncate algorithm to be safe
- examined Arjans version of lru lock breakup
- looked at pte chain locking : a port of Martin's work
- concerned about size of struct page
- made per zone pte chain free lists

rmap runs on 32-bit NUMA. Stress testing results are good indicating the
patches are stable. Thus lock breakup is working but effect on performance
is not known yet. Ruth Forrester expected to have results for rmap soon.

Preliminary results : Lock breakup brings NUMA contention close to that
for scanning VM. Martin will send out a mail on some results.
There was a debate on the importance of guaranteeing stability before
measuring performance.


Dave Olien : proposed preemption deferal interface
--------------------------------------------------

Dave discussed an interface for applications to tell the kernel to hold off
on involuntary preemption of a task. A scenario in which this would be
useful
 is where one task holds a lock and gets preempted. Other tasks get
scheduled
only to come and block while trying to acquire the lock. If the first task
could
delay its preemption, some overhead could be saved.

Two issues need to be addressed :
1) mechanism to inform kernel should be inexpensive. System calls are too
heavy.
For the prototype, Dave wil use the gs segmentation register, which will
break the
modify_ldt syscall. Eventually an archicture independent solution  has to
be found
e.g. using shared memory pages as in fast user level locks etc.

2) preventing denial of servie attacks/unfairness : the task requesting
non-preemption  should not be allowed to hog the CPU. One way to address
the problem is to require  privileges to use the interface. Other way is to
make the request advisory instead of  mandatory. The kernel could give the
task a time window where it would not be preempted.  Beyond the window,
preemption would go on as before.
Daniel suggested using a credit balance of CPU quantum usage where the
non-preemption  period would use up the invoking tasks time-slice faster.
Thus an app would be penalized
or charged for use of the nonpreemption interface and CPU time would still
be fairly  distributed over a longer time period.

Dave clarified that the work would be in the context of the 2.4 kernel.

Other
-----

Gerritt announced that the  lse04 rollup patch was available from the lse
webpage. It  collected the most stable patches that had been used to
enhance performance in the 2.4  kernel.
If there are issues with the rollup patch or suggestions for including
other patches,  please contact Gerritt (gh@us.ibm.com) or send it out on
the lse mailing list.


Daniel revealed that his next project would be on soft page sizes for the
page cache.