Minutes from 4/19/2002 LSE conference call (compiled by Shailabh Nagar nagar@us.ibm.com) Daniel Phillips : status of work -------------------------------- - working on an embedded platform - an early proof-of-concept patch for uml is working and has been posted to lkml/lse Major changes : 1) ulong changes to architectures where phys_t type is long long 2) bring all occurrances of mem_map into the per-arch page.h files He has introduced a new term "ordinal" (formerly known as pagenum) , which is related to the logical address space in that logical address = ordinal << PAGE_SHIFT 3) Initializing the section table in a generic way is a challenging task. One way to do it is to hardcode it for each platform (this would not affect discontig_mem much). A more generic approach is through the bootprompt using entries like mem = @ A list of memory regions can be appended through such entries and these would then populate the physical address space of the kernel. The entries would typically be passed from lilo. The order would need to be significant. The information could also come from the BIOS. If someone can provide ACPI support for the code that Daniel has sent out it would be good. The bootprompt should override the ACPI settings. Bill Irwin : Status of pagemap_lru lock breakup --------------------------------------------- Bill has identified some more issues : - its necessary to go through a truncate algorithm to be safe - examined Arjans version of lru lock breakup - looked at pte chain locking : a port of Martin's work - concerned about size of struct page - made per zone pte chain free lists rmap runs on 32-bit NUMA. Stress testing results are good indicating the patches are stable. Thus lock breakup is working but effect on performance is not known yet. Ruth Forrester expected to have results for rmap soon. Preliminary results : Lock breakup brings NUMA contention close to that for scanning VM. Martin will send out a mail on some results. There was a debate on the importance of guaranteeing stability before measuring performance. Dave Olien : proposed preemption deferal interface -------------------------------------------------- Dave discussed an interface for applications to tell the kernel to hold off on involuntary preemption of a task. A scenario in which this would be useful is where one task holds a lock and gets preempted. Other tasks get scheduled only to come and block while trying to acquire the lock. If the first task could delay its preemption, some overhead could be saved. Two issues need to be addressed : 1) mechanism to inform kernel should be inexpensive. System calls are too heavy. For the prototype, Dave wil use the gs segmentation register, which will break the modify_ldt syscall. Eventually an archicture independent solution has to be found e.g. using shared memory pages as in fast user level locks etc. 2) preventing denial of servie attacks/unfairness : the task requesting non-preemption should not be allowed to hog the CPU. One way to address the problem is to require privileges to use the interface. Other way is to make the request advisory instead of mandatory. The kernel could give the task a time window where it would not be preempted. Beyond the window, preemption would go on as before. Daniel suggested using a credit balance of CPU quantum usage where the non-preemption period would use up the invoking tasks time-slice faster. Thus an app would be penalized or charged for use of the nonpreemption interface and CPU time would still be fairly distributed over a longer time period. Dave clarified that the work would be in the context of the 2.4 kernel. Other ----- Gerritt announced that the lse04 rollup patch was available from the lse webpage. It collected the most stable patches that had been used to enhance performance in the 2.4 kernel. If there are issues with the rollup patch or suggestions for including other patches, please contact Gerritt (gh@us.ibm.com) or send it out on the lse mailing list. Daniel revealed that his next project would be on soft page sizes for the page cache.