LSE Con Call Minutes for May 17 I. VM discussion and status with Andrea Arcangeli. o ZONE_NORMAL over 1G phys on NUMAQ o config_nonlinear on iSeries o pte-highmem/highpte scalability o roadmap for atomic-persistent user-address-space-based kmaps in 2.5 o ZONE_NORMAL exhaustion. o 4GB - 4GB split 1. ZONE_NORMAL over 1G phys on NUMAQ Andrea is not convinced NUMAQ(tm) architecture needs the whole config nonlinear patch. Martin Bligh said they have been thinking of NUMAQ needs as a special case of config nonlinear. NUMAQ only needs half of the config nonlinear changes daniel has done but not all of it. Martin Bligh said there is still an interesting problem with doing dma to a zone normal above 4gb. Andrea said it is a secondary problem. If you never use pci32 device then is not a problem. Martin said we should separate out the assumptions of dma. Let the allocator sort out where it will get memory from. Andrea said to make additional flag to explicityly specify greater than 4gb ZONE NORMAL on other nodes. Martin liked that idea. 2. config_nonlinear on iSeries Andrea said one architecture where config nonlinear is absolutely required is iSeries(tm). They can have huge holes of physical memory spread over large space, without Daniel'ss patch managing memory could be very slow. Tis is an architecture where discontig mem doesnt work well. If have hundreds of holes absolutely need the non linear patchs. Andreas doesnt have access to a system with this architecture and asked if it would be a big problem for the system administrator. Dave Englebret from IBM iSeries said there is no visibility to the administrator of the memory. The hypervisor doesn't allow the sysadmin to the memory; Linux doesnt see any holes because of virtual mapping. Dave said one thing iSeries would like to see in Linux as soon as it is ready is hotplug cpu. which is already supported in os400. Martin asked if they would have a problem with noncontiguous physical memory assumptions being broken. Dave Englebret said we dont have to worry about non physically contiguous memory because all their memory is mapped virtually contiguous anyway. Also DMA doesn't need it on iSeries either. 3. pte-highmem/highpte scalability Martin Bligh said Andrea's changes appear to have fixed the problem they were having on the NUMAQ. Dave McKracken talked about reserving a portion of user address space specifically for mapping pagetables. However, there seems to be a performance problem due to the extra allocations. 4. atomic-persistent user-address-space-based kmaps Andrea said we should get per-cpu persistant kmaps working first before working on highpte for big memory issues. Bill said Manfred Spraul has already been working on per-cpu persistent kmaps and it mostly works so we can look into other stuff. Andrea said there is still some work to be done with per-cpu persistant kmaps. We need to run more benchmarks to help figure out the best solutions. 5. Zone normal exhaustion and 4gb/4gb split Bill was wondering if "bigmem" might be reinstated to help this problem as a config option? That was a patch from Kanoj in the 2.2 kernel timeframe that would switch address spaces whenever doing a context switch. wouldnt have to map so much other stuff but would switch much more often. Andrea suggested kmapping struct page for real 64gb systems. Which was Ben LaHaise's idea of having 4gb for kernel and 4gb for user then using kmap whenever kernel wants to access user pages. Andrea said that would still be very invasive but potentially less so than kmapping struct page. Bill said it would only be used for 32 bit large systems via a config option. Currently they dont work so we need some solution. Bill and Martin asked if it might be accepted into the mainline as a config option. Andrea said it is up to Linus not me. but if it could be made clean and not intrude too much it might get in. ---------- minutes compiled by Hanna Linder (hannal@us.ibm.com)