To: Paul McKenney/Beaverton/IBM@IBMUS
cc: andi@suse.de, andrea@suse.de, aono@ed2.com1.fc.nec.co.jp, beckman@turbolabs.com, bjorn_helgaas@hp.com, Hubertus Franke/Watson/IBM@IBMUS, Jerry.Harrow@Compaq.com, jwright@engr.sgi.com, kanoj@engr.sgi.com, kumon@flab.fujitsu.co.jp, norton@mclinux.com, suganuma@hpc.bs1.fc.nec.co.jp, sunil.saxena@intel.com, tbutler@hi.com, woodman@missioncriticallinux.com, mikek@sequent.com, Kenneth Rozendal/Austin/IBM@IBMUS
Date: 03/29/01 03:16 PM
From: Kanoj Sarcar
Subject: Re: NUMA-on-Linux roadmap, version 2
> o General specification of "hints" that indicate relationships
> between different objects, including tasks, a given task's
> data/bss/stack/heap ("process" to the proprietary-Unix people
> among us), a given shm/mmap area, a given device,
> a given CPU, a given node, and a given IPC conduit.
> Status: Needs definition, research, prototyping, and
> experimentation. Urgency: Hopefully reasonably low,
> given the complexities that would be involved if not done
> correctly.
>
> I believe that this is more a research topic than a set of
> requirements that one could develop to at present (though there
> is certainly ample room for practical prototyping of some of the
> alternatives). I have asked Hubertus Franke to check and see
> if there are universities or other research institutions that
> would be interested in working on this. People interested in
> this area, please let Hubertus know!
>
If it wasn't apparent from the fact that I proposed this, I would like
to mention I am trying to come up with an api based on this idea. I am
trying to make it general enough so that you could also ask for physical
cpu/memory bindings via the api. I will refrain from talking about this
until I have something more concrete, at which point we can discuss
whether it is too complicated etc. By the way, the IRIX mld* interfaces
were in concept designed to do something like this, but were later abused
to be representations of actual hardware, instead of providing the
virtualization that they should provide.
Instead of trying to focus in on implementation specifics right away,
I would suggest just focusing on the things that we want to do. It seems
accepted that we want to be able to tie a thread to a set of processors,
be able to allocate memory from certain nodes etc. _How_ to tell the
kernel to do all this is either via cpuset/memoryset/nodeset/this-new-hunky-api
etc. This will at least partly depend on whether some of these will become
part of standard Linux. For example, there is good logic to claim that
cpuset/memorysets are needed for non-NUMA too ... in which case, no point
implementing nodeset/this-new-hunky-api. I am expecting Linus to provide
some feedback on cpuset/memoryset acceptability.
Kanoj