Proposed Memory Binding API

Version: 0.4
Last Updated: November 5, 2002

Revision History

Versions 0.1 and 0.2 by Paul McKenney, IBM, 2001
Version 0.3 by Michael Hohnbaum, IBM, February 11, 2002
Version 0.4 by Matthew Dobson, IBM, October 25, 2002

Table of Contents

  1. General
    1. Include Files
    2. Memory Blocks vs Nodes
  2. Memory Binding Calls
  3. Future Extensions


This Memory Binding API is intended to be an API which provides for the binding of process' memory to specific Memory Blocks.  This API relies on the in-kernel Topology API [link] for topology information.  The idea behind this API is to allow for processes that cannot achieve their memory allocation needs using the default kernel allocation policy.  These processes will typically be large, NUMA-aware processes, such as databases, webservers, etc.

Include Files

The definitions are accessed via:
#include <linux/membind.h>

struct memblk_list

[This is where you describe memblk_list structure]
[This is where you describe the bitmask part]
[This is where you describe the behavior part]

Memory Blocks vs. Nodes

There is often some confusion as to what exactly a Memory Block is, and how it differs from a 'node'.  The following definitions will hopefully dispell any such confusion.


Memory Block
A Memory Block is defined to be a physically contiguous block of memory.  Memory Block is often written more concisely as memblk.  Typically, nodes will contain exactly one memblk.  This is not a rule, but it is a basic assumption of the Linux kernel's memory management routines.  This API makes no assumptions about the relationship/mapping between nodes and memblks and leaves it entirely at the discretion of those implementing the Topology API.
A Node is no more or less than an abstract container for other topology elements.  For the purposes of this API, the node concept is only useful in that nodes are what contain memblks.  This API attempts to deal with nodes as little as possible, and dwell solely in the realm of memory blocks.

Memory Binding Calls

Below are defined the function calls used to implement the Memory Binding API.

  1.  int sys_mem_setbinding(pid_t pid, unsigned int len, unsigned long *usr_mask, unsigned long usr_bhvr);
    Sets the memory binding of a given process.  Given the pid of a process, this call looks up the correct task_struct, copies the given binding bitmask and behavior from userspace, checks their validity, and then sets the process binding appropriately.

    Returns 0 on success, or a negative errno if an error occurs.

  2.  int sys_mem_getbinding(pid_t pid, unsigned int len, unsigned long *usr_mask, unsigned long *usr_bhvr);
    Gets the memory binding of a process.  Given the pid of a process, this call looks up the correct task_struct and copies its current binding bitmask and behavior to userspace via the pointers passed in.

    Returns the length of the bitmask on success, or a negative errno if an error occurs.

Future Extensions

There are additional capabilities that could be implemented with the MemBinding API.  Some have been identified and are listed below:
  1. Allowing much finer grained bindings, such as binding vma's, memory regions, or individual pages to particular memory blocks.
  2. Following up the API with an efficient implementation of soft-bindings (allowing bindings to fall back to non-bound memory blocks).
  3. Implementing different allocations schemes, such as Round-Robin, Striping, or First-Touch.