Commit 01aa9d51 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge tag 'docs-4.20' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "This is a fairly typical cycle for documentation. There's some welcome
  readability improvements for the formatted output, some LICENSES
  updates including the addition of the ISC license, the removal of the
  unloved and unmaintained 00-INDEX files, the deprecated APIs document
  from Kees, more MM docs from Mike Rapoport, and the usual pile of typo
  fixes and corrections"

* tag 'docs-4.20' of git://git.lwn.net/linux: (41 commits)
  docs: Fix typos in histogram.rst
  docs: Introduce deprecated APIs list
  kernel-doc: fix declaration type determination
  doc: fix a typo in adding-syscalls.rst
  docs/admin-guide: memory-hotplug: remove table of contents
  doc: printk-formats: Remove bogus kobject references for device nodes
  Documentation: preempt-locking: Use better example
  dm flakey: Document "error_writes" feature
  docs/completion.txt: Fix a couple of punctuation nits
  LICENSES: Add ISC license text
  LICENSES: Add note to CDDL-1.0 license that it should not be used
  docs/core-api: memory-hotplug: add some details about locking internals
  docs/core-api: rename memory-hotplug-notifier to memory-hotplug
  docs: improve readability for people with poorer eyesight
  yama: clarify ptrace_scope=2 in Yama documentation
  docs/vm: split memory hotplug notifier description to Documentation/core-api
  docs: move memory hotplug description into admin-guide/mm
  doc: Fix acronym "FEKEK" in ecryptfs
  docs: fix some broken documentation references
  iommu: Fix passthrough option documentation
  ...
parents 5993692f aea74de4
This is a brief list of all the files in ./linux/Documentation and what
they contain. If you add a documentation file, please list it here in
alphabetical order as well, or risk being hunted down like a rabid dog.
Please keep the descriptions small enough to fit on one line.
Thanks -- Paul G.
Following translations are available on the WWW:
- Japanese, maintained by the JF Project (jf@listserv.linux.or.jp), at
http://linuxjf.sourceforge.jp/
00-INDEX
- this file.
ABI/
- info on kernel <-> userspace ABI and relative interface stability.
CodingStyle
- nothing here, just a pointer to process/coding-style.rst.
DMA-API.txt
- DMA API, pci_ API & extensions for non-consistent memory machines.
DMA-API-HOWTO.txt
- Dynamic DMA mapping Guide
DMA-ISA-LPC.txt
- How to do DMA with ISA (and LPC) devices.
DMA-attributes.txt
- listing of the various possible attributes a DMA region can have
EDID/
- directory with info on customizing EDID for broken gfx/displays.
IPMI.txt
- info on Linux Intelligent Platform Management Interface (IPMI) Driver.
IRQ-affinity.txt
- how to select which CPU(s) handle which interrupt events on SMP.
IRQ-domain.txt
- info on interrupt numbering and setting up IRQ domains.
IRQ.txt
- description of what an IRQ is.
Intel-IOMMU.txt
- basic info on the Intel IOMMU virtualization support.
Makefile
- It's not of interest for those who aren't touching the build system.
PCI/
- info related to PCI drivers.
RCU/
- directory with info on RCU (read-copy update).
SAK.txt
- info on Secure Attention Keys.
SM501.txt
- Silicon Motion SM501 multimedia companion chip
SubmittingPatches
- nothing here, just a pointer to process/coding-style.rst.
accounting/
- documentation on accounting and taskstats.
acpi/
- info on ACPI-specific hooks in the kernel.
admin-guide/
- info related to Linux users and system admins.
aoe/
- description of AoE (ATA over Ethernet) along with config examples.
arm/
- directory with info about Linux on the ARM architecture.
arm64/
- directory with info about Linux on the 64 bit ARM architecture.
auxdisplay/
- misc. LCD driver documentation (cfag12864b, ks0108).
backlight/
- directory with info on controlling backlights in flat panel displays
block/
- info on the Block I/O (BIO) layer.
blockdev/
- info on block devices & drivers
bt8xxgpio.txt
- info on how to modify a bt8xx video card for GPIO usage.
btmrvl.txt
- info on Marvell Bluetooth driver usage.
bus-devices/
- directory with info on TI GPMC (General Purpose Memory Controller)
bus-virt-phys-mapping.txt
- how to access I/O mapped memory from within device drivers.
cdrom/
- directory with information on the CD-ROM drivers that Linux has.
cgroup-v1/
- cgroups v1 features, including cpusets and memory controller.
cma/
- Continuous Memory Area (CMA) debugfs interface.
conf.py
- It's not of interest for those who aren't touching the build system.
connector/
- docs on the netlink based userspace<->kernel space communication mod.
console/
- documentation on Linux console drivers.
core-api/
- documentation on kernel core components.
cpu-freq/
- info on CPU frequency and voltage scaling.
cpu-hotplug.txt
- document describing CPU hotplug support in the Linux kernel.
cpu-load.txt
- document describing how CPU load statistics are collected.
cpuidle/
- info on CPU_IDLE, CPU idle state management subsystem.
cputopology.txt
- documentation on how CPU topology info is exported via sysfs.
crc32.txt
- brief tutorial on CRC computation
crypto/
- directory with info on the Crypto API.
dcdbas.txt
- information on the Dell Systems Management Base Driver.
debugging-modules.txt
- some notes on debugging modules after Linux 2.6.3.
debugging-via-ohci1394.txt
- how to use firewire like a hardware debugger memory reader.
dell_rbu.txt
- document demonstrating the use of the Dell Remote BIOS Update driver.
dev-tools/
- directory with info on development tools for the kernel.
device-mapper/
- directory with info on Device Mapper.
dmaengine/
- the DMA engine and controller API guides.
devicetree/
- directory with info on device tree files used by OF/PowerPC/ARM
digsig.txt
-info on the Digital Signature Verification API
dma-buf-sharing.txt
- the DMA Buffer Sharing API Guide
docutils.conf
- nothing here. Just a configuration file for docutils.
dontdiff
- file containing a list of files that should never be diff'ed.
driver-api/
- the Linux driver implementer's API guide.
driver-model/
- directory with info about Linux driver model.
early-userspace/
- info about initramfs, klibc, and userspace early during boot.
efi-stub.txt
- How to use the EFI boot stub to bypass GRUB or elilo on EFI systems.
eisa.txt
- info on EISA bus support.
extcon/
- directory with porting guide for Android kernel switch driver.
isa.txt
- info on EISA bus support.
fault-injection/
- dir with docs about the fault injection capabilities infrastructure.
fb/
- directory with info on the frame buffer graphics abstraction layer.
features/
- status of feature implementation on different architectures.
filesystems/
- info on the vfs and the various filesystems that Linux supports.
firmware_class/
- request_firmware() hotplug interface info.
flexible-arrays.txt
- how to make use of flexible sized arrays in linux
fmc/
- information about the FMC bus abstraction
fpga/
- FPGA Manager Core.
futex-requeue-pi.txt
- info on requeueing of tasks from a non-PI futex to a PI futex
gcc-plugins.txt
- GCC plugin infrastructure.
gpio/
- gpio related documentation
gpu/
- directory with information on GPU driver developer's guide.
hid/
- directory with information on human interface devices
highuid.txt
- notes on the change from 16 bit to 32 bit user/group IDs.
hwspinlock.txt
- hardware spinlock provides hardware assistance for synchronization
timers/
- info on the timer related topics
hw_random.txt
- info on Linux support for random number generator in i8xx chipsets.
hwmon/
- directory with docs on various hardware monitoring drivers.
i2c/
- directory with info about the I2C bus/protocol (2 wire, kHz speed).
x86/i386/
- directory with info about Linux on Intel 32 bit architecture.
ia64/
- directory with info about Linux on Intel 64 bit architecture.
ide/
- Information regarding the Enhanced IDE drive.
iio/
- info on industrial IIO configfs support.
index.rst
- main index for the documentation at ReST format.
infiniband/
- directory with documents concerning Linux InfiniBand support.
input/
- info on Linux input device support.
intel_txt.txt
- info on intel Trusted Execution Technology (intel TXT).
io-mapping.txt
- description of io_mapping functions in linux/io-mapping.h
io_ordering.txt
- info on ordering I/O writes to memory-mapped addresses.
ioctl/
- directory with documents describing various IOCTL calls.
iostats.txt
- info on I/O statistics Linux kernel provides.
irqflags-tracing.txt
- how to use the irq-flags tracing feature.
isapnp.txt
- info on Linux ISA Plug & Play support.
isdn/
- directory with info on the Linux ISDN support, and supported cards.
kbuild/
- directory with info about the kernel build process.
kdump/
- directory with mini HowTo on getting the crash dump code to work.
doc-guide/
- how to write and format reStructuredText kernel documentation
kernel-per-CPU-kthreads.txt
- List of all per-CPU kthreads and how they introduce jitter.
kobject.txt
- info of the kobject infrastructure of the Linux kernel.
kprobes.txt
- documents the kernel probes debugging feature.
kref.txt
- docs on adding reference counters (krefs) to kernel objects.
laptops/
- directory with laptop related info and laptop driver documentation.
ldm.txt
- a brief description of LDM (Windows Dynamic Disks).
leds/
- directory with info about LED handling under Linux.
livepatch/
- info on kernel live patching.
locking/
- directory with info about kernel locking primitives
lockup-watchdogs.txt
- info on soft and hard lockup detectors (aka nmi_watchdog).
logo.gif
- full colour GIF image of Linux logo (penguin - Tux).
logo.txt
- info on creator of above logo & site to get additional images from.
lsm.txt
- Linux Security Modules: General Security Hooks for Linux
lzo.txt
- kernel LZO decompressor input formats
m68k/
- directory with info about Linux on Motorola 68k architecture.
mailbox.txt
- How to write drivers for the common mailbox framework (IPC).
md/
- directory with info about Linux Software RAID
media/
- info on media drivers: uAPI, kAPI and driver documentation.
memory-barriers.txt
- info on Linux kernel memory barriers.
memory-devices/
- directory with info on parts like the Texas Instruments EMIF driver
memory-hotplug.txt
- Hotpluggable memory support, how to use and current status.
men-chameleon-bus.txt
- info on MEN chameleon bus.
mic/
- Intel Many Integrated Core (MIC) architecture device driver.
mips/
- directory with info about Linux on MIPS architecture.
misc-devices/
- directory with info about devices using the misc dev subsystem
mmc/
- directory with info about the MMC subsystem
mtd/
- directory with info about memory technology devices (flash)
namespaces/
- directory with various information about namespaces
netlabel/
- directory with information on the NetLabel subsystem.
networking/
- directory with info on various aspects of networking with Linux.
nfc/
- directory relating info about Near Field Communications support.
nios2/
- Linux on the Nios II architecture.
nommu-mmap.txt
- documentation about no-mmu memory mapping support.
numastat.txt
- info on how to read Numa policy hit/miss statistics in sysfs.
ntb.txt
- info on Non-Transparent Bridge (NTB) drivers.
nvdimm/
- info on non-volatile devices.
nvmem/
- info on non volatile memory framework.
output/
- default directory where html/LaTeX/pdf files will be written.
padata.txt
- An introduction to the "padata" parallel execution API
parisc/
- directory with info on using Linux on PA-RISC architecture.
parport-lowlevel.txt
- description and usage of the low level parallel port functions.
pcmcia/
- info on the Linux PCMCIA driver.
percpu-rw-semaphore.txt
- RCU based read-write semaphore optimized for locking for reading
perf/
- info about the APM X-Gene SoC Performance Monitoring Unit (PMU).
phy/
- ino on Samsung USB 2.0 PHY adaptation layer.
phy.txt
- Description of the generic PHY framework.
pi-futex.txt
- documentation on lightweight priority inheritance futexes.
pinctrl.txt
- info on pinctrl subsystem and the PINMUX/PINCONF and drivers
platform/
- List of supported hardware by compal and Dell laptop.
pnp.txt
- Linux Plug and Play documentation.
power/
- directory with info on Linux PCI power management.
powerpc/
- directory with info on using Linux with the PowerPC.
prctl/
- directory with info on the priveledge control subsystem
preempt-locking.txt
- info on locking under a preemptive kernel.
process/
- how to work with the mainline kernel development process.
pps/
- directory with information on the pulse-per-second support
pti/
- directory with info on Intel MID PTI.
ptp/
- directory with info on support for IEEE 1588 PTP clocks in Linux.
pwm.txt
- info on the pulse width modulation driver subsystem
rapidio/
- directory with info on RapidIO packet-based fabric interconnect
rbtree.txt
- info on what red-black trees are and what they are for.
remoteproc.txt
- info on how to handle remote processor (e.g. AMP) offloads/usage.
rfkill.txt
- info on the radio frequency kill switch subsystem/support.
robust-futex-ABI.txt
- documentation of the robust futex ABI.
robust-futexes.txt
- a description of what robust futexes are.
rpmsg.txt
- info on the Remote Processor Messaging (rpmsg) Framework
rtc.txt
- notes on how to use the Real Time Clock (aka CMOS clock) driver.
s390/
- directory with info on using Linux on the IBM S390.
scheduler/
- directory with info on the scheduler.
scsi/
- directory with info on Linux scsi support.
security/
- directory that contains security-related info
serial/
- directory with info on the low level serial API.
sgi-ioc4.txt
- description of the SGI IOC4 PCI (multi function) device.
sh/
- directory with info on porting Linux to a new architecture.
smsc_ece1099.txt
-info on the smsc Keyboard Scan Expansion/GPIO Expansion device.
sound/
- directory with info on sound card support.
spi/
- overview of Linux kernel Serial Peripheral Interface (SPI) support.
sphinx/
- no documentation here, just files required by Sphinx toolchain.
sphinx-static/
- no documentation here, just files required by Sphinx toolchain.
static-keys.txt
- info on how static keys allow debug code in hotpaths via patching
svga.txt
- short guide on selecting video modes at boot via VGA BIOS.
sync_file.txt
- Sync file API guide.
sysctl/
- directory with info on the /proc/sys/* files.
target/
- directory with info on generating TCM v4 fabric .ko modules
tee.txt
- info on the TEE subsystem and drivers
this_cpu_ops.txt
- List rationale behind and the way to use this_cpu operations.
thermal/
- directory with information on managing thermal issues (CPU/temp)
trace/
- directory with info on tracing technologies within linux
translations/
- translations of this document from English to another language
unaligned-memory-access.txt
- info on how to avoid arch breaking unaligned memory access in code.
unshare.txt
- description of the Linux unshare system call.
usb/
- directory with info regarding the Universal Serial Bus.
vfio.txt
- info on Virtual Function I/O used in guest/hypervisor instances.
video-output.txt
- sysfs class driver interface to enable/disable a video output device.
virtual/
- directory with information on the various linux virtualizations.
vm/
- directory with info on the Linux vm code.
w1/
- directory with documents regarding the 1-wire (w1) subsystem.
watchdog/
- how to auto-reboot Linux if it has "fallen and can't get up". ;-)
wimax/
- directory with info about Intel Wireless Wimax Connections
core-api/workqueue.rst
- information on the Concurrency Managed Workqueue implementation
x86/x86_64/
- directory with info on Linux support for AMD x86-64 (Hammer) machines.
xillybus.txt
- Overview and basic ui of xillybus driver
xtensa/
- directory with documents relating to arch/xtensa port/implementation
xz.txt
- how to make use of the XZ data compression within linux kernel
zorro.txt
- info on writing drivers for Zorro bus devices found on Amigas.
00-INDEX
- this file
acpi-info.txt
- info on how PCI host bridges are represented in ACPI
MSI-HOWTO.txt
- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
PCIEBUS-HOWTO.txt
- a guide describing the PCI Express Port Bus driver
pci-error-recovery.txt
- info on PCI error recovery
pci-iov-howto.txt
- the PCI Express I/O Virtualization HOWTO
pci.txt
- info on the PCI subsystem for device driver authors
pcieaer-howto.txt
- the PCI Express Advanced Error Reporting Driver Guide HOWTO
endpoint/pci-endpoint.txt
- guide to add endpoint controller driver and endpoint function driver.
endpoint/pci-endpoint-cfs.txt
- guide to use configfs to configure the PCI endpoint function.
endpoint/pci-test-function.txt
- specification of *PCI test* function device.
endpoint/pci-test-howto.txt
- userguide for PCI endpoint test function.
endpoint/function/binding/
- binding documentation for PCI endpoint function
00-INDEX
- This file
arrayRCU.txt
- Using RCU to Protect Read-Mostly Arrays
checklist.txt
- Review Checklist for RCU Patches
listRCU.txt
- Using RCU to Protect Read-Mostly Linked Lists
lockdep.txt
- RCU and lockdep checking
lockdep-splat.txt
- RCU Lockdep splats explained.
NMI-RCU.txt
- Using RCU to Protect Dynamic NMI Handlers
rcu_dereference.txt
- Proper care and feeding of return values from rcu_dereference()
rcubarrier.txt
- RCU and Unloadable Modules
rculist_nulls.txt
- RCU list primitives for use with SLAB_TYPESAFE_BY_RCU
rcuref.txt
- Reference-count design for elements of lists/arrays protected by RCU
rcu.txt
- RCU Concepts
RTFP.txt
- List of RCU papers (bibliography) going back to 1980.
stallwarn.txt
- RCU CPU stall warnings (module parameter rcu_cpu_stall_suppress)
torture.txt
- RCU Torture Test Operation (CONFIG_RCU_TORTURE_TEST)
UP.txt
- RCU on Uniprocessor Systems
whatisRCU.txt
- What is RCU?
......@@ -87,7 +87,3 @@ o Where can I find more information on RCU?
See the RTFP.txt file in this directory.
Or point your browser at http://www.rdrop.com/users/paulmck/RCU/.
o What are all these files in this directory?
See 00-INDEX for the list.
......@@ -64,8 +64,8 @@ The sysctl settings (writable only with ``CAP_SYS_PTRACE``) are:
Using ``PTRACE_TRACEME`` is unchanged.
2 - admin-only attach:
only processes with ``CAP_SYS_PTRACE`` may use ptrace
with ``PTRACE_ATTACH``, or through children calling ``PTRACE_TRACEME``.
only processes with ``CAP_SYS_PTRACE`` may use ptrace, either with
``PTRACE_ATTACH`` or through children calling ``PTRACE_TRACEME``.
3 - no attach:
no processes may use ptrace with ``PTRACE_ATTACH`` nor via
......
......@@ -51,8 +51,7 @@ Documentation
- There are various README files in the Documentation/ subdirectory:
these typically contain kernel-specific installation notes for some
drivers for example. See Documentation/00-INDEX for a list of what
is contained in each file. Please read the
drivers for example. Please read the
:ref:`Documentation/process/changes.rst <changes>` file, as it
contains information about the problems, which may result by upgrading
your kernel.
......
......@@ -1764,7 +1764,7 @@
Format: { "0" | "1" }
0 - Use IOMMU translation for DMA.
1 - Bypass the IOMMU for DMA.
unset - Use IOMMU translation for DMA.
unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
io7= [HW] IO7 for Marvel based alpha systems
See comment before marvel_specify_io7 in
......
......@@ -553,7 +553,7 @@ When nested virtualization is in use, three operating systems are involved:
the bare metal hypervisor, the nested hypervisor and the nested virtual
machine. VMENTER operations from the nested hypervisor into the nested
guest will always be processed by the bare metal hypervisor. If KVM is the
bare metal hypervisor it wiil:
bare metal hypervisor it will:
- Flush the L1D cache on every switch from the nested hypervisor to the
nested virtual machine, so that the nested hypervisor's secrets are not
......
......@@ -29,6 +29,7 @@ the Linux memory management.
hugetlbpage
idle_page_tracking
ksm
memory-hotplug
numa_memory_policy
pagemap
soft-dirty
......
.. _admin_guide_memory_hotplug:
==============
Memory Hotplug
==============
......@@ -9,39 +11,19 @@ This document is about memory hotplug including how-to-use and current status.
Because Memory Hotplug is still under development, contents of this text will
be changed often.
.. CONTENTS
1. Introduction
1.1 purpose of memory hotplug
1.2. Phases of memory hotplug
1.3. Unit of Memory online/offline operation
2. Kernel Configuration
3. sysfs files for memory hotplug
4. Physical memory hot-add phase
4.1 Hardware(Firmware) Support
4.2 Notify memory hot-add event by hand
5. Logical Memory hot-add phase
5.1. State of memory
5.2. How to online memory
6. Logical memory remove
6.1 Memory offline and ZONE_MOVABLE
6.2. How to offline memory
7. Physical memory remove
8. Memory hotplug event notifier
9. Future Work List
.. contents:: :local:
.. note::
(1) x86_64's has special implementation for memory hotplug.
This text does not describe it.
(2) This text assumes that sysfs is mounted at /sys.
(2) This text assumes that sysfs is mounted at ``/sys``.
Introduction
============
purpose of memory hotplug
Purpose of memory hotplug
-------------------------
Memory Hotplug allows users to increase/decrease the amount of memory.
......@@ -57,7 +39,6 @@ hardware which supports memory power management.
Linux memory hotplug is designed for both purpose.
Phases of memory hotplug
------------------------
......@@ -92,7 +73,6 @@ phase by hand.
(However, if you writes udev's hotplug scripts for memory hotplug, these
phases can be execute in seamless way.)
Unit of Memory online/offline operation
---------------------------------------
......@@ -107,10 +87,9 @@ unit upon which memory online/offline operations are to be performed. The
default size of a memory block is the same as memory section size unless an
architecture specifies otherwise. (see :ref:`memory_hotplug_sysfs_files`.)
To determine the size (in bytes) of a memory block please read this file:
/sys/devices/system/memory/block_size_bytes
To determine the size (in bytes) of a memory block please read this file::
/sys/devices/system/memory/block_size_bytes
Kernel Configuration
====================
......@@ -119,22 +98,22 @@ To use memory hotplug feature, kernel must be compiled with following
config options.
- For all memory hotplug:
- Memory model -> Sparse Memory (CONFIG_SPARSEMEM)
- Allow for memory hot-add (CONFIG_MEMORY_HOTPLUG)
- Memory model -> Sparse Memory (``CONFIG_SPARSEMEM``)
- Allow for memory hot-add (``CONFIG_MEMORY_HOTPLUG``)
- To enable memory removal, the following are also necessary:
- Allow for memory hot remove (CONFIG_MEMORY_HOTREMOVE)
- Page Migration (CONFIG_MIGRATION)
- Allow for memory hot remove (``CONFIG_MEMORY_HOTREMOVE``)
- Page Migration (``CONFIG_MIGRATION``)
- For ACPI memory hotplug, the following are also necessary:
- Memory hotplug (under ACPI Support menu) (CONFIG_ACPI_HOTPLUG_MEMORY)
- Memory hotplug (under ACPI Support menu) (``CONFIG_ACPI_HOTPLUG_MEMORY``)
- This option can be kernel module.
- As a related configuration, if your box has a feature of NUMA-node hotplug
via ACPI, then this option is necessary too.
- ACPI0004,PNP0A05 and PNP0A06 Container Driver (under ACPI Support menu)
(CONFIG_ACPI_CONTAINER).
(``CONFIG_ACPI_CONTAINER``).
This option can be kernel module too.
......@@ -145,10 +124,11 @@ sysfs files for memory hotplug
==============================
All memory blocks have their device information in sysfs. Each memory block
is described under /sys/devices/system/memory as:
is described under ``/sys/devices/system/memory`` as::
/sys/devices/system/memory/memoryXXX
(XXX is the memory block id.)
where XXX is the memory block id.
For the memory block covered by the sysfs directory. It is expected that all
memory sections in this range are present and no memory holes exist in the
......@@ -157,7 +137,7 @@ the existence of one should not affect the hotplug capabilities of the memory
block.
For example, assume 1GiB memory block size. A device for a memory starting at
0x100000000 is /sys/device/system/memory/memory4::
0x100000000 is ``/sys/device/system/memory/memory4``::
(0x100000000 / 1Gib = 4)
......@@ -165,11 +145,11 @@ This device covers address range [0x100000000 ... 0x140000000)
Under each memory block, you can see 5 files:
- /sys/devices/system/memory/memoryXXX/phys_index
- /sys/devices/system/memory/memoryXXX/phys_device
- /sys/devices/system/memory/memoryXXX/state
- /sys/devices/system/memory/memoryXXX/removable
- /sys/devices/system/memory/memoryXXX/valid_zones
- ``/sys/devices/system/memory/memoryXXX/phys_index``
- ``/sys/devices/system/memory/memoryXXX/phys_device``
- ``/sys/devices/system/memory/memoryXXX/state``
- ``/sys/devices/system/memory/memoryXXX/removable``
- ``/sys/devices/system/memory/memoryXXX/valid_zones``
=================== ============================================================
``phys_index`` read-only and contains memory block id, same as XXX.
......@@ -207,13 +187,15 @@ Under each memory block, you can see 5 files:
These directories/files appear after physical memory hotplug phase.
If CONFIG_NUMA is enabled the memoryXXX/ directories can also be accessed
via symbolic links located in the /sys/devices/system/node/node* directories.
via symbolic links located in the ``/sys/devices/system/node/node*`` directories.
For example::
For example:
/sys/devices/system/node/node0/memory9 -> ../../memory/memory9
/sys/devices/system/node/node0/memory9 -> ../../memory/memory9
A backlink will also be created:
/sys/devices/system/memory/memory9/node0 -> ../../node/node0
A backlink will also be created::
/sys/devices/system/memory/memory9/node0 -> ../../node/node0
.. _memory_hotplug_physical_mem:
......@@ -240,7 +222,6 @@ If firmware supports NUMA-node hotplug, and defines an object _HID "ACPI0004",
calls hotplug code for all of objects which are defined in it.
If memory device is found, memory hotplug code will be called.
Notify memory hot-add event by hand
-----------------------------------
......@@ -251,8 +232,9 @@ CONFIG_ARCH_MEMORY_PROBE and can be configured on powerpc, sh, and x86
if hotplug is supported, although for x86 this should be handled by ACPI
notification.
Probe interface is located at
/sys/devices/system/memory/probe
Probe interface is located at::
/sys/devices/system/memory/probe
You can tell the physical address of new memory to the kernel by::
......@@ -263,7 +245,6 @@ memory_block_size] memory range is hot-added. In this case, hotplug script is
not called (in current implementation). You'll have to online memory by
yourself. Please see :ref:`memory_hotplug_how_to_online_memory`.
Logical Memory hot-add phase
============================
......@@ -301,7 +282,7 @@ This sets a global policy and impacts all memory blocks that will subsequently
be hotplugged. Currently offline blocks keep their state. It is possible, under
certain circumstances, that some memory blocks will be added but will fail to
online. User space tools can check their "state" files
(/sys/devices/system/memory/memoryXXX/state) and try to online them manually.
(``/sys/devices/system/memory/memoryXXX/state``) and try to online them manually.
If the automatic onlining wasn't requested, failed, or some memory block was
offlined it is possible to change the individual block's state by writing to the
......@@ -334,8 +315,6 @@ available memory will be increased.
This may be changed in future.
Logical memory remove
=====================
......@@ -413,88 +392,6 @@ Need more implementation yet....
- Notification completion of remove works by OS to firmware.
- Guard from remove if not yet.
Memory hotplug event notifier
=============================
Hotplugging events are sent to a notification queue.
There are six types of notification defined in include/linux/memory.h:
MEM_GOING_ONLINE
Generated before new memory becomes available in order to be able to
prepare subsystems to handle memory. The page allocator is still unable
to allocate from the new memory.
MEM_CANCEL_ONLINE
Generated if MEMORY_GOING_ONLINE fails.
MEM_ONLINE
Generated when memory has successfully brought online. The callback may
allocate pages from the new memory.
MEM_GOING_OFFLINE
Generated to begin the process of offlining memory. Allocations are no
longer possible from the memory but some of the memory to be offlined
is still in use. The callback can be used to free memory known to a
subsystem from the indicated memory block.
MEM_CANCEL_OFFLINE
Generated if MEMORY_GOING_OFFLINE fails. Memory is available again from
the memory block that we attempted to offline.
MEM_OFFLINE
Generated after offlining memory is complete.
A callback routine can be registered by calling::
hotplug_memory_notifier(callback_func, priority)
Callback functions with higher values of priority are called before callback
functions with lower values.
A callback function must have the following prototype::
int callback_func(
struct notifier_block *self, unsigned long action, void *arg);
The first argument of the callback function (self) is a pointer to the block
of the notifier chain that points to the callback function itself.
The second argument (action) is one of the event types described above.
The third argument (arg) passes a pointer of struct memory_notify::
struct memory_notify {
unsigned long start_pfn;
unsigned long nr_pages;
int status_change_nid_normal;
int status_change_nid_high;
int status_change_nid;
}
- start_pfn is start_pfn of online/offline memory.
- nr_pages is # of pages of online/offline memory.
- status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
is (will be) set/clear, if this is -1, then nodemask status is not changed.
- status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask
is (will be) set/clear, if this is -1, then nodemask status is not changed.
- status_change_nid is set node id when N_MEMORY of nodemask is (will be)
set/clear. It means a new(memoryless) node gets new memory by online and a
node loses all memory. If this is -1, then nodemask status is not changed.
If status_changed_nid* >= 0, callback should create/discard structures for the
node if necessary.
The callback routine shall return one of the values
NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP
defined in include/linux/notifier.h
NOTIFY_DONE and NOTIFY_OK have no effect on the further processing.
NOTIFY_BAD is used as response to the MEM_GOING_ONLINE, MEM_GOING_OFFLINE,
MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It stops
further processing of the notification queue.
NOTIFY_STOP stops further processing of the notification queue.
Future Work
===========
......
00-INDEX
- this file
Booting
- requirements for booting
CCN.txt
- Cache Coherent Network ring-bus and perf PMU driver.
Interrupts
- ARM Interrupt subsystem documentation
IXP4xx
- Intel IXP4xx Network processor.
Netwinder
- Netwinder specific documentation
Porting
- Symbol definitions for porting Linux to a new ARM machine.
Setup
- Kernel initialization parameters on ARM Linux
README
- General ARM documentation
SA1100/
- SA1100 documentation
Samsung-S3C24XX/
- S3C24XX ARM Linux Overview
SPEAr/
- ST SPEAr platform Linux Overview
VFP/
- Release notes for Linux Kernel Vector Floating Point support code
cluster-pm-race-avoidance.txt
- Algorithm for CPU and Cluster setup/teardown
empeg/
- Ltd's Empeg MP3 Car Audio Player
firmware.txt
- Secure firmware registration and calling.
kernel_mode_neon.txt
- How to use NEON instructions in kernel mode
kernel_user_helpers.txt
- Helper functions in kernel space made available for userspace.
mem_alignment
- alignment abort handler documentation
memory.txt
- description of the virtual memory layout
nwfpe/
- NWFPE floating point emulator documentation
swp_emulation
- SWP/SWPB emulation handler/logging description
tcm.txt
- ARM Tightly Coupled Memory
uefi.txt
- [U]EFI configuration and runtime services documentation
vlocks.txt
- Voting locks, low-level mechanism relying on memory system atomic writes.
00-INDEX
- This file
bfq-iosched.txt
- BFQ IO scheduler and its tunables
biodoc.txt
- Notes on the Generic Block Layer Rewrite in Linux 2.5
biovecs.txt
- Immutable biovecs and biovec iterators
capability.txt
- Generic Block Device Capability (/sys/block/<device>/capability)
cfq-iosched.txt
- CFQ IO scheduler tunables
cmdline-partition.txt
- how to specify block device partitions on kernel command line
data-integrity.txt
- Block data integrity
deadline-iosched.txt
- Deadline IO scheduler tunables
ioprio.txt
- Block io priorities (in CFQ scheduler)
pr.txt
- Block layer support for Persistent Reservations
null_blk.txt
- Null block for block-layer benchmarking.
queue-sysfs.txt
- Queue's sysfs entries
request.txt
- The members of struct request (in include/linux/blkdev.h)
stat.txt
- Block layer statistics in /sys/block/<device>/stat
switching-sched.txt
- Switching I/O schedulers at runtime
writeback_cache_control.txt
- Control of volatile write back caches
00-INDEX
- this file
README.DAC960
- info on Mylex DAC960/DAC1100 PCI RAID Controller Driver for Linux.
cciss.txt
- info, major/minor #'s for Compaq's SMART Array Controllers.
cpqarray.txt
- info on using Compaq's SMART2 Intelligent Disk Array Controllers.
floppy.txt
- notes and driver options for the floppy disk driver.
mflash.txt
- info on mGine m(g)flash driver for linux.
nbd.txt
- info on a TCP implementation of a network block device.
paride.txt
- information about the parallel port IDE subsystem.
ramdisk.txt
- short guide on how to set up and use the RAM disk.
00-INDEX
- this file (info on CD-ROMs and Linux)
Makefile
- only used to generate TeX output from the documentation.
cdrom-standard.tex
- LaTeX document on standardizing the CD-ROM programming interface.
ide-cd
- info on setting up and using ATAPI (aka IDE) CD-ROMs.
packet-writing.txt
- Info on the CDRW packet writing module
00-INDEX
- this file
blkio-controller.txt
- Description for Block IO Controller, implementation and usage details.
cgroups.txt
- Control Groups definition, implementation details, examples and API.
cpuacct.txt
- CPU Accounting Controller; account CPU usage for groups of tasks.
cpusets.txt
- documents the cpusets feature; assign CPUs and Mem to a set of tasks.
admin-guide/devices.rst
- Device Whitelist Controller; description, interface and security.
freezer-subsystem.txt
- checkpointing; rationale to not use signals, interface.
hugetlb.txt
- HugeTLB Controller implementation and usage details.
memcg_test.txt
- Memory Resource Controller; implementation details.
memory.txt
- Memory Resource Controller; design, accounting, interface, testing.
net_cls.txt
- Network classifier cgroups details and usages.
net_prio.txt
- Network priority cgroups details and usages.
pids.txt
- Process number cgroups details and usages.
......@@ -259,7 +259,7 @@ latex_elements = {
'papersize': 'a4paper',
# The font size ('10pt', '11pt' or '12pt').
'pointsize': '8pt',
'pointsize': '11pt',
# Latex figure (float) alignment
#'figure_align': 'htbp',
......@@ -272,8 +272,8 @@ latex_elements = {
'preamble': '''
% Use some font with UTF-8 support with XeLaTeX
\\usepackage{fontspec}
\\setsansfont{DejaVu Serif}
\\setromanfont{DejaVu Sans}
\\setsansfont{DejaVu Sans}
\\setromanfont{DejaVu Serif}
\\setmonofont{DejaVu Sans Mono}
'''
......
......@@ -76,7 +76,7 @@ These interfaces available only with bootmem, i.e when ``CONFIG_NO_BOOTMEM=n``
.. kernel-doc:: include/linux/bootmem.h
.. kernel-doc:: mm/bootmem.c
:nodocs:
:functions:
Memblock specific API
---------------------
......@@ -89,4 +89,4 @@ really happens under the hood.
.. kernel-doc:: include/linux/memblock.h
.. kernel-doc:: mm/memblock.c
:nodocs:
:functions:
.. _gfp_mask_from_fs_io:
=================================
GFP masks used from FS/IO context
=================================
......
......@@ -27,10 +27,13 @@ Core utilities
errseq
printk-formats
circular-buffers
memory-allocation
mm-api
gfp_mask-from-fs-io
timekeeping
boot-time-mm
memory-hotplug
Interfaces for kernel debugging
===============================
......
=======================
Memory Allocation Guide
=======================
Linux provides a variety of APIs for memory allocation. You can
allocate small chunks using `kmalloc` or `kmem_cache_alloc` families,
large virtually contiguous areas using `vmalloc` and its derivatives,
or you can directly request pages from the page allocator with
`alloc_pages`. It is also possible to use more specialized allocators,
for instance `cma_alloc` or `zs_malloc`.
Most of the memory allocation APIs use GFP flags to express how that
memory should be allocated. The GFP acronym stands for "get free
pages", the underlying memory allocation function.
Diversity of the allocation APIs combined with the numerous GFP flags
makes the question "How should I allocate memory?" not that easy to
answer, although very likely you should use
::
kzalloc(<size>, GFP_KERNEL);
Of course there are cases when other allocation APIs and different GFP
flags must be used.
Get Free Page flags
===================
The GFP flags control the allocators behavior. They tell what memory
zones can be used, how hard the allocator should try to find free
memory, whether the memory can be accessed by the userspace etc. The
:ref:`Documentation/core-api/mm-api.rst <mm-api-gfp-flags>` provides
reference documentation for the GFP flags and their combinations and
here we briefly outline their recommended usage:
* Most of the time ``GFP_KERNEL`` is what you need. Memory for the
kernel data structures, DMAable memory, inode cache, all these and
many other allocations types can use ``GFP_KERNEL``. Note, that
using ``GFP_KERNEL`` implies ``GFP_RECLAIM``, which means that
direct reclaim may be triggered under memory pressure; the calling
context must be allowed to sleep.
* If the allocation is performed from an atomic context, e.g interrupt
handler, use ``GFP_NOWAIT``. This flag prevents direct reclaim and
IO or filesystem operations. Consequently, under memory pressure
``GFP_NOWAIT`` allocation is likely to fail. Allocations which
have a reasonable fallback should be using ``GFP_NOWARN``.
* If you think that accessing memory reserves is justified and the kernel
will be stressed unless allocation succeeds, you may use ``GFP_ATOMIC``.
* Untrusted allocations triggered from userspace should be a subject
of kmem accounting and must have ``__GFP_ACCOUNT`` bit set. There
is the handy ``GFP_KERNEL_ACCOUNT`` shortcut for ``GFP_KERNEL``
allocations that should be accounted.
* Userspace allocations should use either of the ``GFP_USER``,
``GFP_HIGHUSER`` or ``GFP_HIGHUSER_MOVABLE`` flags. The longer
the flag name the less restrictive it is.
``GFP_HIGHUSER_MOVABLE`` does not require that allocated memory
will be directly accessible by the kernel and implies that the
data is movable.
``GFP_HIGHUSER`` means that the allocated memory is not movable,
but it is not required to be directly accessible by the kernel. An
example may be a hardware allocation that maps data directly into
userspace but has no addressing limitations.
``GFP_USER`` means that the allocated memory is not movable and it
must be directly accessible by the kernel.
You may notice that quite a few allocations in the existing code
specify ``GFP_NOIO`` or ``GFP_NOFS``. Historically, they were used to
prevent recursion deadlocks caused by direct memory reclaim calling
back into the FS or IO paths and blocking on already held
resources. Since 4.12 the preferred way to address this issue is to
use new scope APIs described in
:ref:`Documentation/core-api/gfp_mask-from-fs-io.rst <gfp_mask_from_fs_io>`.
Other legacy GFP flags are ``GFP_DMA`` and ``GFP_DMA32``. They are
used to ensure that the allocated memory is accessible by hardware
with limited addressing capabilities. So unless you are writing a
driver for a device with such restrictions, avoid using these flags.
And even with hardware with restrictions it is preferable to use
`dma_alloc*` APIs.
Selecting memory allocator
==========================
The most straightforward way to allocate memory is to use a function
from the :c:func:`kmalloc` family. And, to be on the safe size it's
best to use routines that set memory to zero, like
:c:func:`kzalloc`. If you need to allocate memory for an array, there
are :c:func:`kmalloc_array` and :c:func:`kcalloc` helpers.
The maximal size of a chunk that can be allocated with `kmalloc` is
limited. The actual limit depends on the hardware and the kernel
configuration, but it is a good practice to use `kmalloc` for objects
smaller than page size.
For large allocations you can use :c:func:`vmalloc` and
:c:func:`vzalloc`, or directly request pages from the page
allocator. The memory allocated by `vmalloc` and related functions is
not physically contiguous.
If you are not sure whether the allocation size is too large for
`kmalloc`, it is possible to use :c:func:`kvmalloc` and its
derivatives. It will try to allocate memory with `kmalloc` and if the
allocation fails it will be retried with `vmalloc`. There are
restrictions on which GFP flags can be used with `kvmalloc`; please
see :c:func:`kvmalloc_node` reference documentation. Note that
`kvmalloc` may return memory that is not physically contiguous.
If you need to allocate many identical objects you can use the slab
cache allocator. The cache should be set up with
:c:func:`kmem_cache_create` before it can be used. Afterwards
:c:func:`kmem_cache_alloc` and its convenience wrappers can allocate
memory from that cache.
When the allocated memory is no longer needed it must be freed. You
can use :c:func:`kvfree` for the memory allocated with `kmalloc`,
`vmalloc` and `kvmalloc`. The slab caches should be freed with
:c:func:`kmem_cache_free`. And don't forget to destroy the cache with
:c:func:`kmem_cache_destroy`.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment