2 menu "Memory Management options"
4 config SELECT_MEMORY_MODEL
6 depends on ARCH_SELECT_MEMORY_MODEL
10 depends on SELECT_MEMORY_MODEL
11 default DISCONTIGMEM_MANUAL if ARCH_DISCONTIGMEM_DEFAULT
12 default SPARSEMEM_MANUAL if ARCH_SPARSEMEM_DEFAULT
13 default FLATMEM_MANUAL
17 depends on !(ARCH_DISCONTIGMEM_ENABLE || ARCH_SPARSEMEM_ENABLE) || ARCH_FLATMEM_ENABLE
19 This option allows you to change some of the ways that
20 Linux manages its memory internally. Most users will
21 only have one option here: FLATMEM. This is normal
24 Some users of more advanced features like NUMA and
25 memory hotplug may have different options here.
26 DISCONTIGMEM is a more mature, better tested system,
27 but is incompatible with memory hotplug and may suffer
28 decreased performance over SPARSEMEM. If unsure between
29 "Sparse Memory" and "Discontiguous Memory", choose
30 "Discontiguous Memory".
32 If unsure, choose this option (Flat Memory) over any other.
34 config DISCONTIGMEM_MANUAL
35 bool "Discontiguous Memory"
36 depends on ARCH_DISCONTIGMEM_ENABLE
38 This option provides enhanced support for discontiguous
39 memory systems, over FLATMEM. These systems have holes
40 in their physical address spaces, and this option provides
41 more efficient handling of these holes. However, the vast
42 majority of hardware has quite flat address spaces, and
43 can have degraded performance from the extra overhead that
46 Many NUMA configurations will have this as the only option.
48 If unsure, choose "Flat Memory" over this option.
50 config SPARSEMEM_MANUAL
52 depends on ARCH_SPARSEMEM_ENABLE
54 This will be the only option for some systems, including
55 memory hotplug systems. This is normal.
57 For many other systems, this will be an alternative to
58 "Discontiguous Memory". This option provides some potential
59 performance benefits, along with decreased code complexity,
60 but it is newer, and more experimental.
62 If unsure, choose "Discontiguous Memory" or "Flat Memory"
69 depends on (!SELECT_MEMORY_MODEL && ARCH_DISCONTIGMEM_ENABLE) || DISCONTIGMEM_MANUAL
73 depends on (!SELECT_MEMORY_MODEL && ARCH_SPARSEMEM_ENABLE) || SPARSEMEM_MANUAL
77 depends on (!DISCONTIGMEM && !SPARSEMEM) || FLATMEM_MANUAL
79 config FLAT_NODE_MEM_MAP
84 # Both the NUMA code and DISCONTIGMEM use arrays of pg_data_t's
85 # to represent different areas of memory. This variable allows
86 # those dependencies to exist individually.
88 config NEED_MULTIPLE_NODES
90 depends on DISCONTIGMEM || NUMA
92 config HAVE_MEMORY_PRESENT
94 depends on ARCH_HAVE_MEMORY_PRESENT || SPARSEMEM
97 # SPARSEMEM_EXTREME (which is the default) does some bootmem
98 # allocations when memory_present() is called. If this cannot
99 # be done on your architecture, select this option. However,
100 # statically allocating the mem_section[] array can potentially
101 # consume vast quantities of .bss, so be careful.
103 # This option will also potentially produce smaller runtime code
104 # with gcc 3.4 and later.
106 config SPARSEMEM_STATIC
110 # Architecture platforms which require a two level mem_section in SPARSEMEM
111 # must select this option. This is usually for architecture platforms with
112 # an extremely sparse physical address space.
114 config SPARSEMEM_EXTREME
116 depends on SPARSEMEM && !SPARSEMEM_STATIC
118 config SPARSEMEM_VMEMMAP_ENABLE
121 config SPARSEMEM_VMEMMAP
122 bool "Sparse Memory virtual memmap"
123 depends on SPARSEMEM && SPARSEMEM_VMEMMAP_ENABLE
126 SPARSEMEM_VMEMMAP uses a virtually mapped memmap to optimise
127 pfn_to_page and page_to_pfn operations. This is the most
128 efficient option when sufficient kernel resources are available.
130 config HAVE_MEMBLOCK_NODE_MAP
133 config HAVE_MEMBLOCK_PHYS_MAP
136 config HAVE_GENERIC_GUP
139 config ARCH_DISCARD_MEMBLOCK
142 config MEMORY_ISOLATION
146 # Only be set on architectures that have completely implemented memory hotplug
147 # feature. If you are not sure, don't touch it.
149 config HAVE_BOOTMEM_INFO_NODE
152 # eventually, we can have this option just 'select SPARSEMEM'
153 config MEMORY_HOTPLUG
154 bool "Allow for memory hot-add"
155 depends on SPARSEMEM || X86_64_ACPI_NUMA
156 depends on ARCH_ENABLE_MEMORY_HOTPLUG
158 config MEMORY_HOTPLUG_SPARSE
160 depends on SPARSEMEM && MEMORY_HOTPLUG
162 config MEMORY_HOTPLUG_DEFAULT_ONLINE
163 bool "Online the newly added memory blocks by default"
165 depends on MEMORY_HOTPLUG
167 This option sets the default policy setting for memory hotplug
168 onlining policy (/sys/devices/system/memory/auto_online_blocks) which
169 determines what happens to newly added memory regions. Policy setting
170 can always be changed at runtime.
171 See Documentation/memory-hotplug.txt for more information.
173 Say Y here if you want all hot-plugged memory blocks to appear in
174 'online' state by default.
175 Say N here if you want the default policy to keep all hot-plugged
176 memory blocks in 'offline' state.
178 config MEMORY_HOTREMOVE
179 bool "Allow for memory hot remove"
180 select MEMORY_ISOLATION
181 select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
182 depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
185 # Heavily threaded applications may benefit from splitting the mm-wide
186 # page_table_lock, so that faults on different parts of the user address
187 # space can be handled with less contention: split it at this NR_CPUS.
188 # Default to 4 for wider testing, though 8 might be more appropriate.
189 # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
190 # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
191 # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
193 config SPLIT_PTLOCK_CPUS
195 default "999999" if !MMU
196 default "999999" if ARM && !CPU_CACHE_VIPT
197 default "999999" if PARISC && !PA20
200 config ARCH_ENABLE_SPLIT_PMD_PTLOCK
204 # support for memory balloon
205 config MEMORY_BALLOON
209 # support for memory balloon compaction
210 config BALLOON_COMPACTION
211 bool "Allow for balloon memory compaction/migration"
213 depends on COMPACTION && MEMORY_BALLOON
215 Memory fragmentation introduced by ballooning might reduce
216 significantly the number of 2MB contiguous memory blocks that can be
217 used within a guest, thus imposing performance penalties associated
218 with the reduced number of transparent huge pages that could be used
219 by the guest workload. Allowing the compaction & migration for memory
220 pages enlisted as being part of memory balloon devices avoids the
221 scenario aforementioned and helps improving memory defragmentation.
224 # support for memory compaction
226 bool "Allow for memory compaction"
231 Compaction is the only memory management component to form
232 high order (larger physically contiguous) memory blocks
233 reliably. The page allocator relies on compaction heavily and
234 the lack of the feature can lead to unexpected OOM killer
235 invocations for high order memory requests. You shouldn't
236 disable this option unless there really is a strong reason for
237 it and then we would be really interested to hear about that at
241 # support for page migration
244 bool "Page migration"
246 depends on (NUMA || ARCH_ENABLE_MEMORY_HOTREMOVE || COMPACTION || CMA) && MMU
248 Allows the migration of the physical location of pages of processes
249 while the virtual addresses are not changed. This is useful in
250 two situations. The first is on NUMA systems to put pages nearer
251 to the processors accessing. The second is when allocating huge
252 pages as migration can relocate pages to satisfy a huge page
253 allocation instead of reclaiming.
255 config ARCH_ENABLE_HUGEPAGE_MIGRATION
258 config ARCH_ENABLE_THP_MIGRATION
262 def_bool (MEMORY_ISOLATION && COMPACTION) || CMA
264 config PHYS_ADDR_T_64BIT
268 bool "Enable bounce buffers"
270 depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM)
272 Enable bounce buffers for devices that cannot access
273 the full range of memory available to the CPU. Enabled
274 by default when ZONE_DMA or HIGHMEM is selected, but you
275 may say n to override this.
285 An architecture should select this if it implements the
286 deprecated interface virt_to_bus(). All new architectures
287 should probably not select this.
295 bool "Enable KSM for page merging"
299 Enable Kernel Samepage Merging: KSM periodically scans those areas
300 of an application's address space that an app has advised may be
301 mergeable. When it finds pages of identical content, it replaces
302 the many instances by a single page with that content, so
303 saving memory until one or another app needs to modify the content.
304 Recommended for use with KVM, or with other duplicative applications.
305 See Documentation/vm/ksm.rst for more information: KSM is inactive
306 until a program has madvised that an area is MADV_MERGEABLE, and
307 root has set /sys/kernel/mm/ksm/run to 1 (if CONFIG_SYSFS is set).
309 config DEFAULT_MMAP_MIN_ADDR
310 int "Low address space to protect from user allocation"
314 This is the portion of low virtual memory which should be protected
315 from userspace allocation. Keeping a user from writing to low pages
316 can help reduce the impact of kernel NULL pointer bugs.
318 For most ia64, ppc64 and x86 users with lots of address space
319 a value of 65536 is reasonable and should cause no problems.
320 On arm and other archs it should not be higher than 32768.
321 Programs which use vm86 functionality or have some need to map
322 this low address space will need CAP_SYS_RAWIO or disable this
323 protection by setting the value to 0.
325 This value can be changed after boot using the
326 /proc/sys/vm/mmap_min_addr tunable.
328 config ARCH_SUPPORTS_MEMORY_FAILURE
331 config MEMORY_FAILURE
333 depends on ARCH_SUPPORTS_MEMORY_FAILURE
334 bool "Enable recovery from hardware memory errors"
335 select MEMORY_ISOLATION
338 Enables code to recover from some memory failures on systems
339 with MCA recovery. This allows a system to continue running
340 even when some of its memory has uncorrected errors. This requires
341 special hardware support and typically ECC memory.
343 config HWPOISON_INJECT
344 tristate "HWPoison pages injector"
345 depends on MEMORY_FAILURE && DEBUG_KERNEL && PROC_FS
346 select PROC_PAGE_MONITOR
348 config NOMMU_INITIAL_TRIM_EXCESS
349 int "Turn on mmap() excess space trimming before booting"
353 The NOMMU mmap() frequently needs to allocate large contiguous chunks
354 of memory on which to store mappings, but it can only ask the system
355 allocator for chunks in 2^N*PAGE_SIZE amounts - which is frequently
356 more than it requires. To deal with this, mmap() is able to trim off
357 the excess and return it to the allocator.
359 If trimming is enabled, the excess is trimmed off and returned to the
360 system allocator, which can cause extra fragmentation, particularly
361 if there are a lot of transient processes.
363 If trimming is disabled, the excess is kept, but not used, which for
364 long-term mappings means that the space is wasted.
366 Trimming can be dynamically controlled through a sysctl option
367 (/proc/sys/vm/nr_trim_pages) which specifies the minimum number of
368 excess pages there must be before trimming should occur, or zero if
369 no trimming is to occur.
371 This option specifies the initial value of this option. The default
372 of 1 says that all excess pages should be trimmed.
374 See Documentation/nommu-mmap.txt for more information.
376 config TRANSPARENT_HUGEPAGE
377 bool "Transparent Hugepage Support"
378 depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE
382 Transparent Hugepages allows the kernel to use huge pages and
383 huge tlb transparently to the applications whenever possible.
384 This feature can improve computing performance to certain
385 applications by speeding up page faults during memory
386 allocation, by reducing the number of tlb misses and by speeding
387 up the pagetable walking.
389 If memory constrained on embedded, you may want to say N.
392 prompt "Transparent Hugepage Support sysfs defaults"
393 depends on TRANSPARENT_HUGEPAGE
394 default TRANSPARENT_HUGEPAGE_ALWAYS
396 Selects the sysfs defaults for Transparent Hugepage Support.
398 config TRANSPARENT_HUGEPAGE_ALWAYS
401 Enabling Transparent Hugepage always, can increase the
402 memory footprint of applications without a guaranteed
403 benefit but it will work automatically for all applications.
405 config TRANSPARENT_HUGEPAGE_MADVISE
408 Enabling Transparent Hugepage madvise, will only provide a
409 performance improvement benefit to the applications using
410 madvise(MADV_HUGEPAGE) but it won't risk to increase the
411 memory footprint of applications without a guaranteed
415 config ARCH_WANTS_THP_SWAP
420 depends on TRANSPARENT_HUGEPAGE && ARCH_WANTS_THP_SWAP && SWAP
422 Swap transparent huge pages in one piece, without splitting.
423 XXX: For now, swap cluster backing transparent huge page
424 will be split after swapout.
426 For selection by architectures with reasonable THP sizes.
428 config TRANSPARENT_HUGE_PAGECACHE
430 depends on TRANSPARENT_HUGEPAGE
433 # UP and nommu archs use km based percpu allocator
435 config NEED_PER_CPU_KM
441 bool "Enable cleancache driver to cache clean pages if tmem is present"
444 Cleancache can be thought of as a page-granularity victim cache
445 for clean pages that the kernel's pageframe replacement algorithm
446 (PFRA) would like to keep around, but can't since there isn't enough
447 memory. So when the PFRA "evicts" a page, it first attempts to use
448 cleancache code to put the data contained in that page into
449 "transcendent memory", memory that is not directly accessible or
450 addressable by the kernel and is of unknown and possibly
451 time-varying size. And when a cleancache-enabled
452 filesystem wishes to access a page in a file on disk, it first
453 checks cleancache to see if it already contains it; if it does,
454 the page is copied into the kernel and a disk access is avoided.
455 When a transcendent memory driver is available (such as zcache or
456 Xen transcendent memory), a significant I/O reduction
457 may be achieved. When none is available, all cleancache calls
458 are reduced to a single pointer-compare-against-NULL resulting
459 in a negligible performance hit.
461 If unsure, say Y to enable cleancache
464 bool "Enable frontswap to cache swap pages if tmem is present"
468 Frontswap is so named because it can be thought of as the opposite
469 of a "backing" store for a swap device. The data is stored into
470 "transcendent memory", memory that is not directly accessible or
471 addressable by the kernel and is of unknown and possibly
472 time-varying size. When space in transcendent memory is available,
473 a significant swap I/O reduction may be achieved. When none is
474 available, all frontswap calls are reduced to a single pointer-
475 compare-against-NULL resulting in a negligible performance hit
476 and swap data is stored as normal on the matching swap device.
478 If unsure, say Y to enable frontswap.
481 bool "Contiguous Memory Allocator"
484 select MEMORY_ISOLATION
486 This enables the Contiguous Memory Allocator which allows other
487 subsystems to allocate big physically-contiguous blocks of memory.
488 CMA reserves a region of memory and allows only movable pages to
489 be allocated from it. This way, the kernel can use the memory for
490 pagecache and when a subsystem requests for contiguous area, the
491 allocated pages are migrated away to serve the contiguous request.
496 bool "CMA debug messages (DEVELOPMENT)"
497 depends on DEBUG_KERNEL && CMA
499 Turns on debug messages in CMA. This produces KERN_DEBUG
500 messages for every CMA call as well as various messages while
501 processing calls such as dma_alloc_from_contiguous().
502 This option does not affect warning and error messages.
505 bool "CMA debugfs interface"
506 depends on CMA && DEBUG_FS
508 Turns on the DebugFS interface for CMA.
511 int "Maximum count of the CMA areas"
515 CMA allows to create CMA areas for particular purpose, mainly,
516 used as device private area. This parameter sets the maximum
517 number of CMA area in the system.
519 If unsure, leave the default value "7".
521 config MEM_SOFT_DIRTY
522 bool "Track memory changes"
523 depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
524 select PROC_PAGE_MONITOR
526 This option enables memory changes tracking by introducing a
527 soft-dirty bit on pte-s. This bit it set when someone writes
528 into a page just as regular dirty bit, but unlike the latter
529 it can be cleared by hands.
531 See Documentation/admin-guide/mm/soft-dirty.rst for more details.
534 bool "Compressed cache for swap pages (EXPERIMENTAL)"
535 depends on FRONTSWAP && CRYPTO=y
540 A lightweight compressed cache for swap pages. It takes
541 pages that are in the process of being swapped out and attempts to
542 compress them into a dynamically allocated RAM-based memory pool.
543 This can result in a significant I/O reduction on swap device and,
544 in the case where decompressing from RAM is faster that swap device
545 reads, can also improve workload performance.
547 This is marked experimental because it is a new feature (as of
548 v3.11) that interacts heavily with memory reclaim. While these
549 interactions don't cause any known issues on simple memory setups,
550 they have not be fully explored on the large set of potential
551 configurations and workloads that exist.
554 tristate "Common API for compressed memory storage"
557 Compressed memory storage API. This allows using either zbud or
561 tristate "Low (Up to 2x) density storage for compressed pages"
564 A special purpose allocator for storing compressed pages.
565 It is designed to store up to two compressed pages per physical
566 page. While this design limits storage density, it has simple and
567 deterministic reclaim properties that make it preferable to a higher
568 density approach when reclaim will be used.
571 tristate "Up to 3x density storage for compressed pages"
575 A special purpose allocator for storing compressed pages.
576 It is designed to store up to three compressed pages per physical
577 page. It is a ZBUD derivative so the simplicity and determinism are
581 tristate "Memory allocator for compressed pages"
585 zsmalloc is a slab-based memory allocator designed to store
586 compressed RAM pages. zsmalloc uses virtual memory mapping
587 in order to reduce fragmentation. However, this results in a
588 non-standard allocator interface where a handle, not a pointer, is
589 returned by an alloc(). This handle must be mapped in order to
590 access the allocated space.
592 config PGTABLE_MAPPING
593 bool "Use page table mapping to access object in zsmalloc"
596 By default, zsmalloc uses a copy-based object mapping method to
597 access allocations that span two pages. However, if a particular
598 architecture (ex, ARM) performs VM mapping faster than copying,
599 then you should select this. This causes zsmalloc to use page table
600 mapping rather than copying for object mapping.
602 You can check speed with zsmalloc benchmark:
603 https://github.com/spartacus06/zsmapbench
606 bool "Export zsmalloc statistics"
610 This option enables code in the zsmalloc to collect various
611 statistics about whats happening in zsmalloc and exports that
612 information to userspace via debugfs.
615 config GENERIC_EARLY_IOREMAP
618 config MAX_STACK_SIZE_MB
619 int "Maximum user stack size for 32-bit processes (MB)"
622 depends on STACK_GROWSUP && (!64BIT || COMPAT)
624 This is the maximum stack size in Megabytes in the VM layout of 32-bit
625 user processes when the stack grows upwards (currently only on parisc
626 arch). The stack will be located at the highest memory address minus
627 the given value, unless the RLIMIT_STACK hard limit is changed to a
628 smaller value in which case that is used.
630 A sane initial value is 80 MB.
632 config DEFERRED_STRUCT_PAGE_INIT
633 bool "Defer initialisation of struct pages to kthreads"
636 depends on !NEED_PER_CPU_KM
639 Ordinarily all struct pages are initialised during early boot in a
640 single thread. On very large machines this can take a considerable
641 amount of time. If this option is set, large machines will bring up
642 a subset of memmap at boot and then initialise the rest in parallel
643 by starting one-off "pgdatinitX" kernel thread for each node X. This
644 has a potential performance impact on processes running early in the
645 lifetime of the system until these kthreads finish the
648 config IDLE_PAGE_TRACKING
649 bool "Enable idle page tracking"
650 depends on SYSFS && MMU
651 select PAGE_EXTENSION if !64BIT
653 This feature allows to estimate the amount of user pages that have
654 not been touched during a given period of time. This information can
655 be useful to tune memory cgroup limits and/or for job placement
656 within a compute cluster.
658 See Documentation/admin-guide/mm/idle_page_tracking.rst for
661 # arch_add_memory() comprehends device memory
662 config ARCH_HAS_ZONE_DEVICE
666 bool "Device memory (pmem, HMM, etc...) hotplug support"
667 depends on MEMORY_HOTPLUG
668 depends on MEMORY_HOTREMOVE
669 depends on SPARSEMEM_VMEMMAP
670 depends on ARCH_HAS_ZONE_DEVICE
674 Device memory hotplug support allows for establishing pmem,
675 or other device driver discovered memory regions, in the
676 memmap. This allows pfn_to_page() lookups of otherwise
677 "device-physical" addresses which is needed for using a DAX
678 mapping in an O_DIRECT operation, among other things.
680 If FS_DAX is enabled, then say Y.
685 depends on (X86_64 || PPC64)
686 depends on ZONE_DEVICE
687 depends on MMU && 64BIT
688 depends on MEMORY_HOTPLUG
689 depends on MEMORY_HOTREMOVE
690 depends on SPARSEMEM_VMEMMAP
692 config MIGRATE_VMA_HELPER
695 config DEV_PAGEMAP_OPS
700 select MIGRATE_VMA_HELPER
703 bool "HMM mirror CPU page table into a device page table"
704 depends on ARCH_HAS_HMM
708 Select HMM_MIRROR if you want to mirror range of the CPU page table of a
709 process into a device page table. Here, mirror means "keep synchronized".
710 Prerequisites: the device must provide the ability to write-protect its
711 page tables (at PAGE_SIZE granularity), and must be able to recover from
712 the resulting potential page faults.
714 config DEVICE_PRIVATE
715 bool "Unaddressable device memory (GPU memory, ...)"
716 depends on ARCH_HAS_HMM
718 select DEV_PAGEMAP_OPS
721 Allows creation of struct pages to represent unaddressable device
722 memory; i.e., memory that is only accessible from the device (or
723 group of devices). You likely also want to select HMM_MIRROR.
726 bool "Addressable device memory (like GPU memory)"
727 depends on ARCH_HAS_HMM
729 select DEV_PAGEMAP_OPS
732 Allows creation of struct pages to represent addressable device
733 memory; i.e., memory that is accessible from both the device and
739 config ARCH_USES_HIGH_VMA_FLAGS
741 config ARCH_HAS_PKEYS
745 bool "Collect percpu memory statistics"
748 This feature collects and exposes statistics via debugfs. The
749 information includes global and per chunk statistics, which can
750 be used to help understand percpu memory usage.
753 bool "Enable infrastructure for get_user_pages_fast() benchmarking"
756 Provides /sys/kernel/debug/gup_benchmark that helps with testing
757 performance of get_user_pages_fast().
759 See tools/testing/selftests/vm/gup_benchmark.c
761 config ARCH_HAS_PTE_SPECIAL