Linux进程内存布局及映射信息

Linux进程内存布局及映射信息

处理器在运行程序时,需要存储器来存放程序和程序使用的数据, 现代操作系统提供了存储器的抽象:虚拟存储器, 使得应用程序来说不用过多的考虑物理存储使用,简化了内存管理.

虚拟存储器

虚拟存储器是硬件异常、硬件地址翻译、主存、磁盘文件和内核软件完美的交互,它为每一个进程提供了一个大的、一致的、私有的地址空间.

–深入理解计算机系统

在多应用复杂任务场景下,进程直接使用物理内存存在一些问题:

  • 处理数据大小: 应用程序存储的使用,将受制于物理内存大小;
  • 复杂性:应用程序要清楚需要使用的物理地址,何时申请,何时释放,甚至产生碎片后是否要进行整理等;
  • 安全性:如果一个进程不小心改写了另外一个进程也正在使用的内存,进程可能会莫名其妙的挂掉.

虚拟存储器很好的解决了这类问题:

  • 高效使用内存: 将主存看作存储在磁盘上地址空间的高速缓存,主存只保存活动区域
  • 简化存储管理: 虚拟存储提供一致的地址空间
  • 对进程空间的保护: 以免地址空间被其他进程破坏

内存布局、映射的信息

虚拟存储用于管理,最终还是要使用到物理内存, Linux提供了一些接口可以查看,进程(虚拟)内存布局, 虚拟内存与物理内存的映射等。

相关的proc文件

可以通过下面的proc文件查看布局及映射信息

  • /proc/iomem: 物理内存布局
  • /proc/pid/maps: 进程虚拟内存布局
  • /proc/pid/mem: 进程内存查看
  • /proc/pid/pagemap: 虚拟内存映射到物理内存的信息
  • /proc/kpagecount: 映射的次数
  • /proc/kpageflags: 页面的状态
  • /proc/kpagecgroup: 可以查看每一页对应的cgroup的inode

物理内存使用布局

  • /proc/iomem

这个文件显示统内存与每个物理设备映射关系, 虚拟机内存物理内存配置了512M

00000000-00000fff : Reserved
00001000-0009ebff : System RAM
0009ec00-0009ffff : Reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ca000-000cafff : Adapter ROM
000cc000-000cffff : PCI Bus 0000:00
000d0000-000d3fff : PCI Bus 0000:00
000d4000-000d7fff : PCI Bus 0000:00
000d8000-000dbfff : PCI Bus 0000:00
000dc000-000fffff : Reserved
  000f0000-000fffff : System ROM
00100000-1fedffff : System RAM
  0ba00000-0c600de0 : Kernel code
  0c800000-0cc16fff : Kernel rodata
  0ce00000-0d00a8ff : Kernel data
  0d5dc000-0dbfffff : Kernel bss
1fee0000-1fefefff : ACPI Tables
1feff000-1fefffff : ACPI Non-volatile Storage
1ff00000-1fffffff : System RAM
...
  • 第一列:内存地址范围
  • 第二列:对应的内存用途

可以看到含有System RAM的几个段

...
00001000-0009ebff : System RAM
...
00100000-1fedffff : System RAM
  0ba00000-0c600de0 : Kernel code
  0c800000-0cc16fff : Kernel rodata
  0ce00000-0d00a8ff : Kernel data
  0d5dc000-0dbfffff : Kernel bss
...
1ff00000-1fffffff : System RAM

转成10进制看一下

#!/bin/bash
while IFS='-: ' read b e used_by; do
     set -- $(printf "%d %d" 0x$b 0x$e)
          echo "$1 to $2 bytes ($(($1/1024/1024)) to $(($2/1024/1024)) Mb) - $used_by"
          done < /proc/iomem
...
4096 to 650239 bytes (0 to 0 Mb) - System RAM
...
1048576 to 535691263 bytes (1 to 510 Mb) - System RAM
195035136 to 207621600 bytes (186 to 198 Mb) - Kernel code
209715200 to 214003711 bytes (200 to 204 Mb) - Kernel rodata
216006656 to 218147071 bytes (206 to 208 Mb) - Kernel data
224247808 to 230686719 bytes (213 to 219 Mb) - Kernel bss
...
535822336 to 536870911 bytes (511 to 511 Mb) - System RAM
...

进程虚拟内存使用布局

Linux 进程的内存布局
  • /proc/pid/maps

从用户态视角进程内存布局如下, 内核态布局部分未包含.

0
Program Text (.text)
Initialised Data (.data)
Uninitialised Data (.bss)
Heap
    |
    v
Memory Mapped Region for Shared Libraries or Anything Else
    ^
    |
User Stack
  • 示例进程memory_layout.c
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>

void *thread_func(void *arg){
    char *addr;
    printf("Before malloc in thread 1\n");
    printf("addr = %llx \n", &addr);
    getchar();
    addr = (char*)malloc(1000*sizeof(char));
    strncpy(addr, "thread", 6);
    printf("addr = %llx \n", &addr);
    printf("After malloc and before free in thread 1\n");
    printf("addr = %llx \n", &addr);
    getchar();
    printf("addr [%s] \n", addr);
    free(addr);
    printf("After free in thread 1\n");
    getchar();
}

int main(){
    char *addr;
    int pthread_status;
    printf("Welcome to per thread arena example::%d\n",getpid());
    printf("Before malloc in main thread\n");

    printf("addr = %llx \n", &addr);

    getchar();

    addr = (char *)malloc(1000);
    strncpy(addr, "garlic", 6);
    printf("addr = %llx  value = \n", &addr, addr);
    printf("After malloc and before free in main thread \n");
    getchar();

    printf("addr [%s] \n", addr);
    free(addr);
    printf("After free in main thread\n");
    getchar();

    pthread_t thread_1;

    pthread_status = pthread_create(&thread_1, NULL, thread_func, NULL);
    if (pthread_status != 0) {
       printf("Thread creation error\n");
       return -1;
    }

    void * thread_1_status;

    pthread_status = pthread_join(thread_1, &thread_1_status);
    if (pthread_status != 0) {
        printf("thread join error\n");
        return -1;
    }
    return 0;
}

通过malloc申请两次内存, 一次是在线程创建前, 一次是在线程执行时

  • 编译&运行
# gcc memory_layout.c -o memory_layout -lpthread

# ./memory_layout
Welcome to per thread arena example::118902
Before malloc in main thread
  • 查看进程内存布局信息

[root@centosgpt vm]# cat /proc/118902/maps 00400000-00401000 r-xp 00000000 fd:00 13592455 /root/src/memory/vm/memory_layout 00600000-00601000 r--p 00000000 fd:00 13592455 /root/src/memory/vm/memory_layout 00601000-00602000 rw-p 00001000 fd:00 13592455 /root/src/memory/vm/memory_layout 7f05691b4000-7f0569376000 r-xp 00000000 fd:00 33593981 /usr/lib64/libc-2.17.so 7f0569376000-7f0569576000 ---p 001c2000 fd:00 33593981 /usr/lib64/libc-2.17.so 7f0569576000-7f056957a000 r--p 001c2000 fd:00 33593981 /usr/lib64/libc-2.17.so 7f056957a000-7f056957c000 rw-p 001c6000 fd:00 33593981 /usr/lib64/libc-2.17.so 7f056957c000-7f0569581000 rw-p 00000000 00:00 0 7f0569581000-7f0569598000 r-xp 00000000 fd:00 33594199 /usr/lib64/libpthread-2.17.so 7f0569598000-7f0569797000 ---p 00017000 fd:00 33594199 /usr/lib64/libpthread-2.17.so 7f0569797000-7f0569798000 r--p 00016000 fd:00 33594199 /usr/lib64/libpthread-2.17.so 7f0569798000-7f0569799000 rw-p 00017000 fd:00 33594199 /usr/lib64/libpthread-2.17.so 7f0569799000-7f056979d000 rw-p 00000000 00:00 0 7f056979d000-7f05697bf000 r-xp 00000000 fd:00 33593974 /usr/lib64/ld-2.17.so 7f05699ae000-7f05699b1000 rw-p 00000000 00:00 0 7f05699bb000-7f05699be000 rw-p 00000000 00:00 0 7f05699be000-7f05699bf000 r--p 00021000 fd:00 33593974 /usr/lib64/ld-2.17.so 7f05699bf000-7f05699c0000 rw-p 00022000 fd:00 33593974 /usr/lib64/ld-2.17.so 7f05699c0000-7f05699c1000 rw-p 00000000 00:00 0 7ffe05a82000-7ffe05aa3000 rw-p 00000000 00:00 0 [stack] 7ffe05bdc000-7ffe05bdf000 r--p 00000000 00:00 0 [vvar] 7ffe05bdf000-7ffe05be0000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]

address – 进程地址空间中区域的起始地址和结束地址
permissions – 访问权限, 读取,写入,执行,共享和私有。

 r = read
 w = write
 x = execute
 s = shared
 p = private (copy on write)

offset – 如果区域是从文件(使用mmap)映射的,那么这就是映射开始的文件中的偏移量。如果内存没有从文件映射,那么它就是0。
device – 如果该区域是从文件映射的,这是文件所在的设备号(16进制)。
inode – 如果该区域从文件映射的,代表inode编号
pathname – 如果区域是从文件映射而来,这就是文件的名称。对于匿名映射区域,此字段为空。还有一些特殊区域的名称,如[heap]、[stack]或[vdso]。[vdso]表示虚拟动态共享对象。它被系统调用用来切换到内核模式。

总体来说分为三部分: 1. 程序部分; 2. 动态库及堆栈区; 3. vdso部分.

  • 程序部分映射
00400000-00401000 r-xp 00000000 fd:00 13592455                           /root/src/memory/vm/memory_layout
00600000-00601000 r--p 00000000 fd:00 13592455                           /root/src/memory/vm/memory_layout
00601000-00602000 rw-p 00001000 fd:00 13592455                           /root/src/memory/vm/memory_layout

每个范围的大小 4K, 对应一个页 ( 我机器页的大小:4K )

[root@centosgpt vm]# bc <<< 'obase=10; ibase=16; 401000 - 400000'
4096
...
[root@centosgpt vm]# getconf PAGESIZE
4096

程序文件大小 8864 大约 8.656K ,上面可以看到分配了3个页,有足够空间存放。

[root@centosgpt vm]# stat memory_layout | grep Size
  Size: 8864            Blocks: 24         IO Block: 4096   regular file

程序装载的信息


[root@centosgpt vm]# readelf -l -W ./memory_layout Elf file type is EXEC (Executable file) Entry point 0x400670 There are 9 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R E 0x8 INTERP 0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R 0x1 [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000b7c 0x000b7c R E 0x200000 LOAD 0x000e00 0x0000000000600e00 0x0000000000600e00 0x000264 0x000268 RW 0x200000 DYNAMIC 0x000e18 0x0000000000600e18 0x0000000000600e18 0x0001e0 0x0001e0 RW 0x8 NOTE 0x000254 0x0000000000400254 0x0000000000400254 0x000044 0x000044 R 0x4 GNU_EH_FRAME 0x000a28 0x0000000000400a28 0x0000000000400a28 0x00003c 0x00003c R 0x4 GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10 GNU_RELRO 0x000e00 0x0000000000600e00 0x0000000000600e00 0x000200 0x000200 R 0x1 Section to Segment mapping: Segment Sections... 00 01 .interp 02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame 03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 04 .dynamic 05 .note.ABI-tag .note.gnu.build-id 06 .eh_frame_hdr 07 08 .init_array .fini_array .jcr .dynamic .got

section是从链接角度进行划分, segment是从装载的角度划分, 可以看到最小的LOAD 对应的虚拟地址起始位置 0x400000, 而入口地址却是Entry point 0x400670, 也就是.textmain()所在section.

[root@centosgpt vm]# objdump --disassemble-all --start-address=0x000000 --stop-address=0x401000 ./memory_layout | less  +/400670
0000000000400670 <_start>:
  400670:       31 ed                   xor    %ebp,%ebp
  400672:       49 89 d1                mov    %rdx,%r9
  400675:       5e                      pop    %rsi
  400676:       48 89 e2                mov    %rsp,%rdx
  400679:       48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
  40067d:       50                      push   %rax
  40067e:       54                      push   %rsp
  40067f:       49 c7 c0 f0 08 40 00    mov    $0x4008f0,%r8
  400686:       48 c7 c1 80 08 40 00    mov    $0x400880,%rcx
  40068d:       48 c7 c7 b2 07 40 00    mov    $0x4007b2,%rdi
  400694:       e8 87 ff ff ff          callq  400620 <__libc_start_main@plt>
  400699:       f4                      hlt
  40069a:       66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)

那么 0x40000 - 0x400670 存放着什么呢, 根据ELF结构可知

Elf header
Program header table
Segment1
Segment2
...
Section head table

链接器会将属性相同的section放到一个segment中,针对本例,起始位置0x400000x400670存放的内容就是:

Elf header + Program header table + Segment1

下面计算一下, 先计算下 Elf header + Program header table 长度

[root@centosgpt vm]# readelf -h ./memory_layout
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x400670
  Start of program headers:          64 (bytes into file)
  Start of section headers:          6880 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         9
  Size of section headers:           64 (bytes)
  Number of section headers:         31
  Section header string table index: 30

Elf header + Program header table 偏移 = 64 + 56*9 = 568 = 0x238
看下程序sections信息, 可以对应到上面虚拟地址范围,可以看到起始section .interp地址是 0x400238:

[root@centosgpt vm]# objdump -h ./memory_layout

./memory_layout:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .interp       0000001c  0000000000400238  0000000000400238  00000238  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .note.ABI-tag 00000020  0000000000400254  0000000000400254  00000254  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .note.gnu.build-id 00000024  0000000000400274  0000000000400274  00000274  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .gnu.hash     0000001c  0000000000400298  0000000000400298  00000298  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .dynsym       00000108  00000000004002b8  00000000004002b8  000002b8  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .dynstr       0000008b  00000000004003c0  00000000004003c0  000003c0  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .gnu.version  00000016  000000000040044c  000000000040044c  0000044c  2**1
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 .gnu.version_r 00000040  0000000000400468  0000000000400468  00000468  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  8 .rela.dyn     00000018  00000000004004a8  00000000004004a8  000004a8  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  9 .rela.plt     000000d8  00000000004004c0  00000000004004c0  000004c0  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 10 .init         0000001a  0000000000400598  0000000000400598  00000598  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 11 .plt          000000a0  00000000004005c0  00000000004005c0  000005c0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 12 .plt.got      00000008  0000000000400660  0000000000400660  00000660  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 13 .text         00000282  0000000000400670  0000000000400670  00000670  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 14 .fini         00000009  00000000004008f4  00000000004008f4  000008f4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 15 .rodata       00000127  0000000000400900  0000000000400900  00000900  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 16 .eh_frame_hdr 0000003c  0000000000400a28  0000000000400a28  00000a28  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 17 .eh_frame     00000114  0000000000400a68  0000000000400a68  00000a68  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 18 .init_array   00000008  0000000000600e00  0000000000600e00  00000e00  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 19 .fini_array   00000008  0000000000600e08  0000000000600e08  00000e08  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 20 .jcr          00000008  0000000000600e10  0000000000600e10  00000e10  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 21 .dynamic      000001e0  0000000000600e18  0000000000600e18  00000e18  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 22 .got          00000008  0000000000600ff8  0000000000600ff8  00000ff8  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 23 .got.plt      00000060  0000000000601000  0000000000601000  00001000  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 24 .data         00000004  0000000000601060  0000000000601060  00001060  2**0
                  CONTENTS, ALLOC, LOAD, DATA
 25 .bss          00000004  0000000000601064  0000000000601064  00001064  2**0
                  ALLOC
 26 .comment      0000002d  0000000000000000  0000000000000000  00001064  2**0
                  CONTENTS, READONLY

0x400238 - 0x400670 存放是 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got 这些section

 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
   03     .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
  • 动态库&堆栈区
02166000-02187000 rw-p 00000000 00:00 0                                  [heap]
7f6f73907000-7f6f73ac9000 r-xp 00000000 fd:00 33593981                   /usr/lib64/libc-2.17.so
7f6f73ac9000-7f6f73cc9000 ---p 001c2000 fd:00 33593981                   /usr/lib64/libc-2.17.so
7f6f73cc9000-7f6f73ccd000 r--p 001c2000 fd:00 33593981                   /usr/lib64/libc-2.17.so
7f6f73ccd000-7f6f73ccf000 rw-p 001c6000 fd:00 33593981                   /usr/lib64/libc-2.17.so
7f6f73ccf000-7f6f73cd4000 rw-p 00000000 00:00 0
7f6f73cd4000-7f6f73ceb000 r-xp 00000000 fd:00 33594199                   /usr/lib64/libpthread-2.17.so
7f6f73ceb000-7f6f73eea000 ---p 00017000 fd:00 33594199                   /usr/lib64/libpthread-2.17.so
7f6f73eea000-7f6f73eeb000 r--p 00016000 fd:00 33594199                   /usr/lib64/libpthread-2.17.so
7f6f73eeb000-7f6f73eec000 rw-p 00017000 fd:00 33594199                   /usr/lib64/libpthread-2.17.so
7f6f73eec000-7f6f73ef0000 rw-p 00000000 00:00 0
7f6f73ef0000-7f6f73f12000 r-xp 00000000 fd:00 33593974                   /usr/lib64/ld-2.17.so
7f6f74101000-7f6f74104000 rw-p 00000000 00:00 0
7f6f7410e000-7f6f74111000 rw-p 00000000 00:00 0
7f6f74111000-7f6f74112000 r--p 00021000 fd:00 33593974                   /usr/lib64/ld-2.17.so
7f6f74112000-7f6f74113000 rw-p 00022000 fd:00 33593974                   /usr/lib64/ld-2.17.so
7f6f74113000-7f6f74114000 rw-p 00000000 00:00 0
7fffa50d1000-7fffa50f2000 rw-p 00000000 00:00 0                          [stack]
7fffa5142000-7fffa5145000 r--p 00000000 00:00 0                          [vvar]
7fffa5145000-7fffa5146000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

可以看到动态库及堆栈空间

[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::1699
Before malloc in main thread
addr = 7fffa50f09f0

addr = 7fffa50f09f0  value = garlic
After malloc and before free in main thread

局部变量addr位置落在stack区内, stack 相对 位于高地址区域。

7fffa50d1000-7fffa50f2000 rw-p 00000000 00:00 0                          [stack]

可以使用脚本看下addr所属页面信息

#!/usr/bin/python

import sys
import os
import binascii
import struct

def read_entry(path, offset, size=8):
  with open(path, 'r') as f:
      f.seek(offset, 0)
      return struct.unpack('Q', f.read(size))[0]

# Read /proc/$PID/pagemap
def get_pagemap_entry(pid, addr):
  maps_path = "/proc/{0}/pagemap".format(pid)
  if not os.path.isfile(maps_path):
    print "Process {0} doesn't exist.".format (pid)
    return
  page_size = os.sysconf ("SC_PAGE_SIZE")
  pagemap_entry_size = 8
  offset = (addr / page_size) * pagemap_entry_size
  return read_entry (maps_path, offset)

def get_pfn (entry):
  return entry & 0x7FFFFFFFFFFFFF

def is_present(entry):
  return ((entry & (1 << 63)) != 0)

def is_file_page(entry):
  return ((entry & (1 << 61)) != 0)

def get_pagecount(pfn):
    file_path = "/proc/kpagecount"
    offset = pfn * 8
    return read_entry(file_path, offset)

def get_page_flags(pfn):
     file_path = "/proc/kpageflags"
     offset = pfn * 8
     return read_entry(file_path, offset)

def get_page_cgroup(pfn):
     file_path = "/proc/kpagecgroup"
     offset = pfn * 8
     return read_entry(file_path, offset)

if __name__ == "__main__":
  pid = sys.argv[1]
  if sys.argv[2].startswith ("0x"):
    addr = long (sys.argv[2], base = 16)
  else:
    addr = long (sys.argv[2])
  entry = get_pagemap_entry (pid, addr)
  pfn = get_pfn (entry)
  print "PFN: {}".format (hex (pfn))
  print "Is Present? : {}".format(is_present(entry))
  print "Is file-page: {}".format(is_file_page(entry))
  print "Page count: {}".format(get_pagecount(pfn))
  print "Page flags: {}".format(hex(get_page_flags(pfn)))
  print "Page cgroup: {}".format(hex(get_page_cgroup(pfn)))

可以看到目前不内存中

[root@centosgpt vm]# ./v2pfn.py  1699 0x7fffa50f09f0
PFN: 0xc6280
Is Present? : False
Is file-page: False
Page count: 0
Page flags: 0x100000
Page cgroup: 0x0

使用page-types 发现进程已经被换出(由于这个时候正在编译内核)

[root@localhost vm]# ./page-types -p 1699  -L
voffset offset  flags
600     6319    ___________________________________________
601     6312    ___________________________________________
7f6f73cc9       631f    ___________________________________________
7f6f73cca       6320    ___________________________________________
7f6f73ccb       6321    ___________________________________________
7f6f73ccc       62cf    ___________________________________________
7f6f73ccd       6322    ___________________________________________
7f6f73cce       631c    ___________________________________________
7f6f73ccf       6326    ___________________________________________
7f6f73cd0       6324    ___________________________________________
7f6f73cd2       6325    ___________________________________________
7f6f73eea       631b    ___________________________________________
7f6f73eeb       6586    ___________________________________________
7f6f73eef       6323    ___________________________________________
7f6f74101       631d    ___________________________________________
7f6f74102       631e    ___________________________________________
7f6f74103       62d0    ___________________________________________
7f6f7410f       652d    ___________________________________________
7f6f74110       6318    ___________________________________________
7f6f74111       6315    ___________________________________________
7f6f74112       6313    ___________________________________________
7f6f74113       6316    ___________________________________________
7fffa50ee       631a    ___________________________________________
7fffa50ef       6317    ___________________________________________
7fffa50f0       6314    ___________________________________________
7fffa50f1       6311    ___________________________________________
7fffa5145       978c    ___________M_______________________________

             flags      page-count       MB  symbolic-flags                     long-symbolic-flags
0x0000000000000000              26        0  ___________________________________________
0x0000000000000800               1        0  ___________M_______________________________        mmap
             total              27        0

执行一下memory_layout看下效果(敲入回车)

[root@centosgpt vm]# ./v2pfn.py  1699 0x7fffa50f09f0
PFN: 0x19fcL
Is Present? : True
Is file-page: False
Page count: 1
Page flags: 0x5838
Page cgroup: 0x1
[root@centosgpt vm]#

这时候进程被调入

[root@localhost vm]# ./page-types -p 1699 -L
voffset offset  flags
400     adee    ___U_l_____M_______________________________
600     a050    ___U_l_____Masb____________________________
601     1c20b   ___UDl_____Ma_b____________________________
2166    16e9f   ___U_l_____Ma_b____________________________
...
7f6f74110       14aa7   ___U_l_____Masb____________________________
7f6f74111       19fd    ___U_l_____Masb____________________________
7f6f74112       1c20c   ___UDl_____Ma_b____________________________
7f6f74113       13d1a   ___U_l_____Masb____________________________
7fffa50ee       631a    ___________________________________________
7fffa50ef       6317    ___________________________________________
7fffa50f0       19fc    ___UDl_____Ma_b____________________________
7fffa50f1       1c215   ___U_l_____Masb____________________________
7fffa5145       978c    ___________M_______________________________
...

             flags      page-count       MB  symbolic-flags                     long-symbolic-flags
0x0000000000000000               7        0  ___________________________________________
0x0000000000000800               1        0  ___________M_______________________________        mmap
0x0000000000000828               1        0  ___U_l_____M_______________________________        uptodate,lru,mmap
0x000000000000086c             217        0  __RU_lA____M_______________________________        referenced,uptodate,lru,active,mmap
0x0000000000007828              10        0  ___U_l_____Masb____________________________        uptodate,lru,mmap,anonymous,swapcache,swapbacked
0x0000000000005828               2        0  ___U_l_____Ma_b____________________________        uptodate,lru,mmap,anonymous,swapbacked
0x0000000000005838               9        0  ___UDl_____Ma_b____________________________        uptodate,dirty,lru,mmap,anonymous,swapbacked
             total             247        0

  • vdso部分
7ffe05bdc000-7ffe05bdf000 r--p 00000000 00:00 0                          [vvar]
7ffe05bdf000-7ffe05be0000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

vDSO(virtual dynamic shared object 虚拟动态共享对象)是一个小的共享库,内核自动将其映射到所有用户空间应用程序的地址空间。

可以将其导出看下里面存放的内容

# gdb ./memory_layout
(gdb) b _start
Breakpoint 1 at 0x400670
(gdb) run
Starting program: /root/src/memory/vm/./memory_layout
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, 0x0000000000400670 in _start ()
(gdb) info program
        Using the running image of child Thread 0x7ffff7fe8740 (LWP 84240).
Program stopped at 0x400670.
It stopped at breakpoint 1.

(gdb) shell cat /proc/84240/maps |grep vdso
7ffff7ffb000-7ffff7ffc000 r-xp 00000000 00:00 0                          [vdso]
(gdb) dump memory /root/src/memory/vm/vdso.so 0x7ffff7ffb000 0x7ffff7ffc000
(gdb) quit

# objdump -T vdso.so

vdso.so:     file format elf64-x86-64

DYNAMIC SYMBOL TABLE:
00000000000008d0  w   DF .text  00000000000000a6  LINUX_2.6   clock_gettime
0000000000000820 g    DF .text  0000000000000085  LINUX_2.6   __vdso_gettimeofday
0000000000000980  w   DF .text  000000000000004b  LINUX_2.6   clock_getres
0000000000000980 g    DF .text  000000000000004b  LINUX_2.6   __vdso_clock_getres
0000000000000820  w   DF .text  0000000000000085  LINUX_2.6   gettimeofday
00000000000008b0 g    DF .text  0000000000000015  LINUX_2.6   __vdso_time
00000000000008b0  w   DF .text  0000000000000015  LINUX_2.6   time
00000000000008d0 g    DF .text  00000000000000a6  LINUX_2.6   __vdso_clock_gettime
0000000000000000 g    DO *ABS*  0000000000000000  LINUX_2.6   LINUX_2.6
00000000000009d0 g    DF .text  000000000000002a  LINUX_2.6   __vdso_getcpu
00000000000009d0  w   DF .text  000000000000002a  LINUX_2.6   getcpu

VDSO(7) 文档中也可以看到加载的函数

   x86-64 functions
       The table below lists the symbols exported by the vDSO.  All of these
       symbols are also available without the "__vdso_" prefix, but you
       should ignore those and stick to the names below. 

       symbol                 version
       ─────────────────────────────────
       __vdso_clock_gettime   LINUX_2.6
       __vdso_getcpu          LINUX_2.6
       __vdso_gettimeofday    LINUX_2.6
       __vdso_time            LINUX_2.6`

vvar存放这些函数返回的信息,内核具有修改权限, 普通进程只读, 进程创建时为其分配vdso虚拟内存空间,这一页指向的内容是内核载入的, 这样就可以跳过系统调用方式,直接获取时间信息, 节约了用户态到内核态转换, 上下文切换的一些环节。
下面启动两个进程看下对应的页面信息。

[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::20833
Before malloc in main thread
addr = 7ffc13b34200

[root@centosgpt vdso]# cat /proc/20833/maps
...
7ffc13bc6000-7ffc13bc9000 r--p 00000000 00:00 0                          [vvar]
7ffc13bc9000-7ffc13bca000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]
...

[root@centosgpt vm]# ./v2pfn.py 20833 0x7ffc13bc9000
PFN: 0xcd8cL
Is Present? : True
Is file-page: True
Page count: 34
Page flags: 0x100000800
Page cgroup: 0x0

[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::20786
Before malloc in main thread
addr = 7ffc8b6a61b0

[root@centosgpt vm]# cat /proc/20786/maps
...
7ffc8b7c3000-7ffc8b7c6000 r--p 00000000 00:00 0                          [vvar]
7ffc8b7c6000-7ffc8b7c7000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

[root@centosgpt vm]# ./v2pfn.py 20786 0x7ffc8b7c6000
PFN: 0xcd8cL
Is Present? : True
Is file-page: True
Page count: 35
Page flags: 0x100000804
Page cgroup: 0x0

可以看到他们的也是一样的指向同一个页面 0xcd8c000 看下物理内存所处的位置

[root@centosgpt vm]# cat /proc/iomem
...
00100000-1fedffff : System RAM
  0bc00000-0c800de0 : Kernel code
  0ca00000-0ce16fff : Kernel rodata
  0d000000-0d20a97f : Kernel data
  0d7dc000-0ddfffff : Kernel bss
...

0xcd8c000 位于 0ca00000-0ce16fffKernel rodata 内核载入的位置

关于vDSO发展的一篇文章: 什麼是 Linux vDSO 與 vsyscall?——發展過程, 非常棒!

Linux 进程的内存信息

  • /proc/pid/mem

可以通过这个文件读取,修改进程内存中的数据, 早先的一些版本是需要使用PTRACE_ATTACH标志的ptrace附加到该进程, 操作完毕后应通过使用PTRACE_DETACH标志调用ptrace进行分离。

  • 使用ptrace示例
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/mman.h>
#include<fcntl.h>
#include<string.h>
#include<sys/ptrace.h>
int
main (int argc, char **argv)
{
  if (argc == 3)
    {
      int pid = atoi (argv[1]);
      unsigned long long mem_addr =  strtoll (argv[2], NULL, 16);

      printf("address %llx\n", mem_addr);

      long  r = ptrace (PTRACE_ATTACH, pid, NULL, NULL);
      if (r < 0) {
        printf(" unable to attach to the pid \n");
        return 0;
      }

      waitpid(pid, NULL, 0);

      char mem_fname[1024];
      sprintf (mem_fname, "/proc/%d/mem", pid);
      printf("mem file name %s\n", mem_fname);

      int mem_file = open(mem_fname, O_RDWR);

      unsigned long long var_addr;

      var_addr = mem_addr;
      pread(mem_file, &var_addr, sizeof(unsigned long long), mem_addr);
      printf ("var %llx\n", var_addr);

      char var[7]={0};
      pread(mem_file, var, sizeof(char)*6, var_addr);
      printf ("var %s\n", var);

      strncpy(var, "hello", 6);
      pwrite(mem_file, var, sizeof(char)*6 , var_addr);
      printf ("write var %s\n", var);

      pread(mem_file, var, sizeof(char)*6, var_addr);
      printf ("read againt var %s\n", var);

      close (mem_file);

      ptrace (PTRACE_CONT, pid, NULL, NULL);
      ptrace (PTRACE_DETACH, pid, NULL, NULL);

      return 0;
    }
  else
    {
      printf ("%s <pid> <mem-address>\n", argv[0]);
      return 0;
    }
}

新版本内核可以不使用ptrace, ,示例程序

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/mman.h>
#include<fcntl.h>
#include<string.h>
#include<sys/ptrace.h>
#include <sys/uio.h>

int
main (int argc, char **argv)
{
  if (argc == 3)
    {
      int pid = atoi (argv[1]);
      unsigned long long mem_addr =  strtoll (argv[2], NULL, 16);

      printf("address %llx\n", mem_addr);

      char mem_fname[1024];
      sprintf (mem_fname, "/proc/%d/mem", pid);
      printf("mem file name %s\n", mem_fname);

      FILE *mem_file = fopen(mem_fname, "r");

      unsigned long long var_addr;
      fseeko (mem_file, mem_addr, SEEK_SET);

      fread(&var_addr, 1 ,sizeof(unsigned long long), mem_file);
      printf ("var %llx\n", var_addr);

      fseeko (mem_file, var_addr, SEEK_SET);
      char var[7]={0};
      fread(var, 6 ,sizeof(char) , mem_file);

      printf ("write var %s\n", var);

      memcpy(var, "hello", 6);
      fseeko (mem_file, var_addr, SEEK_SET);
      fwrite(var, 6 ,sizeof(char) , mem_file);
      printf ("write var %s\n", var);

      fseeko (mem_file, var_addr, SEEK_SET);
      fread(var, 6 ,sizeof(char) , mem_file);
      printf ("againt read var %s\n", var);

      fclose (mem_file);

      return 0;
    }
  else
    {
      printf ("%s <pid> <mem-address>\n", argv[0]);
      return 0;
    }
}

使用memdump3 读取 示例进程memory_layout, addr申请地址数据

  • 执行进程memory_layout
[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::27496
Before malloc in main thread
addr = 7ffd2c137f20

addr = 7ffd2c137f20  value = garlic
After malloc and before free in main thread
  • 读取,修改进程内存memdump3
[root@centosgpt vm]# ./memdump3 27496 7ffd2c137f20
address 7ffd2c137f20
mem file name /proc/27496/mem
var 1736010
read var garlic
write var ...hello
againt read var hello

这里可以发现其实是在程序中验证了下写入是否生效

...
      memcpy(var, "hello", 6);
      fseeko (mem_file, var_addr, SEEK_SET);
      fwrite(var, 6 ,sizeof(char) , mem_file);
...
  • 对于进程内存读写的系统调用
    process_vm_readv()
    process_vm_writev()

示例


[root@centosgpt vm]# cat processwritea.c #include <stdio.h> #include <string.h> #include <sys/uio.h> #include <stdlib.h> #include <errno.h> int read_data(pid_t pid, unsigned long long addr , \ void *var_addr, size_t var_len); int main (int argc, char **argv) { struct iovec local[1]; struct iovec remote[1]; char buf1[10] = { 0 }; ssize_t nwrite; if (argc == 3) { pid_t pid = atoi (argv[1]); unsigned long long addr = strtoll (argv[2], NULL, 16); printf("address %llx\n", addr); printf("reading ...\n"); unsigned long long var_addr; size_t datasize = sizeof(unsigned long long); read_data(pid, addr , &var_addr, datasize); printf("var_address =[%11x] \n", var_addr); read_data(pid, var_addr, buf1, 6); printf("buff = [%s]\n", buf1); memcpy (buf1, "hello", 6); local[0].iov_base = buf1; local[0].iov_len = 6; remote[0].iov_base = (void *)var_addr; remote[0].iov_len = 6; nwrite = process_vm_writev (pid, local, 1, remote, 1, 0); if (nwrite != 6) { printf("ER!\n"); return 1; } else { printf("OK!\n"); read_data(pid, var_addr, buf1, 6); printf("buff = [%s]\n", buf1); return 0; } } else { printf ("%s <pid> <mem-address>\n", argv[0]); return 0; } } int read_data(pid_t pid, unsigned long long addr , void *var_addr, size_t var_len) { struct iovec local[1]; struct iovec remote[1]; ssize_t nread; local[0].iov_base = (void *)(var_addr); local[0].iov_len = var_len; remote[0].iov_base = (void *)addr; remote[0].iov_len = var_len; nread = process_vm_readv(pid, local, 1, remote, 1, 0); if (nread != var_len ){ printf("read ER! %d %d %d\n", nread, var_len, errno); return 1; } else { printf("read OK!\n"); return 0; } }

可以看到在进程memory_layoutaddr申请内存并且赋值后, 可以通过process_vm_write对其修改

[root@centosgpt vm]# ./processwritea 28316 7ffc4c5f7f00
address 7ffc4c5f7f00
reading ...
read OK!
var_address =[    11f2010]
read OK!
buff = [garlic]
OK!
read OK!
buff = [hello]

内存的映射

  • /proc/pid/pagemap
  • /proc/kpagecount
  • /proc/kpageflags
  • /proc/kpagecgroup

memory cgroup 需要有如下配置:
CONFIG_CGROUPS=y
CONFIG_MEMCG=y
CONFIG_MEMCG_SWAP=y
CONFIG_MEMCG_KMEM=y

这里使用linux kernel tools/vm 下的一个工具 page-types, 需要编译一下就可以

开启 idle page tracking
  • 编译内核
# 进入linux内核源码目录,将当前内核配置信息拷贝过来
cp /boot/config-xxxxx .config
# 配置内核,这里没有修改
make menuconfig
# 清除暂存文档,编译核心,模组
make -j 4 clean bzImage modules
# 安装模组
make modules_install
# 安装核心
make install
# 生成开机菜单
grub2-mkconfig -o /boot/grub2/grub.cfg
# 重启验证
reboot

看下路径下已经出现了bitmap

[root@centosgpt page_idle]# ls | sed "s:^:pwd/:"
/sys/kernel/mm/page_idle/bitmap
  • 配置CONFIG_IDLE_PAGE_TRACKING=y
# make menuconfig
...
    Memory Management options  --->       
...

[*] Enable idle page tracking     

# cat .config|grep IDLE
...
CONFIG_IDLE_PAGE_TRACKING=y
...

编译内核时内存页被换出

[root@localhost vm]# ./page-types -p 1699  -L
voffset offset  flags
600     6319    ___________________________________________
601     6312    ___________________________________________
7f6f73cc9       631f    ___________________________________________
7f6f73cca       6320    ___________________________________________
7f6f73ccb       6321    ___________________________________________
7f6f73ccc       62cf    ___________________________________________
7f6f73ccd       6322    ___________________________________________
7f6f73cce       631c    ___________________________________________
7f6f73ccf       6326    ___________________________________________
7f6f73cd0       6324    ___________________________________________
7f6f73cd2       6325    ___________________________________________
7f6f73eea       631b    ___________________________________________
7f6f73eeb       6586    ___________________________________________
7f6f73eef       6323    ___________________________________________
7f6f74101       631d    ___________________________________________
7f6f74102       631e    ___________________________________________
7f6f74103       62d0    ___________________________________________
7f6f7410f       652d    ___________________________________________
7f6f74110       6318    ___________________________________________
7f6f74111       6315    ___________________________________________
7f6f74112       6313    ___________________________________________
7f6f74113       6316    ___________________________________________
7fffa50ee       631a    ___________________________________________
7fffa50ef       6317    ___________________________________________
7fffa50f0       6314    ___________________________________________
7fffa50f1       6311    ___________________________________________
7fffa5145       978c    ___________M_______________________________

             flags      page-count       MB  symbolic-flags                     long-symbolic-flags
0x0000000000000000              26        0  ___________________________________________
0x0000000000000800               1        0  ___________M_______________________________        mmap
             total              27        0
  • 用户态虚拟地址到物理地址转换
    • 通过/proc/pid/maps 查看虚拟地址布局
    • 选择的要读取的地址
    • 通过/proc/pid/pagemap 读取 PFN (page frame number)
    • 物理地址paddr = pfn*page_size+vir%page_size
    • /proc/kpagecount,/proc/kpageflags获取页面的映射次数与页面标识

在这里再次看到了vDSO对应的页面


[root@centosgpt vm]# ./memory_layout Welcome to per thread arena example::81775 Before malloc in main thread addr = 7ffc836ac480 [root@centosgpt vm]# cat /proc/81775/maps|grep vdso 7ffc836bf000-7ffc836c0000 r-xp 00000000 00:00 0 [vdso] [root@centosgpt vm]# ./page-types -p 81775 -L -a 0x7ffc836bf, -N voffset offset flags 7ffc836bf cd8c __R________M_______________________________

可以看到对应的pfn是cd8c,物理地址就是: 0xcd8c*0x1000 = 0xcd8c000

在显示这一页中有个没有搞懂的地方, 就是这里的 R(referenced)标识, 执行一次后,下一次不再显示, 可能是由于vDSO本身是用户态映射的内核页面所有不能使用LRU list管理,不会出现 PG_active标识

page-types 功能如下:


page-types [options] -r|--raw Raw mode, for kernel developers -d|--describe flags Describe flags -a|--addr addr-spec Walk a range of pages -b|--bits bits-spec Walk pages with specified bits -c|--cgroup path|@inode Walk pages within memory cgroup -p|--pid pid Walk process address space -f|--file filename Walk file address space -i|--mark-idle Mark pages idle -l|--list Show page details in ranges -L|--list-each Show page details one by one -C|--list-cgroup Show cgroup inode for pages -M|--list-mapcnt Show page map count -N|--no-summary Don't show summary info -X|--hwpoison hwpoison pages -x|--unpoison unpoison pages -F|--kpageflags filename kpageflags file to parse -h|--help Show this usage message

开始使用只使用到-p, -L, -l一些用法

[root@centosgpt vm]# ./page-types -p 106002 -l
voffset offset  len     flags
400     12077   1       __RUDlA____Ma_b____________________________
401     602e    1       __RU_lA____M_______________________________
601     8c05    1       __RU_lA____Ma_b____________________________
602     8e03    1       ___U_lA____Ma_b____________________________

voffset 是虚拟页面的偏移, 单位是1页 (sysconf(_SC_PAGESIZE))
offset 是pfn
len 页面个数 -l 时显示 -L一页一页显示
flags 就是page flag

          locked              error         referenced           uptodate
           dirty                lru             active               slab
       writeback            reclaim              buddy               mmap
       anonymous          swapcache         swapbacked      compound_head
   compound_tail               huge        unevictable           hwpoison
          nopage                ksm                thp            offline
       zero_page          idle_page            pgtable           reserved(r)
         mlocked(r)    mappedtodisk(r)         private(r)       private_2(r)
   owner_private(r)            arch(r)        uncached(r)       softdirty(r)
       readahead(o)       slob_free(o)     slub_frozen(o)      slub_debug(o)
            file(o)            swap(o)  mmap_exclusive(o)
                                   (r) raw mode bits  (o) overloaded bits

可以通过-i设置页面的 idle状态

[root@centosgpt vm]# ./page-types -p 106215 -i
[root@centosgpt vm]# ./page-types -p 106215 -l
voffset offset  len     flags
400     1d32    1       __RUDlA____Ma_b__________i_________________
401     602e    1       __RU_lA____M_____________i_________________
601     5be5    1       __RU_lA____Ma_b__________i_________________
602     dc49    1       ___U_lA____Ma_b__________i_________________
7ffff7a0e       dec3    1       __RU_lA____M_______________________________

通过-C显示进程使用cgroup的inode

[root@centosgpt vm]# ./page-types -p 106215 -l -C
voffset cgroup  offset  len     flags
400     @1      1d32    1       __RUDlA____Ma_b__________i_________________
401     @1      602e    1       __RU_lA____M_____________i_________________
601     @1      5be5    1       __RU_lA____Ma_b__________i_________________
602     @1      dc49    1       ___U_lA____Ma_b__________i_________________

[root@centosgpt cgroup]# ls -i /sys/fs/cgroup/|grep memory
   1 memory

cgroup: 对应cgroup的inode

通过-f显示程序如果在如内存话需要几个页面

[root@centosgpt vm]# ./page-types -f /root/src/memory/vm/translate -L
foffset offset  flags
/root/src/memory/vm/translate   Inode: 9733651  Size: 13664 (4 pages)
Modify: Sun Mar 22 21:12:39 2020 (1714962 seconds ago)
Access: Sat Apr 11 17:16:33 2020 (1128 seconds ago)
0       a9aa    __RU_lA____________________________________
1       602e    __RU_lA____M_____________i_________________
2       11d63   __RU_lA____________________________________
3       3ce1    __RU_lA____________________________________

             flags      page-count       MB  symbolic-flags                     long-symbolic-flags
0x000000000000006c               3        0  __RU_lA____________________________________        referenced,uptodate,lru,active
0x000000000200086c               1        0  __RU_lA____M_____________i_________________        referenced,uptodate,lru,active,mmap,idle_page
             total               4        0

需要四个页面

-a 指定的页面范围

N                          one page at offset N (unit: pages)
N+M                        pages range from N to N+M-1
N,M                        pages range from N to M-1
N,                         pages range from N to end
,M                         pages range from 0 to M-1

[root@centosgpt vm]# ./page-types -p 106215 -a 0x400,0x603 -L voffset offset flags 400 1d32 __RUDlA____Ma_b__________i_________________ 401 602e __RU_lA____M_____________i_________________ 601 5be5 __RU_lA____Ma_b__________i_________________ 602 dc49 ___U_lA____Ma_b__________i_________________ flags page-count MB symbolic-flags long-symbolic-flags 0x000000000200086c 1 0 __RU_lA____M_____________i_________________ referenced,uptodate,lru,active,mmap,idle_page 0x0000000002005868 1 0 ___U_lA____Ma_b__________i_________________ uptodate,lru,active,mmap,anonymous,swapbacked,idle_page 0x000000000200586c 1 0 __RU_lA____Ma_b__________i_________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked,idle_page 0x000000000200587c 1 0 __RUDlA____Ma_b__________i_________________ referenced,uptodate,dirty,lru,active,mmap,anonymous,swapbacked,idle_page total 4 0

-M 显示映射次数

[root@centosgpt vm]# ./page-types -p 106215 -a 0x400,0x603 -M -L
voffset map-cnt offset  flags
400     1       1d32    __RUDlA____Ma_b__________i_________________
401     1       602e    __RU_lA____M_____________i_________________
601     1       5be5    __RU_lA____Ma_b__________i_________________
602     1       dc49    ___U_lA____Ma_b__________i_________________

             flags      page-count       MB  symbolic-flags                     long-symbolic-flags
0x000000000200086c               1        0  __RU_lA____M_____________i_________________        referenced,uptodate,lru,active,mmap,idle_page
0x0000000002005868               1        0  ___U_lA____Ma_b__________i_________________        uptodate,lru,active,mmap,anonymous,swapbacked,idle_page
0x000000000200586c               1        0  __RU_lA____Ma_b__________i_________________        referenced,uptodate,lru,active,mmap,anonymous,swapbacked,idle_page
0x000000000200587c               1        0  __RUDlA____Ma_b__________i_________________        referenced,uptodate,dirty,lru,active,mmap,anonymous,swapbacked,idle_page
             total               4        0

-r: 可以先显示一些扩展标识

     82 /* [48-] take some arbitrary free slots for expanding overloaded flags
     83  * not part of kernel API
     84  */
     85 #define KPF_READAHEAD           48
     86 #define KPF_SLOB_FREE           49
     87 #define KPF_SLUB_FROZEN         50
     88 #define KPF_SLUB_DEBUG          51
     89 #define KPF_FILE                61
     90 #define KPF_SWAP                62
     91 #define KPF_MMAP_EXCLUSIVE      63

部分标识可以在/fs/proc/task_mmu.c找到

[root@centosgpt vm]# ./page-types -r -p 106215 -L
voffset offset  flags
400     1d32    __RUDlA____Ma_b__________i_________f______1
401     602e    __RU_lA____M_____________i_________f____F_1
601     5be5    __RU_lA____Ma_b__________i_________f______1
602     dc49    ___U_lA____Ma_b__________i_________f______1

-F : 可以通过kpageflags文件解析, 通过-L找到相关页面

[root@centosgpt vm]# ./page-types -F /proc/kpageflags
             flags      page-count       MB  symbolic-flags                     long-symbolic-flags
0x0000000000000000           31220      121  ___________________________________________
0x0000000004000000            1521        5  __________________________g________________        pgtable
0x0000000001000000               1        0  ________________________z__________________        zero_page
0x0000000000000028            3004       11  ___U_l_____________________________________        uptodate,lru
0x0000000002000028            2968       11  ___U_l___________________i_________________        uptodate,lru,idle_page
0x000000000200002c            7397       28  __RU_l___________________i_________________        referenced,uptodate,lru,idle_page
0x000000000000002c             456        1  __RU_l_____________________________________        referenced,uptodate,lru

-X -x: 调试使用

[root@localhost boot]# ls -atlr *5.5.4*
-rw-------. 1 root root 52185823 Apr  5 14:48 initramfs-5.5.4.img
-rw-r--r--. 1 root root        0 Apr  5 15:12 vmlinuz-5.5.4
-rw-r--r--. 1 root root        0 Apr  5 15:12 System.map-5.5.4

hwpoison 当内存页出现问题时就会设置该标识,从而不被应用程序使用. 保证系统稳定, 当然下面操作工具仅仅是软件模拟

安装hwpoison-inject模块

# modprobe hwpoison-inject
# lsmod |grep hw

启动示例程序

[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::8044
Before malloc in main thread
addr = 7ffec96940f0

一个页面打标,选虚拟地址 0x400

[root@centosgpt vm]# ./page-types -p 8044 -L -N -a 0x400
voffset offset  flags
400     c963    __RU_lA____M_______________________________

[root@centosgpt vm]# ./page-types -p 8044 -L -N -a 0x400 -X
voffset offset  flags
400     c963    __RU_lA____M_______________________________

[root@centosgpt vm]# ./page-types -p 8044 -L -N -a 0x400
voffset offset  flags

这时这个页面已经被取消映射了, 看下日志及页面标识

# dmesg
...
[19664.318542] Injecting memory failure at pfn 0xc963
[19664.318794] Memory failure: 0xc963: corrupted page was clean: dropped without side effects
[19664.318815] Memory failure: 0xc963: recovery action for clean LRU page: Recovered

[root@centosgpt vm]# ./page-types -a 0xc963 -L -N
offset  flags
c963    __RU_______________X_______________________

调用脚本扫描一下,虽然不是很准确扫描所有进程映射的pfn发现该页没有被使用

[root@centosgpt shell]# ./findpdibypfn.sh c963
[root@centosgpt shell]# cat findpdibypfn.sh
#!/bin/bash

pt=/srv/linux-5.5.4/tools/vm/page-types
for i in ls /proc |grep '[0-9]'; do
    $pt -p $i -N -l 2>/dev/null| awk '{print $2}' |grep $1   >/dev/null 2>&1
    if [ $? -eq 0 ]; then
       echo "process id : '$i' "
       echo " ps -ef |grep $i|grep -v grep  "
    fi
done

中间去做了个饭, 回来该pfn未被映射

[root@centosgpt vm]# ./page-types -a 0xc963 -L -N
offset  flags
c963    __RU_______________X_______________________

[root@centosgpt shell]# ./findpdibypfn.sh c963
[root@centosgpt shell]#

清除该标识可以看到已经被mmap了

[root@centosgpt vm]# ./page-types -a 0xc963 -L -N -x
offset  flags
c963    __RU_______________X_______________________
[root@centosgpt vm]# ./page-types -a 0xc963 -L -N
offset  flags
c963    ___________________________________________
[root@centosgpt vm]# ./page-types -a 0xc963 -L -N
offset  flags
c963    ___U_lA____Ma_b____________________________

查下使用的进程, 当然映射有可能会变化可能再查就查不到了

[root@centosgpt shell]# ./findpdibypfn.sh c963
process id : '15680'
 root      15680  15673  0 13:18 ?        00:00:00 sendmail -i root
root      15693  15680  0 13:18 ?        00:00:00 /usr/sbin/postdrop -r

[root@centosgpt vm]# ./page-types -p 15680 -L |grep c963
7fb6669ff       c963    ___U_lA____Ma_b____________________________

极客时间刘超老师<趣谈Linux操作系统>的一个课后作业,学习过程中记录的一些笔记

参考/引用

Understanding the Memory Layout of Linux Executables
proc – process information pseudo-filesystem
RHEL-6 – 5.2.12. /proc/iomem
How to translate virtual to physical addresses through /proc/pid/pagemap
Pagemap Interface of Linux Explained
How do I read from /proc/$pid/mem under Linux?
Examining Process Page Tables
VDSO(7)
x86 架构下 Linux 的系统调用与 vsyscall, vDSO
什麼是 Linux vDSO 與 vsyscall?——發展過程

Be First to Comment

发表回复