Linux进程内存布局及映射信息
处理器在运行程序时,需要存储器来存放程序和程序使用的数据, 现代操作系统提供了存储器的抽象:虚拟存储器
, 使得应用程序来说不用过多的考虑物理存储使用,简化了内存管理.
虚拟存储器
虚拟存储器是硬件异常、硬件地址翻译、主存、磁盘文件和内核软件完美的交互,它为每一个进程提供了一个大的、一致的、私有的地址空间.
–深入理解计算机系统
在多应用复杂任务场景下,进程直接使用物理内存存在一些问题:
- 处理数据大小: 应用程序存储的使用,将受制于物理内存大小;
- 复杂性:应用程序要清楚需要使用的物理地址,何时申请,何时释放,甚至产生碎片后是否要进行整理等;
- 安全性:如果一个进程不小心改写了另外一个进程也正在使用的内存,进程可能会莫名其妙的挂掉.
虚拟存储器很好的解决了这类问题:
- 高效使用内存: 将主存看作存储在磁盘上地址空间的高速缓存,主存只保存活动区域
- 简化存储管理: 虚拟存储提供一致的地址空间
- 对进程空间的保护: 以免地址空间被其他进程破坏
内存布局、映射的信息
虚拟存储用于管理,最终还是要使用到物理内存, Linux提供了一些接口可以查看,进程(虚拟)内存布局, 虚拟内存与物理内存的映射等。
相关的proc
文件
可以通过下面的proc
文件查看布局及映射信息
/proc/iomem
: 物理内存布局/proc/pid/maps
: 进程虚拟内存布局/proc/pid/mem
: 进程内存查看/proc/pid/pagemap
: 虚拟内存映射到物理内存的信息/proc/kpagecount
: 映射的次数/proc/kpageflags
: 页面的状态/proc/kpagecgroup
: 可以查看每一页对应的cgroup的inode
物理内存使用布局
- /proc/iomem
这个文件显示统内存与每个物理设备映射关系, 虚拟机内存物理内存配置了512M
00000000-00000fff : Reserved
00001000-0009ebff : System RAM
0009ec00-0009ffff : Reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ca000-000cafff : Adapter ROM
000cc000-000cffff : PCI Bus 0000:00
000d0000-000d3fff : PCI Bus 0000:00
000d4000-000d7fff : PCI Bus 0000:00
000d8000-000dbfff : PCI Bus 0000:00
000dc000-000fffff : Reserved
000f0000-000fffff : System ROM
00100000-1fedffff : System RAM
0ba00000-0c600de0 : Kernel code
0c800000-0cc16fff : Kernel rodata
0ce00000-0d00a8ff : Kernel data
0d5dc000-0dbfffff : Kernel bss
1fee0000-1fefefff : ACPI Tables
1feff000-1fefffff : ACPI Non-volatile Storage
1ff00000-1fffffff : System RAM
...
- 第一列:内存地址范围
- 第二列:对应的内存用途
可以看到含有System RAM
的几个段
...
00001000-0009ebff : System RAM
...
00100000-1fedffff : System RAM
0ba00000-0c600de0 : Kernel code
0c800000-0cc16fff : Kernel rodata
0ce00000-0d00a8ff : Kernel data
0d5dc000-0dbfffff : Kernel bss
...
1ff00000-1fffffff : System RAM
转成10进制看一下
#!/bin/bash
while IFS='-: ' read b e used_by; do
set -- $(printf "%d %d" 0x$b 0x$e)
echo "$1 to $2 bytes ($(($1/1024/1024)) to $(($2/1024/1024)) Mb) - $used_by"
done < /proc/iomem
...
4096 to 650239 bytes (0 to 0 Mb) - System RAM
...
1048576 to 535691263 bytes (1 to 510 Mb) - System RAM
195035136 to 207621600 bytes (186 to 198 Mb) - Kernel code
209715200 to 214003711 bytes (200 to 204 Mb) - Kernel rodata
216006656 to 218147071 bytes (206 to 208 Mb) - Kernel data
224247808 to 230686719 bytes (213 to 219 Mb) - Kernel bss
...
535822336 to 536870911 bytes (511 to 511 Mb) - System RAM
...
进程虚拟内存使用布局
Linux 进程的内存布局
/proc/pid/maps
从用户态视角进程内存布局如下, 内核态布局部分未包含.
0
Program Text (.text)
Initialised Data (.data)
Uninitialised Data (.bss)
Heap
|
v
Memory Mapped Region for Shared Libraries or Anything Else
^
|
User Stack
- 示例进程
memory_layout.c
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
void *thread_func(void *arg){
char *addr;
printf("Before malloc in thread 1\n");
printf("addr = %llx \n", &addr);
getchar();
addr = (char*)malloc(1000*sizeof(char));
strncpy(addr, "thread", 6);
printf("addr = %llx \n", &addr);
printf("After malloc and before free in thread 1\n");
printf("addr = %llx \n", &addr);
getchar();
printf("addr [%s] \n", addr);
free(addr);
printf("After free in thread 1\n");
getchar();
}
int main(){
char *addr;
int pthread_status;
printf("Welcome to per thread arena example::%d\n",getpid());
printf("Before malloc in main thread\n");
printf("addr = %llx \n", &addr);
getchar();
addr = (char *)malloc(1000);
strncpy(addr, "garlic", 6);
printf("addr = %llx value = \n", &addr, addr);
printf("After malloc and before free in main thread \n");
getchar();
printf("addr [%s] \n", addr);
free(addr);
printf("After free in main thread\n");
getchar();
pthread_t thread_1;
pthread_status = pthread_create(&thread_1, NULL, thread_func, NULL);
if (pthread_status != 0) {
printf("Thread creation error\n");
return -1;
}
void * thread_1_status;
pthread_status = pthread_join(thread_1, &thread_1_status);
if (pthread_status != 0) {
printf("thread join error\n");
return -1;
}
return 0;
}
通过malloc申请两次内存, 一次是在线程创建前, 一次是在线程执行时
- 编译&运行
# gcc memory_layout.c -o memory_layout -lpthread
# ./memory_layout
Welcome to per thread arena example::118902
Before malloc in main thread
- 查看进程内存布局信息
[root@centosgpt vm]# cat /proc/118902/maps
00400000-00401000 r-xp 00000000 fd:00 13592455 /root/src/memory/vm/memory_layout
00600000-00601000 r--p 00000000 fd:00 13592455 /root/src/memory/vm/memory_layout
00601000-00602000 rw-p 00001000 fd:00 13592455 /root/src/memory/vm/memory_layout
7f05691b4000-7f0569376000 r-xp 00000000 fd:00 33593981 /usr/lib64/libc-2.17.so
7f0569376000-7f0569576000 ---p 001c2000 fd:00 33593981 /usr/lib64/libc-2.17.so
7f0569576000-7f056957a000 r--p 001c2000 fd:00 33593981 /usr/lib64/libc-2.17.so
7f056957a000-7f056957c000 rw-p 001c6000 fd:00 33593981 /usr/lib64/libc-2.17.so
7f056957c000-7f0569581000 rw-p 00000000 00:00 0
7f0569581000-7f0569598000 r-xp 00000000 fd:00 33594199 /usr/lib64/libpthread-2.17.so
7f0569598000-7f0569797000 ---p 00017000 fd:00 33594199 /usr/lib64/libpthread-2.17.so
7f0569797000-7f0569798000 r--p 00016000 fd:00 33594199 /usr/lib64/libpthread-2.17.so
7f0569798000-7f0569799000 rw-p 00017000 fd:00 33594199 /usr/lib64/libpthread-2.17.so
7f0569799000-7f056979d000 rw-p 00000000 00:00 0
7f056979d000-7f05697bf000 r-xp 00000000 fd:00 33593974 /usr/lib64/ld-2.17.so
7f05699ae000-7f05699b1000 rw-p 00000000 00:00 0
7f05699bb000-7f05699be000 rw-p 00000000 00:00 0
7f05699be000-7f05699bf000 r--p 00021000 fd:00 33593974 /usr/lib64/ld-2.17.so
7f05699bf000-7f05699c0000 rw-p 00022000 fd:00 33593974 /usr/lib64/ld-2.17.so
7f05699c0000-7f05699c1000 rw-p 00000000 00:00 0
7ffe05a82000-7ffe05aa3000 rw-p 00000000 00:00 0 [stack]
7ffe05bdc000-7ffe05bdf000 r--p 00000000 00:00 0 [vvar]
7ffe05bdf000-7ffe05be0000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
address – 进程地址空间中区域的起始地址和结束地址
permissions – 访问权限, 读取,写入,执行,共享和私有。
r = read
w = write
x = execute
s = shared
p = private (copy on write)
offset – 如果区域是从文件(使用mmap)映射的,那么这就是映射开始的文件中的偏移量。如果内存没有从文件映射,那么它就是0。
device – 如果该区域是从文件映射的,这是文件所在的设备号(16进制)。
inode – 如果该区域从文件映射的,代表inode编号
pathname – 如果区域是从文件映射而来,这就是文件的名称。对于匿名映射区域,此字段为空。还有一些特殊区域的名称,如[heap]、[stack]或[vdso]。[vdso]表示虚拟动态共享对象。它被系统调用用来切换到内核模式。
总体来说分为三部分: 1. 程序部分; 2. 动态库及堆栈区; 3. vdso部分.
- 程序部分映射
00400000-00401000 r-xp 00000000 fd:00 13592455 /root/src/memory/vm/memory_layout
00600000-00601000 r--p 00000000 fd:00 13592455 /root/src/memory/vm/memory_layout
00601000-00602000 rw-p 00001000 fd:00 13592455 /root/src/memory/vm/memory_layout
每个范围的大小 4K, 对应一个页 ( 我机器页的大小:4K )
[root@centosgpt vm]# bc <<< 'obase=10; ibase=16; 401000 - 400000'
4096
...
[root@centosgpt vm]# getconf PAGESIZE
4096
程序文件大小 8864 大约 8.656K ,上面可以看到分配了3个页,有足够空间存放。
[root@centosgpt vm]# stat memory_layout | grep Size
Size: 8864 Blocks: 24 IO Block: 4096 regular file
程序装载的信息
[root@centosgpt vm]# readelf -l -W ./memory_layout
Elf file type is EXEC (Executable file)
Entry point 0x400670
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R E 0x8
INTERP 0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000b7c 0x000b7c R E 0x200000
LOAD 0x000e00 0x0000000000600e00 0x0000000000600e00 0x000264 0x000268 RW 0x200000
DYNAMIC 0x000e18 0x0000000000600e18 0x0000000000600e18 0x0001e0 0x0001e0 RW 0x8
NOTE 0x000254 0x0000000000400254 0x0000000000400254 0x000044 0x000044 R 0x4
GNU_EH_FRAME 0x000a28 0x0000000000400a28 0x0000000000400a28 0x00003c 0x00003c R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x000e00 0x0000000000600e00 0x0000000000600e00 0x000200 0x000200 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .init_array .fini_array .jcr .dynamic .got
section
是从链接角度进行划分, segment
是从装载的角度划分, 可以看到最小的LOAD
对应的虚拟地址起始位置 0x400000
, 而入口地址却是Entry point 0x400670
, 也就是.text
,main()
所在section
.
[root@centosgpt vm]# objdump --disassemble-all --start-address=0x000000 --stop-address=0x401000 ./memory_layout | less +/400670
0000000000400670 <_start>:
400670: 31 ed xor %ebp,%ebp
400672: 49 89 d1 mov %rdx,%r9
400675: 5e pop %rsi
400676: 48 89 e2 mov %rsp,%rdx
400679: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
40067d: 50 push %rax
40067e: 54 push %rsp
40067f: 49 c7 c0 f0 08 40 00 mov $0x4008f0,%r8
400686: 48 c7 c1 80 08 40 00 mov $0x400880,%rcx
40068d: 48 c7 c7 b2 07 40 00 mov $0x4007b2,%rdi
400694: e8 87 ff ff ff callq 400620 <__libc_start_main@plt>
400699: f4 hlt
40069a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
那么 0x40000 - 0x400670
存放着什么呢, 根据ELF结构可知
Elf header
Program header table
Segment1
Segment2
...
Section head table
链接器会将属性相同的section
放到一个segment
中,针对本例,起始位置0x40000
到0x400670
存放的内容就是:
Elf header + Program header table + Segment1
下面计算一下, 先计算下 Elf header + Program header table
长度
[root@centosgpt vm]# readelf -h ./memory_layout
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x400670
Start of program headers: 64 (bytes into file)
Start of section headers: 6880 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 9
Size of section headers: 64 (bytes)
Number of section headers: 31
Section header string table index: 30
Elf header + Program header table 偏移 = 64 + 56*9 = 568 = 0x238
看下程序sections信息, 可以对应到上面虚拟地址范围,可以看到起始section .interp
地址是 0x400238
:
[root@centosgpt vm]# objdump -h ./memory_layout
./memory_layout: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .interp 0000001c 0000000000400238 0000000000400238 00000238 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .note.ABI-tag 00000020 0000000000400254 0000000000400254 00000254 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .note.gnu.build-id 00000024 0000000000400274 0000000000400274 00000274 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .gnu.hash 0000001c 0000000000400298 0000000000400298 00000298 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .dynsym 00000108 00000000004002b8 00000000004002b8 000002b8 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .dynstr 0000008b 00000000004003c0 00000000004003c0 000003c0 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
6 .gnu.version 00000016 000000000040044c 000000000040044c 0000044c 2**1
CONTENTS, ALLOC, LOAD, READONLY, DATA
7 .gnu.version_r 00000040 0000000000400468 0000000000400468 00000468 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
8 .rela.dyn 00000018 00000000004004a8 00000000004004a8 000004a8 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
9 .rela.plt 000000d8 00000000004004c0 00000000004004c0 000004c0 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
10 .init 0000001a 0000000000400598 0000000000400598 00000598 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
11 .plt 000000a0 00000000004005c0 00000000004005c0 000005c0 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
12 .plt.got 00000008 0000000000400660 0000000000400660 00000660 2**3
CONTENTS, ALLOC, LOAD, READONLY, CODE
13 .text 00000282 0000000000400670 0000000000400670 00000670 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
14 .fini 00000009 00000000004008f4 00000000004008f4 000008f4 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
15 .rodata 00000127 0000000000400900 0000000000400900 00000900 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
16 .eh_frame_hdr 0000003c 0000000000400a28 0000000000400a28 00000a28 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
17 .eh_frame 00000114 0000000000400a68 0000000000400a68 00000a68 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
18 .init_array 00000008 0000000000600e00 0000000000600e00 00000e00 2**3
CONTENTS, ALLOC, LOAD, DATA
19 .fini_array 00000008 0000000000600e08 0000000000600e08 00000e08 2**3
CONTENTS, ALLOC, LOAD, DATA
20 .jcr 00000008 0000000000600e10 0000000000600e10 00000e10 2**3
CONTENTS, ALLOC, LOAD, DATA
21 .dynamic 000001e0 0000000000600e18 0000000000600e18 00000e18 2**3
CONTENTS, ALLOC, LOAD, DATA
22 .got 00000008 0000000000600ff8 0000000000600ff8 00000ff8 2**3
CONTENTS, ALLOC, LOAD, DATA
23 .got.plt 00000060 0000000000601000 0000000000601000 00001000 2**3
CONTENTS, ALLOC, LOAD, DATA
24 .data 00000004 0000000000601060 0000000000601060 00001060 2**0
CONTENTS, ALLOC, LOAD, DATA
25 .bss 00000004 0000000000601064 0000000000601064 00001064 2**0
ALLOC
26 .comment 0000002d 0000000000000000 0000000000000000 00001064 2**0
CONTENTS, READONLY
0x400238 - 0x400670
存放是 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got
这些section
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
- 动态库&堆栈区
02166000-02187000 rw-p 00000000 00:00 0 [heap]
7f6f73907000-7f6f73ac9000 r-xp 00000000 fd:00 33593981 /usr/lib64/libc-2.17.so
7f6f73ac9000-7f6f73cc9000 ---p 001c2000 fd:00 33593981 /usr/lib64/libc-2.17.so
7f6f73cc9000-7f6f73ccd000 r--p 001c2000 fd:00 33593981 /usr/lib64/libc-2.17.so
7f6f73ccd000-7f6f73ccf000 rw-p 001c6000 fd:00 33593981 /usr/lib64/libc-2.17.so
7f6f73ccf000-7f6f73cd4000 rw-p 00000000 00:00 0
7f6f73cd4000-7f6f73ceb000 r-xp 00000000 fd:00 33594199 /usr/lib64/libpthread-2.17.so
7f6f73ceb000-7f6f73eea000 ---p 00017000 fd:00 33594199 /usr/lib64/libpthread-2.17.so
7f6f73eea000-7f6f73eeb000 r--p 00016000 fd:00 33594199 /usr/lib64/libpthread-2.17.so
7f6f73eeb000-7f6f73eec000 rw-p 00017000 fd:00 33594199 /usr/lib64/libpthread-2.17.so
7f6f73eec000-7f6f73ef0000 rw-p 00000000 00:00 0
7f6f73ef0000-7f6f73f12000 r-xp 00000000 fd:00 33593974 /usr/lib64/ld-2.17.so
7f6f74101000-7f6f74104000 rw-p 00000000 00:00 0
7f6f7410e000-7f6f74111000 rw-p 00000000 00:00 0
7f6f74111000-7f6f74112000 r--p 00021000 fd:00 33593974 /usr/lib64/ld-2.17.so
7f6f74112000-7f6f74113000 rw-p 00022000 fd:00 33593974 /usr/lib64/ld-2.17.so
7f6f74113000-7f6f74114000 rw-p 00000000 00:00 0
7fffa50d1000-7fffa50f2000 rw-p 00000000 00:00 0 [stack]
7fffa5142000-7fffa5145000 r--p 00000000 00:00 0 [vvar]
7fffa5145000-7fffa5146000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
可以看到动态库及堆栈空间
[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::1699
Before malloc in main thread
addr = 7fffa50f09f0
addr = 7fffa50f09f0 value = garlic
After malloc and before free in main thread
局部变量addr
位置落在stack
区内, stack
相对 堆
位于高地址区域。
7fffa50d1000-7fffa50f2000 rw-p 00000000 00:00 0 [stack]
可以使用脚本看下addr所属页面信息
#!/usr/bin/python
import sys
import os
import binascii
import struct
def read_entry(path, offset, size=8):
with open(path, 'r') as f:
f.seek(offset, 0)
return struct.unpack('Q', f.read(size))[0]
# Read /proc/$PID/pagemap
def get_pagemap_entry(pid, addr):
maps_path = "/proc/{0}/pagemap".format(pid)
if not os.path.isfile(maps_path):
print "Process {0} doesn't exist.".format (pid)
return
page_size = os.sysconf ("SC_PAGE_SIZE")
pagemap_entry_size = 8
offset = (addr / page_size) * pagemap_entry_size
return read_entry (maps_path, offset)
def get_pfn (entry):
return entry & 0x7FFFFFFFFFFFFF
def is_present(entry):
return ((entry & (1 << 63)) != 0)
def is_file_page(entry):
return ((entry & (1 << 61)) != 0)
def get_pagecount(pfn):
file_path = "/proc/kpagecount"
offset = pfn * 8
return read_entry(file_path, offset)
def get_page_flags(pfn):
file_path = "/proc/kpageflags"
offset = pfn * 8
return read_entry(file_path, offset)
def get_page_cgroup(pfn):
file_path = "/proc/kpagecgroup"
offset = pfn * 8
return read_entry(file_path, offset)
if __name__ == "__main__":
pid = sys.argv[1]
if sys.argv[2].startswith ("0x"):
addr = long (sys.argv[2], base = 16)
else:
addr = long (sys.argv[2])
entry = get_pagemap_entry (pid, addr)
pfn = get_pfn (entry)
print "PFN: {}".format (hex (pfn))
print "Is Present? : {}".format(is_present(entry))
print "Is file-page: {}".format(is_file_page(entry))
print "Page count: {}".format(get_pagecount(pfn))
print "Page flags: {}".format(hex(get_page_flags(pfn)))
print "Page cgroup: {}".format(hex(get_page_cgroup(pfn)))
可以看到目前不内存中
[root@centosgpt vm]# ./v2pfn.py 1699 0x7fffa50f09f0
PFN: 0xc6280
Is Present? : False
Is file-page: False
Page count: 0
Page flags: 0x100000
Page cgroup: 0x0
使用page-types 发现进程已经被换出(由于这个时候正在编译内核)
[root@localhost vm]# ./page-types -p 1699 -L
voffset offset flags
600 6319 ___________________________________________
601 6312 ___________________________________________
7f6f73cc9 631f ___________________________________________
7f6f73cca 6320 ___________________________________________
7f6f73ccb 6321 ___________________________________________
7f6f73ccc 62cf ___________________________________________
7f6f73ccd 6322 ___________________________________________
7f6f73cce 631c ___________________________________________
7f6f73ccf 6326 ___________________________________________
7f6f73cd0 6324 ___________________________________________
7f6f73cd2 6325 ___________________________________________
7f6f73eea 631b ___________________________________________
7f6f73eeb 6586 ___________________________________________
7f6f73eef 6323 ___________________________________________
7f6f74101 631d ___________________________________________
7f6f74102 631e ___________________________________________
7f6f74103 62d0 ___________________________________________
7f6f7410f 652d ___________________________________________
7f6f74110 6318 ___________________________________________
7f6f74111 6315 ___________________________________________
7f6f74112 6313 ___________________________________________
7f6f74113 6316 ___________________________________________
7fffa50ee 631a ___________________________________________
7fffa50ef 6317 ___________________________________________
7fffa50f0 6314 ___________________________________________
7fffa50f1 6311 ___________________________________________
7fffa5145 978c ___________M_______________________________
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000000 26 0 ___________________________________________
0x0000000000000800 1 0 ___________M_______________________________ mmap
total 27 0
执行一下memory_layout
看下效果(敲入回车)
[root@centosgpt vm]# ./v2pfn.py 1699 0x7fffa50f09f0
PFN: 0x19fcL
Is Present? : True
Is file-page: False
Page count: 1
Page flags: 0x5838
Page cgroup: 0x1
[root@centosgpt vm]#
这时候进程被调入
[root@localhost vm]# ./page-types -p 1699 -L
voffset offset flags
400 adee ___U_l_____M_______________________________
600 a050 ___U_l_____Masb____________________________
601 1c20b ___UDl_____Ma_b____________________________
2166 16e9f ___U_l_____Ma_b____________________________
...
7f6f74110 14aa7 ___U_l_____Masb____________________________
7f6f74111 19fd ___U_l_____Masb____________________________
7f6f74112 1c20c ___UDl_____Ma_b____________________________
7f6f74113 13d1a ___U_l_____Masb____________________________
7fffa50ee 631a ___________________________________________
7fffa50ef 6317 ___________________________________________
7fffa50f0 19fc ___UDl_____Ma_b____________________________
7fffa50f1 1c215 ___U_l_____Masb____________________________
7fffa5145 978c ___________M_______________________________
...
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000000 7 0 ___________________________________________
0x0000000000000800 1 0 ___________M_______________________________ mmap
0x0000000000000828 1 0 ___U_l_____M_______________________________ uptodate,lru,mmap
0x000000000000086c 217 0 __RU_lA____M_______________________________ referenced,uptodate,lru,active,mmap
0x0000000000007828 10 0 ___U_l_____Masb____________________________ uptodate,lru,mmap,anonymous,swapcache,swapbacked
0x0000000000005828 2 0 ___U_l_____Ma_b____________________________ uptodate,lru,mmap,anonymous,swapbacked
0x0000000000005838 9 0 ___UDl_____Ma_b____________________________ uptodate,dirty,lru,mmap,anonymous,swapbacked
total 247 0
- vdso部分
7ffe05bdc000-7ffe05bdf000 r--p 00000000 00:00 0 [vvar]
7ffe05bdf000-7ffe05be0000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
vDSO
(virtual dynamic shared object 虚拟动态共享对象)是一个小的共享库,内核自动将其映射到所有用户空间应用程序的地址空间。
可以将其导出看下里面存放的内容
# gdb ./memory_layout
(gdb) b _start
Breakpoint 1 at 0x400670
(gdb) run
Starting program: /root/src/memory/vm/./memory_layout
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Breakpoint 1, 0x0000000000400670 in _start ()
(gdb) info program
Using the running image of child Thread 0x7ffff7fe8740 (LWP 84240).
Program stopped at 0x400670.
It stopped at breakpoint 1.
(gdb) shell cat /proc/84240/maps |grep vdso
7ffff7ffb000-7ffff7ffc000 r-xp 00000000 00:00 0 [vdso]
(gdb) dump memory /root/src/memory/vm/vdso.so 0x7ffff7ffb000 0x7ffff7ffc000
(gdb) quit
# objdump -T vdso.so
vdso.so: file format elf64-x86-64
DYNAMIC SYMBOL TABLE:
00000000000008d0 w DF .text 00000000000000a6 LINUX_2.6 clock_gettime
0000000000000820 g DF .text 0000000000000085 LINUX_2.6 __vdso_gettimeofday
0000000000000980 w DF .text 000000000000004b LINUX_2.6 clock_getres
0000000000000980 g DF .text 000000000000004b LINUX_2.6 __vdso_clock_getres
0000000000000820 w DF .text 0000000000000085 LINUX_2.6 gettimeofday
00000000000008b0 g DF .text 0000000000000015 LINUX_2.6 __vdso_time
00000000000008b0 w DF .text 0000000000000015 LINUX_2.6 time
00000000000008d0 g DF .text 00000000000000a6 LINUX_2.6 __vdso_clock_gettime
0000000000000000 g DO *ABS* 0000000000000000 LINUX_2.6 LINUX_2.6
00000000000009d0 g DF .text 000000000000002a LINUX_2.6 __vdso_getcpu
00000000000009d0 w DF .text 000000000000002a LINUX_2.6 getcpu
VDSO(7) 文档中也可以看到加载的函数
x86-64 functions
The table below lists the symbols exported by the vDSO. All of these
symbols are also available without the "__vdso_" prefix, but you
should ignore those and stick to the names below.
symbol version
─────────────────────────────────
__vdso_clock_gettime LINUX_2.6
__vdso_getcpu LINUX_2.6
__vdso_gettimeofday LINUX_2.6
__vdso_time LINUX_2.6`
vvar
存放这些函数返回的信息,内核具有修改权限, 普通进程只读, 进程创建时为其分配vdso
虚拟内存空间,这一页指向的内容是内核载入的, 这样就可以跳过系统调用方式,直接获取时间信息, 节约了用户态到内核态转换, 上下文切换的一些环节。
下面启动两个进程看下对应的页面信息。
[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::20833
Before malloc in main thread
addr = 7ffc13b34200
[root@centosgpt vdso]# cat /proc/20833/maps
...
7ffc13bc6000-7ffc13bc9000 r--p 00000000 00:00 0 [vvar]
7ffc13bc9000-7ffc13bca000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
...
[root@centosgpt vm]# ./v2pfn.py 20833 0x7ffc13bc9000
PFN: 0xcd8cL
Is Present? : True
Is file-page: True
Page count: 34
Page flags: 0x100000800
Page cgroup: 0x0
[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::20786
Before malloc in main thread
addr = 7ffc8b6a61b0
[root@centosgpt vm]# cat /proc/20786/maps
...
7ffc8b7c3000-7ffc8b7c6000 r--p 00000000 00:00 0 [vvar]
7ffc8b7c6000-7ffc8b7c7000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
[root@centosgpt vm]# ./v2pfn.py 20786 0x7ffc8b7c6000
PFN: 0xcd8cL
Is Present? : True
Is file-page: True
Page count: 35
Page flags: 0x100000804
Page cgroup: 0x0
可以看到他们的也是一样的指向同一个页面 0xcd8c000
看下物理内存所处的位置
[root@centosgpt vm]# cat /proc/iomem
...
00100000-1fedffff : System RAM
0bc00000-0c800de0 : Kernel code
0ca00000-0ce16fff : Kernel rodata
0d000000-0d20a97f : Kernel data
0d7dc000-0ddfffff : Kernel bss
...
0xcd8c000
位于 0ca00000-0ce16fff
是Kernel rodata
内核载入的位置
关于vDSO发展的一篇文章: 什麼是 Linux vDSO 與 vsyscall?——發展過程, 非常棒!
Linux 进程的内存信息
/proc/pid/mem
可以通过这个文件读取,修改进程内存中的数据, 早先的一些版本是需要使用PTRACE_ATTACH标志的ptrace附加到该进程, 操作完毕后应通过使用PTRACE_DETACH标志调用ptrace进行分离。
- 使用ptrace示例
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/mman.h>
#include<fcntl.h>
#include<string.h>
#include<sys/ptrace.h>
int
main (int argc, char **argv)
{
if (argc == 3)
{
int pid = atoi (argv[1]);
unsigned long long mem_addr = strtoll (argv[2], NULL, 16);
printf("address %llx\n", mem_addr);
long r = ptrace (PTRACE_ATTACH, pid, NULL, NULL);
if (r < 0) {
printf(" unable to attach to the pid \n");
return 0;
}
waitpid(pid, NULL, 0);
char mem_fname[1024];
sprintf (mem_fname, "/proc/%d/mem", pid);
printf("mem file name %s\n", mem_fname);
int mem_file = open(mem_fname, O_RDWR);
unsigned long long var_addr;
var_addr = mem_addr;
pread(mem_file, &var_addr, sizeof(unsigned long long), mem_addr);
printf ("var %llx\n", var_addr);
char var[7]={0};
pread(mem_file, var, sizeof(char)*6, var_addr);
printf ("var %s\n", var);
strncpy(var, "hello", 6);
pwrite(mem_file, var, sizeof(char)*6 , var_addr);
printf ("write var %s\n", var);
pread(mem_file, var, sizeof(char)*6, var_addr);
printf ("read againt var %s\n", var);
close (mem_file);
ptrace (PTRACE_CONT, pid, NULL, NULL);
ptrace (PTRACE_DETACH, pid, NULL, NULL);
return 0;
}
else
{
printf ("%s <pid> <mem-address>\n", argv[0]);
return 0;
}
}
新版本内核可以不使用ptrace, ,示例程序
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/mman.h>
#include<fcntl.h>
#include<string.h>
#include<sys/ptrace.h>
#include <sys/uio.h>
int
main (int argc, char **argv)
{
if (argc == 3)
{
int pid = atoi (argv[1]);
unsigned long long mem_addr = strtoll (argv[2], NULL, 16);
printf("address %llx\n", mem_addr);
char mem_fname[1024];
sprintf (mem_fname, "/proc/%d/mem", pid);
printf("mem file name %s\n", mem_fname);
FILE *mem_file = fopen(mem_fname, "r");
unsigned long long var_addr;
fseeko (mem_file, mem_addr, SEEK_SET);
fread(&var_addr, 1 ,sizeof(unsigned long long), mem_file);
printf ("var %llx\n", var_addr);
fseeko (mem_file, var_addr, SEEK_SET);
char var[7]={0};
fread(var, 6 ,sizeof(char) , mem_file);
printf ("write var %s\n", var);
memcpy(var, "hello", 6);
fseeko (mem_file, var_addr, SEEK_SET);
fwrite(var, 6 ,sizeof(char) , mem_file);
printf ("write var %s\n", var);
fseeko (mem_file, var_addr, SEEK_SET);
fread(var, 6 ,sizeof(char) , mem_file);
printf ("againt read var %s\n", var);
fclose (mem_file);
return 0;
}
else
{
printf ("%s <pid> <mem-address>\n", argv[0]);
return 0;
}
}
使用memdump3
读取 示例进程memory_layout
, addr申请地址数据
- 执行进程
memory_layout
[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::27496
Before malloc in main thread
addr = 7ffd2c137f20
addr = 7ffd2c137f20 value = garlic
After malloc and before free in main thread
- 读取,修改进程内存
memdump3
[root@centosgpt vm]# ./memdump3 27496 7ffd2c137f20
address 7ffd2c137f20
mem file name /proc/27496/mem
var 1736010
read var garlic
write var ...hello
againt read var hello
这里可以发现其实是在程序中验证了下写入是否生效
...
memcpy(var, "hello", 6);
fseeko (mem_file, var_addr, SEEK_SET);
fwrite(var, 6 ,sizeof(char) , mem_file);
...
- 对于进程内存读写的系统调用
process_vm_readv()
process_vm_writev()
示例
[root@centosgpt vm]# cat processwritea.c
#include <stdio.h>
#include <string.h>
#include <sys/uio.h>
#include <stdlib.h>
#include <errno.h>
int read_data(pid_t pid, unsigned long long addr , \
void *var_addr, size_t var_len);
int
main (int argc, char **argv)
{
struct iovec local[1];
struct iovec remote[1];
char buf1[10] = { 0 };
ssize_t nwrite;
if (argc == 3)
{
pid_t pid = atoi (argv[1]);
unsigned long long addr = strtoll (argv[2], NULL, 16);
printf("address %llx\n", addr);
printf("reading ...\n");
unsigned long long var_addr;
size_t datasize = sizeof(unsigned long long);
read_data(pid, addr , &var_addr, datasize);
printf("var_address =[%11x] \n", var_addr);
read_data(pid, var_addr, buf1, 6);
printf("buff = [%s]\n", buf1);
memcpy (buf1, "hello", 6);
local[0].iov_base = buf1;
local[0].iov_len = 6;
remote[0].iov_base = (void *)var_addr;
remote[0].iov_len = 6;
nwrite = process_vm_writev (pid, local, 1, remote, 1, 0);
if (nwrite != 6)
{
printf("ER!\n");
return 1;
}
else {
printf("OK!\n");
read_data(pid, var_addr, buf1, 6);
printf("buff = [%s]\n", buf1);
return 0;
}
}
else
{
printf ("%s <pid> <mem-address>\n", argv[0]);
return 0;
}
}
int read_data(pid_t pid, unsigned long long addr , void *var_addr, size_t var_len)
{
struct iovec local[1];
struct iovec remote[1];
ssize_t nread;
local[0].iov_base = (void *)(var_addr);
local[0].iov_len = var_len;
remote[0].iov_base = (void *)addr;
remote[0].iov_len = var_len;
nread = process_vm_readv(pid, local, 1, remote, 1, 0);
if (nread != var_len ){
printf("read ER! %d %d %d\n", nread, var_len, errno);
return 1;
} else {
printf("read OK!\n");
return 0;
}
}
可以看到在进程memory_layout
中addr
申请内存并且赋值后, 可以通过process_vm_write
对其修改
[root@centosgpt vm]# ./processwritea 28316 7ffc4c5f7f00
address 7ffc4c5f7f00
reading ...
read OK!
var_address =[ 11f2010]
read OK!
buff = [garlic]
OK!
read OK!
buff = [hello]
内存的映射
/proc/pid/pagemap
/proc/kpagecount
/proc/kpageflags
/proc/kpagecgroup
memory cgroup 需要有如下配置:
CONFIG_CGROUPS=y
CONFIG_MEMCG=y
CONFIG_MEMCG_SWAP=y
CONFIG_MEMCG_KMEM=y
这里使用linux kernel tools/vm 下的一个工具 page-types, 需要编译一下就可以
开启 idle page tracking
- 编译内核
# 进入linux内核源码目录,将当前内核配置信息拷贝过来
cp /boot/config-xxxxx .config
# 配置内核,这里没有修改
make menuconfig
# 清除暂存文档,编译核心,模组
make -j 4 clean bzImage modules
# 安装模组
make modules_install
# 安装核心
make install
# 生成开机菜单
grub2-mkconfig -o /boot/grub2/grub.cfg
# 重启验证
reboot
看下路径下已经出现了bitmap
[root@centosgpt page_idle]# ls | sed "s:^:pwd
/:"
/sys/kernel/mm/page_idle/bitmap
- 配置
CONFIG_IDLE_PAGE_TRACKING
=y
# make menuconfig
...
Memory Management options --->
...
[*] Enable idle page tracking
# cat .config|grep IDLE
...
CONFIG_IDLE_PAGE_TRACKING=y
...
编译内核时内存页被换出
[root@localhost vm]# ./page-types -p 1699 -L
voffset offset flags
600 6319 ___________________________________________
601 6312 ___________________________________________
7f6f73cc9 631f ___________________________________________
7f6f73cca 6320 ___________________________________________
7f6f73ccb 6321 ___________________________________________
7f6f73ccc 62cf ___________________________________________
7f6f73ccd 6322 ___________________________________________
7f6f73cce 631c ___________________________________________
7f6f73ccf 6326 ___________________________________________
7f6f73cd0 6324 ___________________________________________
7f6f73cd2 6325 ___________________________________________
7f6f73eea 631b ___________________________________________
7f6f73eeb 6586 ___________________________________________
7f6f73eef 6323 ___________________________________________
7f6f74101 631d ___________________________________________
7f6f74102 631e ___________________________________________
7f6f74103 62d0 ___________________________________________
7f6f7410f 652d ___________________________________________
7f6f74110 6318 ___________________________________________
7f6f74111 6315 ___________________________________________
7f6f74112 6313 ___________________________________________
7f6f74113 6316 ___________________________________________
7fffa50ee 631a ___________________________________________
7fffa50ef 6317 ___________________________________________
7fffa50f0 6314 ___________________________________________
7fffa50f1 6311 ___________________________________________
7fffa5145 978c ___________M_______________________________
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000000 26 0 ___________________________________________
0x0000000000000800 1 0 ___________M_______________________________ mmap
total 27 0
- 用户态虚拟地址到物理地址转换
- 通过
/proc/pid/maps
查看虚拟地址布局 - 选择的要读取的地址
- 通过
/proc/pid/pagemap
读取 PFN (page frame number) - 物理地址
paddr = pfn*page_size+vir%page_size
/proc/kpagecount
,/proc/kpageflags
获取页面的映射次数与页面标识
- 通过
在这里再次看到了vDSO
对应的页面
[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::81775
Before malloc in main thread
addr = 7ffc836ac480
[root@centosgpt vm]# cat /proc/81775/maps|grep vdso
7ffc836bf000-7ffc836c0000 r-xp 00000000 00:00 0 [vdso]
[root@centosgpt vm]# ./page-types -p 81775 -L -a 0x7ffc836bf, -N
voffset offset flags
7ffc836bf cd8c __R________M_______________________________
可以看到对应的pfn是cd8c,物理地址就是: 0xcd8c*0x1000 = 0xcd8c000
在显示这一页中有个没有搞懂的地方, 就是这里的 R(referenced)标识, 执行一次后,下一次不再显示, 可能是由于vDSO本身是用户态映射的内核页面所有不能使用LRU list管理,不会出现
PG_active
标识
page-types 功能如下:
page-types [options]
-r|--raw Raw mode, for kernel developers
-d|--describe flags Describe flags
-a|--addr addr-spec Walk a range of pages
-b|--bits bits-spec Walk pages with specified bits
-c|--cgroup path|@inode Walk pages within memory cgroup
-p|--pid pid Walk process address space
-f|--file filename Walk file address space
-i|--mark-idle Mark pages idle
-l|--list Show page details in ranges
-L|--list-each Show page details one by one
-C|--list-cgroup Show cgroup inode for pages
-M|--list-mapcnt Show page map count
-N|--no-summary Don't show summary info
-X|--hwpoison hwpoison pages
-x|--unpoison unpoison pages
-F|--kpageflags filename kpageflags file to parse
-h|--help Show this usage message
开始使用只使用到-p
, -L
, -l
一些用法
[root@centosgpt vm]# ./page-types -p 106002 -l
voffset offset len flags
400 12077 1 __RUDlA____Ma_b____________________________
401 602e 1 __RU_lA____M_______________________________
601 8c05 1 __RU_lA____Ma_b____________________________
602 8e03 1 ___U_lA____Ma_b____________________________
voffset 是虚拟页面的偏移, 单位是1页 (sysconf(_SC_PAGESIZE))
offset 是pfn
len 页面个数 -l
时显示 -L
一页一页显示
flags 就是page flag
locked error referenced uptodate
dirty lru active slab
writeback reclaim buddy mmap
anonymous swapcache swapbacked compound_head
compound_tail huge unevictable hwpoison
nopage ksm thp offline
zero_page idle_page pgtable reserved(r)
mlocked(r) mappedtodisk(r) private(r) private_2(r)
owner_private(r) arch(r) uncached(r) softdirty(r)
readahead(o) slob_free(o) slub_frozen(o) slub_debug(o)
file(o) swap(o) mmap_exclusive(o)
(r) raw mode bits (o) overloaded bits
可以通过-i
设置页面的 idle
状态
[root@centosgpt vm]# ./page-types -p 106215 -i
[root@centosgpt vm]# ./page-types -p 106215 -l
voffset offset len flags
400 1d32 1 __RUDlA____Ma_b__________i_________________
401 602e 1 __RU_lA____M_____________i_________________
601 5be5 1 __RU_lA____Ma_b__________i_________________
602 dc49 1 ___U_lA____Ma_b__________i_________________
7ffff7a0e dec3 1 __RU_lA____M_______________________________
通过-C
显示进程使用cgroup的inode
[root@centosgpt vm]# ./page-types -p 106215 -l -C
voffset cgroup offset len flags
400 @1 1d32 1 __RUDlA____Ma_b__________i_________________
401 @1 602e 1 __RU_lA____M_____________i_________________
601 @1 5be5 1 __RU_lA____Ma_b__________i_________________
602 @1 dc49 1 ___U_lA____Ma_b__________i_________________
[root@centosgpt cgroup]# ls -i /sys/fs/cgroup/|grep memory
1 memory
cgroup: 对应cgroup的inode
通过-f
显示程序如果在如内存话需要几个页面
[root@centosgpt vm]# ./page-types -f /root/src/memory/vm/translate -L
foffset offset flags
/root/src/memory/vm/translate Inode: 9733651 Size: 13664 (4 pages)
Modify: Sun Mar 22 21:12:39 2020 (1714962 seconds ago)
Access: Sat Apr 11 17:16:33 2020 (1128 seconds ago)
0 a9aa __RU_lA____________________________________
1 602e __RU_lA____M_____________i_________________
2 11d63 __RU_lA____________________________________
3 3ce1 __RU_lA____________________________________
flags page-count MB symbolic-flags long-symbolic-flags
0x000000000000006c 3 0 __RU_lA____________________________________ referenced,uptodate,lru,active
0x000000000200086c 1 0 __RU_lA____M_____________i_________________ referenced,uptodate,lru,active,mmap,idle_page
total 4 0
需要四个页面
-a 指定的页面范围
N one page at offset N (unit: pages)
N+M pages range from N to N+M-1
N,M pages range from N to M-1
N, pages range from N to end
,M pages range from 0 to M-1
[root@centosgpt vm]# ./page-types -p 106215 -a 0x400,0x603 -L
voffset offset flags
400 1d32 __RUDlA____Ma_b__________i_________________
401 602e __RU_lA____M_____________i_________________
601 5be5 __RU_lA____Ma_b__________i_________________
602 dc49 ___U_lA____Ma_b__________i_________________
flags page-count MB symbolic-flags long-symbolic-flags
0x000000000200086c 1 0 __RU_lA____M_____________i_________________ referenced,uptodate,lru,active,mmap,idle_page
0x0000000002005868 1 0 ___U_lA____Ma_b__________i_________________ uptodate,lru,active,mmap,anonymous,swapbacked,idle_page
0x000000000200586c 1 0 __RU_lA____Ma_b__________i_________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked,idle_page
0x000000000200587c 1 0 __RUDlA____Ma_b__________i_________________ referenced,uptodate,dirty,lru,active,mmap,anonymous,swapbacked,idle_page
total 4 0
-M
显示映射次数
[root@centosgpt vm]# ./page-types -p 106215 -a 0x400,0x603 -M -L
voffset map-cnt offset flags
400 1 1d32 __RUDlA____Ma_b__________i_________________
401 1 602e __RU_lA____M_____________i_________________
601 1 5be5 __RU_lA____Ma_b__________i_________________
602 1 dc49 ___U_lA____Ma_b__________i_________________
flags page-count MB symbolic-flags long-symbolic-flags
0x000000000200086c 1 0 __RU_lA____M_____________i_________________ referenced,uptodate,lru,active,mmap,idle_page
0x0000000002005868 1 0 ___U_lA____Ma_b__________i_________________ uptodate,lru,active,mmap,anonymous,swapbacked,idle_page
0x000000000200586c 1 0 __RU_lA____Ma_b__________i_________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked,idle_page
0x000000000200587c 1 0 __RUDlA____Ma_b__________i_________________ referenced,uptodate,dirty,lru,active,mmap,anonymous,swapbacked,idle_page
total 4 0
-r
: 可以先显示一些扩展标识
82 /* [48-] take some arbitrary free slots for expanding overloaded flags
83 * not part of kernel API
84 */
85 #define KPF_READAHEAD 48
86 #define KPF_SLOB_FREE 49
87 #define KPF_SLUB_FROZEN 50
88 #define KPF_SLUB_DEBUG 51
89 #define KPF_FILE 61
90 #define KPF_SWAP 62
91 #define KPF_MMAP_EXCLUSIVE 63
部分标识可以在/fs/proc/task_mmu.c
找到
[root@centosgpt vm]# ./page-types -r -p 106215 -L
voffset offset flags
400 1d32 __RUDlA____Ma_b__________i_________f______1
401 602e __RU_lA____M_____________i_________f____F_1
601 5be5 __RU_lA____Ma_b__________i_________f______1
602 dc49 ___U_lA____Ma_b__________i_________f______1
-F
: 可以通过kpageflags
文件解析, 通过-L
找到相关页面
[root@centosgpt vm]# ./page-types -F /proc/kpageflags
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000000 31220 121 ___________________________________________
0x0000000004000000 1521 5 __________________________g________________ pgtable
0x0000000001000000 1 0 ________________________z__________________ zero_page
0x0000000000000028 3004 11 ___U_l_____________________________________ uptodate,lru
0x0000000002000028 2968 11 ___U_l___________________i_________________ uptodate,lru,idle_page
0x000000000200002c 7397 28 __RU_l___________________i_________________ referenced,uptodate,lru,idle_page
0x000000000000002c 456 1 __RU_l_____________________________________ referenced,uptodate,lru
-X -x
: 调试使用
[root@localhost boot]# ls -atlr *5.5.4*
-rw-------. 1 root root 52185823 Apr 5 14:48 initramfs-5.5.4.img
-rw-r--r--. 1 root root 0 Apr 5 15:12 vmlinuz-5.5.4
-rw-r--r--. 1 root root 0 Apr 5 15:12 System.map-5.5.4
hwpoison 当内存页出现问题时就会设置该标识,从而不被应用程序使用. 保证系统稳定, 当然下面操作工具仅仅是软件模拟
安装hwpoison-inject
模块
# modprobe hwpoison-inject
# lsmod |grep hw
启动示例程序
[root@centosgpt vm]# ./memory_layout
Welcome to per thread arena example::8044
Before malloc in main thread
addr = 7ffec96940f0
一个页面打标,选虚拟地址 0x400
[root@centosgpt vm]# ./page-types -p 8044 -L -N -a 0x400
voffset offset flags
400 c963 __RU_lA____M_______________________________
[root@centosgpt vm]# ./page-types -p 8044 -L -N -a 0x400 -X
voffset offset flags
400 c963 __RU_lA____M_______________________________
[root@centosgpt vm]# ./page-types -p 8044 -L -N -a 0x400
voffset offset flags
这时这个页面已经被取消映射了, 看下日志及页面标识
# dmesg
...
[19664.318542] Injecting memory failure at pfn 0xc963
[19664.318794] Memory failure: 0xc963: corrupted page was clean: dropped without side effects
[19664.318815] Memory failure: 0xc963: recovery action for clean LRU page: Recovered
[root@centosgpt vm]# ./page-types -a 0xc963 -L -N
offset flags
c963 __RU_______________X_______________________
调用脚本扫描一下,虽然不是很准确扫描所有进程映射的pfn发现该页没有被使用
[root@centosgpt shell]# ./findpdibypfn.sh c963
[root@centosgpt shell]# cat findpdibypfn.sh
#!/bin/bash
pt=/srv/linux-5.5.4/tools/vm/page-types
for i in ls /proc |grep '[0-9]'
; do
$pt -p $i -N -l 2>/dev/null| awk '{print $2}' |grep $1 >/dev/null 2>&1
if [ $? -eq 0 ]; then
echo "process id : '$i' "
echo " ps -ef |grep $i|grep -v grep
"
fi
done
中间去做了个饭, 回来该pfn未被映射
[root@centosgpt vm]# ./page-types -a 0xc963 -L -N
offset flags
c963 __RU_______________X_______________________
[root@centosgpt shell]# ./findpdibypfn.sh c963
[root@centosgpt shell]#
清除该标识可以看到已经被mmap了
[root@centosgpt vm]# ./page-types -a 0xc963 -L -N -x
offset flags
c963 __RU_______________X_______________________
[root@centosgpt vm]# ./page-types -a 0xc963 -L -N
offset flags
c963 ___________________________________________
[root@centosgpt vm]# ./page-types -a 0xc963 -L -N
offset flags
c963 ___U_lA____Ma_b____________________________
查下使用的进程, 当然映射有可能会变化可能再查就查不到了
[root@centosgpt shell]# ./findpdibypfn.sh c963
process id : '15680'
root 15680 15673 0 13:18 ? 00:00:00 sendmail -i root
root 15693 15680 0 13:18 ? 00:00:00 /usr/sbin/postdrop -r
[root@centosgpt vm]# ./page-types -p 15680 -L |grep c963
7fb6669ff c963 ___U_lA____Ma_b____________________________
极客时间刘超老师<趣谈Linux操作系统>的一个课后作业,学习过程中记录的一些笔记
参考/引用
Understanding the Memory Layout of Linux Executables
proc – process information pseudo-filesystem
RHEL-6 – 5.2.12. /proc/iomem
How to translate virtual to physical addresses through /proc/pid/pagemap
Pagemap Interface of Linux Explained
How do I read from /proc/$pid/mem under Linux?
Examining Process Page Tables
VDSO(7)
x86 架构下 Linux 的系统调用与 vsyscall, vDSO
什麼是 Linux vDSO 與 vsyscall?——發展過程
Be First to Comment