So, let's do some experiments to verify it:
First, we need to have GDB to watch the kernel's memory, typically as:
gdb -x /dev/shm/gdb-script /usr/src/lki/vmlinux /proc/kcore
where 'gdb-script' is my tiny GDB scripts to facilitate this observations, /usr/src/lki/vmlinux is just this working kernel's ELF image file. Note that you need to have CONFIG_DEBUG_INFO being set, which will turn on '-g' GCC's flag within Makefile.
Here is the related command lines:
[bshyu@vmBShyu gdb-scripts]$ sudo mount /dev/disk/by-uuid/448c9c30-4a09-40f6-8e01-d3c577cd250d /media/linux
[bshyu@vmBShyu gdb-scripts]$ sudo mount /dev/disk/by-label/FAT32 /media/FAT32
[bshyu@vmBShyu gdb-scripts]$ ls /media/FAT32
bsdbg data knoppix.img mypage setup-cygwin.exe VirtMachine zycova
BShyu linux Recycled temp www
[bshyu@vmBShyu gdb-scripts]$ sudo mount --bind /media/FAT32 /media/FAT32/temp
[bshyu@vmBShyu gdb-scripts]$ sudo mount --bind /media/linux /media/FAT32/linux/
Now we are going to play with GDB, but first, listed here the fore-mentioned tiny GDB scripts, which we help us to observe Linux's link-list data structures, (assume you are already familiar with Linux's struct list_head thing, :-) :
define bsScan
# get the address of the list_head
set $head = (struct list_head*) $arg0
# set $CStructOffset = (unsigned long)(&((struct $arg1 *)0)->$arg2)))
set $CStructOffset = &(((struct $arg1 *)0)->$arg2)
#set $CStruct = $arg1
printf "\n================================================================\n"
printf "\tdisplay list of structures\n"
print *($arg0)
printf "================================================================\n"
set $ptr = $head->next
bsNext $arg0 $arg1 $arg2
end
define bsNext
if $ptr != $head
set $ent = ((struct $arg1 *)((char*)($ptr) - (unsigned)$CStructOffset))
printf "\n\n#--------------- 0x%08x ---------------#\n", $ent
print *$ent
set $ptr = $ptr->next
printf "\nUse print *$ent to show list entity\n"
printf "Use set $a = $ent to store the entity\n"
end
end
document bsScan
-----------------------------------------------------------
Bernard Shyu's LIST display script
-----------------------------------------------------------
bsScan
bsNext
Scan a list of elements one by one
Ex. bsScan &inode_in_use inode i_list
end
So, we will start from INIT process's namespace, which in more than 99% cases will be just the namespace to be used by all processes. My kernel version is: 2.6.25.4, where, the namespace structure has been through a big change since the book of ULK - the process's namespace has been replaced by nsproxy :-) (%% You always need to keep note the kernel version being used when you want to tell the story of Linux Kernel, otherwise, we will very like miss the point you mentioned. %%)
(gdb) p *init_task.nsproxy->mnt_ns->root
$114 = {mnt_hash = {next = 0xcf80c808, prev = 0xcf80c808}, mnt_parent = 0xcf80c808,
mnt_mountpoint = 0xcf4018e0, mnt_root = 0xcf4018e0, mnt_sb = 0xcf80dab8, mnt_mounts = {
next = 0xcf80cf98, prev = 0xcf80cf98}, mnt_child = {next = 0xcf80c828, prev = 0xcf80c828},
mnt_flags = 0, mnt_devname = 0xcf802440 "rootfs", mnt_list = {next = 0xcf80cfa8,
prev = 0xcf802410}, mnt_expire = {next = 0xcf80c840, prev = 0xcf80c840}, mnt_share = {
next = 0xcf80c848, prev = 0xcf80c848}, mnt_slave_list = {next = 0xcf80c850,
prev = 0xcf80c850}, mnt_slave = {next = 0xcf80c858, prev = 0xcf80c858}, mnt_master = 0x0,
mnt_ns = 0xcf802408, mnt_count = {counter = 2}, mnt_expiry_mark = 0, mnt_pinned = 0,
mnt_ghosts = 0}
(gdb) bsScan &init_task.nsproxy->mnt_ns->root->mnt_mounts vfsmount mnt_child
================================================================
display list of structures
$115 = {next = 0xcf80cf98, prev = 0xcf80cf98}
================================================================
#--------------- 0xcf80cf78 ---------------#
$116 = {mnt_hash = {next = 0xcf80b408, prev = 0xcf80b408}, mnt_parent = 0xcf80c808,
mnt_mountpoint = 0xcf4018e0, mnt_root = 0xcf421718, mnt_sb = 0xcf1a1318, mnt_mounts = {
next = 0xcf80cf10, prev = 0xcf80cbe0}, mnt_child = {next = 0xcf80c820, prev = 0xcf80c820},
mnt_flags = 0, mnt_devname = 0xcf183590 "/dev/root", mnt_list = {next = 0xcf80cf20,
prev = 0xcf80c838}, mnt_expire = {next = 0xcf80cfb0, prev = 0xcf80cfb0}, mnt_share = {
next = 0xcf80cfb8, prev = 0xcf80cfb8}, mnt_slave_list = {next = 0xcf80cfc0,
prev = 0xcf80cfc0}, mnt_slave = {next = 0xcf80cfc8, prev = 0xcf80cfc8}, mnt_master = 0x0,
mnt_ns = 0xcf802408, mnt_count = {counter = 755}, mnt_expiry_mark = 0, mnt_pinned = 0,
mnt_ghosts = 0}
Use print *$ent to show list entity
Use set $a = $ent to store the entity
(gdb) set $root = $ent
The 2nd vfsmount structure is the real root of the filesystems directory tree we will play around soon. This is just the mount structure corresponding the '/' - the root directory in every pathname we will always use (note that we we in the most case: only one namespace).
What about the 1st, it's initramfs.
"For each mounted filesystem, a circular doubly linked list including all child mounted filesystems. The head of each list is stored in the mnt_mounts field of the mounted filesystem descriptor; moreover, the mnt_child field of the descriptor stores the pointers to the adjacent elements in the list." -- ULK
Note that we've set $root to the struct vfsmount for the root directory (/dev/root), so $root->mnt_mounts will be the link list of all child mounted filesystems we will loop on, by the mnt_child field.
(gdb) bsScan &$root->mnt_mounts vfsmount mnt_child
================================================================
display list of structures
$117 = {next = 0xcf80cf10, prev = 0xcf80cbe0}
================================================================
#--------------- 0xcf80cef0 ---------------#
$118 = {mnt_hash = {next = 0xcf80b310, prev = 0xcf80b310}, mnt_parent = 0xcf80cf78,
mnt_mountpoint = 0xcf420128, mnt_root = 0xcf417c70, mnt_sb = 0xcf1526c8, mnt_mounts = {
next = 0xcf80c938, prev = 0xcf80ca48}, mnt_child = {next = 0xcf80ce88, prev = 0xcf80cf90},
mnt_flags = 0, mnt_devname = 0xcf1fca60 "/dev", mnt_list = {next = 0xcf80ce98,
prev = 0xcf80cfa8}, mnt_expire = {next = 0xcf80cf28, prev = 0xcf80cf28}, mnt_share = {
next = 0xcf80cf30, prev = 0xcf80cf30}, mnt_slave_list = {next = 0xcf80cf38,
prev = 0xcf80cf38}, mnt_slave = {next = 0xcf80cf40, prev = 0xcf80cf40}, mnt_master = 0x0,
mnt_ns = 0xcf802408, mnt_count = {counter = 48}, mnt_expiry_mark = 0, mnt_pinned = 0,
mnt_ghosts = 0}
Use print *$ent to show list entity
Use set $a = $ent to store the entity
(gdb) bsNext &$root->mnt_mounts vfsmount mnt_child
#--------------- 0xcf80ce68 ---------------#
$119 = {mnt_hash = {next = 0xcf80b3e0, prev = 0xcf80b3e0}, mnt_parent = 0xcf80cf78,
mnt_mountpoint = 0xcf420e38, mnt_root = 0xcf4017b0, mnt_sb = 0xcf80d688, mnt_mounts = {
next = 0xcf80c1c8, prev = 0xcf80cb58}, mnt_child = {next = 0xcf80ce00, prev = 0xcf80cf10},
mnt_flags = 0, mnt_devname = 0xcf1fca28 "/proc", mnt_list = {next = 0xcf80ce10,
prev = 0xcf80cf20}, mnt_expire = {next = 0xcf80cea0, prev = 0xcf80cea0}, mnt_share = {
next = 0xcf80cea8, prev = 0xcf80cea8}, mnt_slave_list = {next = 0xcf80ceb0,
prev = 0xcf80ceb0}, mnt_slave = {next = 0xcf80ceb8, prev = 0xcf80ceb8}, mnt_master = 0x0,
mnt_ns = 0xcf802408, mnt_count = {counter = 6}, mnt_expiry_mark = 0, mnt_pinned = 0,
mnt_ghosts = 0}
Use print *$ent to show list entity
Use set $a = $ent to store the entity
(gdb)
#--------------- 0xcf80cde0 ---------------#
$130 = {mnt_hash = {next = 0xcf80b3e8, prev = 0xcf80b3e8}, mnt_parent = 0xcf80cf78,
mnt_mountpoint = 0xcf420ed0, mnt_root = 0xcf401978, mnt_sb = 0xcf80dcd0, mnt_mounts = {
next = 0xcf80cdf8, prev = 0xcf80cdf8}, mnt_child = {next = 0xcf80c9c0, prev = 0xcf80ce88},
mnt_flags = 0, mnt_devname = 0xcf1fc9f0 "/sys", mnt_list = {next = 0xcf80c1d8,
prev = 0xcf80ce98}, mnt_expire = {next = 0xcf80ce18, prev = 0xcf80ce18}, mnt_share = {
next = 0xcf80ce20, prev = 0xcf80ce20}, mnt_slave_list = {next = 0xcf80ce28,
prev = 0xcf80ce28}, mnt_slave = {next = 0xcf80ce30, prev = 0xcf80ce30}, mnt_master = 0x0,
mnt_ns = 0xcf802408, mnt_count = {counter = 1}, mnt_expiry_mark = 0, mnt_pinned = 0,
mnt_ghosts = 0}
Use print *$ent to show list entity
Use set $a = $ent to store the entity
Then, to continue to scan through the list of child mounts, we can do either repeat the command bsNext &$root->mnt_mounts vfsmount mnt_child, or simply by ENTER (by GDB, it will repeat last command). We will find the mounted childs, such as '/dev', '/proc', '/sys', as shown by my machine.
Not shown here, but if we continue with ENTER to observe the child mounts of the root, we will notice the BIND mounts (mount --bind /media/FAT32 /media/FAT32/temp, mount --bind /media/linux /media/FAT32/linux/) can't be found here. Where did they go? This is because they are grand-children, rather that direct children of the root mount.
So, let's find the mount for /media/FAT32, and from there to find its children.
...
...
(gdb)
#--------------- 0xcf80cab0 ---------------#
$143 = {mnt_hash = {next = 0xcf80b978, prev = 0xcf80b978}, mnt_parent = 0xcf80cf78,
mnt_mountpoint = 0xcf406848, mnt_root = 0xcf56a1c0, mnt_sb = 0xcf294470, mnt_mounts = {
next = 0xcf80cac8, prev = 0xcf80cac8}, mnt_child = {next = 0xcf80c140, prev = 0xcf80c9c0},
mnt_flags = 0,
mnt_devname = 0xcf26f238 "/dev/disk/by-uuid/448c9c30-4a09-40f6-8e01-d3c577cd250d",
mnt_list = {next = 0xcf80cb68, prev = 0xcf80ca58}, mnt_expire = {next = 0xcf80cae8,
prev = 0xcf80cae8}, mnt_share = {next = 0xcf80caf0, prev = 0xcf80caf0}, mnt_slave_list = {
next = 0xcf80caf8, prev = 0xcf80caf8}, mnt_slave = {next = 0xcf80cb00, prev = 0xcf80cb00},
mnt_master = 0x0, mnt_ns = 0xcf802408, mnt_count = {counter = 2}, mnt_expiry_mark = 0,
mnt_pinned = 0, mnt_ghosts = 0}
Use print *$ent to show list entity
Use set $a = $ent to store the entity
(gdb) set $L1 = $ent
...
...
(gdb)
#--------------- 0xcf80cbc0 ---------------#
$137 = {mnt_hash = {next = 0xcf80be00, prev = 0xcf80be00}, mnt_parent = 0xcf80cf78,
mnt_mountpoint = 0xcf7098e0, mnt_root = 0xcf709ed0, mnt_sb = 0xcfa68080, mnt_mounts = {
next = 0xcf80cc68, prev = 0xcf80cd78}, mnt_child = {next = 0xcf80cf90, prev = 0xcf80c0b8},
mnt_flags = 0, mnt_devname = 0xc94cc750 "/dev/disk/by-label/FAT32", mnt_list = {
next = 0xcf80cc78, prev = 0xcf80c0c8}, mnt_expire = {next = 0xcf80cbf8,
prev = 0xcf80cbf8}, mnt_share = {next = 0xcf80cc00, prev = 0xcf80cc00}, mnt_slave_list = {
next = 0xcf80cc08, prev = 0xcf80cc08}, mnt_slave = {next = 0xcf80cc10, prev = 0xcf80cc10},
mnt_master = 0x0, mnt_ns = 0xcf802408, mnt_count = {counter = 3}, mnt_expiry_mark = 0,
mnt_pinned = 0, mnt_ghosts = 0}
Use print *$ent to show list entity
Use set $a = $ent to store the entity
(gdb) set $F1 = $ent
So we've set two GDB variables: $L1 and $F1 to point to the interested mounts. Since we don't mount anything under the mount $L1, it's mnt_mounts is an empty list: as we can see:
(gdb) p $L1->mnt_mounts
$153 = {next = 0xcf80cac8, prev = 0xcf80cac8}
(gdb) p &$L1->mnt_mounts
$154 = (struct list_head *) 0xcf80cac8
Let's go through the children of $F1.
(gdb) bsScan &$F1->mnt_mounts vfsmount mnt_child
================================================================
display list of structures
$158 = {next = 0xcf80cc68, prev = 0xcf80cd78}
================================================================
#--------------- 0xcf80cc48 ---------------#
$159 = {mnt_hash = {next = 0xcf80be30, prev = 0xcf80be30}, mnt_parent = 0xcf80cbc0,
mnt_mountpoint = 0xcf709f68, mnt_root = 0xcf709ed0, mnt_sb = 0xcfa68080, mnt_mounts = {
next = 0xcf80cc60, prev = 0xcf80cc60}, mnt_child = {next = 0xcf80cd78, prev = 0xcf80cbd8},
mnt_flags = 0, mnt_devname = 0xc94c61d8 "/dev/disk/by-label/FAT32", mnt_list = {
next = 0xcf80cd88, prev = 0xcf80cbf0}, mnt_expire = {next = 0xcf80cc80,
prev = 0xcf80cc80}, mnt_share = {next = 0xcf80cc88, prev = 0xcf80cc88}, mnt_slave_list = {
next = 0xcf80cc90, prev = 0xcf80cc90}, mnt_slave = {next = 0xcf80cc98, prev = 0xcf80cc98},
mnt_master = 0x0, mnt_ns = 0xcf802408, mnt_count = {counter = 1}, mnt_expiry_mark = 0,
mnt_pinned = 0, mnt_ghosts = 0}
Use print *$ent to show list entity
Use set $a = $ent to store the entity
(gdb) set $F2 = $ent
(gdb) bsNext &$F1->mnt_mounts vfsmount mnt_child
#--------------- 0xcf80cd58 ---------------#
$160 = {mnt_hash = {next = 0xcf80bf38, prev = 0xcf80bf38}, mnt_parent = 0xcf80cbc0,
mnt_mountpoint = 0xc4eaf2f0, mnt_root = 0xcf56a1c0, mnt_sb = 0xcf294470, mnt_mounts = {
next = 0xcf80cd70, prev = 0xcf80cd70}, mnt_child = {next = 0xcf80cbd8, prev = 0xcf80cc68},
mnt_flags = 0,
mnt_devname = 0xcf26fbd8 "/dev/disk/by-uuid/448c9c30-4a09-40f6-8e01-d3c577cd250d",
mnt_list = {next = 0xcf802410, prev = 0xcf80cc78}, mnt_expire = {next = 0xcf80cd90,
prev = 0xcf80cd90}, mnt_share = {next = 0xcf80cd98, prev = 0xcf80cd98}, mnt_slave_list = {
next = 0xcf80cda0, prev = 0xcf80cda0}, mnt_slave = {next = 0xcf80cda8, prev = 0xcf80cda8},
mnt_master = 0x0, mnt_ns = 0xcf802408, mnt_count = {counter = 1}, mnt_expiry_mark = 0,
mnt_pinned = 0, mnt_ghosts = 0}
Use print *$ent to show list entity
Use set $a = $ent to store the entity
(gdb) set $L2 = $ent
So, we have $L1, $L2, $F1, $F2 for the 4 mounts we want to observe. Let's first check with $F1 & $F2, the mounts for the FAT32 file systems.
- They are indeed different structures: $F1 = 0xcf80cbc0, $F2 = 0xcf80cc48
- They have the same superblocks: $F1->mnt_sb = $F2->mnt_sb = 0xcfa68080. This reflect what is described in ULK: only one superblock structure for what ever number of mounts.
- Just as expected, $F2->mnt_parent = 0xcf80cbc0, the address of $F1
- Even they are different mounts, their mnt_root (struct dentry) are the same: $F1->mnt_root = $F2->mnt_root = 0xcf709ed0. That also indicate mnt_root is bound to the file system, rather than to the mount.
- On the other hand, the vfsmount's another dentry: mnt_mountpoint are obviously different, for their repective mount points: $F2->mnt_mountpoint = 0xcf709f68, $F1->mnt_mountpoint = 0xcf7098e0
- $F2->mnt_mountpoint->d_mounted = 1
- $F2->mnt_mountpoint->d_parent->d_mounted = 0
- The reference counter: $F1->mnt_count = 3, $F2->mnt_count = 1.
That's it, using my tiny script snippets: bsScan & bsNext, you can also play around with Linux kernel internally. Here we have observed the tree of file system mountings by going through the mnt_mounts & mnt_child, starting from the namespace.
Note that another way to play around with the system's mounting structures, much more straightforward, is simply by looping through the namespace's list field. Belows are the GDB commands to navigate the whole mounts of the namespace:
(gdb) bsScan &init_task.nsproxy->mnt_ns->list vfsmount mnt_list
(gdb) bsNext &init_task.nsproxy->mnt_ns->list vfsmount mnt_list
Enjoy it.
No comments:
Post a Comment