fs::symlink_metadata
Recently I wrote a program to recursively search for folder statistics and got stuck when measuring the size of a folder
I used find -type l
to search for soft links and found two links referenced recursively:
a points to b’s folder while b points to a’s folder
Looking at the source code, I realized that the underlying std::fs::metadata call stat() system call would follow the link
I switched to symlink_metadata, and the underlying lstat() call solved the stuck bug by skipping the soft link.
errno 40 ELOOP
The Linux glibc error code errno 40(ELOOP) means too many soft link bounces?
I thought the standard library would report an ELOOP error, but then Mr. Wildcat said that std::fs made a lot of sacrifices to be cross-platform.
(It seems that the std::fs API does not handle ELOOP error codes? But I’ve never seen errno ELOOP, so I won’t talk nonsense)
is_symlink() always returns false?
Trade-offs made by standard libraries for cross-platform purposes, such as is_syslink() for metadata() always returning false, are obviously not well designed for Linux systems
because metadata/stat will “eat” the soft link, which is equivalent to resolving to a normal file, i.e. the abstraction of a soft link does not exist
so metadata().is_symlink() must always return false on Linux
That’s why the Rust documentation is kind enough to emphasize that is_symlink() must be used in conjunction with symlink_metadata() to be effective
The only way to know if a file is softlinked or not is to use symlink_metadata/lstat to not track softlinks
Of course, there must be a hint in the man documentation, but when I first read it TLDR was too long and I didn’t read it carefully…
is_symlink is mutually exclusive with is_dir
In the symlink_metadata()/lstat() return value
is_symlink() and is_dir() are mutually exclusive only if one of them is true and the other is false
is_symlink is not stable yet, how can I use it?
I have to say that the standard library lacks a lot of support for various LinuxExt, and is_symlink is not expected to be stable until early 2022.
Since Metadata’s member fields are all private, you can only transmute or find out if there is a UnixExt or something like that.
Why is the du command not the same as Metadata::len()
Hard disk 4k alignment
For example, if a.txt has only one character, the stat command or fs::Metadata::len() does look like it has a size of 1
However, if you look at it with du, it says 4k, because the block-size of the Linux ext4 file system is usually 4k.
You can understand that the minimum storage unit of the hard disk is 4k, and all files occupy an integer multiple of 4k, which seems to be called 4k alignment
It is a bit like the structure internal storage layout to be aligned with the CPU register size of 8 byte, the structure size should be an integer multiple of 8 byte as much as possible
If you add a block-size parameter to the du command, such as du --apparent-size --block-size 1
, it is the same as the stat command.
du --bytes
ordu -b
is short fordu --apparent-size --block-size 1
Is /proc really zero size?
The du command is not lying, the three virtual filesystems /dev
, /proc
, /sys
are really zero size on the hard disk (because they don’t exist on the hard disk at all)
Although most of the files in these three folders are zero size when you look at the stat command, for example, “/proc/bus/pci/00/01.2” still has a size
For example, “/proc/config.gz” stores the compile-time parameters of the Linux kernel
Many of the parameters are of type String, so the size of zcat is also “variable” or indeterminate length.
So stat just says what the length of /proc/config.gz will be if it is read at the current moment
|
|