建站提交历史文章,原文写作时间 2023 年 2 月前后。

文件操作与目录操作

基本原理

标准C库与Linux系统文件IO对比

  • I0Input / Output,输入输出,通常指文件在存储介质之间的传输。
  • 标准C库IO函数与LinuxIO函数是调用与被调用的关系, 标准C库函数调用Linux函数进行文件IO,应用程序没有直接操作文件的权限,必须由操作系统代理完成。标准C库IO函数比LinuxIO函数更高级。但是,需要注意的是,标准C库 IO函数效率高于 Linux 函数,标准C库 为IO实现了 缓冲区 的功能,减少了效率低下的外部设备访问开销。
标准C库文件IO原理
  • 由于缓存区的存在,标准C库IO是非即时的,LinuxIO是即时的,例如在网络交互中需要使用后者,一般场景使用前者更优。
标准C库文件IO与内核的关系

虚拟地址空间

  • 在操作文件与内存中得到的所有地址都是虚拟地址真实地址对于用户是不可见的,也是无需关心的。
  • MMUMemory Management Unit,用于实现虚拟地址真实地址之间的映射,从而完成CPU的内存管理请求。虚拟地址映射表存储在PCB中。
  • 栈空间从高地址向低地址占用,堆空间从低地址向高地址占用。

文件描述符

  • 文件描述符用于描述一个打开的文件,存储于进程PCBProcessing Control Block)中。
  • 进程通过文件描述符访问一个打开的文件。
  • PCB中文件描述符列表的尺寸为1024,也就是说,一个进程最多同时打开1024个文件,关闭文件后可复用。
  • 多次打开相同文件将占用多个文件描述符

Linux 文件操作函数

打开文件

  • open
  • O_RDONLYO_WRONLYO_RDWRO_APPENDO_TRUNCO_NONBLOCKO_CREAT
  • UMASK:在后文的权限操作中,也会与UMASK操作后处理。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
// open a file
// pathname:
// the file path to open
// flags:
// must and exactly one:
// O_RDONLY for read only,
// O_WRONLY for write only,
// O_RDWR for read and write.
// optional and limitless:
// O_APPEND for append mode,
// O_TRUNC truncate file size to zero while open,
// O_NONBLOCK for non-block,
// O_CREAT for creatable.
// mode:
// If O_CREATE flag is on, must follow with the key.
// Mode is a oct-based number and determine the limits of the new file.
// It includes three digits (such as 0664), for user - group - others.
// And each digit can be extend to three binary-bits, for read - write - execute.
// The finally mode will be (mode^~UMASK).
// * UMASK:
// A mask on the mode to ensure the file didn't give the inappropriate permission.
// Also you can change the value of UMASK, command "$ umask" to show the value.
// The default value is 0002. Because giving others write permission is inappropriate.
// return value:
// file descripor, or -1 for error
int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);

// if return value is -1, you can then call to print the error message
#include <stdio.h>
perror("open"); // About more: $ man 3 perror

// About more
// $ man 2 open

关闭文件

  • close
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <unistd.h>
// close a file
// file will automatic close while process ended
// fd:
// file description
// return value:
// return 0 for success, -1 for error
int close(int fd);

// if return value is -1, you can then call to print the error message
#include <stdio.h>
perror("close");

// About more
// $ man 2 close

读取文件

  • read
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <unistd.h>
// read a file
// fd:
// file descriptor of a readable file
// buf:
// he container of data
// count:
// read size for each time
// return value:
// real read size of data, or -1 for error
// Because the rest size maybe smaller than ideal. It become zero if read the end of file.
ssize_t read(int fd, void *buf, size_t count);

// if return value is -1, you can then call to print the error message
#include <stdio.h>
perror("read");

// About more
// $ man 2 read

写入文件

  • write
1
2
3
4
5
6
7
8
9
10
#include <unistd.h>
// similar to read a file
ssize_t write(int fd, const void *buf, size_t count);

// if return value is -1, you can then call to print the error message
#include <stdio.h>
perror("write");

// About more
// $ man 2 write

文件指针

  • lseek
  • SEEK_SETSEEK_CURSEEK_END
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <sys/types.h>
#include <unistd.h>
// reposition read / write file offset
// fd:
// file descriptor
// offset:
// relative offset to whence
// whence:
// SEEK_SET for BOF, SEEK_CUR for now file offset, SEEK_END for EOF
// return value:
// absolute file offset
off_t lseek(int fd, off_t offset, int whence);

// if return value is -1, you can then call to print the error message
#include <stdio.h>
perror("lseek");

// About more
// $ man 2 lseek
1
2
3
int offset = lseek(fd, 0, SEEK_CUR);    // get now file offset
lseek(fd, offset, SEEK_SET); // reposition to absolute offset
lseek(fd, 0, SEEK_END); // reposition to the end of file

文件截断

  • truncate
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <unistd.h>
#include <sys/types.h>
// resize the file
// path:
// the file path
// length:
// new length of file
// append \x00 if become larger and cut down the rest tail if become smaller
// return value:
// return 0 for success, -1 for error
int truncate(const char *path, off_t length);

// if return value is -1, you can then call to print the error message
#include <stdio.h>
perror("truncate");

// About more
// $ man 2 truncate

Linux 文件属性函数

  • Linux中一切皆文件,你同样可以用下面这些函数操作目录等特殊文件。

返回文件属性

  • statlstat

  • 常用属性:

    st_mode:文件属性编码。

    st_size:文件大小Bytes

    st_atime:访问时间。

    st_mtime:修改时间。

    st_ctime:修改时间,指属性修改。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
// get file status
// pathname:
// the path of target file
// statbuf:
// the container to restore data
// struct stat {
// dev_t st_dev; /* ID of device containing file */
// ino_t st_ino; /* Inode number */
// mode_t st_mode; /* File type and mode */
// nlink_t st_nlink; /* Number of hard links */
// uid_t st_uid; /* User ID of owner */
// gid_t st_gid; /* Group ID of owner */
// dev_t st_rdev; /* Device ID (if special file) */
// off_t st_size; /* Total size, in bytes */
// blksize_t st_blksize; /* Block size for filesystem I/O */
// blkcnt_t st_blocks; /* Number of 512B blocks allocated */
//
// struct timespec st_atim; /* Time of last access */
// struct timespec st_mtim; /* Time of last modification */
// struct timespec st_ctim; /* Time of last status change */
//
// #define st_atime st_atim.tv_sec
// #define st_mtime st_mtim.tv_sec
// #define st_ctime st_ctim.tv_sec
// };
// return value:
// return 0 for success, -1 for error
int stat(const char *pathname, struct stat *statbuf);
// similar to stat
int lstat(const char *pathname, struct stat *statbuf);

// if return value is -1, you can then call to print the error message
#include <stdio.h>
perror("stat");

// About more
// $ man 2 stat
  • lstatstat的区别:stat在访问软链接时,返回软链接属性;lstat在访问软链接时,返回指向文件属性。
1
$ ln -s src.txt link.lnk  // 建立软链接 link.lnk -> src.txt

判断文件访问

  • access
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <unistd.h>
// check permissions to the file for current process
// pathname:
// the file path
// mode:
// optional and limitless:
// F_OK for file exists,
// R_OK for read permission for process,
// W_OK for write permission for process,
// X_OK for execute permission for process
// return value:
// return 0 for all permission is available, -1 for other case and error
int access(const char *pathname, int mode);

// if return value is -1, you can then call to print the error message
#include <stdio.h>
perror("access");

// About more
// $ man 2 access

修改文件属性

  • chmod
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <sys/stat.h>
// change mode of file
// pathname:
// the file path
// mode:
// three digits of oct-based number
// UMASK work here
// return value:
// return 0 for success, -1 for error
int chmod(const char *pathname, mode_t mode);

// if return value is -1, you can then call to print the error message
#include <stdio.h>
void perror("chmod");

// About more
// $ man 2 chmod

修改文件所有

  • chown
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <unistd.h>
// change owner of file
// pathname:
// the file path
// owner:
// the UID of owner
// group:
// the GID of group
// return 0 for success, -1 for error
int chown(const char *pathname, uid_t owner, gid_t group);

// if return value is -1, you can then call to print the error message
#include <stdio.h>
void perror("chown");

// About more
// $ man 2 chown

Linux 目录操作函数

切换工作目录

  • chdir
1
2
3
#include <unistd.h>
int chdir(const char *path);
// $ man 2 chdir

查看工作目录

  • getcwd
1
2
3
#include <unistd.h>
char *getcwd(char *buf, size_t size);
// $ man 2 getcwd

创建目录

  • mkdir
1
2
3
4
#include <sys/stat.h>
#include <sys/types.h>
int mkdir(const char *pathname, mode_t mode);
// $ man 2 mkdir

重命名目录

  • rename
1
2
3
4
#include <stdio.h>
// rename or move a file or director
int rename(const char *oldpath, const char *newpath);
// $ man 2 rename

移除目录

  • rmdirunlinkremove
1
2
3
4
#include <unistd.h>
// remove a director
int rmdir(const char *pathname);
// $ man 2 rmdir
1
2
3
4
#include <unistd.h>
// delete a file
int unlink(const char *pathname);
// $ man 2 unlink
1
2
3
4
#include <stdio.h>
// remove a file or director
int remove(const char *pathname);
// $ man 3 remove

遍历目录

  • opendirreaddirseekdirclosedir

  • 常用属性:

    d_off:文件在当前目录的序号

    d_name:文件名

    d_reclen:文件名长度

1
2
3
4
5
6
7
8
9
#include <sys/types.h>
#include <dirent.h>
// open a director
// name:
// the file path
// return value:
// a pointer to DIR, or NULL for error
DIR *opendir(const char *name);
// $ man 3 opendir
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include <dirent.h>
// read an item in director
// dirp:
// a pointer to DIR return by "opendir"
// return value:
// an item in director and restore in struct dirent, NULL for EOF or other errors
// struct dirent {
// ino_t d_ino; /* Inode number */
// off_t d_off; /* Current position in the director stream */
// unsigned short d_reclen; /* Length of this record */
// unsigned char d_type; /* Type of file; not supported by all filesystem types */
// char d_name[256]; /* Null-terminated filename */
// };
// It's similar to read a file, while read an item in director, the file seek will move to the next item. You can use "seekdir" to reset the file offset.
struct dirent *readdir(DIR *dirp);
// $ man 3 readdir
1
2
3
4
5
6
7
8
#include <dirent.h>
// set the current position of pointer
// dirp:
// a pointer to DIR return by "opendir"
// loc:
// new position
void seekdir(DIR *dirp, long loc);
// $ man 3 seekdir
1
2
3
4
5
6
7
8
9
#include <sys/types.h>
#include <dirent.h>
// close a director
// dirp:
// a pointer to DIR return by "opendir"
// return value:
// return 0 for success, -1 for error
int closedir(DIR *dirp);
// $ man 3 closedir

Linux 操作文件描述符

复制文件描述符

  • dupdup2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <unistd.h>
// duplicate a file descriptor
// oldfd:
// old file descriptor
// return value:
// new file descriptor, or -1 for error
int dup(int oldfd);
// duplicate a file descriptor
// oldfd:
// old file descriptor
// newfd:
// new file descriptor
// return value:
// newfd, or -1 for error
// If newfd is occupied, it will close the file first.
// It is same to reopen a same file in same flags.
int dup2(int oldfd, int newfd);
// $ man 2 dup

获取文件状态标记

  • fcntlfcntl有五大功能,由cmd参数决定,这里介绍的获取与修改文件状态标记属于其中一种。这里仅介绍这一种。
1
2
3
4
5
6
7
8
9
10
11
#include <unistd.h>
#include <fcntl.h>
// manipulate file descriptor: get file descriptor flags
// fd:
// file descriptor
// cmd:
// F_GETFD: get file descriptor flags
// return value:
// return file descriptor flags
int fcntl(int fd, int cmd);
// $ man 2 fcntl

修改文件状态标记

  • fcntl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <unistd.h>
#include <fcntl.h>
// manipulate file descriptor: set file descriptor flags
// fd:
// file descriptor
// cmd:
// F_SETFD: get file descriptor flags
// flags:
// must and exactly one:
// O_RDONLY for read only,
// O_WRONLY for write only,
// O_RDWR for read and write
// optional and limitless:
// O_APPEND for append mode,
// O_NONBLOCK for non-block,
// O_CREATE for creatable.
int fcntl(int fd, int cmd, int flags);
// $ man 2 fcntl