Skip to content

lab 1

约 4153 个字 338 行代码 预计阅读时间 18 分钟

运行指令

build and run: make and make qemu

clean 缓存:make clean

评测:make grade

tmux 复制

shift 选中文本,再 Ctrl + insert 复制,shift + insert 粘贴

git版本控制

查看本地仓库的 git 配置文件,可以看到 origin 主机名下已经有了对应的上游仓库地址

cd xv6-labs-2021/
cat .git/config

因此我们不要使用origin,可以使用其他主机名对应到github仓库,例如,我使用github

git remote add github 你的仓库地址
cat .git/config

将实验 \(1\) 用到的 util 分支推送到github

git checkout util
git add .
git commit -am "xxxxxxxx"
git push github util:util

system call

int fork() Create a process, return childs PID.
int exit(int status) Terminate the current process; status reported to wait(). No return.
int wait(int *status) Wait for a child to exit; exit status in *status; returns child PID.
int kill(int pid) Terminate process PID. Returns 0, or -1 for error.
int getpid() Return the current processs PID.
int sleep(int n) Pause for n clock ticks.
int exec(char *file, char *argv[]) Load a file and execute it with arguments; only returns if error.
char *sbrk(int n) Grow processs memory by n bytes. Returns start of new memory.
int open(char *file, int flags) Open a file; flags indicate read/write; returns an fd (file descriptor).
int write(int fd, char *buf, int n) Write n bytes from buf to file descriptor fd; returns n.
int read(int fd, char *buf, int n) Read n bytes into buf; returns number read; or 0 if end of file.
int close(int fd) Release open file fd.
int dup(int fd) Return a new file descriptor referring to the same file as fd.
int pipe(int p[]) Create a pipe, put read/write file descriptors in p[0] and p[1].
int chdir(char *dir) Change the current directory.
int mkdir(char *dir) Create a new directory.
int mknod(char *file, int, int) Create a device file.
int fstat(int fd, struct stat *st) Place info about an open file into *st.
int stat(char *file, struct stat *st) Place info about a named file into *st.
int link(char *file1, char *file2) Create another name (file2) for the file file1.
int unlink(char *file) Remove a file

sleep (easy)

Implement the UNIX program sleep for xv6; your sleep should pause for a user-specified number of ticks. A tick is a notion of time defined by the xv6 kernel, namely the time between two interrupts from the timer chip. Your solution should be in the file user/sleep.c.

Some hints:

  1. Before you start coding, read Chapter 1 of the xv6 book.
  2. Look at some of the other programs in user/ (e.g., user/echo.c, user/grep.c, and user/rm.c) to see how you can obtain the command-line arguments passed to a program.
  3. If the user forgets to pass an argument, sleep should print an error message.
  4. The command-line argument is passed as a string; you can convert it to an integer using atoi (see user/ulib.c).
  5. Use the system call sleep.
  6. See kernel/sysproc.c for the xv6 kernel code that implements the sleep system call (look for sys_sleep), user/user.h for the C definition of sleep callable from a user program, and user/usys.S for the assembler code that jumps from user code into the kernel for sleep.
  7. Make sure main calls exit() in order to exit your program.
  8. Add your sleep program to UPROGS in Makefile; once you've done that, make qemu will compile your program and you'll be able to run it from the xv6 shell.
  9. Look at Kernighan and Ritchie's book The C programming language (second edition) (K&R) to learn about C.

concept

fork

A process may create a new process using the fork system call. Fork gives the new process exactly the same memory contents (both instructions and data) as the calling process.

Fork returns in both the original and new processes. In the original process, fork returns the new process’s PID. In the new process, fork returns zero. The original and new processes are often called the parent and child.

exit

The exit system call causes the calling process to stop executing and to release resources such as memory and open files. Exit takes an integer status argument, conventionally 0 to indicate success and 1 to indicate failure.

The wait system call returns the PID of an exited (or killed) child of the current process and copies the exit status of the child to the address passed to wait; if none of the caller’s children has exited, wait waits for one to do so. If the caller has no children, wait immediately returns \(-1\). If the parent doesn’t care about the exit status of a child, it can pass a 0 address to wait.

exec

The exec system call replaces the calling process’s memory with a new memory image loaded from a file stored in the file system. The file must have a particular format, which specifies which part of the file holds instructions, which part is data, at which instruction to start, etc.

Xv6 uses the ELF format, which Chapter 3 discusses in more detail. When exec succeeds, it does not return to the calling program; instead, the instructions loaded from the file start executing at the entry point declared in the ELF header. Exec takes two arguments: the name of the file containing the executable and an array of string arguments.

Most programs ignore the first element of the argument array,which is conventionally the name of the program.

参考代码

echo.c
1  #include "kernel/types.h"
2  #include "kernel/stat.h"
3  #include "user/user.h"
4
5  int
6  main(int argc, char *argv[])
7  {
8    int i;
9
10    for(i = 1; i < argc; i++){
11      write(1, argv[i], strlen(argv[i]));
12      if(i + 1 < argc){                            
13        write(1, " ", 1);
14      } else {
15        write(1, "\n", 1);
16      }
17    }
18    exit(0);
19  }
rm.c
1  #include "kernel/types.h"
2  #include "kernel/stat.h"
3  #include "user/user.h"
4
5  int
6  main(int argc, char *argv[])
7  {
8    int i;
9
10    if(argc < 2){
11      fprintf(2, "Usage: rm files...\n");
12      exit(1);
13    }
14
15    for(i = 1; i < argc; i++){
16      if(unlink(argv[i]) < 0){
17        fprintf(2, "rm: %s failed to delete\n", argv[i]);
18        break;
19      }
20    }
21
22    exit(0);
23  }

sys_sleep

sys_sleep

user/usys.S

98  .global sleep
99  sleep:
100   li a7, SYS_sleep
101   ecall
102   ret

sleep.c

#include <kernel/types.h>
#include <kernel/stat.h>
#include <user/user.h>

int 
main(int argc, char *argv[])
{
    int n;
    if(argc < 2){
        fprintf(2, "Usage: argc error message...\n");
        exit(1);
    }

    n = atoi(argv[1]);
    sleep(n);
    exit(0);
}

问题一

出现的问题:

Alt text

解决方案:

这个命令和 make qemu 一样的,不是进入 qemu 模拟的 xv6 里去执行,要放在根目录那里执行

最终结果:

Alt text

pingpong (easy)

Write a program that uses UNIX system calls to ''ping-pong'' a byte between two processes over a pair of pipes, one for each direction. The parent should send a byte to the child; the child should print ": received ping", where is its process ID, write the byte on the pipe to the parent, and exit; the parent should read the byte from the child, print ": received pong", and exit. Your solution should be in the file user/pingpong.c.

Some hints:

  1. Use pipe to create a pipe.
  2. Use fork to create a child.
  3. Use read to read from the pipe, and write to write to the pipe.
  4. Use getpid to find the process ID of the calling process.
  5. Add the program to UPROGS in Makefile.
  6. User programs on xv6 have a limited set of library functions available to them. You can see the list in user/user.h; the source (other than for system calls) is in user/ulib.c, user/printf.c, and user/umalloc.c.

concept

xv6内核将文件描述符用作每个进程表的索引,以便每个进程从零开始拥有私有的文件描述符空间。按照惯例,进程从 file descriptor \(0\) (standard input),将输出写入 file descriptor \(1\) (standard output),并将错误消息写入 file descriptor \(2\) (standard error)。正如我们将看到的,shell 利用这一惯例来实现I/O重定向和管道。shell确保始终打开了三个文件描述符(user/sh.c:152),它们默认是控制台的文件描述符。

read

The call read(fd, buf, n) 最多从文件描述符 fd 中读取 \(n\) 个字节,将它们复制到 buf 中,并返回读取的字节数。每个与文件相关的文件描述符都有一个偏移量。read从当前文件偏移量读取数据,然后将偏移量增加已读取的字节数:随后的read将返回第一个read返回的字节之后的字节。当没有更多字节可读取时,read返回零,表示文件结束。

The call write(fd, buf, n) 从缓冲区 buf 向文件描述符 fd 写入 \(n\) 个字节,并返回写入的字节数。只有在发生错误时才会写入少于n个字节。类似于read,write将数据写入当前文件偏移量,并将偏移量增加写入的字节数:每次写入都从前一次写入的位置继续。

I/O redirection

文件描述符和fork相互作用,使得I/O重定向易于实现。fork 会将父进程的文件描述符表与其内存一起复制,因此子进程从父进程完全相同的打开文件开始。系统调用exec 会替换调用进程的内存,但保留其文件表。这种行为使得shell 可以通过fork,在子进程中重新打开选择的文件描述符,然后调用exec来运行新程序来实现I/O重定向。

open

open("input.txt", O_RDONLY);

The second argument to open consists of a set of flags, expressed as bits, that control what open does. The possible values are defined in the file control (fcntl) header (kernel/fcntl.h:1-5):O_RDONLY, O_WRONLY, O_RDWR, O_CREATE, and O_TRUNC, which instruct open to open the file for reading, or for writing, or for both reading and writing, to create the file if it doesn’t exist, and to truncate the file to zero length

forkexec

将fork和exec分开成为不同的调用是有帮助的:在这两者之间,shell有机会重定向子进程的I/O,而不会干扰主shell的I/O设置。我们可以想象一个假设的合并的forkexec系统调用,但是使用这样的调用进行I/O重定向的选项似乎很笨拙。shell可以在调用forkexec之前修改自己的I/O设置(然后撤消这些修改);或者forkexec可以将I/O重定向的指令作为参数传递;或者(最不理想的情况)每个类似cat的程序都可以被教导如何进行自己的I/O重定向。

Alt text

thanks to wait, runs only after the child is done

dup

The dup system call duplicates an existing file descriptor, returning a new one that refers to the same underlying I/O object. Both file descriptors share an offset, just as the file descriptors duplicated by fork do.

Alt text

图中的这个指令2>&1就是,将 标准错误输出(standard error) 的 file descriptor \(2\) 转变为 标准输出(standard output) 的 file descriptor \(1\) 的地址。

pipe

管道是一个小的内核缓冲区,它以文件描述符对的形式提供给进程,一个用于写操作,一个用于读操作。从管道的一端写的数据可以从管道的另一端读取。管道提供了一种进程间交互的方式。

接下来的示例代码运行了程序 wc,它的标准输出绑定到了一个管道的读端口。

int p[2];
char *argv[2];
argv[0] = "wc";
argv[1] = 0;
pipe(p);
if(fork() == 0) {
    close(0);
    dup(p[0]);
    close(p[0]);
    close(p[1]);
    exec("/bin/wc", argv);
} else {
    write(p[1], "hello world\n", 12);
    close(p[0]);
    close(p[1]);
}
The program calls pipe, which creates a new pipe and records the read and write file descriptors in the array p. After fork, both parent and child have file descriptors referring to the pipe. The child calls close and dup to make file descriptor zero(\(0\)) refer to the read end of the pipe, closes the file descriptors in p, and calls exec to run wc. When wc reads from its standard input, it reads from the pipe. The parent closes the read side of the pipe, writes to the pipe, and then closes the write side.

如果数据没有准备好,那么对管道执行的read会一直等待,直到有数据了或者其他绑定在这个管道写端口的描述符都已经关闭了。在后一种情况中,read 会返回 0,就像是一份文件读到了最后。读操作会一直阻塞直到不可能再有新数据到来了,这就是为什么我们在执行 wc 之前要关闭子进程的写端口。如果 wc 指向了一个管道的写端口,那么 wc 就永远看不到 eof 了。

example:

echo hello world | wc

可以用无管道的方式实现:

echo hello world > /tmp/xyz; wc < /tmp/xyz

但管道和临时文件起码有三个关键的不同点。

  1. 首先,管道会进行自我清扫,如果是 shell 重定向的话,我们必须要在任务完成后删除 /tmp/xyz
  2. 第二,管道可以传输任意长度的数据。
  3. 第三,管道允许同步:两个进程可以使用一对管道来进行二者之间的信息传递,每一个读操作都阻塞调用进程,直到另一个进程用 write 完成数据的发送。

pingpong.c

在使用管道时,不再需要使用PV(信号量)操作。管道本身提供了一种机制,用于在进程之间进行通信和同步。它允许一个进程将数据写入管道的写端,而另一个进程可以从管道的读端读取这些数据。

管道的读操作和写操作是阻塞的,这意味着如果没有数据可读,读取操作将会阻塞进程,直到有数据可用。同样,如果管道已满,写入操作将会阻塞进程,直到有空间可用。

因此,使用管道可以有效地实现进程之间的同步和通信,而不需要显式使用PV操作来进行信号量的管理。管道提供了一种自动的同步机制,确保数据在进程之间按照适当的顺序传递,并避免竞态条件和数据损坏的问题。

将题目大题翻译一遍就可以了,值得注意的是,用完读操作/写操作后,及时 close 掉,防止不加锁带来的冲突。

#include <kernel/types.h>
#include <kernel/stat.h>
#include <user/user.h>

int 
main(int argc, char *argv[])
{
    int p_child[2];
    int p_parent[2];
    char buffer[8]; // 一个字节
    pipe(p_child);
    pipe(p_parent);
    // p[0] is read,p[1] is write

    int pid = fork();

    if(pid == 0) { // child

        // close(p[1]);
        read(p_child[0],buffer,8);
        printf("%d: received ping\n",getpid());
        close(p_child[0]);

        // write the byte to the parents
        write(p_parent[1], "child", sizeof("child"));
        close(p_parent[1]);
        exit(0);
    } else { // parents
        // write the byte to the child
        write(p_child[1], "parent", sizeof("parent"));
        close(p_child[1]);

        read(p_parent[0],buffer,8);
        wait(0);
        printf("%d: received pong\n",getpid());

        close(p_parent[0]);
        exit(0);
    }

}

最终结果:

Alt text

primes (moderate)/(hard)

Write a concurrent(并发的) version of prime sieve using pipes. This idea is due to Doug McIlroy, inventor of Unix pipes. The picture halfway down this page and the surrounding text explain how to do it. Your solution should be in the file user/primes.c.

Your goal is to use pipe and fork to set up the pipeline. The first process feeds the numbers 2 through 35 into the pipeline. For each prime number, you will arrange to create one process that reads from its left neighbor over a pipe and writes to its right neighbor over another pipe. Since xv6 has limited number of file descriptors and processes, the first process can stop at 35.

Some hints:

  1. Be careful to close file descriptors that a process doesn't need, because otherwise your program will run xv6 out of resources before the first process reaches 35.
  2. Once the first process reaches 35, it should wait until the entire pipeline terminates, including all children, grandchildren, &c. Thus the main primes process should only exit after all the output has been printed, and after all the other primes processes have exited.
  3. Hint: read returns zero when the write-side of a pipe is closed.
  4. It's simplest to directly write 32-bit (4-byte) ints to the pipes, rather than using formatted ASCII I/O.
  5. You should create the processes in the pipeline only as they are needed.
  6. Add the program to UPROGS in Makefile.

concept

p = get a number from left neighbor
print p
loop:
    n = get a number from left neighbor
    if (p does not divide n)
        send n to right neighbor

Alt text

primes.cpp

值得注意的是,关闭通道之后就不能再用了,除非 exit() 释放后。

所以在刚开始,先把不用的通道关闭。

所有与该通道相关的命令执行完之后在关闭,并不是执行一个命令,关闭一次,关闭一次之后以后就写不进去了!!!

#include <kernel/types.h>
#include <kernel/stat.h>
#include <user/user.h>

int is_prime(int x) {
    for(int i = 2; i <= x/i; i ++) {
        if(x%i == 0) return 0;
    }
    return 1;
}
int 
main(int argc, char *argv[])
{

    int p_child[2];
    int p_parent[2];
    char buffer[32]; // 四个字节

    pipe(p_child);
    pipe(p_parent);

    if(fork() > 0) { // parents
        close(p_child[0]); 
        close(p_parent[1]); 

        for(int i = 2; i <= 35; i ++) {
            buffer[0] = i;
            write(p_child[1],buffer,1);
        }
        close(p_child[1]); 

        while(read(p_parent[0],buffer,1) != 0){
            printf("prime %d\n", (int)buffer[0]);
        }
        close(p_parent[0]);
        wait(0); // 等待所有子进程都完成
        exit(0); // 好习惯,最后return 0;

    } else { // child
        close(p_child[1]);
        close(p_parent[0]);

        while(read(p_child[0],buffer,1) != 0) {
            int n = (int)buffer[0];
            if(is_prime(n) == 1) {
                if(fork() == 0) {
                    close(p_parent[0]); 
                    write(p_parent[1],buffer,1);
                    close(p_parent[1]);
                    exit(0);
                } 
                wait(0); // 等待child 的子进程执行完毕。
            }
            if(n == 35) break;
        }
        close(p_child[0]);
        close(p_parent[1]);
        exit(0);
    }

}

最终结果

Alt text

find (moderate)

Write a simple version of the UNIX find program: find all the files in a directory tree with a specific name. Your solution should be in the file user/find.c.

Some hints:

  1. Look at user/ls.c to see how to read directories.
  2. Use recursion to allow find to descend into sub-directories.
  3. Don't recurse into "." and "..".
  4. Changes to the file system persist across runs of qemu; to get a clean file system run make clean and then make qemu.
  5. You'll need to use C strings. Have a look at K&R (the C book), for example Section 5.5.
  6. Note that == does not compare strings like in Python. Use strcmp() instead.
  7. Add the program to UPROGS in Makefile.

concept

fstat 可以获取一个文件描述符指向的文件的信息。它填充一个名为 stat 的结构体,它在 stat.h 中定义为:

fstat 感觉就是 file state.

#define T_DIR  1
#define T_FILE 2
#define T_DEV  3
// Directory
// File
// Device
     struct stat {
       short type;  // Type of file
       int dev;     // File system’s disk device
       uint ino;    // Inode number
       short nlink; // Number of links to file
       uint size;   // Size of file in bytes
};

文件名和这个文件本身是有很大的区别。同一个文件(称为 inode)可能有多个名字,称为连接 (links)。系统调用 link 创建另一个文件系统的名称,它指向同一个 inode。下面的代码创建了一个既叫做 a 又叫做 b 的新文件。

open("a", O_CREATE|O_WRONGLY);
link("a", "b");

读写 a 就相当于读写 b。每一个 inode 都由一个唯一的 inode 号 直接确定。在上面这段代码中,我们可以通过 fstat 知道 a 和 b 都指向同样的内容:a 和 b 都会返回同样的 inode 号(ino),并且 nlink 数会设置为 \(2\)

问题一

这里被卡了,要再敲一下回车,才能推出,不知道在那卡住了

Alt text

修改之前的代码:

if (strcmp(de.name, ".") != 0 && strcmp(de.name, ".." != 0)) {
    find(buf,name);
}

if (st.type == T_FILE)
{
    if (strcmp(de.name, name) == 0)
    {
        printf("%s\n", buf);
    }
}
else {
    if (strcmp(de.name, ".") == 0){}
    else if (strcmp(de.name, "..") == 0) {}
    else
    {
        find(buf, name);
    }
}

修改之后的代码:

switch(st.type){
case T_FILE:
    if (strcmp(de.name, name) == 0)
    {
        printf("%s\n", buf);
    }
    break;
case T_DIR:
    if (strcmp(de.name, ".") == 0){}
    else if (strcmp(de.name, "..") == 0) {}
    else
    {
        find(buf, name);
    }
    break;
}

Alt text

错误原因:

文件有三种状态:file,dir,device , 第一种写法少判断了 device,改过来就好啦!

if (st.type == T_FILE)
{
    if (strcmp(de.name, name) == 0)
    {
        printf("%s\n", buf);
    }
}
else if (st.type == T_DIR){
    if (strcmp(de.name, ".") == 0){}
    else if (strcmp(de.name, "..") == 0) {}
    else
    {
        find(buf, name);
    }
}

最终结果

Alt text

xargs (moderate)

Write a simple version of the UNIX xargs program: read lines from the standard input and run a command for each line, supplying the line as arguments to the command. Your solution should be in the file user/xargs.c.

Some hints:

  1. Use fork and exec to invoke the command on each line of input. Use wait in the parent to wait for the child to complete the command.
  2. To read individual lines of input, read a character at a time until a newline ('\n') appears.
  3. kernel/param.h declares MAXARG, which may be useful if you need to declare an argv array.
  4. Add the program to UPROGS in Makefile.
  5. Changes to the file system persist across runs of qemu; to get a clean file system run make clean and then make qemu.

concept

执行命令 echo "1\n2" | xargs -n 1 echo line 将产生以下输出:

line 1
line 2

让我们来解析这个命令:

  1. echo "1\n2" 将字符串 "1\n2" 打印到标准输出。其中 \n 表示换行符。
  2. |(管道)符号将上一个命令的输出作为输入传递给下一个命令。
  3. xargs -n 1 从输入中读取每一行,并将其作为参数传递给后续的命令,每次执行命令时最多只传递一个参数。
  4. echo line 是接收来自 xargs 的参数,并在每个参数后附加单词 "line" 的命令。然后将结果打印到标准输出。

因此,xargs 命令将从输入中获取每一行,该输入在这个例子中是 "1" 和 "2"(由换行符分隔),然后逐个将它们作为参数传递给 echo line。结果是,"line" 被附加到每个参数上,最终的输出显示为单独的行 "line 1" 和 "line 2"。

grep

grepGlobal Regular Expression Print 的简写。它是一个在 Unix 系统中广泛使用的命令行工具,其中 "global" 表示全局搜索。"global" 在这里表示 grep 命令会在整个文件中搜索匹配的行,而不仅仅是第一个匹配的行。默认情况下,grep 命令会打印出所有匹配的行,而不仅仅是第一个匹配的行。

除了打印匹配的行,grep 还具有其他选项和功能,例如可以指定匹配的模式是大小写敏感还是不敏感、递归搜索目录、显示匹配行的行号等。

grep 命令与正则表达式(Regular Expression)密切相关。正则表达式是一种强大的模式匹配工具,用于描述和匹配文本中的模式。

以下是一些常见的正则表达式元字符和用法:

  • .:匹配任意单个字符。
  • *:匹配零个或多个前面的字符。
  • +:匹配一个或多个前面的字符。
  • ?:匹配零个或一个前面的字符。
  • []:字符类,匹配方括号内的任意一个字符。
  • ():分组,将表达式中的一部分作为一个整体进行匹配。

例如,可以使用 grep 命令与正则表达式一起执行以下操作:

搜索包含特定单词的行:

grep "word" file.txt

忽略大小写搜索模式:

grep -i "pattern" file.txt

使用正则表达式进行更复杂的匹配:

grep "[0-9]+[A-Za-z]*" file.txt

find . b | xargs grep win

find . b | xargs grep win 可以搜索文件里面的内容,这是因为 xargs 命令会将 find 命令的输出作为参数传递给 grep 命令,并且 grep 命令会在这些文件中搜索匹配的行。

目录结构:a/b.cpp, b.cpp 内容为:win

  1. 执行 find . b 得到 ./a ./a/b.cpp

  2. | 将拿到的find 的标准输出 传给 xargs

  3. xargs 将标准输出用 \n 分割成多个参数,将 \n 替换为结束符 \0,并将这些参数传递给 grep 命令

  4. 最后实际执行:

    grep win ./a

    grep win ./a/b.cpp

输出如图所示

Alt text

整个命令执行过程如图所示

Alt text

xargs.c

Alt text

#include "kernel/types.h"
#include "kernel/stat.h"
#include "kernel/param.h"
#include "user/user.h"
#include "kernel/fs.h"


int 
main(int argc, char *argv[])
{ 
    // char *program = argv[1];
    // char *context = argv[4];

    char buf[26];
    // 拿到上一个命令的标准输出,即 xargs 的标准输入
    // 一直读,直到拿到为止
    int n = 0;
    int buf_idx = 0;
    while((n = read(0,&buf[buf_idx],sizeof(buf)) != 0)) {
        buf_idx += n;
    }

    // 存储每一个 Xargs 的参数
    char *xargv[MAXARG];
    int xargc = 0;
    for(int i = 1; i < argc; i ++,xargc ++) {
        xargv[xargc] = argv[i];
    }

    // 读 buf,碰到 \n 执行命令
    char *p = buf;
    for(int i = 0; i < 26; i ++) {
        if(buf[i] == '\n') {

            if(fork() > 0) {
                // 跳过上一个 \0
                p = &buf[i+1];
                wait(0);
            } else {
                buf[i] = '\0';
                // 把文件地址传进去
                xargv[xargc] = p;
                xargc ++;
                xargv[xargc] = "\0";
                xargc ++;
                exec(xargv[0],xargv);
            }
        }

    }

    exit(0);
}

最终结果

Alt text

Summary

好难啊,终于做完了 lab 1

Alt text


Last update: January 17, 2024
Created: January 7, 2024