6.S081 lab1 write up

2021-02-18 921 words 5 minutes

Contents

Preface

It is considered to be important to have an overview of Operating Systems. However, due to my laziness and the radio-building-oriented subject I major in, I hardly complete any courses about OSs.

After the spring festival, feeling still knowing little about OS and losing ability on computer programming, I picked up xv6 labs from MIT 6.S081 courses again, hoping by coping with them will be of help.

The hardest thing I came up with is not What to do but How to do and Why doing so works.

Lab: Utils

In this lab, I am assigned to finish a few UNIX utilities. It is not that difficult. All system calls can be found in user/user.h.

Always check about hints before solving a task.

sleep

As a single program running on OS, you cannot count on yourself to sleep an exact period of time. Instead of using dead loops, you should use system calls to acquire aid from the kernel.

After referring to the xv6 guide book, I find a system call named sleep(), which I need to do next is to pack it up into a user program.

How to write a user program for xv6? By looking up other programs, I got ambiguous knowledge about it.

All user programs start executing from their main() functions, arguments from console passed to the program from string array argv. argc is the number of arguments in argv.

1
2


# console input
sleep 114514

1
2
3
4


// console parsed arguments and pased it to the program.
// be await that argv[0] is the program name.
int argc = 2;
char *argv[] = {"sleep", "114514"};

Programs can also take user inputs from stdin, whose fd is normally 0. It can be redirected to files or console outputs. scanf() in stdio.h takes input from it.

And functions like printf() writes data to stdout, whose fd is normally 1. It can also be redirected to files.

If there is an error in your program and you want to print a message about what is going on, you should print it to stderr whose fd is normally 2. fprintf() is used for this purpose.

So what we need to do is to pack up the system call sleep() with a program realized in user/sleep.c, and add it to Makefile. The program should take its argument from argv.

pingpong & primes

As I know, communicating between processes have mainly two ways:

Share memory to communicate

Using buffer or file that could be accessed by both sender and receiver. However, this usually leads to troubling issues like data racing that you should deal with it carefully. We call the buffer/file Critical Section. And there are mechanisms like mutex and semaphore to prevent data racing or dirty data issues.
Communicate with pipes, channels

Using pipes is much easier than dealing with shared memory in my point of view. When reading from an empty pipe, the process will be blocked until there are new data flushed in pipes or pipes being closed.

Pipes can be provided by OS, and some modern languages like Go have a language level support for pipes (referred as ch(channel) in Go).

As for pingpong, we can solve it by using 2 processes connected by 2 pipes, but be careful that if 2 process use printf() at the same time, the output will be messed up.

graph LR parent{{parent}} -- ping --> child{{child}} child -- pong --> parent

As for primes, things get tricky a little. However, the link in the problem description had almost told you the answer.

I complete the task by realizing a void subprime(int* r_end) function, which works like below:

I will change the graph’s engine into flowchart.js someday later…

graph TD A["void subprimt(int* read_end)"] --> B["int div <- read_end;
print div"] B --> C1{"Is pipe's write end closed?"} C1 -- yes --> G["return"] C1 -- no --> C["int following <- read_end;"] C --> D["pipe p;
fork: subprime(p[READEND]);"] D --> E{"following % div == 0 ?"} E -- yes --> C1 E -- no --> F["p[WRITEEND] <- following"] --> C1

find

Just by adopting user/ls.c, you can get a medium well find program.

Files and directories in Linux are referred to as inodes. Most of the file system layouts are like a tree with files/directories as nodes in it.

graph TD r --> run["run"] r["/"] --> usr["usr"] r --> var["var"] r --> rot["other files/dirs..."] usr --> lib["lib"] usr --> bin["bin"] usr --> uot["other files/dirs..."] bin --> grep{{grep}} bin --> cat{{cat}} bin --> bot["other files/dirs..."]

To access files in directories, you can use a fd point to the path, and use read() to read files/dirs under the directory, the data structure used to store child files/directories is struct dirent.

Now that you are assigned to develop a find program, you should use the function on child directories, so that the program can look up recursively.

xargs

In fact, this task is pretty simple, but the way xargs works come out to be a little hard to understand.

The way xargs works can be concluded into pseudo code like below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


# string argv[]
stdin = input()
stdv = stdin.split()
for arg in stdv:
    arg_full = list.append(argv, arg)
    arg_full.pop(0)

    fork(
        # argv[0] is "xargs", which should be dropped.
        execute(arg_full)
    )

Conclusion

Working on labs did give me an ambiguous impression about multithread programming, file systems and OS programming. There are still many jobs to be done.

My lab repository is -> here.