Ish Shell

1. The main loop

As a first step, implement your shell’s main loop. Write a program which provides a prompt, reads user input and then writes it back to stdout. This is an opportunity to make one of the more important decisions in the exercise: what your prompt should look like! For our shell, (snowshell) the interaction looks a little like:

⛄ hi
hi
⛄ hello
hello

GNU has some nice C docs. Rather than making things too hard on myself with getc or gets, I’ll use getline.

Before calling getline, you should place in lineptr the address of a buffer n bytes long, allocated with malloc.

allocate a buffer for the line of user input

size_t* bufsize = malloc(sizeof(size_t));
*bufsize = 80;
char** linebuf = malloc(sizeof(char*) * *bufsize);

Used by 1

When getline is successful, it returns the number of characters read (including the newline, but not including the terminating null). This value enables you to distinguish null characters that are part of the line from the null character inserted as a terminator.

allocate a buffer for the line of user input +=

ssize_t chars_read;

Now, in the main loop, we’ll just read a line then echo it back.

Function: size_t fwrite (const void data, size_t size, size_t count, FILE stream) Preliminary: | MT-Safe | AS-Unsafe corrupt | AC-Unsafe lock corrupt | See POSIX Safety Concepts. This function writes up to count objects of size size from the array data, to the stream stream. The return value is normally count, if the call succeeds. Any other value indicates some sort of error, such as running out of space.

includes

#include <string.h>

Used by 1

main loop

size_t chars_write = 0;
for (;;) {
    @{print prompt}
    chars_read = getline(linebuf, bufsize, stdin);
    @{handle EOF}
    @{strip newline from linebuf}
    char result;
    pid_t pid = fork();
    @{handle fork error}
    if (pid == 0) {
        result = execvp(*linebuf, (char* const []){NULL});
        if (result == -1) {
            printf("%s: command not found\n", *linebuf);
            exit(errno);
        }
    } else {
        int status;
        pid_t wait_pid = wait(&status);
    }
}

Used by 1

The linebuf will have a newline at the end. The command that we want to use for our subprocess needs that newline removed.

strip newline from linebuf

for (char* c = *linebuf; *c != '\0'; c++) {
    if (*c == '\n')
        *c = '\0';
}

Used by 1

print prompt

printf("⛄ ");

Used by 1

From the line-input section of the GNU libc docs:

If an error occurs or end of file is reached without any bytes read, getline returns -1.

This kind of sucks because we can’t differentiate between an error and EOF. But for now, I can’t think of any error that we’d reasonably encounter and it’s pragmatic to ignore this issue.

handle EOF

if (chars_read == -1) {
    printf("❄❅❄❅ Goodbye and stay warm! ❄❅❄❅\n");
    exit(1);
}

Used by 1

/src/main.c

@{includes}

int main(int argc, char** argv) {
    @{parse arguments}
    @{handle using ish to execute a command or script}
    @{handle using ish as a REPL}
}

handle using ish as a REPL

@{allocate a buffer for the line of user input}
@{main loop}

Used by 1

handle using ish to execute a command or script

if (command_string != NULL) {
    @{fork and exec a command}
    @{handle fork error}
}

Used by 1

includes +=

#include <stdio.h>
#include <stdlib.h>

2. Using `ish` to execute a command

You may also wish to structure your code such that your shell can either be run as a repl, or execute a command (e.g. bash -c ‘ls’) as this may help you with testing.

From the GNU libc docs (lots of love for these docs), it looks like we have two reasonable options for parsing arguments.

If the syntax for the command line arguments to your program is simple enough, you can simply pick the arguments off from argv by hand. But unless your program takes a fixed number of arguments, or all of the arguments are interpreted in the same way (as file names, for example), you are usually better off using getopt (see Parsing program options using getopt) or argp_parse (see Parsing Program Options with Argp) to do the parsing.

getopt is more standard (the short-option only version of it is a part of the POSIX standard), but using argp_parse is often easier, both for very simple and very complex option structures, because it does more of the dirty work for you.

And from StackOverflow, we learn that getopt is part of the POSIX standard.

argp may be more flexible / powerful / etc, but getopt is part of the POSIX standard. Thats a choice you’ve to make based on whether you expect your program to be portable.

After reading the docs for each, I’m going to go with getopt. argp looks sweet. It does a lot for you. But it involves a lot of boilerplate and is really verbose for my simple needs. I can always refactor later.

includes +=

#include <unistd.h>

parse arguments

int c;
char* command_string = NULL;
while ((c = getopt (argc, argv, "c:")) != -1) {
    switch (c) {
        case 'c':
            command_string = optarg;
            break;
    }
}

Used by 1

3. Running a subprocess

Our shell isn’t very useful unless it can run a child process for us. As a next goal, have your shell fork and exec a simple command with no arguments such as ls or pwd. Once you have done this, extend it to also support simple arguments. Your shell should obviously be able to run one program, wait for it to complete, then run another, like so:

⛄ whoami
ozan
⛄ echo hello
hello
⛄ ^D
❄❅❄❅ Goodbye and stay warm! ❄❅❄❅

You should also gracefully handle cases where the command is not found:

⛄ a
🌲⛷  a: command not found
⛄ b
🌲⛷  b: command not found
⛄ ^D
❄❅❄❅ Goodbye and stay warm! ❄❅❄❅

man 2 fork has all the info we need.

On success, the PID of the child process is returned in the parent’s thread of execution, and a 0 is returned in the child’s thread of execution. On failure, a -1 will be returned in the parent’s context, no child process will be created, and errno will be set appropriately.

fork and exec a command

pid_t pid = fork();
@{handle fork error}
char result;
if (pid == 0) {
    result = execvp(command_string, (char* const []){NULL});
} else {
    @{wait for child process to finish}
    if (result == -1)
        exit(errno);
}

Used by 1

See this opengroup.org page for a description of wait vs waitpid.

The waitpid() function is provided for three reasons:
    To support job control
    To permit a non-blocking version of the wait() function
    To permit a library routine, such as system() or pclose(), to wait for its children without interfering with other terminated children for which the process has not waited

includes +=

#include <sys/wait.h>

wait for child process to finish

int status;
pid_t wait_pid = wait(&status);
return status;

Used by 1

includes +=

#include <errno.h>

handle fork error

if (pid == -1) {
    perror("error forking");
    exit(errno);
}

Used by 1 2 3

4. Build and dev tools

The next two snippets are small and I haven’t taken the time to make them literate, but I wanted to include them here for completeness.

Build

/Makefile

CC = clang
CFLAGS = -g -Og -Wall

all: ish docs/ish.html docs/ish.md

ish: src/main.c
    $(CC) $(CFLAGS) -o ish src/main.c

src/main.c docs/ish.html: ish.lit
    srcweave --tangle . --weave docs --formatter srcweave-format $<

docs/ish.md: ish.lit
    pandoc --from html --to gfm docs/ish.html -o docs/ish.md

init: FORCE
    srcweave-format-init -m docs
FORCE:

Dev

/watch.sh

while inotifywait -e modify ish.lit; do make all; done