Sunday, January 14, 2007

Stack-less multi-task system

In traditional multi-task operating system, there is one stack for one task. The stack is mainly used for storing call chain, function parameters and local variables, that is, the context of the task. When task is switched, the stack of task is switched too.

The problem of stack based multi-task system is the size of stack should be the maximum possible size that the stack maybe grow to. In Linux system, the size of kernel task stack is 4kB or 8kB. But, in 99% condition, only small part of whole stack is used. The efficiency of stack memory usage is low.

Another solution is that all tasks share the same stack. When task is executed, the stack is used for storing call chain, function parameters and local variables as that of task based multi-task. When task is about to be switched out of CPU, it saves all state (context) needed to continue the work elsewhere (such as global variable, malloced memory) and hands the stack to the task to be switched into CPU. The new task will ruin all the contents in the stack and use it exclusively until the next switching.

The problem of stack-less multi-task system is that when switching point is deeply nested, it is hard to storing all state needed in a modular way. For example, when a "page fault" occurs in a memory mapped page, the handler will go through memory subsystem and file system and maybe be switched out of CPU for waiting for disk reading. How can the task save the state of memory subsystem and file system in a modular way? So, the stack-less multi-task system is used mainly in the kernel of micro-kernel operating system, because there are only few simple subsystems, the state saving can be done in a modular or manageable way.

A resolution is adaptive stack scheme. If task is working for a system call that have very small stack usage (such as the wait in POSIX), all the state needed is stored elsewhere and the stack is freed, otherwise the stack is preserved. But this resolution is not very effective if most system calls use more stack.

I think another possible resolution is split task into more small tasks. Because every task is so small, its state can be stored without refer to other subsystem, that is in a modular way. And because the overhead of task is much less than before, the total overhead is not too much. Perhaps one small task corresponds one subsystem in traditional system. For example, the memory system, file system can be tasks too. In this system, create a process turns to create several tasks.

No comments: