Arm Cortex-M Context Switching Part 1

In this example we are going to implement a basic framework to handle multiple concurrent tasks  i.e. task (or context) switching.

A task is defined as an area of memory for its stack and the current state of it’s registers, if we wish to halt a task and run another we must first save the state of the current running task and restore the state of the task about to be run.

The sequence we are going to use to perform task switching  is as follows:

  1. SysTick exception is triggered
  2. The CPU pushes registers psr, pc, lr, r12, r3-r0 onto the process stack
  3. The task list is searched to find a task that is runnable
  4. If a task is found save the current task state, i.e. the remaining unsaved registers {r4-r11} are pushed onto the task stack and the stack pointer is saved
  5. Restore the state of the next task , using that tasks stack, i.e. pop registers off the stack that are not handled by the exception handler {r4-r11}
  6. Instruct the CPU to use the new tasks stack when returning from exception
  7. Return from exception into Thread Mode
  8. The CPU restores registers  psr, pc, lr, r12, r3-r0 from the task stack, which means the execution of the process continues from the newly loaded program counter (pc)

In this example we represent a tasks information with the following structure:

struct task_t {
        uint32_t *sp;    // Task stack pointer
        uint32_t state; 
};

In order to “allocate” memory for the individual task’s stack we can use the following arbitary layout, each task  has a stack of 256 bytes and the task stacks are placed below the main stack (also 256 bytes) at 256 byte intervals.

Therefore to calculate a stack pointer value for task N the code would look something like:

task_table[N].sp = (void *) (PROC_STACK_TOP - (N * PROC_STACK_SIZE));

The diagram below shows the memory layout required:

stack_layout

We use the following startup.ld linker script to define these secions:

MEMORY
{
    mem : ORIGIN = 0x00000000, LENGTH = 0x1000
    data : ORIGIN = 0x20000000, LENGTH = 0x1000
}

SECTIONS
{
    .text : { *(.text*) } > mem
    .data : { *(.text*) } > data
}

Note the .data section is required as we use a global array in the code and we have not yet set up code to handle .bss sections which are usually used to handle global variables, we must also indicate in the code that a global array is to be placed in the .data section, for example:

static struct task_t task_table[TASK_MAX] __attribute__((section(".data")));

 

When we first start a task we must set up it’s stack so a call to task_restore() will do the right thing, i.e pop registers r11-r4 off the stack, at this point the stack pointer points to the values that will be loaded into the psr,  pc, lr, r12, r3-r0 registers after the exception is instructed to return to thread mode.

This is achieved with the following sample code:

/* setup initial stack frame */
*(task_table[i].sp--) =  PSR_INIT;                      /* psr */
*(task_table[i].sp--) = (uint32_t) task & TASK_PC_MASK; /* pc */
*(task_table[i].sp--) = 0;      /* lr */
*(task_table[i].sp--) = 12;     /* r12 */
*(task_table[i].sp--) = 3;      /* r3  */
*(task_table[i].sp--) = 2;      /* r2  */
*(task_table[i].sp--) = 1;      /* r1  */
*(task_table[i].sp--) = 0;      /* r0  */

*(task_table[i].sp--) = 11;     /* r11  */
*(task_table[i].sp--) = 10;     /* r10  */
*(task_table[i].sp--) = 9;      /* r9   */
*(task_table[i].sp--) = 8;      /* r8   */
*(task_table[i].sp--) = 7;      /* r7   */
*(task_table[i].sp--) = 6;      /* r6   */
*(task_table[i].sp--) = 5;      /* r5   */
*(task_table[i].sp) = 4;        /* r4   */

Note that registers r0-r12 are initialised with the register number (i.e. r5 is initialised to 5) this is to help provide a visual sanity check when examining registers in the debugger.

The task stack should now be as shown below:

stack_registers

In this example we are going to setup a task with it’s initial task frame, we are only going to have a single task and in order to simplify the code we not going to attempt to save a task’s state. We also disable the SysTick timer after it’s first invocation.

This is has the effect of the code simply starting the first task, which makes the code simpler and easier to understand.

Take the code from the SysTick example and update the startup.ld as above, then add a file called task.h with the following contents:

#ifndef __TASK_H
#define __TASK_H

#include <stdint.h>

#define TASK_MAX  1

#define TASK_FREE     -1
#define TASK_READY     1
#define TASK_RUNNING     2

/* Need to mask out thumb bit */
#define TASK_PC_MASK 0xfffffffe

#define MAIN_SP     0x20001000
#define PROC_STACK_SIZE 256
#define PROC_STACK_TOP     (MAIN_SP - PROC_STACK_SIZE)

#define PSR_INIT 0x21000000 

#define EXC_RET_THREAD 0xfffffff9


struct task_t {
    uint32_t *sp;
    uint32_t state;
};

void task_init();
int task_create(void (*task)(void));
struct task_t * task_ready();
__attribute__((naked)) void task_switch(void);
#endif /* __TASK_H */

add a file task.c with the following:

#include <stdint.h>
#include <task.h>

static struct task_t task_table[TASK_MAX] __attribute__((section(".data")));

void task_init()
{
        int i = 0;

        for(i = 0; i < TASK_MAX; i++){
                task_table[i].state = TASK_FREE;
        }
}

int task_create(void (*task)(void))
{
        int i = 0;

        for(i = 0; i < TASK_MAX; i++) {
                if(task_table[i].state == TASK_FREE) {
                        task_table[i].sp = (void *) (PROC_STACK_TOP 
                                                - (i * PROC_STACK_SIZE));
        
                        /* setup initial stack frame */
                        *(task_table[i].sp--) =  PSR_INIT;
                        *(task_table[i].sp--) = (uint32_t) task & TASK_PC_MASK;
                        *(task_table[i].sp--) = 0;      /* lr */ 
                        *(task_table[i].sp--) = 12;     /* r12 */
                        *(task_table[i].sp--) = 3;      /* r3  */
                        *(task_table[i].sp--) = 2;      /* r2  */
                        *(task_table[i].sp--) = 1;      /* r1  */
                        *(task_table[i].sp--) = 0;      /* r0  */

                        *(task_table[i].sp--) = 11;     /* r11  */
                        *(task_table[i].sp--) = 10;     /* r10  */
                        *(task_table[i].sp--) = 9;      /* r9   */
                        *(task_table[i].sp--) = 8;      /* r8   */
                        *(task_table[i].sp--) = 7;      /* r7   */
                        *(task_table[i].sp--) = 6;      /* r6   */
                        *(task_table[i].sp--) = 5;      /* r5   */
                        *(task_table[i].sp) = 4;        /* r4   */
        
                        task_table[i].state = TASK_READY;
                        return 0;
                }
        } 
        
        return -1;
}

struct task_t * task_ready()
{
        int i = 0;

        for(i = 0; i < TASK_MAX; i++) {
                if(task_table[i].state == TASK_READY) {
                        return &(task_table[i]);
                }
        }

        return (struct task_t *) 0;
}

struct task_t * task_running()
{
        int i = 0;

        for(i = 0; i < TASK_MAX; i++) {
                if(task_table[i].state == TASK_RUNNING) {
                        return &(task_table[i]);
                }
        }

        return (struct task_t *) 0;
}

void task_save(struct task_t *task)
{
        return;
}

__attribute__((naked)) void task_restore(void *task_sp)
{
        /* Restore task stackpointer */
        asm volatile("mov sp, r0");

        /* Restore sw frame */
        asm volatile("pop {r4-r11}");

        /* set psp */
        asm volatile("mov r0, sp");
        asm volatile("msr psp, r0");

        /* Return from exception */
        asm volatile("mov lr, #0xfffffffd");
        asm volatile("bx lr");
}

__attribute__((naked)) void task_switch()
{
        struct task_t *task = (struct task_t *) 0;
        
        task = task_ready();
        if(task) {
                task_restore(task->sp);
        }       

        /* Return from exception */
        asm volatile("mov lr, #0xfffffff9");
        asm volatile("bx lr");
}

then change timer.c to call the task_switch() function:

#include <stdint.h>
#include <task.h>

#define SYSTICK_BASE ((uint32_t *) 0xe000e000)
#define SYSTICK_CTRL ((uint32_t *) 0xe000e010)
#define SYSTICK_TRELOAD ((uint32_t *) 0xe000e014)
#define SYSTICK_CTRL_ACTIVATE (0x7)

void timer_start()
{
    *SYSTICK_CTRL = SYSTICK_CTRL_ACTIVATE;
}

void timer_stop()
{
        *SYSTICK_CTRL = 0x0;
}

void timer_reload(uint32_t reload)
{
    *SYSTICK_TRELOAD = reload;
}

__attribute__((naked)) void timer_func()
{
    timer_stop();
    asm volatile("b task_switch");
}

then change main.c  as follows:

#include <task.h>

#define TIMER_RELOAD 0x000ffff

void task_1(void)
{
    puts("TASK_1");
    while(1) { ; }    
}

int main() 
{
    task_init();
    task_create(task_1);

    timer_reload(TIMER_RELOAD);
    timer_start();

    while(1) { ; }

    return 0;    
}

then change Makefile as follows:

TARGET=startup.elf
OBJ := vector_table.o main.o serial.o timer.o task.o 
LDSCRIPT=startup.ld
CROSS=arm-none-eabi-
CFLAGS := -I. -nostdlib -g -mcpu=cortex-m3 -mthumb
LDFLAGS := -g -o
ASMFLAGS := -g -o

TTY := $(shell tty)
MACH := lm3s6965evb
CDEV := tty,id=any,path=$(TTY)

QEMU_FLAGS := -S -gdb tcp::1234 -nographic -kernel

all: $(TARGET)

%.o: %.s
    $(CROSS)as $(ASMFLAGS) $@ $<

%.o: %.c
    $(CROSS)gcc $(CFLAGS) -c $<

$(TARGET): $(OBJ) $(LDSCRIPT)
    $(CROSS)ld $(LDFLAGS) $@ -T $(LDSCRIPT) $(OBJ)

clean:
    $(RM) *.o *.elf

run: $(TARGET)
    qemu-system-arm -machine $(MACH) $(QEMU_FLAGS) $< -chardev $(CDEV)    

debug: $(TARGET)
    $(CROSS)gdb -x gdbinit

 

In a terminal window use make run to start the emulator, note there is no output from our task so far:

make run
arm-none-eabi-as -g -o vector_table.o vector_table.s
arm-none-eabi-gcc -I. -nostdlib -g -mcpu=cortex-m3 -mthumb -c main.c
arm-none-eabi-gcc -I. -nostdlib -g -mcpu=cortex-m3 -mthumb -c serial.c
arm-none-eabi-gcc -I. -nostdlib -g -mcpu=cortex-m3 -mthumb -c timer.c
arm-none-eabi-gcc -I. -nostdlib -g -mcpu=cortex-m3 -mthumb -c task.c
arm-none-eabi-ld -g -o startup.elf -T startup.ld vector_table.o main.o serial.o timer.o task.o 
qemu-system-arm -machine lm3s6965evb -S -gdb tcp::1234 -nographic -kernel startup.elf -chardev tty,id=any,path=/dev/pts/1

In another terminal window use make debug to start the debugger, then issue a command to continue execution:

make debug
arm-none-eabi-gdb -x gdbinit
GNU gdb (GNU Tools for ARM Embedded Processors) 7.6.0.20131129-cvs
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=i686-linux-gnu --target=arm-none-eabi".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
0x00000040 in ?? ()
Loading section .text, size 0x400 lma 0x0
Loading section .rodata, size 0x8 lma 0x400
Loading section .data, size 0x8 lma 0x20000000
Start address 0x0, load size 1040
Transfer rate: 1015 KB/sec, 346 bytes/write.
Breakpoint 1 at 0x40: file vector_table.s, line 29.

Breakpoint 1, reset_handler () at vector_table.s:29
29        bl main
r0             0x0    0
r1             0x0    0
r2             0x0    0
r3             0x0    0
r4             0x0    0
r5             0x0    0
r6             0x0    0
r7             0x0    0
r8             0x0    0
r9             0x0    0
r10            0x0    0
r11            0x0    0
r12            0x0    0
sp             0x20001000    0x20001000
lr             0x0    0
pc             0x40    0x40 
cpsr           0x40000173    1073742195
(gdb) cont
Continuing.

Returning to the previous terminal running the emulator we see output from our task:

make run
arm-none-eabi-as -g -o vector_table.o vector_table.s
arm-none-eabi-gcc -I. -nostdlib -g -mcpu=cortex-m3 -mthumb -c main.c
arm-none-eabi-gcc -I. -nostdlib -g -mcpu=cortex-m3 -mthumb -c serial.c
arm-none-eabi-gcc -I. -nostdlib -g -mcpu=cortex-m3 -mthumb -c timer.c
arm-none-eabi-gcc -I. -nostdlib -g -mcpu=cortex-m3 -mthumb -c task.c
arm-none-eabi-ld -g -o startup.elf -T startup.ld vector_table.o main.o serial.o timer.o task.o 
qemu-system-arm -machine lm3s6965evb -S -gdb tcp::1234 -nographic -kernel startup.elf -chardev tty,id=any,path=/dev/pts/1    
TASK_1

Return to the terminal window running gdb, and CTRL-C to break, we can see that the pc is now pointing to code in our as task, as shown below:

(gdb) cont
Continuing.
^C
Program received signal SIGINT, Interrupt.
task_1 () at main.c:8
8        while(1) { ; }    
(gdb) list
3    #define TIMER_RELOAD 0x000ffff
4    
5    void task_1(void)
6    {
7        puts("TASK_1");
8        while(1) { ; }    
9    }
10    
11    int main() 
12    {
(gdb)

In this example we have initialised and started a single task, in the following example we shall execute multiple tasks concurrently by adding code to save the state of the current task.

Advertisements