update cheri article

This commit is contained in:
Jack Bond-Preston 2022-11-30 00:53:42 +00:00
parent 25594e233e
commit 27dd36aa3a
Signed by: jack
GPG Key ID: 010071F1482BA852
3 changed files with 82 additions and 7 deletions

View File

@ -210,13 +210,16 @@ as we can see, the bounds for our `user_name` capability (which is stored in cap
csw a1, 0 (ca2)
{% endhighlight %}
### capability monotonicity
### chains of capabilities
at this point you may be thinking "okay, that's great, but if we can just set the bounds of a capability with an instruction then what's the point? surely I can just set global bounds on some random pointer and access whatever I want?"
fundamental to the idea of capabilities is their _provenance_ and _monotonicity_. simply put, the first says we can only construct a capability using specific instructions, from an existing capability. we can't just create a capability from some random number. let's see what happens when we try to run our `ptrs_as_numbers` program on CheriBSD:
fundamental to the idea of capabilities is their *provenance* and *monotonicity*.
*provenance*, simply put, means we can only construct a capability from an existing capability, using specific instructions. we can't just create a capability from some random `size_t` and use it to load/store something. let's see what happens when we try to run our `ptrs_as_numbers` program on CheriBSD:
{% highlight plaintext %}
(gdb) runStarting program: /root/ptrs_as_numbers-cheribsd
(gdb) run
Starting program: /root/ptrs_as_numbers-cheribsd
*x=1234
Program received signal SIGPROT, CHERI protection violation.
Capability tag fault caused by register ca1.
@ -226,13 +229,48 @@ Capability tag fault caused by register ca1.
$1 = () 0x3fffdfff74
{% endhighlight %}
we can see we get a fault - the tag isn't set. any capability with a tag not set to 1 cannot be dereferenced - it is invalid. in fact, this capability has no capability metadata - when we copied it into our `unsigned long`, we just copied the 64-bit address.
we get a fault, because the tag isn't set. any capability with a tag not set to 1 cannot be dereferenced -- it is invalid. in fact, this capability has no capability metadata -- when we copied it into our `unsigned long`, we just copied the 64-bit address.
*monotonicity* is what stops us taking an existing capability, and creating a capability with more permissions and/or access than the original. it stipulates that when we create a capability from another capability (which we have to do - provenance), the permissions and bounds of the new capability must be equal to or less than the original. so our bounds can only get narrower as we create new capabilites from an existing capability. this means that capabilities trace back in a chain - they are all created from other capabilities, and narrowed as necessary. in this case, (simplified) when the kernel loads our program it will give us capabilities that are wide enough to do everything we need to do, and the compiler will try and make sure all the capabilities that we make and use from these are as tightly bound and unpermissive as possible.
*monotonicity* is what stops us taking an existing capability and creating a capability with more permissions and/or access than the original. it stipulates that when we create a capability from another capability (which we have to do -- provenance), the permissions and bounds of the new capability must be less than or equal to the original. so our bounds can only get narrower as we create new capabilites from an existing capability. this means that capabilities trace back in a chain - they are all created from other capabilities, and narrowed as necessary. in this case, (simplified) when the kernel loads our program it will give us capabilities that are wide enough to do everything we need to do, and the compiler will try and make sure all the capabilities that we make and use from these are as tightly bound and unpermissive as possible.
### CHERI-fying code
you'll notice we got a lot of these benefits "for free". we only had to recompile our code, and we got this extra security. of course, CHERI does require changes to programs. naturally, the compiler had to be changed a lot to implement this behaviour. it also especially requires changes to things like the C library and kernel in order to take advantage of the features fully. sufficiently large userspace programs do need changes too. one common issue is that a lot of existing C code assumes that `sizeof (*void) == sizeof(size_t)`. with CHERI, our pointers are now twice as big. however, `size_t` hasn't changed size, as the address space size hasn't changed - for example, if we index into an array with `size_t`, the index should still be the same size; the extra data in our `void *` capability is the metadata, not extra address data. any program that tries to convert from some `unsigned long` or `size_t` to a capability will fault - this violates provenance. so, sometimes code changes have to be made to ensure we are keeping the capability metadata around.
you'll notice we got a lot of these benefits "for free". we only had to recompile our code, and we gained this extra security. of course, CHERI does require changes to program sources. naturally, the compiler was changed a lot to implement this behaviour. in particular, CHERI also requires changes to things like the C library and kernel in order to take advantage of the features fully. sufficiently large userspace programs will generally require source changes.
one common issue is that a lot of existing C code assumes that `sizeof (*void) == sizeof(size_t)`. with CHERI, our pointers are now twice as big. however, `size_t` hasn't changed size, as the address space size hasn't changed - for example, if we index into an array with `size_t`, the index should still be the same size; the extra data in our `void *` capability is the metadata, not extra address data. any program that tries to convert from some `unsigned long` or `size_t` to a capability will fault - this violates provenance. so, sometimes code changes have to be made to ensure we are keeping the capability metadata around. in CHERI, we can use `ptraddr_t` to store addresses and `[u]intptr_t` to store capabilities.
let's make a program to see some differences in types, and demonstrate how `uintptr_t` can preserve capabilities:
{% highlight c linenos %}
{% include_relative code/ptrtypes.c %}
{% endhighlight %}
running this on our non-CHERI host will give us:
{% highlight terminal %}
$ ./ptrtypes
type size (hex) size (dec)
=====================================
uintptr_t 0x08 08
size_t 0x08 08
void* 0x08 08
=====================================
{% endhighlight %}
running this on CHERI (64-bit):
{% highlight terminal %}
$ ./ptrtypes-cheribsd
type size (hex) size (dec)
=====================================
ptraddr_t 0x08 08
uintptr_t 0x10 16
size_t 0x08 08
void* 0x10 16
=====================================
*b: 888
*b: 111
*b: 999
{% endhighlight %}
## epilogue
I appreciate this has been a fragmented and surface level introduction to CHERI. hopefully it has provided some education in some basic aims of CHERI regardless. potential benefits and uses for CHERI go much deeper than anything I've touched on here, so please, read more about everything - and get your hands dirty trying out messing about with qemu and CheriBSD!

View File

@ -1,5 +1,5 @@
CFLAGS += -Wall -g -fno-stack-protector -O0
CC ?= clang
CC := clang
PURECAP_CC ?= ~/cheri/output/sdk/utils/cheribsd-riscv64-purecap-clang
SOURCES := $(wildcard *.c)

View File

@ -0,0 +1,37 @@
#include <stdio.h>
#include <stdint.h>
int main() {
printf("type size (hex) size (dec)\n");
printf("=====================================\n");
#ifdef __PTRADDR_TYPE__
printf("ptraddr_t 0x%.2lx %.2lu\n", sizeof(ptraddr_t), sizeof(ptraddr_t));
#endif
printf("uintptr_t 0x%.2lx %.2lu\n", sizeof(uintptr_t), sizeof(uintptr_t));
printf("size_t 0x%.2lx %.2lu\n", sizeof(size_t), sizeof(size_t));
printf("void* 0x%.2lx %.2lu\n", sizeof(void*), sizeof(void*));
printf("=====================================\n");
#ifdef __CHERI__
int x = 111;
int y[] = { 888, 999 };
int *a = &x;
int *b = &(y[0]);
// transplant address from capability a to capability b
printf("*b: %d\n", *b);
ptraddr_t a_addr = __builtin_cheri_address_get(a);
b = __builtin_cheri_address_set(a, a_addr);
printf("*b: %d\n", *b);
b = &(y[0]);
// uintptr_t is an unsigned integer type that preserves capabilities
uintptr_t b_uintptr = (uintptr_t) b;
b_uintptr += sizeof(int);
b = (int*) b_uintptr;
printf("*b: %d\n", *b);
#endif
return 0;
}