diff --git a/_posts/cheri/2022-11-19-cheri.md b/_posts/cheri/2022-11-19-cheri.md index 9770f85..bbf1e9d 100644 --- a/_posts/cheri/2022-11-19-cheri.md +++ b/_posts/cheri/2022-11-19-cheri.md @@ -210,13 +210,16 @@ as we can see, the bounds for our `user_name` capability (which is stored in cap csw a1, 0 (ca2) {% endhighlight %} -### capability monotonicity +### chains of capabilities at this point you may be thinking "okay, that's great, but if we can just set the bounds of a capability with an instruction then what's the point? surely I can just set global bounds on some random pointer and access whatever I want?" -fundamental to the idea of capabilities is their _provenance_ and _monotonicity_. simply put, the first says we can only construct a capability using specific instructions, from an existing capability. we can't just create a capability from some random number. let's see what happens when we try to run our `ptrs_as_numbers` program on CheriBSD: +fundamental to the idea of capabilities is their *provenance* and *monotonicity*. + +*provenance*, simply put, means we can only construct a capability from an existing capability, using specific instructions. we can't just create a capability from some random `size_t` and use it to load/store something. let's see what happens when we try to run our `ptrs_as_numbers` program on CheriBSD: {% highlight plaintext %} -(gdb) runStarting program: /root/ptrs_as_numbers-cheribsd +(gdb) run +Starting program: /root/ptrs_as_numbers-cheribsd *x=1234 Program received signal SIGPROT, CHERI protection violation. Capability tag fault caused by register ca1. @@ -226,13 +229,48 @@ Capability tag fault caused by register ca1. $1 = () 0x3fffdfff74 {% endhighlight %} -we can see we get a fault - the tag isn't set. any capability with a tag not set to 1 cannot be dereferenced - it is invalid. in fact, this capability has no capability metadata - when we copied it into our `unsigned long`, we just copied the 64-bit address. +we get a fault, because the tag isn't set. any capability with a tag not set to 1 cannot be dereferenced -- it is invalid. in fact, this capability has no capability metadata -- when we copied it into our `unsigned long`, we just copied the 64-bit address. -*monotonicity* is what stops us taking an existing capability, and creating a capability with more permissions and/or access than the original. it stipulates that when we create a capability from another capability (which we have to do - provenance), the permissions and bounds of the new capability must be equal to or less than the original. so our bounds can only get narrower as we create new capabilites from an existing capability. this means that capabilities trace back in a chain - they are all created from other capabilities, and narrowed as necessary. in this case, (simplified) when the kernel loads our program it will give us capabilities that are wide enough to do everything we need to do, and the compiler will try and make sure all the capabilities that we make and use from these are as tightly bound and unpermissive as possible. +*monotonicity* is what stops us taking an existing capability and creating a capability with more permissions and/or access than the original. it stipulates that when we create a capability from another capability (which we have to do -- provenance), the permissions and bounds of the new capability must be less than or equal to the original. so our bounds can only get narrower as we create new capabilites from an existing capability. this means that capabilities trace back in a chain - they are all created from other capabilities, and narrowed as necessary. in this case, (simplified) when the kernel loads our program it will give us capabilities that are wide enough to do everything we need to do, and the compiler will try and make sure all the capabilities that we make and use from these are as tightly bound and unpermissive as possible. ### CHERI-fying code -you'll notice we got a lot of these benefits "for free". we only had to recompile our code, and we got this extra security. of course, CHERI does require changes to programs. naturally, the compiler had to be changed a lot to implement this behaviour. it also especially requires changes to things like the C library and kernel in order to take advantage of the features fully. sufficiently large userspace programs do need changes too. one common issue is that a lot of existing C code assumes that `sizeof (*void) == sizeof(size_t)`. with CHERI, our pointers are now twice as big. however, `size_t` hasn't changed size, as the address space size hasn't changed - for example, if we index into an array with `size_t`, the index should still be the same size; the extra data in our `void *` capability is the metadata, not extra address data. any program that tries to convert from some `unsigned long` or `size_t` to a capability will fault - this violates provenance. so, sometimes code changes have to be made to ensure we are keeping the capability metadata around. +you'll notice we got a lot of these benefits "for free". we only had to recompile our code, and we gained this extra security. of course, CHERI does require changes to program sources. naturally, the compiler was changed a lot to implement this behaviour. in particular, CHERI also requires changes to things like the C library and kernel in order to take advantage of the features fully. sufficiently large userspace programs will generally require source changes. +one common issue is that a lot of existing C code assumes that `sizeof (*void) == sizeof(size_t)`. with CHERI, our pointers are now twice as big. however, `size_t` hasn't changed size, as the address space size hasn't changed - for example, if we index into an array with `size_t`, the index should still be the same size; the extra data in our `void *` capability is the metadata, not extra address data. any program that tries to convert from some `unsigned long` or `size_t` to a capability will fault - this violates provenance. so, sometimes code changes have to be made to ensure we are keeping the capability metadata around. in CHERI, we can use `ptraddr_t` to store addresses and `[u]intptr_t` to store capabilities. + +let's make a program to see some differences in types, and demonstrate how `uintptr_t` can preserve capabilities: + +{% highlight c linenos %} +{% include_relative code/ptrtypes.c %} +{% endhighlight %} + +running this on our non-CHERI host will give us: + +{% highlight terminal %} +$ ./ptrtypes +type size (hex) size (dec) +===================================== +uintptr_t 0x08 08 +size_t 0x08 08 +void* 0x08 08 +===================================== +{% endhighlight %} + +running this on CHERI (64-bit): + +{% highlight terminal %} +$ ./ptrtypes-cheribsd +type size (hex) size (dec) +===================================== +ptraddr_t 0x08 08 +uintptr_t 0x10 16 +size_t 0x08 08 +void* 0x10 16 +===================================== +*b: 888 +*b: 111 +*b: 999 +{% endhighlight %} ## epilogue I appreciate this has been a fragmented and surface level introduction to CHERI. hopefully it has provided some education in some basic aims of CHERI regardless. potential benefits and uses for CHERI go much deeper than anything I've touched on here, so please, read more about everything - and get your hands dirty trying out messing about with qemu and CheriBSD! diff --git a/_posts/cheri/code/Makefile b/_posts/cheri/code/Makefile index 2f5ffb7..116ec36 100644 --- a/_posts/cheri/code/Makefile +++ b/_posts/cheri/code/Makefile @@ -1,5 +1,5 @@ CFLAGS += -Wall -g -fno-stack-protector -O0 -CC ?= clang +CC := clang PURECAP_CC ?= ~/cheri/output/sdk/utils/cheribsd-riscv64-purecap-clang SOURCES := $(wildcard *.c) diff --git a/_posts/cheri/code/ptrtypes.c b/_posts/cheri/code/ptrtypes.c new file mode 100644 index 0000000..e86b263 --- /dev/null +++ b/_posts/cheri/code/ptrtypes.c @@ -0,0 +1,37 @@ +#include +#include + +int main() { + printf("type size (hex) size (dec)\n"); + printf("=====================================\n"); +#ifdef __PTRADDR_TYPE__ + printf("ptraddr_t 0x%.2lx %.2lu\n", sizeof(ptraddr_t), sizeof(ptraddr_t)); +#endif + printf("uintptr_t 0x%.2lx %.2lu\n", sizeof(uintptr_t), sizeof(uintptr_t)); + printf("size_t 0x%.2lx %.2lu\n", sizeof(size_t), sizeof(size_t)); + printf("void* 0x%.2lx %.2lu\n", sizeof(void*), sizeof(void*)); + printf("=====================================\n"); + +#ifdef __CHERI__ + int x = 111; + int y[] = { 888, 999 }; + + int *a = &x; + int *b = &(y[0]); + + // transplant address from capability a to capability b + printf("*b: %d\n", *b); + ptraddr_t a_addr = __builtin_cheri_address_get(a); + b = __builtin_cheri_address_set(a, a_addr); + printf("*b: %d\n", *b); + + b = &(y[0]); + // uintptr_t is an unsigned integer type that preserves capabilities + uintptr_t b_uintptr = (uintptr_t) b; + b_uintptr += sizeof(int); + b = (int*) b_uintptr; + printf("*b: %d\n", *b); +#endif + + return 0; +} \ No newline at end of file