This Summer, ZeroRISC had the pleasure of hosting two Summer interns in our Boston office. This blog post is the first in a two-part series highlighting their work tackling challenging real-world problems at the intersection of hardware and software.

Beshr Bouli, this post's primary author, is a rising junior at MIT majoring in Computer Science and Mathematics. He’s interested in both low-level systems like OS’s and compilers and large-scale systems like distributed networks. He’s a dedicated competitive programmer outside the classroom and an avid soccer enthusiast who plays, watches, and organizes the game with equal passion.

His project focused on enabling Position-Independent Code (PIC) and implementing dynamic memory management in ZeroRISC’s Rust-based secure operating system, built on top of TockOS, allowing multi-application execution on MMU-less systems with greater runtime flexibility and improved memory efficiency.

The following is Beshr’s report on his Summer internship with us.

Enabling Position-Independent Code and Dynamic Memory Management

As part of our full-stack philosophy, ZeroRISC supports a Rust-based embedded OS co-developed atop OpenTitan hardware – extending hardware security guarantees into the software. A key feature of this OS is its ability to load and run multiple applications simultaneously. Since OpenTitan does not provide a Memory Management Unit (MMU), the OS cannot assign a virtual address space to each process. While this is not a security concern due to the use of a Memory Protection Unit (MPU), it forces all processes to share the same address space. My internship this Summer focused on increasing flexibility around this memory bottleneck by integrating Position-Independent Code (PIC) and making other core OS enhancements.

Below is a simplified system overview showing all the components my project involved:

Position-Independent Code (PIC)

Compilers typically generate code assuming exclusive control over memory, placing code and data at fixed addresses. On MMU-based systems, the MMU handles virtual-to-physical address translation, but MMU-less systems must handle memory differently. Previously, applications were assigned fixed addresses at compile time, for example, Application A at 0x21000 and Application B at 0x22000. This static allocation leads to memory inefficiency, fragmentation, and difficulty accommodating applications that grow or are unknown at compile time.

I implemented a comprehensive pipeline from compiler to loader, producing binaries that do not assume specific RAM addresses. The focus was on making C applications built with the Clang LLVM-based toolchain position-independent.

Compiler

In my pipeline, the compiler now generates assembly that uses program counter (PC)-relative addressing for code instead of absolute addresses. Note that the PIC memory accesses remain valid regardless of where the binary is loaded, because they are relative rather than absolute. For example:

However, this isn’t the complete solution. Binaries consist of two independently relocatable segments: the code segment, which is backed by read-only memory, and the data segment, which is read-write and located on SRAM. PC-relative addressing works for code segments stored in read-only Flash, but the data segment is loaded dynamically into SRAM, so its addresses are unknown at compile time. This is solved using global pointer (GP)-relative addressing, allowing data to be accessed relative to the GP, which is set at load time. For example:

Other ELF segments are mapped to PIC segments based on read/write permissions, with constants typically in the code segment and .bss in the data segment.

Linker

The linker finalizes the locations of objects whose placement is unknown at compile time. For example, in C code with externally defined constants, the linker rewrites the compiler’s pseudoinstructions to ensure that the correct relocation sequences are applied depending on whether objects reside in read-only or writable memory.

Loader and Final Relocation

Now that we have assembly code that doesn’t make assumptions about addresses, we need to appropriately set up the environment for the assembly code to run in. Specifically, I modified the C runtime (crt0) to perform runtime initialization and relocation. Code segments remain in Flash, while data segments are loaded into SRAM. The loader sets GP after loading the data segment and finalizes relocation for variables requiring absolute addresses, including constant pointers that are mapped into SRAM to allow updates. Finally, the PC is set to the application entry point, achieving fully position-independent execution.

Memory Management

With PIC implemented, memory can now be managed more flexibly. Previously, applications had fixed memory requirements, which limited dynamic allocation and overall system efficiency. Beshr designed a variable-size-block memory allocator to optimize usage.

Each memory block contains a header with size and allocation status, forming a linked list. For PIC allocations, the allocator uses a best-fit strategy; for non-PIC allocations, it checks that the requested address and size fit within a free block. Blocks can be split to minimize wasted memory, and deallocation coalesces adjacent free blocks.

More complex allocation methods such as segregated lists were considered but rejected, as they would increase OS size and thus decrease memory efficiency. The chosen design balances simplicity, memory conservation, and fragmentation reduction.

Impact and Benefits

The combination of PIC and the new memory allocator enables dynamic loading, updating, and unloading of applications at runtime, without recompilation or manual memory management. Previously, memory had to be statically divided among applications, leaving little flexibility for runtime changes.

Now, the OS loader can manage application placement dynamically, optimize memory usage, and reduce downtime, significantly improving system flexibility and efficiency. These changes provide a new level of abstraction, allowing developers and users to deploy applications without concern for SRAM addresses or memory fragmentation.

I had a great Summer at ZeroRISC and look forward to diving more deeply into OS in the future!

Conclusion

It was our privilege to host Beshr and we’re grateful for all his outstanding work this Summer!

Beshr’s project demonstrates how system-level software engineering and compiler/runtime integration can enhance MMU-less embedded systems, providing both efficiency and flexibility for secure, multi-application execution.

At ZeroRISC, we believe strongly in the power of open source to fuel innovation and amortize maintenance effort over the long term. So, we are currently coordinating with the Tock team on integrating Beshr’s PIC efforts into the relevant upstream Tock repositories.

While Beshr explored enhancements to ZeroRISC’s Rust-based secure operating system, including position-independent code (PIC) and a flexible memory allocator, our second intern, Yeabsira, focused on hardware and interconnects. Look out for that blog coming out shortly!

ZeroRISC’s Summer Interns Expand Runtime Flexibility of our SecureOS