Bare metal coding on a Raspberry Pi 4 B
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

8.9 KiB

Raspberry Pi Bare Metal


  • Download the Arm GNU toolchain: x86_64 Linux hosted cross toolchains, AArch64 bare-metal target (aarch64-none-elf) link

  • Download the latest firmware (use the Download ZIP button, aka this; a slightly smaller repo is here)

  • Format an SD card as msdos partitioned, with a FAT32 at least as large as the firmware/boot contents + your kernel8.img

  • Copy the contents of the firmware boot/ onto the SD card

  • Remove all kernel*.img files


Put the following for config.txt:

  # Mode 82 = 1920x1080 60Hz


  • Run (only need to do this once or if fonts change)

  • Run

  • Copy kernel8.img to the SD card


Raspberry Pi:


objdump notes


  toolchain/arm-gnu-toolchain-12.3.rel1-x86_64-aarch64-none-elf/bin/aarch64-none-elf-objdump --disassemble cakelisp_cache/RaspberryPiOS/boot.S.o

Disassemble entire kernel:

  toolchain/arm-gnu-toolchain-12.3.rel1-x86_64-aarch64-none-elf/bin/aarch64-none-elf-objdump --disassemble kernel8.elf

Show entire kernel elf (including data):

  toolchain/arm-gnu-toolchain-12.3.rel1-x86_64-aarch64-none-elf/bin/aarch64-none-elf-objdump --full-contents kernel8.elf

Show the addresses of all the sections:

  toolchain/arm-gnu-toolchain-12.3.rel1-x86_64-aarch64-none-elf/bin/aarch64-none-elf-objdump --all-headers kernel8.elf

My stumbling blocks

  • Not having all the firmware on the SD card

  • Not writing the LED blink code correctly, i.e. I wasn't waiting while the LED was off, so it was just solid on. Viewing the disassembly helped clue me in to it.

  • Frame buffer pointer needed to be converted from a GPU address to a CPU address

Unaligned memory access is not supported without the MMU enabled

GCC started using NEON/floating point registers to set integer values due to optimizations, but I don't think I had initialized NEON yet.

NEON/SIMD was accessible, and alignment checking was disabled. The actual problem was made clear when I started playing with the alignment of e.g. this sequence:

  ldr x1, =__bss_start
  ldr d0, [x1, #8] // works
  ldr d0, [x1, #4] // exception.

This reddit thread finally cleared it up:

With the MMU off, the ARM core must treat all memory as device memory (which does not support unaligned access) since it doesn't actually know what's MMIO and what's DRAM. You have to provide that information to the core via page tables and MAIR bits if you want unaligned accesses to be allowed.

This is confirmed in ARM DDI 0487J.a, The AArch64 Application Level Memory Model B2.5 Alignment support that device memory faults on unaligned access.

Using the floating point/SIMD registers before initializing

Hack around it by passing -mgeneral-regs-only.

  00000000000000e8 <initializeFramebuffer>:
  e8:	a9b37bfd 	stp	x29, x30, [sp, #-208]!
  ec:	910003fd 	mov	x29, sp
  f0:	f9000fe0 	str	x0, [sp, #24]
  f4:	910083e0 	add	x0, sp, #0x20
  f8:	4f000400 	movi	v0.4s, #0x0

Initializing the stack pointer after exception level change

This was finally pointed out to me while reading ARM DAI 0527A (Bare-metal Boot Code for ARMv8-A Processors):

5.2.2 Initializing stack pointer registers The stack pointer register is implicitly used in some instructions, for example, push and pop. You must initialize it with a proper value before using it. In an MPCore system, different stack pointers must point to different memory addresses to avoid overwriting the stack area. If SPs in different Exception levels are used, you must initialize all of them.

Virtual memory broke mailbox interface

No one seems to have written about virtual memory and used the framebuffer. I had thought I was doing virtual memory wrong, but actually the mailbox interface was failing. I didn't know this until I bought the Adafruit FT232H and debugged with OpenOCD. Once I changed my virtual memory mappings to be all device memory, the mailbox interface worked again. It would have been nigh impossible for me to figure out my virtual memory setup was fine without the debugger.

I also used the Arm Fast Models TARMAC trace to discover I had an invalid translation table due to misunderstanding this instruction:

  str x0, [x1, #8]

I thought the #8 increment to x1 before access would also cause the new address to be stored in x1, but it's only a temporary add.

The Arm Learn the Architecture series on memory management and the Memory Management Examples documents were helpful in clearing up virtual memory for me. The amateur tutorials online use different terminology and assume I want the 4 KiB layout.

JTAG debugging with FT232H


These did not work:



Set by the src/openocd_adafruit-ft232h.cfg, Pi in Alt4 (see unofficial):

Function FT232h Pin Pi Pin Pi GPIO
TCK D0 22 25
TDI D1 37 26
TDO D2 18 24
TMS D3 13 27
TRST D4 15 22
SRST D5 Not connected Not connected
RTCK D7 16 23

Pi config.txt

  # Disable pull downs

  # Enable jtag pins (i.e. GPIO22-GPIO27)



  git submodule add Dependencies/openocd
  cd Dependencies/openocd/
  # May also need capstone
  sudo apt install libusb-1.0-0 libusb-1.0-0-dev
  ./configure --enable-ftdi --prefix=/media/macoy/Preserve/dev/rpi-bare-metal/Dependencies/openocd/install
  make -j7
  make install
  cd install
  cd bin
  #cp ../../../../src/openocd_adafruit-ft232h.cfg ../share/openocd/scripts/interface/adafruit-ft232h.cfg
  #cp ../../../../src/openocd_raspi4.cfg ../share/openocd/scripts/target/raspi4.cfg
  #./openocd --file ../share/openocd/scripts/interface/adafruit-ft232h.cfg --file ../share/openocd/scripts/target/raspi4.cfg
  sudo ./openocd --file ../../../../src/openocd_adafruit-ft232h.cfg --file ../../../../src/openocd_raspi4.cfg

Connecting to OpenOCD CLI:

  telnet 4444


  # Easy way:
  ./toolchain/arm-gnu-toolchain-12.3.rel1-x86_64-aarch64-none-elf/bin/aarch64-none-elf-gdb --command=src/ConnectJTAG.gdb

  # Manually:
  cd toolchain/arm-gnu-toolchain-12.3.rel1-x86_64-aarch64-none-elf/bin
  # In gdb:
  target extended-remote :3333
  file /media/macoy/Preserve/dev/rpi-bare-metal/kernel8.elf

Set a startup "breakpoint" in Cakelisp:

  ;; Wait counter is volatile
  (while (< wait-counter -1)
    (incr wait-counter))

Escape the "breakpoint":

  set waitCounter=-1

GDB print 10 instructions at $pc:

  x/10i $pc

Print the mailbox status:

  # Peripheral base + mailbox + status
  p/x *(unsigned int*)(0xfe000000 + 0x0000b880 + 0x18)

Print 16 entries from the translation table (the address can be found by printing tt_l1_base):

  x/16xg 0x83000