Creating an ESP32 based dice for chess

Posted on March 28, 2025 by Javier Martinez Canillas

A LILYGO T-Display-S3 board running a CircuitPython program that randomly chooses and shows a chess piece.

My kid loves chess and lately he has been playing the dice chess variant with his friends. For those unfamiliar, it is when a die roll determines which piece type can be moved (e.g., 1=Pawn, 2=Rook, 3=Knight, etc.). However, this is inconvenient because players have to remember which chess piece corresponds to each number, and the mapping has to be explained to every new player.

I’ve had an idea on the back burner for some time to do an embedded project using CircuitPython and an ESP32 based board. Improving his dice chess setup sounded like a great excuse for this.

So I wrote esp32-dice-chess which is just a simple program to randomly choose a chess piece and shows its image on a screen. The code is meant to be used on a T-Display-S3 but it should be easy to port it to a different board.

The ability to use Python for microcontroller firmware development is very convenient, especially for rapid prototyping and ease of development. I totally recommend it to folks doing hardware tinkering.

Happy hacking!

Using an SPI SSD1306 OLED on Fedora with a Raspberry Pi

Posted on September 27, 2024 by Javier Martinez Canillas

In a previous post I explained how to use a Solomon SSD1306 OLED display on Fedora with a Raspberry Pi, but that only covered displays that come with a I2C driving interface.

The SSD1306 display controller also supports a (both 3-wire and 4-wire) SPI interface, and the ssd130x DRM driver has support for it since Linux 5.19.

This blog post explains how to setup Fedora to use a SSD1306 OLED when connected through 4-wire SPI interface.

First you need connect the SSD1306 display to the RPi, there are different ways to do this since one can choose the GPIO to use for some the reset and data/command pins.

But to simplify the configuration, one could use the default GPIO pins that are defined in the ssd1306-spi.dtbo provided by the RPi firmware package.

To do this, connect the SSD1306 display to the RPi as follows:

Then you need to install Fedora, this can be done by downloading an aarch64 Fedora raw image (e.g: Workstation Edition) and executing the following command:

sudo arm-image-installer --image $image --target=rpi4 -- \\
media=$device --addconsole --addkey=id_rsa.pub \\
--norootpass --resizefs

where $device is the block device for your uSD (e.g: /dev/sda) and image is the file name (e.g: Fedora-Workstation-40-1.14.aarch64.raw.xz) of the downloaded image.

Finally you need to configure the RPi firmware to enable the SPI pins and load the ssd1306-spi Device Tree Blob Overlay (dtbo) to register a SSD1306 SPI device, e.g:

sudo cat << EOF >> /boot/efi/config.txt
dtparam=spi=on
dtoverlay=ssd1306-spi,inverted
EOF

The ssd1306-spi.dtbo supports many parameters in case your display doesn’t match the defaults. Take a look to the RPi overlays README for more details about all these parameters and their possible values.

The ssd130x DRM driver registers an emulated fbdev device that can be bound with fbcon and use the OLED display to have a framebuffer console. If you want to do that, it’s convenient to change the virtual terminal console font to a smaller one, e.g:

sudo sed -i 's/FONT=.*/FONT="drdos8x8"/' /etc/vconsole.conf
sudo dracut -f

And that’s it. Happy hacking!

Some useful Linux kernel cmdline debug parameters to troubleshoot driver issues

Posted on June 2, 2024 by Javier Martinez Canillas

I have written before on how to troubleshoot drivers’ deferred probe issues in Linux. And while having a /sys/kernel/debug/devices_deferred debugfs entry is quite useful to know the list of devices whose drivers failed to probe due being deferred [0], there are also some debug command line parameters that could help with the troubleshooting.

But first, an explanation of what these parameters do:

Devices require different resources in order to be operative, for example a display controller could need a set of clocks, power domains and regulators to be enabled.

Ideally, if these are managed by Linux, these should be declared in the hardware description that is handled to the kernel (e.g: Device Tree Blobs) but sometimes this isn’t the case. And devices may be working just because the firmware that booted Linux left these required resources enabled.

But of course this has the disadvantage that Linux can’t manage those, and can’t for example disable a clock or power domain when is not used.

It is common though to add support for a platform incrementally and so it could be that at the beginning this relies on the setup made by the firmware (e.g: some required clocks left enabled and Linux not knowing about them). But later, support for these clocks could and the Common Clock Framework (CCF) will now be aware of them.

But it could be that the Device Tree was not updated to define the dependency between some device that requires these introduced clocks. If that is the case, the clocks may appear to be unused by the system and the CCF rightfully decide to disable them [1].

If that is the case, this message is printed in the kernel log buffer:

[    5.189930] clk: Disabling unused clocks

This will of course make the device to not work anymore because one of its required clock has been gated.

To prevent this, there is a debug clk_ignore_unused command line parameter that can be used to prevent the CCF to gate unused clocks. When this is used, the unused clocks won’t be disabled an the following message be printed instead:

[    5.200758] clk: Not disabling unused clocks

And is the same for power domains and regulators, the frameworks will disable unused ones and the pd_ignore_unused and regulator_ignore_unused command line parameters can be used to prevent this. When using them, the following messages are printed in the kernel log:

[    5.186559] PM: genpd: Not disabling unused power domains
[   35.298779] regulator: Not disabling unused regulators

The {clk,pd}_ignore_unused parameters have been present in Linux for a long time, but we didn’t have one for regulators until recently.

Before, to prevent an unused regulator to be disabled by the regulator framework, it had to be marked in the Device Tree as always-on by using the regulator-always-on property. But this requires to modify the Device Trees which is more cumbersome just to troubleshoot drivers regressions.

So after talking with with Mark Brown (the regulator framework maintainer), I proposed to add a regulator_ignore_unused debug parameter which landed in Linux kernel v6.8.

I hope these debug parameters can be useful when facing regressions in drivers.

Happy hacking!

[0] Read this older post if want to learn more about about drivers and devices registration, matching and probe deferral.

[1] Brian Masney pointed me out that the list of clocks being disabled can be shown by using the tp_printk trace_event=clk:clk_disable cmdline param as explained in the kernel docs.

Atomically exchange vfat files in Linux

Posted on February 10, 2024 by Javier Martinez Canillas

The Linux kernel implements the POSIX renameat() system call that atomically renames a file, moving it between directories if required. If the newpath already exists, it is atomically replaced by the oldpath.

This system call function signature is as follows:

int renameat(int olddirfd, const char *oldpath,
             int newdirfd, const char *newpath);

While atomically renaming a file is useful, the system call was not designed for future extensibility and so a follow-up renameat2() system call was added. This new system call has a flags parameter that can be used to modify its behaviour:

int renameat2(int olddirfd, const char *oldpath,
              int newdirfd, const char *newpath, unsigned int flags);

The flags argument is a bit mask that if set to zero, it makes renameat2() to behave just like the old renameat() system call.

RENAME_NOREPLACE is a flag that prevents the newpath to be overwritten by oldpath. If newpath already exists, an error is returned instead.

RENAME_EXCHANGE is a flag that atomically exchanges oldpath and newpath.

Before Linux v6.0, the vfat filesystem implementation only had support for the renameat2() RENAME_NOREPLACE flag. This means that files couldn’t be atomically exchanged in a vfat filesystem using the RENAME_EXCHANGE flag.

But this feature is quite convenient to implement for example A/B update mechanisms, since two files can be swapped atomically using a single system call. This is particularly important for EFI platforms, because the EFI System Partition always is formatted using vfat.

If someone wants to implement an update mechanism for EFI binaries stored in the ESP, and provide a fallback, there was no way to do it safely in Linux. This is even more important for vfat, due the filesystem not having a journal and so the EFI firmware not being able to do a replay of pending changes to make the filesystem consistent again. For this reason, the less operations done while swapping EFI binaries, the better.

Another example is OSTree, that uses symbolic links renames to atomically commit its deployment transactions. Having atomic exchange support in vfat, could allow to use RENAME_EXCHANGE instead in this filesystem that doesn’t support symlinks.

With that use case in mind, we implemented [1,2] RENAME_EXCHANGE support to the Linux vfat filesystem driver. An example of how this can be used, is in a Linux kernel rename_exchange selftest that was added along with the implementation.

Although this feature has been in the Linux kernel since 2022, I’ve had to mention it a few times recently, so I thought a post might help raise awareness in case anyone finds it useful too.

Happy hacking!

How to install Fedora on an HP X2 Chromebook

Posted on November 4, 2022 by Javier Martinez Canillas

We have been working lately with Enric Balletbo and Dorinda Bassey to improve the support for the HP X2 Chromebook in Fedora. This post explains how to install Fedora on that Chromebook.

The article ended being up being longer than I thought, so for the impatient this is the summary:

Switch the Chromebook to Developer Mode.
Boot from the internal disk.
Go to a virtual terminal with Ctrl+Alt+F2 and login as root.
Enable developer mode boot from external disk (USB/microSD):
```
  $ crossystem dev_boot_usb=1
```

Install packages needed by the chromebook-setup.sh script:

  $ sudo dnf install bc curl util-linux gdisk lz4 \
    e2fsprogs uboot-tools udisks2 vboot-utils \
    guestfs-tools qemu-user-static

Clone Enric’s chromebooks scripts repo:

  $ git clone https://github.com/eballetbo/chromebooks.git
  $ pushd chromebooks

Flash a Fedora image to a storage media (replace /dev/sda with your block device):

  $ sudo ./chromebook-setup.sh deploy_fedora \
    --architecture=arm64 --storage=/dev/sda

UPDATE: Enric added some useful tools/image-installer scripts, to allow installing Fedora released versions (rather than Rawhide that’s what the chromebook-setup.sh script installs by default). For example, to install Fedora 39 you can instead just execute the following command:

   $ sudo ./tools/image-installer/fedora-workstation-39-aarch64 /dev/sda

Plug the USB/microSD device into the Chromebook and choose to boot from an external device.

After the Fedora initial setup, install the following packages:

  $ sudo dnf install uboot-tools vboot-utils lz4 unzboot -y

Remove packages that expects the grub2 bootloader to be used:

  $ rm /etc/dnf/protected.d/{grub,shim}*
  $ sudo dnf remove grub2-* shim-* grubby kexec-tools -y

UPDATE: The Fedora aarch64 Linux kernel image format changed and are not shipped anymore as a compressed vmlinuz but as an EFI binary and this new format is not supported by Coreboot. Enric wrote an unzboot tool that can extract the vmlinuz to be used for kernel updates.

UPDATE: This new package has been added to Fedora 39 and there is no need to fetch the rpm manually anymore as it was indicated before.

Enjoy Fedora on the Chromebook 🙂

Now the longer version…

Challenges supporting the Chromebooks in Fedora

Supporting the Chromebooks is not trivial, because these laptops use a different firmware (Coreboot) and boot stack (Depthcharge) than what is used by all the other aarch64 machines supported by Fedora (UEFI and GRUB). There are good reasons why that is the case, but it poses some challenges and complicates making these laptops to work in Fedora out-of-the-box.

For this reason, a standard Fedora ISO image can’t just be booted on a Chromebook to start an installation process.

Current approach used to install Fedora on the Chromebooks

To overcome this, Enric wrote a set of chromebook scripts that can be used to setup a block device (i.e: USB drive or microSD card) and write a Fedora image that can be booted directly. The script also adds to the system a kernel-install plugin, written by Dorinda, that takes the Linux kernel image being installed and package it in the format (FIT) expected by the Chromebook bootloader.

That way, the Fedora installation would look and behave exactly the same than in any other system.

In the future support for Chromebooks might be added to the Anaconda OS installer used by Fedora, but in the meantime using this script allows us to do experimentation and make further customization that wouldn’t be suitable for a general system installer.

Following are the instructions to install Fedora on a HP X2 Chromebook using the chromebook-setup.sh script.

Switch the Chromebook to Developer Mode

In the default mode the Chromebook can only boot binaries that are trusted by its firmware. This means that nothing besides ChromeOS can be installed. To boot a different OS, the mode should be change to Developer. In this mode, any binary that is signed with the Google’s developer key can be booted. That key is available for anyone so is what is used to sign the Linux images when generating the FIT images during the Fedora kernel packages installations.

The ChromiumOS project has excellent articles explaining the Developer Mode and how to enable it on Chromebooks without a physical keyboard, such as the HP X2.

After enabling developer mode, boot from the internal disk and switch to a virtual terminal with Ctrl+Alt+F2. Then login as root and execute the following to enable booting from an external disk (USB/microSD):

$ crossystem dev_boot_usb=1

Flashing a Fedora image

The chromebook-setup.sh script can be used to flash fedora images to a block device and do all the needed setup to make it bootable.

It supports many different options but it also has reasonable defaults. The list of options can be listed with ./chromebook-setup --help.

Following are the steps to flash a Fedora image.

Install packages needed by the chromebook-setup.sh script:

$ sudo dnf install bc curl util-linux gdisk lz4 \
  e2fsprogs uboot-tools udisks2 vboot-utils \
  guestfs-tools qemu-user-static

Clone Enric’s chromebooks scripts repository:

$ git clone https://github.com/eballetbo/chromebooks.git
$ pushd chromebooks

Execute the script, for example:

$ sudo ./chromebook-setup.sh deploy_fedora \
  --architecture=arm64 --storage=/dev/sda \
  --kparams "clk_ignore_unused deferred_probe_timeout=30"

The deploy_fedora option will install a Fedora image in the specified storage media. By default the latest Fedora Workstation Rawhide image will downloaded and used, but a different image can be chosen using the --image=$image option.

The --architecture and --storage options specify the architecture and block device used respectively.

Finally, the --kparams option allows to set additional kernel command line parameters.

The clk_ignore_unused parameter is currently needed because there is a bug when the MSM/snapdragon DRM driver is built as a module. Some needed clocks are gated before the driver probe function is executed, causing it to fail.

And the deferred_probe_timeout=30 is needed because there are a lot of drivers probe deferrals and the default 10 seconds expires causing drivers to fail to probe due timeouts.

Hopefully these two issues would be fixed soon and the parameters won’t be needed anymore.

UPDATE: the bugs were fixed and there is no need for these parameters two parameters anymore (as of Fedora 39).

One the script finishes flashing the image, plug the USB drive or insert the microSD in the Chromebook and choose “Boot from external media”.

The system should boot and start the Fedora initial setup program to configure the system and create a user. Once that is done, start a terminal and install the following packages needed by the kernel-install Chromebook plugin:

$ sudo dnf install uboot-tools vboot-utils lz4 unzboot -y

Remove packages that expects the grub2 bootloader to be used:

$ rm /etc/dnf/protected.d/{grub,shim}*
$ sudo dnf remove grub2-* shim-* grubby kexec-tools -y

And that’s it. Now the Fedora should behave like in any other system. If there are any bugs, please file issues in the chromebooks scripts repository.

Happy hacking!

Booting Fedora with sd-boot and Secure Boot enabled

Posted on September 4, 2022 by Javier Martinez Canillas

Every once in a while there is a discussion in the Fedora development list about replacing the default GRUB bootloader. That latest of these threads was "future of dual booting Windows and Fedora, redux" about a month ago, but searching for GRUB in the mailing list archives one can find many similar conversations.

One of the options to replace GRUB that is always mentioned is systemd-boot (sd-boot), since is quite minimal and enough for simple booting needs such as what is required for Fedora Workstation.

But currently this can’t be done without additional work, because shim only supports using GRUB as a second stage loader. There is a RFE: add support for multiple second stage loaders #472 filed for shim, but there wasn’t a conclusion on that issue about how to move forward.

This means that to use sd-boot instead of GRUB and keeping Secure Boot enabled, users need to create their own key pairs and enroll a cert with the public key into the UEFI firmware db database.

To have an idea of what that would entail, I installed sd-boot and removed shim and GRUB in one of my machines. Below are the steps I followed in case someone wants to replicate it.

To keep the example generic, I mention how a public key can be enrolled with the Open Virtual Machine Firmware (OVMF) UEFI firmware used to boot virtual machines. But a similar procedure can be followed to enroll keys using the UEFI settings of a physical machine. Refer to your hardware vendor documentation for how to do this.

1. Create a key pair to sign the binaries and enroll into the UEFI firmware db key database

$ cat << EOF > configuration_file.config
[ req ]
default_bits = 4096
distinguished_name = req_distinguished_name
prompt = no
string_mask = utf8only
x509_extensions = myexts

[ req_distinguished_name ]
O = Organization
CN = Organization signing key
emailAddress = E-mail address

[ myexts ]
basicConstraints=critical,CA:FALSE
keyUsage=digitalSignature
subjectKeyIdentifier=hash
authorityKeyIdentifier=keyid
EOF

$ openssl req -x509 -new -nodes -utf8 -sha256 -days 36500 -batch -config \
configuration_file.config -outform DER -out public_key.der -keyout private_key.priv

$ openssl x509 -in public_key.der -inform DER -outform PEM -out public_key.pem

2. Enroll the public cert in the UEFI firmware db database

$ sudo cp public_key.der /boot/efi/

On reboot, enter into the UEFI firmware settings and go to:

Device Manager
    Secure Boot Configuration
        Secure Boot Mode

That will allow to enroll a key by choosing:

Custom Secure Boot Options
    DB options
        Enroll Signature
            Enroll Signature Using File

After that, when booting the enrolled key should be present in db and added to the Linux kernel .platform keyring:

$ mokutil --test-key public_key.der 
public_key.der is already in db

$ keyctl list %:.platform
3 keys in keyring:
577585465: ---lswrv     0     0 asymmetric: Microsoft Windows Production PCA 2011: a92902398e16c49778cd90f99e4f9ae17c55af53
137367868: ---lswrv     0     0 asymmetric: Organization signing key: ae6f49657c238754eedc503e98cb97afdb064706
984007323: ---lswrv     0     0 asymmetric: Microsoft Corporation UEFI CA 2011: 13adbf4309bd82709c8cd54f316ed522988a1bd4

3. Install sd-boot and remove shim and grub2 packages

$ sudo bootctl install
$ sudo mkdir /boot/efi/$(cat /etc/machine-id)
$ sudo rm /etc/dnf/protected.d/{shim,grub2}*.conf
$ sudo dnf remove shim* grub2* -y
$ sudo rm -r /boot/loader
$ sudo dnf reinstall kernel-core -y

4. Sign the sd-boot and Linux kernel binaries

$ sudo dnf install -y sbsigntools pesign

$ sudo sbsign --key private_key.priv --cert public_key.pem  /usr/lib/systemd/boot/efi/systemd-bootx64.efi \
--output /boot/efi/EFI/systemd/systemd-bootx64.efi

$ sudo sbsign --key private_key.priv --cert public_key.pem /lib/modules/$(uname -r)/vmlinuz \
--output /boot/efi/$(cat /etc/machine-id)/$(uname -r)/linux
Image was already signed; adding additional signature

Check that the binaries have been signed correctly:

$ sudo pesign -S -i /boot/efi/EFI/systemd/systemd-bootx64.efi
---------------------------------------------
certificate address is 0x7f348c12b1c8
Content was not encrypted.
Content is detached; signature cannot be verified.
The signer's common name is Test signing key
The signer's email address is e-mail address
Signing time: Thu Sep 01, 2022
There were certs or crls included.
---------------------------------------------

$ sudo pesign -S -i /boot/efi/$(cat /etc/machine-id)/$(uname -r)/linux
---------------------------------------------
certificate address is 0x7fb81a4b11e8
Content was not encrypted.
Content is detached; signature cannot be verified.
The signer's common name is Fedora Secure Boot Signer
No signer email address.
Signing time: Thu Apr 28, 2022
There were certs or crls included.
---------------------------------------------
certificate address is 0x7fb81a4b1b30
Content was not encrypted.
Content is detached; signature cannot be verified.
The signer's common name is kernel-signer
No signer email address.
Signing time: Thu Apr 28, 2022
There were certs or crls included.
---------------------------------------------
certificate address is 0x7fb81a4b26f8
Content was not encrypted.
Content is detached; signature cannot be verified.
The signer's common name is Organization signing key
The signer's email address is e-mail address
Signing time: Thu Sep 01, 2022
There were certs or crls included.
---------------------------------------------

On reboot the machine should be able to boot the signed sd-boot and Linux kernel with Secure Boot enabled:

$ mokutil --sb-state
SecureBoot enabled

5. Sign an out-of-tree kernel module

This step is mentioned for completeness, since some users need to load out-of-tree kernel modules. And this is only possible for signed modules when Secure Boot is enabled.

If that’s needed, following is an example on how to do it:

$ sudo dnf install kernel-devel -y

$ git clone https://github.com/maK-/SimplestLKM.git

$ pushd SimplestLKM

$ make

$ /usr/src/kernels/$(uname -r)/scripts/sign-file sha256 key-private_key.priv \
public_key.der hello.ko

$ modinfo hello.ko | grep signature
signature: 91:F7:20:16:1C:85:AC:DC:54:25:C2:B2:E9:ED:02:93:79:43:1D:7F:

$ insmod hello.ko

$ dmesg
[ 1653.929819] hello: loading out-of-tree module taints kernel.
[ 1653.930435] Hello world!

A pain point with this setup is that one need to manually sign the sd-boot and Linux kernel binaries each time that those are updated. This could be automated of course but that would require to have the private key in the filesystem and that would defeat the whole purpose of the Secure Boot protection.

Since in that case an attacker that gets root access could be able to sign any binaries they want and make their attack persistent across reboots.

For this reason what I’ve been done is keeping the private key in an external media and only attach it when there is a need to sign a new binary, which makes updating packages more inconvenient.

Hopefully this won’t be a problem anymore once the mentioned shim issue #472 gets resolved. Because then the sd-boot binary could get signed during package build with the Fedora key just like GRUB and there won’t be a need to sign neither sd-boot nor the Linux kernel binaries with a custom key.

And that’s all for today. Happy hacking!

Using an I2C SSD1306 OLED on Fedora with a Raspberry Pi

Posted on August 18, 2022 by Javier Martinez Canillas

Linux 5.18 version landed in Fedora 36 and it includes a new ssd130x DRM driver for the Solomon OLED display controllers.

One of the supported devices is SSD1306, which seems to be a quite popular display controller for small and cheap (I bought a pack of 3 on Amazon for less than 15€) monochrome OLED panels.

I do a lot of development and testing on RPi boards and found that these small OLED panels are useful to get console output without the need to either connect a HDMI monitor or a serial console.

This blog post explains how to setup Fedora 36 to use a SSD1306 OLED connected through I2C.

First you need connect the SSD1306 display to the RPi, Adrafruit has an excellent article on how to do that.

Then you need to install Fedora 36, this can be done by downloading an aarch64 Fedora raw image (e.g: Workstation Edition) and executing the following command:

sudo arm-image-installer --image Fedora-Workstation-36-1.5.aarch64.raw.xz --target=rpi4 \\
--media=$device --addconsole --addkey=id_rsa.pub \\
--norootpass --resizefs

where $device is the block device for your uSD (e.g: /dev/sda).

Fedora 36 was released with Linux 5.17, but as mentioned the ssd130x DRM driver landed in 5.18, so you need to update the Fedora kernel:

sudo dnf update kernel -y

Finally you need to configure the RPi firmware to enable the I2C1 pins and load the ssd1306 Device Tree Blob Overlay (dtbo) to register a SSD1306 I2C device, e.g:

sudo cat << EOF >> /boot/efi/config.txt
dtparam=i2c1=on
dtoverlay=ssd1306,inverted
EOF

The ssd1306.dtbo supports many parameters in case your display doesn’t match the defaults. Take a look to the RPi overlays README for more details about all these parameters and their possible values.

sudo sed -i 's/FONT=.*/FONT="drdos8x8"/' /etc/vconsole.conf
sudo dracut -f

And that’s it. If your SSD1306 uses SPI instead, this is supported since Linux 5.19 but that would be the topic for another post.

Happy hacking!

How to troubleshoot deferred probe issues in Linux

Posted on June 21, 2022 by Javier Martinez Canillas

When working on the retro handheld console mentioned in a previous post, I had an issue where the LCD driver was not probed when booting a custom Linux kernel image built.

To understand the problem, first some knowledge is needed about how devices and drivers are registered in the Linux kernel, how these two sets are matched (bound) and what is a probe deferral.

If you are not familiar with these concepts, please read this post where are explained in detail.

The problem is that the st7735r driver (that’s needed for the SPI TFT LCD panel I was using) requires a GPIO based backlight device. To make it more clear, let’s look at the relevant bits in the adafruit-st7735r-overlay.dts that’s used as a Device Tree Blob (DTB) overlay to register the needed devices:

fragment@2 {
...
    af18_backlight: backlight {
        compatible = "gpio-backlight";
...
    };
};

fragment@3 {
...
    af18: adafruit18@0 {
        compatible = "jianda,jd-t18003-t01";
        backlight = <&af18_backlight>;
...
    };
};

We see that the adafruit18@0 node for the panel has a backlight property whose value is a phandle to the af18_backlight label used to refer to the backlight node.

The drivers/gpu/drm/tiny/st7735r.c probe callback then uses the information in the DTB to attempt getting a backlight device:

static int st7735r_probe(struct spi_device *spi)
{
...
	dbidev->backlight = devm_of_find_backlight(dev);
	if (IS_ERR(dbidev->backlight))
		return PTR_ERR(dbidev->backlight);
...
}

The devm_of_find_backlight() function returns a pointer to the backlight device if this could be found or a -EPROBE_DEFER error pointer if there is a backlight property defined in the DTB but this could not be found.

For example, this can happen if the driver that registers that expected backlight device was not yet probed.

If the probe callback returns -EPROBE_DEFER, the kernel then will put the device that matched the driver but failed to probe in a deferred probe list. The list is iterated each time that a new driver is probed (since it could be that the newly probed driver registered the missing devices that forced the probe deferral).

My problem then was that the needed driver (CONFIG_BACKLIGHT_GPIO since the backlight node has compatible = gpio-backlight) was not enabled in my kernel, leading to the panel device to remain in the deferred probe list indefinitely due a missing backlight device that was never registered.

This is quite a common issue on Device Tree based systems and something that it used to take me a lot of time to root cause when I started working on Linux embedded platforms.

A few years ago I added a /sys/kernel/debug/devices_deferred debugfs entry that would expose the list of devices deferred to user-space, which makes much more easier to figure out what devices couldn’t be bound due their driver probe being deferred.

Later, Andrzej Hajda improved that and added to the devices_deferred debugfs entry support to print the reason of the deferral.

So checking what devices were deferred and the reason is now quite trivial, i.e:

$ cat /sys/kernel/debug/devices_deferred 
spi0.0  spi: supplier backlight not ready

Much better than spending a lot of time looking at kernel logs and adding debug printous to figure out what’s going on.

Happy hacking!

Linux drivers and devices registration, matching, aliases and modules autoloading

Posted on June 10, 2022 by Javier Martinez Canillas

Every once in a while I have to explain these concepts to someone so I thought that it could be something worth to write about.

Device drivers and devices registration

The Linux kernel documentation covers quite well how device drivers and devices are registered and how these two are bound. But the summary is that drivers and devices are registered independently and each of these specify their given bus type. The Linux kernel device model then uses that information to bind drivers with devices of the same bus type.

Drivers and devices are registered using the driver_register() function which is usually called from either the drivers’ module_init() function or platform code.

Devices are registered using the register_device() function which is usually called by subsystems that parses a list of devices from some hardware topology description, some enumerable bus or platform code that hardcodes the devices to be registered.

Drivers and device matching (binding)

When a driver is registered for a given bus, the list of devices registered for that bus is iterated to find a match.

In the same manner, when a device is registered, the list of drivers registered for the same bus is iterated to find a match.

That way, it doesn’t matter the order in which drivers and devices are registered. They a device will be bound to a driver regardless of which one was registered first.

Drivers’ probe callback

If a match is found, the driver’s probe callback is executed. This function handler contains the driver-specific logic to bind the driver with a device and any setup needed for this. The probe function returns 0 if the driver could be bound to the device successfully or a negative errno code if the driver was not able to bound the device.

Probe deferral

A special errno code -EPROBE_DEFER is used to indicate that the bound failed because a driver could not provide all the resources needed by the device. When this happens, the device is put into a deferred probe list and the probe is retried again at a later time.

That later time is when a new driver probes successfully. When this happens, the device deferred probe list is iterated again and all devices are tried to bind again with their matched driver.

If the newly probed driver provides a resource that was missing by drivers whose probe was deferred, then their probe will succeed this time and their bound devices will be removed from the deferred list.

If all required resources are provided at some point, then all drivers should probe correctly and the deferred list should become empty.

It is a simple and elegant (albeit inefficient) solution to the fact that drivers and devices registration are non-deterministic. This leads to drivers not having a way to know if a resource won’t ever be available or is just that the driver that would provide the resource has just not probed yet.

But even if the kernel probed the drivers in a deterministic order (i.e: by using device dependency information), the driver would have no way to know if for example the missing resource would be provided by a driver that was built as a kernel module and would be loaded much later by user-space or even manually by an operator.

Module device tables

Each driver provides information about what devices can be matched against and usually this information is provided on a per firmware basis. For example, a driver that supports devices registered using both ACPI and Device Tree hardware descriptions, will contain separate ID tables for the ACPI and OpenFirmware (OF) devices that can be matched.

To illustrate this, the drivers/input/touchscreen/hideep.c has the following device ID tables:

static const struct i2c_device_id hideep_i2c_id[] = {
	{ HIDEEP_I2C_NAME, 0 },
	{ }
};
MODULE_DEVICE_TABLE(i2c, hideep_i2c_id);

#ifdef CONFIG_ACPI
static const struct acpi_device_id hideep_acpi_id[] = {
	{ "HIDP0001", 0 },
	{ }
};
MODULE_DEVICE_TABLE(acpi, hideep_acpi_id);
#endif

#ifdef CONFIG_OF
static const struct of_device_id hideep_match_table[] = {
	{ .compatible = "hideep,hideep-ts" },
	{ }
};
MODULE_DEVICE_TABLE(of, hideep_match_table);
#endif

Module aliases

If information defined in these tables are exported using the MODULE_DEVICE_TABLE() macro, then these will be in the drivers kernel modules as alias entries in the module information.

For example, one can check the module aliases for a given module using the modinfo command, i.e:

$ modinfo drivers/input/touchscreen/hideep.ko | grep alias
alias:          i2c:hideep_ts
alias:          acpi*:HIDP0001:*
alias:          of:N*T*Chideep,hideep-tsC*
alias:          of:N*T*Chideep,hideep-ts

Here are listed the legacy I2C platform, ACPI and OF devices that are exported by MODULE_DEVICE_TABLE(i2c, hideep_i2c_id), MODULE_DEVICE_TABLE(acpi, hideep_acpi_id) and MODULE_DEVICE_TABLE(of, hideep_match_table) respectively.

Module autoloading

The module aliases information is only used by user-space, the kernel uses the actual device tables to match the driver with the registered devices. In fact, the MODULE_DEVICE_TABLE() is a no-op if the driver is built-in the kernel image and not built as a module.

The way this work is that when a device is registered for a bus type, the struct bus_type.uevent callback is executed and the bus driver reports a uevent to udev to take some actions. The uevent contains key-value pairs and one of them is the device MODALIAS.

For example, on my laptop when the PCI bus is enumerated and my GPU registered the following uevent MODALIAS will be sent (as shown by udevadm monitor -p):

KERNEL[189823.929341] add   /devices/pci0000:00/0000:00:02.0 (pci)                                                  
ACTION=add                                              
DEVPATH=/devices/pci0000:00/0000:00:02.0                                                                               
SUBSYSTEM=pci                                                                                                          
...
MODALIAS=pci:v00008086d00003EA0sv000017AAsd00002292bc03sc00i00                                                         
...

This information is then used by udev and pass to kmod to load the module if needed. It will do something like:

$ modprobe pci:v00008086d00003EA0sv000017AAsd00002292bc03sc00i00

Since mod{probe,info} can also take a module alias besides the module name. For exapmle, the following should tell the module that matches this alias:

$ modinfo pci:v00008086d00003EA0sv000017AAsd00002292bc03sc00i00 | grep ^name
name:           i915

This information is also present in sysfs, i.e:

$ cat /sys/devices/pci0000\:00/0000\:00\:02.0/uevent 
DRIVER=i915
PCI_CLASS=30000
PCI_ID=8086:3EA0
PCI_SUBSYS_ID=17AA:2292
PCI_SLOT_NAME=0000:00:02.0
MODALIAS=pci:v00008086d00003EA0sv000017AAsd00002292bc03sc00i00

$ /sys/devices/pci0000\:00/0000\:00\:02.0/modalias 
pci:v00008086d00003EA0sv000017AAsd00002292bc03sc00i00

In theory, drivers should only define device ID tables for the firmware interfaces that they support. That is, a driver that supports devices registered through let’s say ACPI should only need a struct acpi_device_id. And also the same table should be used to match the driver with a device and send the MODALIAS information. For example if a device was registered through OF, only the struct of_device_id should be used for both matching and module alias reporting.

But in practice things are more complicated and there are exceptions in some subsystems, although that’s a topic for another time since this post got already too long.

If you are curious about the possible pitfalls though, I wrote about a bug chased some time ago in Fedora where the cause was a driver not reporting the MODALIAS that one would expect.

Happy hacking!

Building a retro handheld console with Fedora and a RPi zero

Posted on May 29, 2022 by Javier Martinez Canillas

I built a retro console for my kids some time ago and they asked me if they could have one but that was portable.

Now that I finished the retro handheld console, thought that could be useful to share how it was done in case others wanted to replicate.

The retro console running the game Doom.

Hardware

I used a Waveshare 128×128, 1.44inch LCD display HAT which is a good fit because it contains both a LCD display and GPIO keys that can be used as a gamepad.

The board is a HAT for the Raspberry Pi Zero 2W board, HATs are expansion boards whose connectors are compatible with the RPi Zero 2W pinout.

And that is all the hardware needed if the console will just be powered with a micro USB cable. But it is of course more fun to make the handheld portable and for that I used the Waveshare UPS HAT.

Software

I just used a stock Fedora Server image for this project, no additional software was needed than what is already packaged in the distro.

The image can be flashed using the arm-image-installer tool. There is no support for the RPi Zero 2W but since is quite similar to the RPi3, that can just be used as the target instead, i.e:

sudo arm-image-installer --image=Fedora-Server-36-1.5.aarch64.raw.xz \
--target=rpi3 --media=/dev/$device --addkey=id_rsa.pub --norootpass --resizefs

Where $device is the block device for the uSD card used to install the OS.

Then I followed these steps:

Boot the uSD card, go through Fedora initial setup, create a retroarch user and make it member of the video and input groups.
Install a Libretro emulator (i.e: mGBA for Game Boy Advance) and the Retroarch frontend.

sudo dnf install libretro-mgba retroarch

Create a user service to run the game.

$ mkdir -p ~/.config/systemd/user/
$ cat > ~/.config/systemd/user/retroarch.service
<< EOF
[Unit]
Description=Start Doom
[Service]
ExecStart=retroarch -L /lib64/libretro/mgba_libretro.so /home/retroarch/roms/doom.zip
Restart=always
[Install]
WantedBy=default.target
EOF

Enable the service and lingering for the retroarch user, the latter is needed to allow the service to start at boot even when the user was not logged in.

$ systemctl --user enable retroarch.service
$ sudo loginctl enable-linger retroarch

Add RPi config snippets to support the LCD and GPIO keys.

Two Device Tree Blob Overlays (DTBO) are used to drive the SPI controller of the LCD panel and the GPIO keys. These are adafruit-st7735r.dtbo and gpio-key.dtbo.

The overlays support options that can be configured in the RPi config.txt file such as the pins used, display resolution, if the display has to be rotated, the input event code that has to be reported for each GPIO key and so on.

For the HAT mentioned above, the following has to be added to the /boot/efi/config.txt file:

# Enable SPI
dtparam=spi=on
# TFT LCD Hat
dtoverlay=adafruit-st7735r,128x128,dc_pin=25,reset_pin=27,led_pin=24,rotate=90
# GPIO keys configuration
# Directional pad (KEY_UP, KEY_LEFT, KEY_RIGHT, KEY_DOWN, BTN_ENTER)
dtoverlay=gpio-key,gpio=6,active_low=1,gpio_pull=up,label=UP,keycode=103
dtoverlay=gpio-key,gpio=5,active_low=1,gpio_pull=up,label=LEFT,keycode=105
dtoverlay=gpio-key,gpio=26,active_low=1,gpio_pull=up,label=RIGHT,keycode=106
dtoverlay=gpio-key,gpio=19,active_low=1,gpio_pull=up,label=DOWN,keycode=108
dtoverlay=gpio-key,gpio=13,active_low=1,gpio_pull=up,label=PRESS,keycode=28
# Buttons (KEY_X, KEY_Z, KEY_A)
dtoverlay=gpio-key,gpio=21,active_low=1,gpio_pull=up,label=KEY_1,keycode=45
dtoverlay=gpio-key,gpio=20,active_low=1,gpio_pull=up,label=KEY_2,keycode=44
dtoverlay=gpio-key,gpio=16,active_low=1,gpio_pull=up,label=KEY_3,keycode=30

Workaround a bug in the st7735r driver that prevents the module to be auto loaded.

There is a bug in the st7735r driver which causes the module to not be loading automatically. To workaround this issue, create a modules-load.d config snippet to force the module to be loaded:

$ echo st7735r | sudo tee /etc/modules-load.d/st7735r.conf

Unfortunately this is a quite common bug in SPI drivers. But for this particular driver the workaround should not be needed in the future since it was already fixed by this commit. But at the time of this writing, the Fedora version used (36) still does not contain the fix.

Prevent the simpledrm driver to be initialized.

The firmware seems to add a "simple-framebuffer" Device Tree node even when there is no monitor connected in its mini HDM port. This leads to the simpledrm driver to be probed, so it needs to be denied listed using a kernel command line parameter:

$ sudo grubby --update-kernel=DEFAULT \
--args=initcall_blacklist=simpledrm_platform_driver_init

Update the adafruit-st7735r.dtbo to the latest version.

This was the only change I needed in Fedora 36. The issue is that the adafruit-st7735r.dtbo overlay is for the legacy fb_st7735r driver instead of the DRM st7735r driver. The latter has a different Device Tree node property to specify the display rotation and so the rotate=90 option specified in the config.txt file will be ignored.

$ wget https://github.com/raspberrypi/firmware/raw/master/boot/overlays/adafruit-st7735r.dtbo
$ sudo mv adafruit-st7735r.dtbo /boot/efi/overlays/

Happy gaming!

Blog | Javier Martinez Canillas

$ cat /dev/random

Creating an ESP32 based dice for chess

Using an SPI SSD1306 OLED on Fedora with a Raspberry Pi

Some useful Linux kernel cmdline debug parameters to troubleshoot driver issues

Atomically exchange vfat files in Linux

How to install Fedora on an HP X2 Chromebook

Challenges supporting the Chromebooks in Fedora

Current approach used to install Fedora on the Chromebooks

Switch the Chromebook to Developer Mode

Flashing a Fedora image

Booting Fedora with sd-boot and Secure Boot enabled

1. Create a key pair to sign the binaries and enroll into the UEFI firmware db key database

2. Enroll the public cert in the UEFI firmware db database

3. Install sd-boot and remove shim and grub2 packages

4. Sign the sd-boot and Linux kernel binaries

5. Sign an out-of-tree kernel module

Using an I2C SSD1306 OLED on Fedora with a Raspberry Pi

How to troubleshoot deferred probe issues in Linux

Linux drivers and devices registration, matching, aliases and modules autoloading

Device drivers and devices registration

Drivers and device matching (binding)

Drivers’ probe callback

Probe deferral

Module device tables

Module aliases

Module autoloading

Building a retro handheld console with Fedora and a RPi zero

Hardware

Software