How to troubleshoot deferred probe issues in Linux

When working on the retro handheld console mentioned in a previous post, I had an issue where the LCD driver was not probed when booting a custom Linux kernel image built.

To understand the problem, first some knowledge is needed about how devices and drivers are registered in the Linux kernel, how these two sets are matched (bound) and what is a probe deferral.

If you are not familiar with these concepts, please read this post where are explained in detail.

The problem is that the st7735r driver (that’s needed for the SPI TFT LCD panel I was using) requires a GPIO based backlight device. To make it more clear, let’s look at the relevant bits in the adafruit-st7735r-overlay.dts that’s used as a Device Tree Blob (DTB) overlay to register the needed devices:

fragment@2 {
...
    af18_backlight: backlight {
        compatible = "gpio-backlight";
...
    };
};

fragment@3 {
...
    af18: adafruit18@0 {
        compatible = "jianda,jd-t18003-t01";
        backlight = <&af18_backlight>;
...
    };
};

We see that the adafruit18@0 node for the panel has a backlight property whose value is a phandle to the af18_backlight label used to refer to the backlight node.

The drivers/gpu/drm/tiny/st7735r.c probe callback then uses the information in the DTB to attempt getting a backlight device:

static int st7735r_probe(struct spi_device *spi)
{
...
	dbidev->backlight = devm_of_find_backlight(dev);
	if (IS_ERR(dbidev->backlight))
		return PTR_ERR(dbidev->backlight);
...
}

The devm_of_find_backlight() function returns a pointer to the backlight device if this could be found or a -EPROBE_DEFER error pointer if there is a backlight property defined in the DTB but this could not be found.

For example, this can happen if the driver that registers that expected backlight device was not yet probed.

If the probe callback returns -EPROBE_DEFER, the kernel then will put the device that matched the driver but failed to probe in a deferred probe list. The list is iterated each time that a new driver is probed (since it could be that the newly probed driver registered the missing devices that forced the probe deferral).

My problem then was that the needed driver (CONFIG_BACKLIGHT_GPIO since the backlight node has compatible = gpio-backlight) was not enabled in my kernel, leading to the panel device to remain in the deferred probe list indefinitely due a missing backlight device that was never registered.

This is quite a common issue on Device Tree based systems and something that it used to take me a lot of time to root cause when I started working on Linux embedded platforms.

A few years ago I added a /sys/kernel/debug/devices_deferred debugfs entry that would expose the list of devices deferred to user-space, which makes much more easier to figure out what devices couldn’t be bound due their driver probe being deferred.

Later, Andrzej Hajda improved that and added to the devices_deferred debugfs entry support to print the reason of the deferral.

So checking what devices were deferred and the reason is now quite trivial, i.e:

$ cat /sys/kernel/debug/devices_deferred 
spi0.0  spi: supplier backlight not ready

Much better than spending a lot of time looking at kernel logs and adding debug printous to figure out what’s going on.

Happy hacking!

Linux drivers and devices registration, matching, aliases and modules autoloading

Every once in a while I have to explain these concepts to someone so I thought that it could be something worth to write about.

Device drivers and devices registration

The Linux kernel documentation covers quite well how device drivers and devices are registered and how these two are bound. But the summary is that drivers and devices are registered independently and each of these specify their given bus type. The Linux kernel device model then uses that information to bind drivers with devices of the same bus type.

Drivers and devices are registered using the driver_register() function which is usually called from either the drivers’ module_init() function or platform code.

Devices are registered using the register_device() function which is usually called by subsystems that parses a list of devices from some hardware topology description, some enumerable bus or platform code that hardcodes the devices to be registered.

Drivers and device matching (binding)

When a driver is registered for a given bus, the list of devices registered for that bus is iterated to find a match.

In the same manner, when a device is registered, the list of drivers registered for the same bus is iterated to find a match.

That way, it doesn’t matter the order in which drivers and devices are registered. They a device will be bound to a driver regardless of which one was registered first.

Drivers’ probe callback

If a match is found, the driver’s probe callback is executed. This function handler contains the driver-specific logic to bind the driver with a device and any setup needed for this. The probe function returns 0 if the driver could be bound to the device successfully or a negative errno code if the driver was not able to bound the device.

Probe deferral

A special errno code -EPROBE_DEFER is used to indicate that the bound failed because a driver could not provide all the resources needed by the device. When this happens, the device is put into a deferred probe list and the probe is retried again at a later time.

That later time is when a new driver probes successfully. When this happens, the device deferred probe list is iterated again and all devices are tried to bind again with their matched driver.

If the newly probed driver provides a resource that was missing by drivers whose probe was deferred, then their probe will succeed this time and their bound devices will be removed from the deferred list.

If all required resources are provided at some point, then all drivers should probe correctly and the deferred list should become empty.

It is a simple and elegant (albeit inefficient) solution to the fact that drivers and devices registration are non-deterministic. This leads to drivers not having a way to know if a resource won’t ever be available or is just that the driver that would provide the resource has just not probed yet.

But even if the kernel probed the drivers in a deterministic order (i.e: by using device dependency information), the driver would have no way to know if for example the missing resource would be provided by a driver that was built as a kernel module and would be loaded much later by user-space or even manually by an operator.

Module device tables

Each driver provides information about what devices can be matched against and usually this information is provided on a per firmware basis. For example, a driver that supports devices registered using both ACPI and Device Tree hardware descriptions, will contain separate ID tables for the ACPI and OpenFirmware (OF) devices that can be matched.

To illustrate this, the drivers/input/touchscreen/hideep.c has the following device ID tables:

static const struct i2c_device_id hideep_i2c_id[] = {
	{ HIDEEP_I2C_NAME, 0 },
	{ }
};
MODULE_DEVICE_TABLE(i2c, hideep_i2c_id);

#ifdef CONFIG_ACPI
static const struct acpi_device_id hideep_acpi_id[] = {
	{ "HIDP0001", 0 },
	{ }
};
MODULE_DEVICE_TABLE(acpi, hideep_acpi_id);
#endif

#ifdef CONFIG_OF
static const struct of_device_id hideep_match_table[] = {
	{ .compatible = "hideep,hideep-ts" },
	{ }
};
MODULE_DEVICE_TABLE(of, hideep_match_table);
#endif

Module aliases

If information defined in these tables are exported using the MODULE_DEVICE_TABLE() macro, then these will be in the drivers kernel modules as alias entries in the module information.

For example, one can check the module aliases for a given module using the modinfo command, i.e:

$ modinfo drivers/input/touchscreen/hideep.ko | grep alias
alias:          i2c:hideep_ts
alias:          acpi*:HIDP0001:*
alias:          of:N*T*Chideep,hideep-tsC*
alias:          of:N*T*Chideep,hideep-ts

Here are listed the legacy I2C platform, ACPI and OF devices that are exported by MODULE_DEVICE_TABLE(i2c, hideep_i2c_id), MODULE_DEVICE_TABLE(acpi, hideep_acpi_id) and MODULE_DEVICE_TABLE(of, hideep_match_table) respectively.

Module autoloading

The module aliases information is only used by user-space, the kernel uses the actual device tables to match the driver with the registered devices. In fact, the MODULE_DEVICE_TABLE() is a no-op if the driver is built-in the kernel image and not built as a module.

The way this work is that when a device is registered for a bus type, the struct bus_type.uevent callback is executed and the bus driver reports a uevent to udev to take some actions. The uevent contains key-value pairs and one of them is the device MODALIAS.

For example, on my laptop when the PCI bus is enumerated and my GPU registered the following uevent MODALIAS will be sent (as shown by udevadm monitor -p):

KERNEL[189823.929341] add   /devices/pci0000:00/0000:00:02.0 (pci)                                                  
ACTION=add                                              
DEVPATH=/devices/pci0000:00/0000:00:02.0                                                                               
SUBSYSTEM=pci                                                                                                          
...
MODALIAS=pci:v00008086d00003EA0sv000017AAsd00002292bc03sc00i00                                                         
...

This information is then used by udev and pass to kmod to load the module if needed. It will do something like:

$ modprobe pci:v00008086d00003EA0sv000017AAsd00002292bc03sc00i00

Since mod{probe,info} can also take a module alias besides the module name. For exapmle, the following should tell the module that matches this alias:

$ modinfo pci:v00008086d00003EA0sv000017AAsd00002292bc03sc00i00 | grep ^name
name:           i915

This information is also present in sysfs, i.e:

$ cat /sys/devices/pci0000\:00/0000\:00\:02.0/uevent 
DRIVER=i915
PCI_CLASS=30000
PCI_ID=8086:3EA0
PCI_SUBSYS_ID=17AA:2292
PCI_SLOT_NAME=0000:00:02.0
MODALIAS=pci:v00008086d00003EA0sv000017AAsd00002292bc03sc00i00

$ /sys/devices/pci0000\:00/0000\:00\:02.0/modalias 
pci:v00008086d00003EA0sv000017AAsd00002292bc03sc00i00

In theory, drivers should only define device ID tables for the firmware interfaces that they support. That is, a driver that supports devices registered through let’s say ACPI should only need a struct acpi_device_id. And also the same table should be used to match the driver with a device and send the MODALIAS information. For example if a device was registered through OF, only the struct of_device_id should be used for both matching and module alias reporting.

But in practice things are more complicated and there are exceptions in some subsystems, although that’s a topic for another time since this post got already too long.

If you are curious about the possible pitfalls though, I wrote about a bug chased some time ago in Fedora where the cause was a driver not reporting the MODALIAS that one would expect.

Happy hacking!