Skip to content

Pollable Character Driver

Overview

A pollable character driver allows userspace applications to monitor a device file with poll(), select(), or epoll().

This model is useful when a driver produces asynchronous events, such as:

  • GPIO interrupt events
  • sensor data-ready events
  • hardware FIFO watermark events
  • modem or UART receive events
  • custom kernel-to-userspace notifications

The key idea is:

kernel event source
push event into driver queue
wake_up_interruptible()
poll()/epoll() wakes up
userspace read()
driver pops one event

Poll Readiness Meaning

EPOLLIN does not mean that data must come from hardware or a remote peer.

It means:

read() can be performed without blocking

For a character driver, the driver defines what "readable" means.

In this lab:

event queue is not empty
read() can return one event
poll() returns EPOLLIN | EPOLLRDNORM

Core Kernel Mechanisms

Wait Queue

The driver uses a wait queue to allow blocking readers and poll/epoll waiters to sleep until an event is available.

wait_queue_head_t wq;

The wait queue is initialized with:

init_waitqueue_head(&mydev->wq);

No explicit destroy operation is required for an embedded wait queue object.


Mutex

The event queue is shared between:

  • write()
  • read()
  • poll()

Therefore, queue state must be protected by a mutex.

struct mutex lock;

The mutex is initialized with:

mutex_init(&mydev->lock);

No explicit destroy operation is required for an embedded mutex object.


Driver Event Queue

The driver uses a small ring buffer as an event queue.

#define MYPOLL_QUEUE_SIZE 16

struct mypoll_event {
    u32 counter;
};

struct mypoll_queue {
    struct mypoll_event events[MYPOLL_QUEUE_SIZE];
    unsigned int head;
    unsigned int tail;
    unsigned int count;
};

Queue state meaning:

Field Meaning
head Index of the next event to read
tail Index of the next free slot for write
count Number of queued events
events[] Ring buffer storage

Driver Context

The driver context contains the char device resources, synchronization objects, and event queue.

struct mypoll_dev {
    dev_t devt;
    struct cdev cdev;
    struct class *class;
    struct device *dev;

    wait_queue_head_t wq;
    struct mutex lock;

    u32 event_counter;

    struct mypoll_queue queue;
};

event_counter represents the last successfully queued event number.


Queue Helper Responsibility

Queue helper functions are internal helpers.

The driver design assumes that queue mutation helpers are called while holding the device mutex.

Example:

mutex_lock(&mydev->lock);
ret = mypoll_queue_push(&mydev->queue, &event);
mutex_unlock(&mydev->lock);

The only exception is the wait condition helper used by wait_event_interruptible(), because the wait condition is evaluated without holding the device mutex.

Therefore, that helper uses READ_ONCE().

static bool mypoll_is_event_enqueued(const struct mypoll_queue *q)
{
    return READ_ONCE(q->count) > 0;
}

Write Path

In this lab, write() is used as a fake event source.

Any userspace write generates one event.

userspace write()
create event
push event into queue
wake_up_interruptible()

The write path must update the queue before waking waiters.

mutex_lock(&mydev->lock);

event.counter = mydev->event_counter + 1;

ret = mypoll_queue_push(&mydev->queue, &event);
if (ret) {
    mutex_unlock(&mydev->lock);
    return ret;
}

mydev->event_counter = event.counter;

mutex_unlock(&mydev->lock);

wake_up_interruptible(&mydev->wq);

If the queue is full, write() returns -ENOBUFS.


Poll Path

The poll callback has two jobs:

  1. Register the wait queue with poll_wait()
  2. Return the current readiness mask
poll_wait(file, &mydev->wq, wait);

mutex_lock(&mydev->lock);

if (!mypoll_queue_is_empty(&mydev->queue))
    mask |= EPOLLIN | EPOLLRDNORM;

mutex_unlock(&mydev->lock);

poll_wait() does not directly block the process.

It tells the poll/epoll core:

If this file is not ready now, this is the wait queue that can wake it later.

Read Path

The read path consumes one queued event.

Behavior:

Condition Behavior
Queue has event Pop one event and return it
Queue empty + O_NONBLOCK Return -EAGAIN
Queue empty + blocking fd Sleep until an event is queued

The blocking read flow is:

while true:
    lock
    if queue not empty:
        pop event
        unlock
        break
    unlock

    if O_NONBLOCK:
        return -EAGAIN

    wait_event_interruptible()

It is important to re-check the queue after waking up.

A wakeup means that the condition may have become true, but it does not guarantee that the event is still available when this reader gets the mutex.

Another reader may have consumed it first.


File Offset Handling

This driver does not use *off.

That is intentional.

This device behaves like an event device, not a file-like device.

For file-like character drivers, read() often advances *off and returns EOF after all data has been read.

For event devices:

read() means consume one event

Therefore, this driver does not do:

*off += n;

and does not do:

if (*off > 0)
    return 0;

Using file offset in this event model can break long-running epoll readers.


Event Reliability

This lab uses a simple queue model.

Events are popped before copy_to_user().

If copy_to_user() fails, the event has already been consumed.

This is acceptable for a learning driver and a simple notification-style device.

A production driver that provides reliable event records may choose a stricter design:

peek event
copy to userspace
only pop after successful copy

However, that design requires additional care for multi-reader ownership and synchronization.


Current Driver Semantics

This driver provides:

Feature Behavior
Event source Userspace write()
Event storage FIFO queue
Read readiness Queue is not empty
Blocking read Waits for queued event
Non-blocking read Returns -EAGAIN when queue is empty
Queue full write() returns -ENOBUFS
Poll event EPOLLIN | EPOLLRDNORM

Summary

The pollable character driver model connects kernel event generation with userspace event loops.

The essential pattern is:

event happens
update driver state
wake_up_interruptible()
poll()/epoll() reports readiness
read() consumes data

This is the same event-driven idea used by many Linux subsystems and fits naturally with userspace epoll event loops.