You are on page 1of 13

Introduction to Linux Device Driver Development

Prepared by: Richard A. Sevenich, rsevenic@netscape.net

Chapter 8: Introduction to Block Drivers with an Example


References:
● Robert Love, Linux Kernel Development, Sams, 2004.
● Alessandro Rubini & Jonathan Corbet, Linux Device Drivers, 2nd Edition, O'Reilly and Associates, June 2001.
● Daniel P. Bovet & Marco Cesati, Understanding the Linux Kernel, O'Reilly and Associates, October 2000.
● Neil Matthew and Richard Stones, Beginning Linux Programming, 2nd Edition, Wrox Press Ltd. (1999).
● Linux source code via http://lxr.linux.no/

8.0 Introductory Comments


Compared to the classic character device driver, the block device driver is relatively complex. Two principal
reasons for this are:
• the buffer cache which adds a layer of necessary complexity between the user and the device
• the relative complexity of a disk drive's geometry specification

8.0.1 Some Important Kernel Data Structures

The block driver API has evolved and can be expected to continue to do so. Compared to the 2.2 series kernel, the
2.4 kernel has simplified the low level driver with a cleaner interface to the VFS. The 2.6 kernel has streamlined I/O
operations by adding a new data structure designed for I/O operations in progress. Although an in-depth
understanding of the buffer cache (and its interaction with the paging system) would be useful, it is currently
considered outside the scope of this course. However, we will introduce some of the kernel data structures that are
important. Even though they are not used directly by the driver, their description will give us a little deeper
understanding.

From a hardware point of view the sector is the smallest addressable unit on a block device. This is, in bytes, a
power of two. The typical sector size is 512 bytes. From the viewpoint of the software supporting the file system, the
smallest logically addressable unit is the block. We expect it to hold a number of sectors, which is a power of 2. The
ext2 file system allows this size to be specified when the file system is created and common choices include 1024,
2048, or 4096 bytes. Of course, when we bring a file into memory, we must put the blocks intopages; so we expect
a page to hold some number of blocks, again a power of 2. Page size is typically dictated by the architecture and for
the IA32, this is 4096 bytes (there is a very large page size available, as well).

Linux provides a data structure called the buffer_head to fully describe the file block brought into memory, one
buffer_head per block. This is shown on the next page. Many of the fields would be what we might expect, others
less obvious. Perhaps the major feature to notice is its size, since there is one of these for each file block.

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 -1


struct buffer_head {
/* First cache line: */
struct buffer_head *b_next; /* Hash queue list */
unsigned long b_blocknr; /* block number */
unsigned short b_size; /* block size */
unsigned short b_list; /* List that this buffer appears */
kdev_t b_dev; /* device (B_FREE = free) */

atomic_t b_count; /* users using this block */
kdev_t b_rdev; /* Real device */
unsigned long b_state; /* buffer state bitmap (see above) */
unsigned long b_flushtime; /* Time when (dirty) buffer should be written
*/

struct buffer_head *b_next_free; /* lru/free list linkage */
struct buffer_head *b_prev_free; /* doubly linked list of buffers */
struct buffer_head *b_this_page; /* circular list of buffers in one page */
struct buffer_head *b_reqnext; /* request queue */

struct buffer_head **b_pprev; /* doubly linked list of hash­queue */
char * b_data; /* pointer to data block */
struct page *b_page; /* the page this bh is mapped to */
void (*b_end_io)(struct buffer_head *bh, int uptodate); /* I/O completion */
  void *b_private; /* reserved for b_end_io */

unsigned long b_rsector; /* Real buffer location on disk */
wait_queue_head_t b_wait;

struct list_head b_inode_buffers;/* doubly linked list of inode dirty buffers
*/
};

In the 2.4 kernel, this was also the data item used for I/O operations. It was realized that much of the information in
this struct was extraneous to the support of the actual data structure – and there was one of these for every block in
the transfer. Consequently, in the 2.6 kernel a new data structure, the bio struct, was devised to streamline the block
I/O operations. There is one I/O struct for every I/O operation, but such an operation could involve many blocks. In
our discussion we will focus on one of the fileds within the bio struct called bio_vec. This is the new representation
of the block for supporting I/O operations. Here is the small bio_vec struct:

struct bio_vec {
struct page *bv_page;
unsigned int bv_len;
unsigned int bv_offset;
};

where
● page is a reference to the page on which it is located
● bv_len is the number of bytes in the block
● bv_offset is the position of the block within the page

To see how this is used, we will need to reference three other fields in the bio struct:
bi_io_vec
bi_idx
bi_vcnt

Before embarking on that short discussion, we'll show the entire bio struct on the next page.

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 -2


struct bio {
sector_t bi_sector;
struct bio *bi_next; /* request queue link */
struct block_device *bi_bdev;
unsigned long bi_flags; /* status, command, etc */
unsigned long bi_rw; /* bottom bits READ/WRITE,
 * top bits priority
 */

unsigned short bi_vcnt; /* how many bio_vec's */


unsigned short bi_idx; /* current index into bvl_vec */

/* Number of segments in this BIO after
 * physical address coalescing is performed.
 */
unsigned short bi_phys_segments;

/* Number of segments after physical and DMA remapping
 * hardware coalescing is performed.
 */
unsigned short bi_hw_segments;

unsigned int bi_size; /* residual I/O count */


unsigned int bi_max_vecs; /* max bvl_vecs we can hold */

struct bio_vec *bi_io_vec; /* the actual vec list */

bio_end_io_t *bi_end_io;
atomic_t bi_cnt; /* pin count */

void *bi_private;

bio_destructor_t *bi_destructor; /* destructor */


};

When the kernel wants to conduct a block I/O operation it constructs a bio struct. The I/O transfer will involve a list
of bio_vec structures where bi_io_vec points to the start of the list. The number of bio_vec's in that list is bi_vcnt.
The bio_vec that is being currently handled is indexed from the start of the list by bi_idx.

8.0.2 Contrasting Character and Block drivers

From a userland programmer's point of view the block and character devices look similar because both use the VFS
layer to initiate access. Hence, to use either, one employs similar library calls to do such things as
• open the device
• read or write from it
• close the device

Of course, there are differences as well. From the viewpoint of the device driver developer, the API for both is
similar in intent, but different in the details. The similarities lie in what must be done. In particular, the device driver
programmer must provide:
• an initialization routine to probe for and allocate resources - the entry point corresponding to insmod
• a cleanup routine which frees resources and does any other cleanup - the entry point corresponding to rmmod
• other low level functions whose entry points correspond to using the device once installed
• the kernel with entry point information about these other low level functions by registering the information ( a
data structure) with the kernel

The differences lie in the details of how things are done as well as in some differences in needed functionality.
In this chapter we will only look at the block driver API for kernels in the 2.4 version series. This API changed
somewhat dramatically from the 2.2 series - more so than did the character device API.

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 -3


8.1 Overview
From a coarse viewpoint the block driver uses
• the block_device_operations struct, where the classic character driver uses the file_operations struct
• the register_blkdev() and unregister_blkdev() functions, where the classic character driver uses the
register_chrdev() and unregister_chrdev() functions

8.1.1 The block_device_operations struct for block devices

This struct is found in /usr/src/linux/include/linux/fs.h where it is defined as:


struct block_device_operations {
  int (*open) (struct inode *, struct file *);
  int (*release) (struct inode *, struct file *);
  int (*ioctl) (struct inode *, struct file *, unsigned, unsigned long);
  int (*check_media_change) (kdev_t);
  int (*revalidate) (kdev_t);
};

Clearly, there is a loose end. In particular, in contrast to the file_operations struct, we see no read or write
functionality. We will come back to that shortly. In the block_device_operations struct, we do see the expected
• open
• release
• ioctl
plus functions that deal with removable media.

8.1.2 The register_blkdev() and unregister _blk_dev() functions

These functions are defined in /usr/src/linux/fs/block_dev.c and have prototypes:


int register_blkdev(unsigned int major, const char * name, 
struct block_device_operations *bdops);
int unregister_blkdev(unsigned int major, const char * name);
Note that register_blkdev() registers the block_device_operations struct as expected. The conventions surrounding
the major number are the same as for character drivers i.e.
• the programmer may pass the major number as an argument
• the programmer may pass 0 instead and then the return value is an available major number
In passing we note that a block driver and a character driver may have the same major number. The kernel is not
confused, because they are in different driver classes.

8.1.3 Loose ends

read/write access to the block device

Typical block devices are the various kinds of disk drives. From the CPU's viewpoint, these are relatively slow
electromechanical devices. To enhance data transfer efficiency, we would expect some sort of caching/buffering
strategy. Linux employs a dynamic cache system using physical memory left over from what is required for the
kernel and user processes. This leads to a significant difference between character device drivers and block device
drivers:
● If a user program makes read/write library calls to access a character device, the VFS passes these requests on to
the low level read/write functions in the driver
● If a user program makes read/write library calls to access a block device, the VFS does not pass these requests
on to low level read/write functions in the driver. Instead, the block_read() and block_write() functions ( see /
usr/src/linux/fs/block_dev.c) are used so that the user interaction is with the buffer not with the driver.

Clearly, if a block device driver does not have a direct read/write interaction with the user program because of the
intervening buffer, then it must provide functions to keep the buffer appropriately up to date. In short, the device
driver must provide low level read/write functionality to interact with the buffer as needed. These low level
functions are not entry points triggered directly by user programs and hence do not belong in the
block_device_operations struct. Nevertheless, the kernel must know about the block driver's low level read/write
functionality.

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 -4


If the block_read() or block_write() function finds that the buffer is not up to date, it will call the function
ll_rw_block() which interfaces to the device driver's low level read/write functionality. The ll_rw_block() function
can be found in a directory with a somewhat similar name, i.e. /usr.src/linux/drivers/block/ll_rw_blk.c. The low
level read/write request may be deferred ('plugged') to allow the system to merge requests which may be adjacent on
the hard disk. Since disk seek and rotational latency are very slow (by the CPU's standards), any merging and sorting
possible yield much better efficiency.

A given device will have a its own request queues (typically just one), so that the merging and sorting of requests are
possibilities that makes sense.

disk geometry

It is typically necessary to initialize certain disk geometry parameters; although, in some cases, there are default
values that may be appropriate. These parameters are array members organized by major (and perhaps, minor)
number e.g.
• blk_size[major][minor] - size of device in kbytes
• blksize_size[major][minor] - size of a block in bytes
• hardsect_size[major][minor] - sector size in bytes
The corresponding definitions along with other parameters can be found in /usr/src/linux/drivers/block/ll_rw_blk.c.
We'll see examples of initializing such parameters in the next chapter.

8.2 The Request Queue


In preparation for discussing a block driver's request queue, we will first develop some background, starting with a
discussion of blocks of bytes. Because disk access is so slow compared to CPU time, there is some effort expended
by the kernel to group requests for any necessary disk access into efficiently accessed physical regions on the disk.

8.2.1 Blocks, buffers, buffer heads

A block device driver transfers data grouped as a large number of adjacent bytes, called a block. Linux requires that
the number of bytes
• be a power of 2
• not exceed the page size
• include an integral number of disk sectors

It is somewhat typical to have a 512 byte sector and a 4096 byte page, implying that acceptable block sizes are 512,
1024, 2048, and 4096 bytes. For each block there is a buffer - an area of RAM that holds a copy of the block's data,
for efficient program access. A buffer is described by a data structure called a buffer head, which contains all salient
data about the buffer.

8.2.2 Requests (2.4 kernel)

When the kernel needs to access a disk block, it creates a block device request. This is described in the request
struct (see /usr/src/linuc/include/linux/blkdev.h). A request may contain a number of adjacent blocks, so that the
request struct includes references to the first and last buffer heads for those adjacent blocks. The buffer_head struct
itself points to the next buffer head in a simple linked list.

Requests are placed in a request_queue - in fact, the request struct has a pointer to the request_queue in which it is a
member. Among the fields in the request_queue are references to
• the request function (request_fn) for processing the request queue (see next section)
• a set of functions for merging requests (for disk access efficiency)
• a function for making a request

Each block device has a default request_queue designated by


BLK_DEFAULT_QUEUE(major)
It is the responsibility of the device driver to initialize this queue with the blk_init_queue function. We'll see an
example in the next subsection.

There is a blk_dev array indexed by major number. Each element of the array represents a particular block device as
a blk_dev_struct. This struct references
• the device's request_queue
• the (atomic) procedure for processing that queue

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 -5


8.2.3 The request_fn

Here, it is useful to include examples that includes some of the other context. The following fragments are excerpted
from the ramdisk source code which will be covered in the next chapter:

example 1 - from init_module

/* Here is the block_device_operations structure holding entry point 
references to the driver's open, release, and ioctl. The removable media
functionality is not needed for a ramdisk.
*/
static struct block_device_operations fd_fops = 
{
  ioctl: radimo_ioctl,
  open: radimo_open,
  release: radimo_release,
};

/* The following is from the initialization and registration portion of the
code. We see that the register_blkdev() function registers the major number
(MAJOR_NR) and the block_device_operations structure as discussed earlier.
*/
res = register_blkdev(RADIMO_MAJOR, "radimo", &radimo_fops)) 

/* The ramdisk device driver programmer has already encoded the low level
read/write functionality in a function radimo_request, which is the 
request_fn. Recall that the driver is responsible for initializing the
request_queue ­ and also informing the kernel.
*/
blk_init_queue(BLK_DEFAULT_QUEUE(RADIMO_MAJOR), &radimo_request);

example2 - from cleanup_module

/* the expected unregister_blkdev() pairing the register_blkdev() found in
the earlier init_module
*/ 
res = unregister_blkdev(RADIMO_MAJOR, "radimo");

/* similarly, blk_cleanup_queue() pairs with the blk_init_queue() found in
the earlier init_module()
*/
blk_cleanup_queue(BLK_DEFAULT_QUEUE(RADIMO_MAJOR));

Starting in Section 8.4 we will look at a simplified ramdisk driver and see an example of an actual request_fn. and
see how it traverses the request_queue.

8.3 Entry Point References in block_device_operations


Recall that the entry points referenced in the block_device_operations struct are:
int (*open) (struct inode *, struct file *);
int (*release) (struct inode *, struct file *);
int (*ioctl) (struct inode *, struct file *, unsigned, unsigned long);
int (*check_media_change) (kdev_t);
int (*revalidate) (kdev_t);

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 -6


8.3.1 The open and release functions for block devices

These are triggered by user programs which use the library calls open and close, as usual. Typically when a file is
opened the open command passes a pointer to a file struct and that structure is then bound to the process which
performed the open. Mounting also uses the open command, but the file structure passed is quite different and does
not become bound to the mount process. This 'quite different' file struct has only one field that is relevant i.e. the
f_mode field. The other fields should not be used. The f_mode tells the driver to open in one of two modes:
• read-only via f_mode = FMODE_READ
• read-write via f_mode = FMODE_READ | FMODE_WRITE

After the mount command is finished, the mounted file system remains, of course. Once the file system is mounted,
the kernel manages the files using the low level read and write methods accessed via the driver's request_fn
discussed earlier. Any process which opens a file within the mounted file system will still use the block_read() and
block_write() via the VFS.

Unmounting, done with the umount command, flushes the buffer cache and calls the driver's release. The release
function will be passed NULL as its file pointer since the file struct is not meaningful.

8.3.2 ioctl for block devices

Recall that for character drivers, ioctl() was a catchall for any hardware commands the specific driver might need.
This is also true for the block drivers, but there is also a set of commonly used commands which usually suffice (see
<linux/fs.h>); for example:
• BLKROSET - set device read-only
• BLKGETSIZE - return device size
• BLKFLSBUF - flush buffer cache
• BLKRASET - set the read ahead value
• BLKRAGET - get the read ahead value

There are on the order of a dozen of these ioctl commands. The ioctl() function in the driver could handle these, as
well as any custom commands, in a switch structure as usual.

8.3.3 check_media_change and revalidate for block devices

These are intended for use with removable media. Their functionality is:
• check_media_change - returns 1 if the media has been changed since last access, but otherwise returns 0
• revalidate - typically updates internal status information to reflect new media; called after media change is
detected

The kernel automatically checks for media change on mounting a device. If the driver keeps status information
concerning a removable device, it should check for media change (and revalidate on change) in the open command
as well.

8.4 An Example - a RAM Disk Driver


For initial pedagogical success, it is best to have a simple example with most of the necessary features. This is rather
easier for character devices than for block devices. Perhaps the simplest, useful block driver example is a RAM disk
driver. The Linux source has such a driver and so does the book (highly recommended) by Matthew and Stones. Of
the two, we choose to discuss the latter because it is simpler (and shorter!). Like the RAM disk driver in the Linux
source, it is available under the GPL. We have modified it appropriately for the 2.4 kernel series. Further, we have
stripped some functionality out of even this simple driver to get down to bare essentials.

Of course, what is missing in RAM disk examples is a real hardware device - other than RAM. Hence, we see no
role for interrupts etc. Our suggestion for further study beyond this chapter is to:
• work through the linux RAM disk driver (linux/drivers/block/rd.c)
• then work through a real driver such as linux/drivers/block/IDE-floppy.c or one which pertains more directly to
your interests

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 -7


8.5 Entry Points
We can start by looking for entry points, familiar from the character drivers i.e.
• init_module
• cleanup_module
• the driver's block_device_operations struct, referring to other entry points
These are all present in the RAM disk driver example. We'll discuss these three items in the next three subsections.
Additionally, there is the
• request_fn
which, you'll recall from the prior chapter, exists in block drivers, but not in character drivers. It is a new feature for
us and will take more explanation; so we'll give it a section of its own - Section 8.2.

8.5.1 init_module

We'll first give the code listing and then follow it with a description. The code is well commented, so there will be
some redundancy in the description.
int init_module(void)
{
  int res; 
  /* block size must be a multiple of sector size */
  if (radimo_soft & ((1 << RADIMO_HARDS_BITS)­1)) 
  {
    MSG(RADIMO_ERROR, "Block size not a multiple of sector size\n");
    return ­EINVAL;
  }

  /* allocate room for data */
  radimo_storage = (char *) vmalloc(1024*radimo_size);
  if (radimo_storage == NULL) 
  {
    MSG(RADIMO_ERROR, "Not enough memory. Try a smaller size.\n");
    return ­ENOMEM;
  }
  memset(radimo_storage, 0, 1024*radimo_size); 

  /* register block device */
  res = register_blkdev(RADIMO_MAJOR, "radimo", &radimo_fops);
  if (res) 
  {
    MSG(RADIMO_ERROR, "couldn't register block device\n");
    return res;
  }
  blk_init_queue(BLK_DEFAULT_QUEUE(RADIMO_MAJOR), &radimo_request);

  /* set hard­ and soft blocksize */
  hardsect_size[RADIMO_MAJOR] = &radimo_hard;
  blksize_size[RADIMO_MAJOR] = &radimo_soft;
  blk_size[RADIMO_MAJOR] = &radimo_size;
  read_ahead[RADIMO_MAJOR] = radimo_readahead;
  MSG(RADIMO_INFO, "sector size = %d, block size = %d, 
      total size = % dKb\n", radimo_hard, radimo_soft, radimo_size);
  return 0;
}

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 -8


Recall that this entry point is used by the insmod command. A step-by-step description of what init_module does
follows:
• Checks the block size (radimo_soft) to assure that it is a multiple of the sector size
• Allocates 1024*radimo_size bytes (2 MB) of memory for the device
• Fills the allocated memory with zeroes (memset)
• Registers the block device with the kernel as major number 42 and with a reference to the jump table,
&radimo_fops
• Sets up arrays for hard and soft block sizes as expected by linux/drivers/block/ll_rw_block.c
• Initializes the request queue and registers the request_fn for this driver (i.e. the address of radimo_request)
• Sets the read_ahead value at 4 sectors
• Sends out information to the log/console

8.5.2 cleanup_module

Again, we'll first give the code listing and then follow it with a description.
void cleanup_module(void) 
{
  int res;

  res = unregister_blkdev(RADIMO_MAJOR, "radimo");
  if (res) 
  {
    MSG(RADIMO_ERROR, "couldn't unregister block device\n");
    return;
  }

  invalidate_buffers(MKDEV(RADIMO_MAJOR,0));
  blk_cleanup_queue(BLK_DEFAULT_QUEUE(RADIMO_MAJOR));
  vfree(radimo_storage);
  MSG(RADIMO_INFO, "unloaded\n");

Recall that this entry point is used by the rmmod command. A step-by-step description of what cleanup_module
does follows:
• Unregisters the module
• Marks the buffer region associated with the device as invalid
• Frees the memory allocated earlier for the device
• Cleans up the request queue
• Sends a message to the log/console indicating that the device has been unloaded

8.5.3 The block_device_operations struct

Here is the jump table for other entry points, the file_operations struct:
static struct block_device_operations radimo_fops = {
  owner: THIS_MODULE,
  ioctl: radimo_ioctl,
  open: radimo_open,
  release: radimo_release,
};

The entry points referenced in the radimo_fops structure are used when a user program interfaces with the device via
the VFS. For example, the open system call when used for this device will ultimately make use of the function,
radimo_open. All functions referenced must be provided by this driver except for block_read and block_write. As
noted before block_read and block_write use a block buffering system employed by VFS to provide data transfer
between the buffer and the user address space as triggered by the user read and write calls. The data transfer
between the buffering system and the actual device is handled by the request_fn described in the next section. The
request_fn must be provided by this driver.

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 -9


8.6 The request_fn, radimo_request
Recall that this function handles data transfer between the VFS block buffering system and the device. As before,
we'll first give the code listing and then follow it with a description.

void radimo_request(void)
{
  unsigned long offset, total;

  radimo_begin:
  INIT_REQUEST;

  MSG(RADIMO_REQUEST, "%s sector %lu of %lu\n",
       CURRENT­>cmd == READ ? "read" : "write", CURRENT­>sector,
                                     CURRENT­>current_nr_sectors);

  offset = CURRENT­>sector * radimo_hard;
  total = CURRENT­>current_nr_sectors * radimo_hard;

  /* access beyond end of the device */
  if (total + offset > radimo_size * (radimo_hard << 1)) 
  {
    /* error in request */
    end_request(0);
    goto radimo_begin;
  }

  MSG(RADIMO_REQUEST, "offset = %lu, total = %lu\n", offset, total);

  if (CURRENT­>cmd == READ) 
  {
    memcpy(CURRENT­>buffer, radimo_storage + offset, total);
  } 
  else if (CURRENT­>cmd == WRITE) 
  {
    memcpy(radimo_storage + offset, CURRENT­>buffer, total);
  } 
  else 
  {
    /* can't happen */
    MSG(RADIMO_ERROR, "cmd == %d is invalid\n", CURRENT­>cmd);
    end_request(0);
    goto radimo_begin;
  }

  /* successful */
  end_request(1);

  /* let INIT_REQUEST return when we are done */
  goto radimo_begin;
}

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 - 10


Note that radimo_request looks like an infinite loop, because INIT_REQUEST decides when to terminate the loop
and handles the return. A step-by-step description of what radimo_request does follows:
• check the request validity or return
• perhaps send out a message detailing the request (see radimo.h)
• compute offset and size of transfer
• if offset plus size takes us beyond the end of the device abort the request, perform request cleanup, and return to
the top
• perhaps print out the offset and size (see radimo.h)
• if it's a READ, copy to the buffer from the device
• if it's a WRITE, copy to the device from the buffer
• if the command is neither READ nor WRITE, annunciate an error, abort the request, perform request cleanup,
and return to the top
• cleanup the request and return to the top

8.7 Remaining Functions


We'll now look at the remaining functions. The description is perfunctory in the case of the open and close
functions, which merely annunciate their activity.

The open and close (release) functions

static int radimo_open(struct inode *inode, struct file *file)
{
  MSG(RADIMO_OPEN, "opened\n");
  return 0;
}

static int radimo_release(struct inode *inode, struct file *file)
{
  MSG(RADIMO_OPEN, "closed\n");
  return 0;
}

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 - 11


The ioctl function

static int radimo_ioctl(struct inode *inode, struct file *file, 
                                      unsigned int cmd, unsigned long arg)
{
  unsigned int minor; 
  if (!inode || !inode­>i_rdev) 
              return ­EINVAL; minor = MINOR(inode­>i_rdev);

  switch (cmd) 
  {
    case BLKFLSBUF: 
    {
      /* flush buffers */
      MSG(RADIMO_IOCTL, "ioctl: BLKFLSBUF\n");
      /* deny all but root */
      if (!capable(CAP_SYS_ADMIN)) return ­EACCES;
      fsync_dev(inode­>i_rdev);
      invalidate_buffers(inode­>i_rdev);
      break;
    }
    case BLKGETSIZE: 
    {
      /* return device size */
      MSG(RADIMO_IOCTL, "ioctl: BLKGETSIZE\n");
      if (!arg) return ­EINVAL;
      return put_user(radimo_size*2, (long *) arg);
    }
    case BLKRASET: 
    { /* set read ahead value */
      int tmp;
      MSG(RADIMO_IOCTL, "ioctl: BLKRASET\n");
      if (get_user(tmp, (long *)arg)) return ­EINVAL;
      if (tmp > 0xff) return ­EINVAL;
      read_ahead[RADIMO_MAJOR] = tmp;
      return 0;
    }
    case BLKRAGET: 
    { /* return read ahead value */
      MSG(RADIMO_IOCTL, "ioctl: BLKRAGET\n");
      if (!arg) return ­EINVAL;
      return put_user(read_ahead[RADIMO_MAJOR], (long *)arg);
    }
    case BLKSSZGET: 
    { /* return block size */
      MSG(RADIMO_IOCTL, "ioctl: BLKSSZGET\n");
      if (!arg) return ­EINVAL;
      return put_user(radimo_soft, (long *)arg);
    }
    default: 
    {
      MSG(RADIMO_ERROR, "ioctl wanted %u\n", cmd);
      return ­ENOTTY;
    }
  }
  return 0;
}

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 - 12


Here is a description of what radimo_ioctl will do:
• Check the device file validity.
• Get the minor number.
• Switch to appropriate command case
• If cmd == flush buffer, proceed only if requester has root permission. Then write any dirty buffers to the RAM
disk and invalidate the buffers.
• If cmd == get disk size, copy the device size to the user program..
• If cmd == set read_ahead, then get the requested size from the user program, assure that the value is valid, and
then set the requested read_ahead value.
• If cmd == get read_ahead, copy the read_ahead value to the user program.
• If cmd == get block size, copy the block size to the user program.
• If cmd is unrecognized, return an error.
• return success

8.8 Using your Ramdisk


Of course, when you power off, anything stored in the ramdisk is lost. However, while power is on, it is usable and
its virtue is that it is much faster than an electromechanical disk drive. To get it up and running, you would use the
same sorts of steps as for a new hard disk:
• install the driver - insmod radimo.o
• make a device node - mknod /dev/radimo b 42 0
• make a file system for the device - mke2fs /dev/radimo
• set permissions - chmod 666 /dev/radimo
• make a mount point - mkdir /ramd
• mount the device - mount -t ext2 /dev/radimo /ramd
• Then use the device as needed and, when done:
• unmount the device - umount /ramd

8.9 Activities
8.9.1 Activity 1

Download the simplified software, radimo_simp.tgz, and install it. Enable all levels of printing for the simplified
radimo driver. Set it up for use as in Section 8.8. Create some files on the ramdisk. Do those files persist when you
unmount the device and then remount? Do those files persist when you reboot the machine?

8.9.2 Activity 2

Write a user program that employs all of the ioctl commands for the simplified radimo driver. Trace the activity via
the messages available within the driver.

8.9.3 Activity 3

Modify the driver so it can support four minor devices, say,


• /dev/radimo1
• /dev/radimo2
• /dev/radimo3
• /dev/radimo4
Verify that these devices can exist concurrently and work properly.

R.A. Sevenich © 2004 Introduction to Linux Device Driver Development 8 - 13

You might also like