Saturday, April 19, 2008

Ext2 Gotchas (Directory Entry Size)

Recently in my Operating Systems Concepts class, one of our projects involved reading in an ext2 filesystem and listing the contents of the root directory. For anyone who is familiar with filesystem programming, yes, you can go ahead and laugh at the simplicity of this project.
Seriously, go ahead, I'll wait.

However, there is one aspect of the ext2 filesystem that was extremely difficult to track down: the size of a directory entry must be a multiple of four (4). This makes sense for efficiency reasons, however people familiar with this filesystem seems to take this for granted when writing descriptions and make no mention of it. The directory entry structure is defined as a struct in C with the following items:

  • Inode Number (4 bytes)
  • Length of this entry (2 bytes)
  • Length of file name (1 byte)
  • File Type (1 => File, 2=> Directory) (1 byte)
  • The file name (? Bytes)

The reason the size is variable in practice is due to the variable lengths of file names. If the length of the name is not a multiple of four, extra bytes are tacked on the end to even it out. For example, take the link present in every directory pointing to its parent. The name of this link is “..”, causing the size of the directory entry structure to be 10 (4 + 2 + 1 + 1 + 2). Two null bytes are added to the end of the name giving the structure an appropriate size of 12.

I finally figured out the answer after looking at different systems with a hex editor, and then got a hold of the book Understanding the Linux Kernel by Bovet and Cesati, which was invaluable. (It is also available online at Google books.) So if you ever find yourself writing a tool for dealing with ext2 or ext3 filesystems and cannot use the existing functions for whatever reason, keep this is mind.

On another note, here is another resource that was very helpful when doing this project. It is a great in-depth introduction to the ext2 filesystem for those who are not familiar with filesystems in general.