File Handling in Linux 2

2.1.3 Can I use SysV IPC at the same time as select or poll?

No. (Except on AIX, which has an incredibly ugly kluge to allow this.)
In general, trying to combine the use of select() or poll() with using SysV message queues is troublesome. SysV IPC objects are not handled by file descriptors, so they can't be passed to select() or poll(). There are a number of workarounds, of varying degrees of ugliness:
  • Abandon SysV IPC completely. :-)
  • fork(), and have the child process handle the SysV IPC, communicating with the parent process by a pipe or socket, which the parent process can select() on.
  • As above, but have the child process do the select(), and communicate with the parent by message queue.
  • Arrange for the process that sends messages to you to send a signal after each message. Warning: handling this right is non-trivial; it's very easy to write code that can potentially lose messages or deadlock using this method.
(Other methods exist.)

2.2 How can I tell when the other end of a connection shuts down?

If you try to read from a pipe, socket, FIFO etc. when the writing end of the connection has been closed, you get an end-of-file indication (read() returns 0 bytes read). If you try and write to a pipe, socket etc. when the reading end has closed, then a SIGPIPE signal will be delivered to the process, killing it unless the signal is caught. (If you ignore or block the signal, the write() call fails with EPIPE.)

2.3 Best way to read directories?

While historically there have been several different interfaces for this, the only one that really matters these days the the Posix.1 standard `<dirent.h>' functions.
The function opendir() opens a specified directory; readdir() reads directory entries from it in a standardised format; closedir() does the obvious. Also provided are rewinddir(), telldir() and seekdir() which should also be obvious.
If you are looking to expand a wildcard filename, then most systems have the glob() function; also check out fnmatch() to match filenames against a wildcard, or ftw() to traverse entire directory trees.

2.4 How can I find out if someone else has a file open?

This is another candidate for `Frequently Unanswered Questions' because, in general, your program should never be interested in whether someone else has the file open. If you need to deal with concurrent access to the file, then you should be looking at advisory locking.
This is, in general, too hard to do anyway. Tools like fuser and lsof that find out about open files do so by grovelling through kernel data structures in a most unhealthy fashion. You can't usefully invoke them from a program, either, because by the time you've found out that the file is/isn't open, the information may already be out of date.

2.5 How do I `lock' a file?

There are three main file locking mechanisms available. All of them are `advisory'[*], which means that they rely on programs co-operating in order to work. It is therefore vital that all programs in an application should be consistent in their locking regime, and great care is required when your programs may be sharing files with third-party software.
[*] Well, actually some Unices permit mandatory locking via the sgid bit -- RTFM for this hack.
Some applications use lock files -- something like `FILENAME.lock'. Simply testing for the existence of such files is inadequate though, since a process may have been killed while holding the lock. The method used by UUCP (probably the most notable example: it uses lock files for controlling access to modems, remote systems etc.) is to store the PID in the lockfile, and test if that pid is still running. Even this isn't enough to be sure (since PIDs are recycled); it has to have a backstop check to see if the lockfile is old, which means that the process holding the lock must update the file regularly. Messy.
The locking functions are:
flock() originates with BSD, and is now available in most (but not all) Unices. It is simple and effective on a single host, but doesn't work at all with NFS. It locks an entire file. Perhaps rather deceptively, the popular Perl programming language implements its own flock() where necessary, conveying the illusion of true portability.
fcntl() is the only POSIX-compliant locking mechanism, and is therefore the only truly portable lock. It is also the most powerful, and the hardest to use. For NFS-mounted file systems, fcntl() requests are passed to a daemon (rpc.lockd), which communicates with the lockd on the server host. Unlike flock() it is capable of record-level locking.
lockf() is merely a simplified programming interface to the locking functions of fcntl().
Whatever locking mechanism you use, it is important to sync all your file IO while the lock is active:
    flush_output_to(fd); /* NEVER unlock while output may be buffered */
    do_something_else;   /* another process might update it */
    seek(fd, somewhere); /* because our old file pointer is not safe */
A few useful fcntl() locking recipes (error handling omitted for simplicity) are:
#include <fcntl.h>
#include <unistd.h>
read_lock(int fd)   /* a shared lock on an entire file */
    fcntl(fd, F_SETLKW, file_lock(F_RDLCK, SEEK_SET));
write_lock(int fd)  /* an exclusive lock on an entire file */
    fcntl(fd, F_SETLKW, file_lock(F_WRLCK, SEEK_SET));
append_lock(int fd) /* a lock on the _end_ of a file -- other
                       processes may access existing records */
    fcntl(fd, F_SETLKW, file_lock(F_WRLCK, SEEK_END));
The function file_lock used by the above is
struct flock* file_lock(short type, short whence) 
    static struct flock ret ;
    ret.l_type = type ;
    ret.l_start = 0 ;
    ret.l_whence = whence ;
    ret.l_len = 0 ;
    ret.l_pid = getpid() ;
    return &ret ;

To Look for similar posts on File handling in Linux explore the following links from the same blog as well.

Post a Comment

Previous Post Next Post