manual/llio.texi

   1 @node Low-Level I/O, File System Interface, I/O on Streams, Top
   2 @chapter Low-Level Input/Output
   3
   4 This chapter describes functions for performing low-level input/output
   5 operations on file descriptors.  These functions include the primitives
   6 for the higher-level I/O functions described in @ref{I/O on Streams}, as
   7 well as functions for performing low-level control operations for which
   8 there are no equivalents on streams.
   9
  10 Stream-level I/O is more flexible and usually more convenient;
  11 therefore, programmers generally use the descriptor-level functions only
  12 when necessary.  These are some of the usual reasons:
  13
  14 @itemize @bullet
  15 @item
  16 For reading binary files in large chunks.
  17
  18 @item
  19 For reading an entire file into core before parsing it.
  20
  21 @item
  22 To perform operations other than data transfer, which can only be done
  23 with a descriptor.  (You can use @code{fileno} to get the descriptor
  24 corresponding to a stream.)
  25
  26 @item
  27 To pass descriptors to a child process.  (The child can create its own
  28 stream to use a descriptor that it inherits, but cannot inherit a stream
  29 directly.)
  30 @end itemize
  31
  32 @menu
  33 * Opening and Closing Files::           How to open and close file
  34                                          descriptors.
  35 * Truncating Files::                    Change the size of a file.
  36 * I/O Primitives::                      Reading and writing data.
  37 * File Position Primitive::             Setting a descriptor's file
  38                                          position.
  39 * Descriptors and Streams::             Converting descriptor to stream
  40                                          or vice-versa.
  41 * Stream/Descriptor Precautions::       Precautions needed if you use both
  42                                          descriptors and streams.
  43 * Waiting for I/O::                     How to check for input or output
  44                                          on multiple file descriptors.
  45 * Synchronizing I/O::                   Making sure all I/O actions completed.
  46 * Asynchronous I/O::                    Perform I/O in parallel.
  47 * Control Operations::                  Various other operations on file
  48                                          descriptors.
  49 * Duplicating Descriptors::             Fcntl commands for duplicating
  50                                          file descriptors.
  51 * Descriptor Flags::                    Fcntl commands for manipulating
  52                                          flags associated with file
  53                                          descriptors.
  54 * File Status Flags::                   Fcntl commands for manipulating
  55                                          flags associated with open files.
  56 * File Locks::                          Fcntl commands for implementing
  57                                          file locking.
  58 * Interrupt Input::                     Getting an asynchronous signal when
  59                                          input arrives.
  60 @end menu
  61
  62
  63 @node Opening and Closing Files
  64 @section Opening and Closing Files
  65
  66 @cindex opening a file descriptor
  67 @cindex closing a file descriptor
  68 This section describes the primitives for opening and closing files
  69 using file descriptors.  The @code{open} and @code{creat} functions are
  70 declared in the header file @file{fcntl.h}, while @code{close} is
  71 declared in @file{unistd.h}.
  72 @pindex unistd.h
  73 @pindex fcntl.h
  74
  75 @comment fcntl.h
  76 @comment POSIX.1
  77 @deftypefun int open (const char *@var{filename}, int @var{flags}[, mode_t @var{mode}])
  78 The @code{open} function creates and returns a new file descriptor
  79 for the file named by @var{filename}.  Initially, the file position
  80 indicator for the file is at the beginning of the file.  The argument
  81 @var{mode} is used only when a file is created, but it doesn't hurt
  82 to supply the argument in any case.
  83
  84 The @var{flags} argument controls how the file is to be opened.  This is
  85 a bit mask; you create the value by the bitwise OR of the appropriate
  86 parameters (using the @samp{|} operator in C).
  87 @xref{File Status Flags}, for the parameters available.
  88
  89 The normal return value from @code{open} is a non-negative integer file
  90 descriptor.  In the case of an error, a value of @code{-1} is returned
  91 instead.  In addition to the usual file name errors (@pxref{File
  92 Name Errors}), the following @code{errno} error conditions are defined
  93 for this function:
  94
  95 @table @code
  96 @item EACCES
  97 The file exists but is not readable/writable as requested by the @var{flags}
  98 argument, the file does not exist and the directory is unwritable so
  99 it cannot be created.
 100
 101 @item EEXIST
 102 Both @code{O_CREAT} and @code{O_EXCL} are set, and the named file already
 103 exists.
 104
 105 @item EINTR
 106 The @code{open} operation was interrupted by a signal.
 107 @xref{Interrupted Primitives}.
 108
 109 @item EISDIR
 110 The @var{flags} argument specified write access, and the file is a directory.
 111
 112 @item EMFILE
 113 The process has too many files open.
 114 The maximum number of file descriptors is controlled by the
 115 @code{RLIMIT_NOFILE} resource limit; @pxref{Limits on Resources}.
 116
 117 @item ENFILE
 118 The entire system, or perhaps the file system which contains the
 119 directory, cannot support any additional open files at the moment.
 120 (This problem cannot happen on the GNU system.)
 121
 122 @item ENOENT
 123 The named file does not exist, and @code{O_CREAT} is not specified.
 124
 125 @item ENOSPC
 126 The directory or file system that would contain the new file cannot be
 127 extended, because there is no disk space left.
 128
 129 @item ENXIO
 130 @code{O_NONBLOCK} and @code{O_WRONLY} are both set in the @var{flags}
 131 argument, the file named by @var{filename} is a FIFO (@pxref{Pipes and
 132 FIFOs}), and no process has the file open for reading.
 133
 134 @item EROFS
 135 The file resides on a read-only file system and any of @w{@code{O_WRONLY}},
 136 @code{O_RDWR}, and @code{O_TRUNC} are set in the @var{flags} argument,
 137 or @code{O_CREAT} is set and the file does not already exist.
 138 @end table
 139
 140 @c !!! umask
 141
 142 If on a 32 bits machine the sources are translated with
 143 @code{_FILE_OFFSET_BITS == 64} the function @code{open} returns a file
 144 descriptor opened in the large file mode which enables the file handling
 145 functions to use files up to @math{2^63} in size and offset from
 146 @math{-2^63} to @math{2^63}.  This happens transparently for the user
 147 since all of the lowlevel file handling functions are equally replaced.
 148
 149 This function is a cancelation point in multi-threaded programs.  This
 150 is a problem if the thread allocates some resources (like memory, file
 151 descriptors, semaphores or whatever) at the time @code{open} is
 152 called.  If the thread gets canceled these resources stay allocated
 153 until the program ends.  To avoid this calls to @code{open} should be
 154 protected using cancelation handlers.
 155 @c ref pthread_cleanup_push / pthread_cleanup_pop
 156
 157 The @code{open} function is the underlying primitive for the @code{fopen}
 158 and @code{freopen} functions, that create streams.
 159 @end deftypefun
 160
 161 @comment fcntl.h
 162 @comment LFS
 163 @deftypefun int open64 (const char *@var{filename}, int @var{flags}[, mode_t @var{mode}])
 164 This function is similar to @code{open}.  It returns a file descriptor
 165 which can be used to access the file named by @var{filename}.  The only
 166 the difference is that on 32 bits systems the file is opened in the
 167 large file mode.  I.e., file length and file offsets can exceed 31 bits.
 168
 169 To use this file descriptor one must not use the normal operations but
 170 instead the counterparts named @code{*64}, e.g., @code{read64}.
 171
 172 When the sources are translated with @code{_FILE_OFFSET_BITS == 64} this
 173 function is actually available under the name @code{open}.  I.e., the
 174 new, extended API using 64 bit file sizes and offsets transparently
 175 replaces the old API.
 176 @end deftypefun
 177
 178 @comment fcntl.h
 179 @comment POSIX.1
 180 @deftypefn {Obsolete function} int creat (const char *@var{filename}, mode_t @var{mode})
 181 This function is obsolete.  The call:
 182
 183 @smallexample
 184 creat (@var{filename}, @var{mode})
 185 @end smallexample
 186
 187 @noindent
 188 is equivalent to:
 189
 190 @smallexample
 191 open (@var{filename}, O_WRONLY | O_CREAT | O_TRUNC, @var{mode})
 192 @end smallexample
 193
 194 If on a 32 bits machine the sources are translated with
 195 @code{_FILE_OFFSET_BITS == 64} the function @code{creat} returns a file
 196 descriptor opened in the large file mode which enables the file handling
 197 functions to use files up to @math{2^63} in size and offset from
 198 @math{-2^63} to @math{2^63}.  This happens transparently for the user
 199 since all of the lowlevel file handling functions are equally replaced.
 200 @end deftypefn
 201
 202 @comment fcntl.h
 203 @comment LFS
 204 @deftypefn {Obsolete function} int creat64 (const char *@var{filename}, mode_t @var{mode})
 205 This function is similar to @code{creat}.  It returns a file descriptor
 206 which can be used to access the file named by @var{filename}.  The only
 207 the difference is that on 32 bits systems the file is opened in the
 208 large file mode.  I.e., file length and file offsets can exceed 31 bits.
 209
 210 To use this file descriptor one must not use the normal operations but
 211 instead the counterparts named @code{*64}, e.g., @code{read64}.
 212
 213 When the sources are translated with @code{_FILE_OFFSET_BITS == 64} this
 214 function is actually available under the name @code{open}.  I.e., the
 215 new, extended API using 64 bit file sizes and offsets transparently
 216 replaces the old API.
 217 @end deftypefn
 218
 219 @comment unistd.h
 220 @comment POSIX.1
 221 @deftypefun int close (int @var{filedes})
 222 The function @code{close} closes the file descriptor @var{filedes}.
 223 Closing a file has the following consequences:
 224
 225 @itemize @bullet
 226 @item
 227 The file descriptor is deallocated.
 228
 229 @item
 230 Any record locks owned by the process on the file are unlocked.
 231
 232 @item
 233 When all file descriptors associated with a pipe or FIFO have been closed,
 234 any unread data is discarded.
 235 @end itemize
 236
 237 This function is a cancelation point in multi-threaded programs.  This
 238 is a problem if the thread allocates some resources (like memory, file
 239 descriptors, semaphores or whatever) at the time @code{close} is
 240 called.  If the thread gets canceled these resources stay allocated
 241 until the program ends.  To avoid this calls to @code{close} should be
 242 protected using cancelation handlers.
 243 @c ref pthread_cleanup_push / pthread_cleanup_pop
 244
 245 The normal return value from @code{close} is @code{0}; a value of @code{-1}
 246 is returned in case of failure.  The following @code{errno} error
 247 conditions are defined for this function:
 248
 249 @table @code
 250 @item EBADF
 251 The @var{filedes} argument is not a valid file descriptor.
 252
 253 @item EINTR
 254 The @code{close} call was interrupted by a signal.
 255 @xref{Interrupted Primitives}.
 256 Here is an example of how to handle @code{EINTR} properly:
 257
 258 @smallexample
 259 TEMP_FAILURE_RETRY (close (desc));
 260 @end smallexample
 261
 262 @item ENOSPC
 263 @itemx EIO
 264 @itemx EDQUOT
 265 When the file is accessed by NFS, these errors from @code{write} can sometimes
 266 not be detected until @code{close}.  @xref{I/O Primitives}, for details
 267 on their meaning.
 268 @end table
 269
 270 Please note that there is @emph{no} separate @code{close64} function.
 271 This is not necessary since this function does not determine nor depend
 272 on the more of the file.  The kernel which performs the @code{close}
 273 operation knows for which mode the descriptor is used and can handle
 274 this situation.
 275 @end deftypefun
 276
 277 To close a stream, call @code{fclose} (@pxref{Closing Streams}) instead
 278 of trying to close its underlying file descriptor with @code{close}.
 279 This flushes any buffered output and updates the stream object to
 280 indicate that it is closed.
 281
 282
 283 @node Truncating Files
 284 @section Change the size of a file
 285
 286 In some situations it is useful to explicitly determine the size of a
 287 file.  Since the 4.2BSD days there is a function to truncate a file to
 288 at most a given number of bytes and POSIX defines one additional
 289 function.  The prototypes for these functions are in @file{unistd.h}.
 290
 291 @comment unistd.h
 292 @comment X/Open
 293 @deftypefun int truncate (const char *@var{name}, off_t @var{length})
 294 The @code{truncation} function truncates the file named by @var{name} to
 295 at most @var{length} bytes.  I.e., if the file was larger before the
 296 extra bytes are stripped of.  If the file was small or equal to
 297 @var{length} in size before nothing is done.  The file must be writable
 298 by the user to perform this operation.
 299
 300 When the source file is compiled with @code{_FILE_OFFSET_BITS == 64} the
 301 @code{truncate} function is in fact @code{truncate64} and the type
 302 @code{off_t} has 64 bits which makes it possible to handle files up to
 303 @math{2^63} bytes.
 304
 305 The return value is zero is everything went ok.  Otherwise the return
 306 value is @math{-1} and the global variable @var{errno} is set to:
 307 @table @code
 308 @item EACCES
 309 The file is not accessible to the user.
 310 @item EINVAL
 311 The @var{length} value is illegal.
 312 @item EISDIR
 313 The object named by @var{name} is a directory.
 314 @item ENOENT
 315 The file named by @var{name} does not exist.
 316 @item ENOTDIR
 317 One part of the @var{name} is not a directory.
 318 @end table
 319
 320 This function was introduced in 4.2BSD but also was available in later
 321 @w{System V} systems.  It is not added to POSIX since the authors felt
 322 it is only of marginally additional utility.  See below.
 323 @end deftypefun
 324
 325 @comment unistd.h
 326 @comment LFS
 327 @deftypefun int truncate64 (const char *@var{name}, off64_t @var{length})
 328 This function is similar to the @code{truncate} function.  The
 329 difference is that the @var{length} argument is even on 32 bits machines
 330 64 bits wide which allows to handle file with a size up to @math{2^63}
 331 bytes.
 332
 333 When the sources are defined using @code{_FILE_OFFSET_BITS == 64} on a
 334 32 bits machine this function is actually available under the name
 335 @code{truncate} and so transparently replaces the 32 bits interface.
 336 @end deftypefun
 337
 338 @comment unistd.h
 339 @comment POSIX
 340 @deftypefun int ftruncate (int @var{fd}, off_t @var{length})
 341 The @code{ftruncate} function is similar to the @code{truncate}
 342 function.  The main difference is that it takes a descriptor for an
 343 opened file instead of a file name to identify the object.  The file
 344 must be opened for writing to successfully carry out the operation.
 345
 346 The POSIX standard leaves it implementation defined what happens if the
 347 specified new @var{length} of the file is bigger than the original size.
 348 The @code{ftruncate} function might simply leave the file alone and do
 349 nothing or it can increase the size to the desired size.  In this later
 350 case the extended area should be zero-filled.  So using @code{ftruncate}
 351 is no reliable way to increase the file size but if it is possible it is
 352 probably the fastest way.  The function also operates on POSIX shared
 353 memory segments if these are implemented by the system.
 354
 355 When the source file is compiled with @code{_FILE_OFFSET_BITS == 64} the
 356 @code{ftruncate} function is in fact @code{ftruncate64} and the type
 357 @code{off_t} has 64 bits which makes it possible to handle files up to
 358 @math{2^63} bytes.
 359
 360 On success the function returns zero.  Otherwise it returns @math{-1}
 361 and set @var{errno} to one of these values:
 362 @table @code
 363 @item EBADF
 364 @var{fd} is no valid file descriptor or is not opened for writing.
 365 @item EINVAL
 366 The object referred to by @var{fd} does not permit this operation.
 367 @item EROFS
 368 The file is on a read-only file system.
 369 @end table
 370 @end deftypefun
 371
 372 @comment unistd.h
 373 @comment LFS
 374 @deftypefun int ftruncate64 (int @var{id}, off64_t @var{length})
 375 This function is similar to the @code{ftruncate} function.  The
 376 difference is that the @var{length} argument is even on 32 bits machines
 377 64 bits wide which allows to handle file with a size up to @math{2^63}
 378 bytes.
 379
 380 When the sources are defined using @code{_FILE_OFFSET_BITS == 64} on a
 381 32 bits machine this function is actually available under the name
 382 @code{ftruncate} and so transparently replaces the 32 bits interface.
 383 @end deftypefun
 384
 385 @node I/O Primitives
 386 @section Input and Output Primitives
 387
 388 This section describes the functions for performing primitive input and
 389 output operations on file descriptors: @code{read}, @code{write}, and
 390 @code{lseek}.  These functions are declared in the header file
 391 @file{unistd.h}.
 392 @pindex unistd.h
 393
 394 @comment unistd.h
 395 @comment POSIX.1
 396 @deftp {Data Type} ssize_t
 397 This data type is used to represent the sizes of blocks that can be
 398 read or written in a single operation.  It is similar to @code{size_t},
 399 but must be a signed type.
 400 @end deftp
 401
 402 @cindex reading from a file descriptor
 403 @comment unistd.h
 404 @comment POSIX.1
 405 @deftypefun ssize_t read (int @var{filedes}, void *@var{buffer}, size_t @var{size})
 406 The @code{read} function reads up to @var{size} bytes from the file
 407 with descriptor @var{filedes}, storing the results in the @var{buffer}.
 408 (This is not necessarily a character string and there is no terminating
 409 null character added.)
 410
 411 @cindex end-of-file, on a file descriptor
 412 The return value is the number of bytes actually read.  This might be
 413 less than @var{size}; for example, if there aren't that many bytes left
 414 in the file or if there aren't that many bytes immediately available.
 415 The exact behavior depends on what kind of file it is.  Note that
 416 reading less than @var{size} bytes is not an error.
 417
 418 A value of zero indicates end-of-file (except if the value of the
 419 @var{size} argument is also zero).  This is not considered an error.
 420 If you keep calling @code{read} while at end-of-file, it will keep
 421 returning zero and doing nothing else.
 422
 423 If @code{read} returns at least one character, there is no way you can
 424 tell whether end-of-file was reached.  But if you did reach the end, the
 425 next read will return zero.
 426
 427 In case of an error, @code{read} returns @code{-1}.  The following
 428 @code{errno} error conditions are defined for this function:
 429
 430 @table @code
 431 @item EAGAIN
 432 Normally, when no input is immediately available, @code{read} waits for
 433 some input.  But if the @code{O_NONBLOCK} flag is set for the file
 434 (@pxref{File Status Flags}), @code{read} returns immediately without
 435 reading any data, and reports this error.
 436
 437 @strong{Compatibility Note:} Most versions of BSD Unix use a different
 438 error code for this: @code{EWOULDBLOCK}.  In the GNU library,
 439 @code{EWOULDBLOCK} is an alias for @code{EAGAIN}, so it doesn't matter
 440 which name you use.
 441
 442 On some systems, reading a large amount of data from a character special
 443 file can also fail with @code{EAGAIN} if the kernel cannot find enough
 444 physical memory to lock down the user's pages.  This is limited to
 445 devices that transfer with direct memory access into the user's memory,
 446 which means it does not include terminals, since they always use
 447 separate buffers inside the kernel.  This problem never happens in the
 448 GNU system.
 449
 450 Any condition that could result in @code{EAGAIN} can instead result in a
 451 successful @code{read} which returns fewer bytes than requested.
 452 Calling @code{read} again immediately would result in @code{EAGAIN}.
 453
 454 @item EBADF
 455 The @var{filedes} argument is not a valid file descriptor,
 456 or is not open for reading.
 457
 458 @item EINTR
 459 @code{read} was interrupted by a signal while it was waiting for input.
 460 @xref{Interrupted Primitives}.  A signal will not necessary cause
 461 @code{read} to return @code{EINTR}; it may instead result in a
 462 successful @code{read} which returns fewer bytes than requested.
 463
 464 @item EIO
 465 For many devices, and for disk files, this error code indicates
 466 a hardware error.
 467
 468 @code{EIO} also occurs when a background process tries to read from the
 469 controlling terminal, and the normal action of stopping the process by
 470 sending it a @code{SIGTTIN} signal isn't working.  This might happen if
 471 signal is being blocked or ignored, or because the process group is
 472 orphaned.  @xref{Job Control}, for more information about job control,
 473 and @ref{Signal Handling}, for information about signals.
 474 @end table
 475
 476 Please note that there is no function named @code{read64}.  This is not
 477 necessary since this function does not directly modify or handle the
 478 possibly wide file offset.  Since the kernel handles this state
 479 internally the @code{read} function can be used for all cases.
 480
 481 This function is a cancelation point in multi-threaded programs.  This
 482 is a problem if the thread allocates some resources (like memory, file
 483 descriptors, semaphores or whatever) at the time @code{read} is
 484 called.  If the thread gets canceled these resources stay allocated
 485 until the program ends.  To avoid this calls to @code{read} should be
 486 protected using cancelation handlers.
 487 @c ref pthread_cleanup_push / pthread_cleanup_pop
 488
 489 The @code{read} function is the underlying primitive for all of the
 490 functions that read from streams, such as @code{fgetc}.
 491 @end deftypefun
 492
 493 @comment unistd.h
 494 @comment Unix98
 495 @deftypefun ssize_t pread (int @var{filedes}, void *@var{buffer}, size_t @var{size}, off_t @var{offset})
 496 The @code{pread} function is similar to the @code{read} function.  The
 497 first three arguments are identical and also the return values and error
 498 codes correspond.
 499
 500 The difference is the fourth argument and its handling.  The data block
 501 is not read from the current position of the file descriptor
 502 @code{filedes}.  Instead the data is read from the file starting at
 503 position @var{offset}.  The position of the file descriptor itself is
 504 not effected by the operation.  The value is the same as before the call.
 505
 506 When the source file is compiled with @code{_FILE_OFFSET_BITS == 64} the
 507 @code{pread} function is in fact @code{pread64} and the type
 508 @code{off_t} has 64 bits which makes it possible to handle files up to
 509 @math{2^63} bytes.
 510
 511 The return value of @code{pread} describes the number of bytes read.
 512 In the error case it returns @math{-1} like @code{read} does and the
 513 error codes are also the same.  Only there are a few more error codes:
 514 @table @code
 515 @item EINVAL
 516 The value given for @var{offset} is negative and therefore illegal.
 517
 518 @item ESPIPE
 519 The file descriptor @var{filedes} is associate with a pipe or a FIFO and
 520 this device does not allow positioning of the file pointer.
 521 @end table
 522
 523 The function is an extension defined in the Unix Single Specification
 524 version 2.
 525 @end deftypefun
 526
 527 @comment unistd.h
 528 @comment LFS
 529 @deftypefun ssize_t pread64 (int @var{filedes}, void *@var{buffer}, size_t @var{size}, off64_t @var{offset})
 530 This function is similar to the @code{pread} function.  The difference
 531 is that the @var{offset} parameter is of type @code{off64_t} instead of
 532 @code{off_t} which makes it possible on 32 bits machines to address
 533 files larger then @math{2^31} bytes and up to @math{2^63} bytes.  The
 534 file descriptor @code{filedes} must be opened using @code{open64} since
 535 otherwise the large offsets possible with @code{off64_t} will lead to
 536 errors with a descriptor in small file mode.
 537
 538 When the sources are defined using @code{_FILE_OFFSET_BITS == 64} on a
 539 32 bits machine this function is actually available under the name
 540 @code{pread} and so transparently replaces the 32 bits interface.
 541 @end deftypefun
 542
 543 @cindex writing to a file descriptor
 544 @comment unistd.h
 545 @comment POSIX.1
 546 @deftypefun ssize_t write (int @var{filedes}, const void *@var{buffer}, size_t @var{size})
 547 The @code{write} function writes up to @var{size} bytes from
 548 @var{buffer} to the file with descriptor @var{filedes}.  The data in
 549 @var{buffer} is not necessarily a character string and a null character is
 550 output like any other character.
 551
 552 The return value is the number of bytes actually written.  This may be
 553 @var{size}, but can always be smaller.  Your program should always call
 554 @code{write} in a loop, iterating until all the data is written.
 555
 556 Once @code{write} returns, the data is enqueued to be written and can be
 557 read back right away, but it is not necessarily written out to permanent
 558 storage immediately.  You can use @code{fsync} when you need to be sure
 559 your data has been permanently stored before continuing.  (It is more
 560 efficient for the system to batch up consecutive writes and do them all
 561 at once when convenient.  Normally they will always be written to disk
 562 within a minute or less.)  Modern systems provide another function
 563 @code{fdatasync} which guarantees integrity only for the file data and
 564 is therefore faster.
 565 @c !!! xref fsync, fdatasync
 566 You can use the @code{O_FSYNC} open mode to make @code{write} always
 567 store the data to disk before returning; @pxref{Operating Modes}.
 568
 569 In the case of an error, @code{write} returns @code{-1}.  The following
 570 @code{errno} error conditions are defined for this function:
 571
 572 @table @code
 573 @item EAGAIN
 574 Normally, @code{write} blocks until the write operation is complete.
 575 But if the @code{O_NONBLOCK} flag is set for the file (@pxref{Control
 576 Operations}), it returns immediately without writing any data, and
 577 reports this error.  An example of a situation that might cause the
 578 process to block on output is writing to a terminal device that supports
 579 flow control, where output has been suspended by receipt of a STOP
 580 character.
 581
 582 @strong{Compatibility Note:} Most versions of BSD Unix use a different
 583 error code for this: @code{EWOULDBLOCK}.  In the GNU library,
 584 @code{EWOULDBLOCK} is an alias for @code{EAGAIN}, so it doesn't matter
 585 which name you use.
 586
 587 On some systems, writing a large amount of data from a character special
 588 file can also fail with @code{EAGAIN} if the kernel cannot find enough
 589 physical memory to lock down the user's pages.  This is limited to
 590 devices that transfer with direct memory access into the user's memory,
 591 which means it does not include terminals, since they always use
 592 separate buffers inside the kernel.  This problem does not arise in the
 593 GNU system.
 594
 595 @item EBADF
 596 The @var{filedes} argument is not a valid file descriptor,
 597 or is not open for writing.
 598
 599 @item EFBIG
 600 The size of the file would become larger than the implementation can support.
 601
 602 @item EINTR
 603 The @code{write} operation was interrupted by a signal while it was
 604 blocked waiting for completion.  A signal will not necessary cause
 605 @code{write} to return @code{EINTR}; it may instead result in a
 606 successful @code{write} which writes fewer bytes than requested.
 607 @xref{Interrupted Primitives}.
 608
 609 @item EIO
 610 For many devices, and for disk files, this error code indicates
 611 a hardware error.
 612
 613 @item ENOSPC
 614 The device containing the file is full.
 615
 616 @item EPIPE
 617 This error is returned when you try to write to a pipe or FIFO that
 618 isn't open for reading by any process.  When this happens, a @code{SIGPIPE}
 619 signal is also sent to the process; see @ref{Signal Handling}.
 620 @end table
 621
 622 Unless you have arranged to prevent @code{EINTR} failures, you should
 623 check @code{errno} after each failing call to @code{write}, and if the
 624 error was @code{EINTR}, you should simply repeat the call.
 625 @xref{Interrupted Primitives}.  The easy way to do this is with the
 626 macro @code{TEMP_FAILURE_RETRY}, as follows:
 627
 628 @smallexample
 629 nbytes = TEMP_FAILURE_RETRY (write (desc, buffer, count));
 630 @end smallexample
 631
 632 Please note that there is no function named @code{write64}.  This is not
 633 necessary since this function does not directly modify or handle the
 634 possibly wide file offset.  Since the kernel handles this state
 635 internally the @code{write} function can be used for all cases.
 636
 637 This function is a cancelation point in multi-threaded programs.  This
 638 is a problem if the thread allocates some resources (like memory, file
 639 descriptors, semaphores or whatever) at the time @code{write} is
 640 called.  If the thread gets canceled these resources stay allocated
 641 until the program ends.  To avoid this calls to @code{write} should be
 642 protected using cancelation handlers.
 643 @c ref pthread_cleanup_push / pthread_cleanup_pop
 644
 645 The @code{write} function is the underlying primitive for all of the
 646 functions that write to streams, such as @code{fputc}.
 647 @end deftypefun
 648
 649 @comment unistd.h
 650 @comment Unix98
 651 @deftypefun ssize_t pwrite (int @var{filedes}, const void *@var{buffer}, size_t @var{size}, off_t @var{offset})
 652 The @code{pwrite} function is similar to the @code{write} function.  The
 653 first three arguments are identical and also the return values and error
 654 codes correspond.
 655
 656 The difference is the fourth argument and its handling.  The data block
 657 is not written to the current position of the file descriptor
 658 @code{filedes}.  Instead the data is written to the file starting at
 659 position @var{offset}.  The position of the file descriptor itself is
 660 not effected by the operation.  The value is the same as before the call.
 661
 662 When the source file is compiled with @code{_FILE_OFFSET_BITS == 64} the
 663 @code{pwrite} function is in fact @code{pwrite64} and the type
 664 @code{off_t} has 64 bits which makes it possible to handle files up to
 665 @math{2^63} bytes.
 666
 667 The return value of @code{pwrite} describes the number of written bytes.
 668 In the error case it returns @math{-1} like @code{write} does and the
 669 error codes are also the same.  Only there are a few more error codes:
 670 @table @code
 671 @item EINVAL
 672 The value given for @var{offset} is negative and therefore illegal.
 673
 674 @item ESPIPE
 675 The file descriptor @var{filedes} is associate with a pipe or a FIFO and
 676 this device does not allow positioning of the file pointer.
 677 @end table
 678
 679 The function is an extension defined in the Unix Single Specification
 680 version 2.
 681 @end deftypefun
 682
 683 @comment unistd.h
 684 @comment LFS
 685 @deftypefun ssize_t pwrite64 (int @var{filedes}, const void *@var{buffer}, size_t @var{size}, off64_t @var{offset})
 686 This function is similar to the @code{pwrite} function.  The difference
 687 is that the @var{offset} parameter is of type @code{off64_t} instead of
 688 @code{off_t} which makes it possible on 32 bits machines to address
 689 files larger then @math{2^31} bytes and up to @math{2^63} bytes.  The
 690 file descriptor @code{filedes} must be opened using @code{open64} since
 691 otherwise the large offsets possible with @code{off64_t} will lead to
 692 errors with a descriptor in small file mode.
 693
 694 When the sources are defined using @code{_FILE_OFFSET_BITS == 64} on a
 695 32 bits machine this function is actually available under the name
 696 @code{pwrite} and so transparently replaces the 32 bits interface.
 697 @end deftypefun
 698
 699
 700 @node File Position Primitive
 701 @section Setting the File Position of a Descriptor
 702
 703 Just as you can set the file position of a stream with @code{fseek}, you
 704 can set the file position of a descriptor with @code{lseek}.  This
 705 specifies the position in the file for the next @code{read} or
 706 @code{write} operation.  @xref{File Positioning}, for more information
 707 on the file position and what it means.
 708
 709 To read the current file position value from a descriptor, use
 710 @code{lseek (@var{desc}, 0, SEEK_CUR)}.
 711
 712 @cindex file positioning on a file descriptor
 713 @cindex positioning a file descriptor
 714 @cindex seeking on a file descriptor
 715 @comment unistd.h
 716 @comment POSIX.1
 717 @deftypefun off_t lseek (int @var{filedes}, off_t @var{offset}, int @var{whence})
 718 The @code{lseek} function is used to change the file position of the
 719 file with descriptor @var{filedes}.
 720
 721 The @var{whence} argument specifies how the @var{offset} should be
 722 interpreted in the same way as for the @code{fseek} function, and must be
 723 one of the symbolic constants @code{SEEK_SET}, @code{SEEK_CUR}, or
 724 @code{SEEK_END}.
 725
 726 @table @code
 727 @item SEEK_SET
 728 Specifies that @var{whence} is a count of characters from the beginning
 729 of the file.
 730
 731 @item SEEK_CUR
 732 Specifies that @var{whence} is a count of characters from the current
 733 file position.  This count may be positive or negative.
 734
 735 @item SEEK_END
 736 Specifies that @var{whence} is a count of characters from the end of
 737 the file.  A negative count specifies a position within the current
 738 extent of the file; a positive count specifies a position past the
 739 current end.  If you set the position past the current end, and
 740 actually write data, you will extend the file with zeros up to that
 741 position.@end table
 742
 743 The return value from @code{lseek} is normally the resulting file
 744 position, measured in bytes from the beginning of the file.
 745 You can use this feature together with @code{SEEK_CUR} to read the
 746 current file position.
 747
 748 If you want to append to the file, setting the file position to the
 749 current end of file with @code{SEEK_END} is not sufficient.  Another
 750 process may write more data after you seek but before you write,
 751 extending the file so the position you write onto clobbers their data.
 752 Instead, use the @code{O_APPEND} operating mode; @pxref{Operating Modes}.
 753
 754 You can set the file position past the current end of the file.  This
 755 does not by itself make the file longer; @code{lseek} never changes the
 756 file.  But subsequent output at that position will extend the file.
 757 Characters between the previous end of file and the new position are
 758 filled with zeros.  Extending the file in this way can create a
 759 ``hole'': the blocks of zeros are not actually allocated on disk, so the
 760 file takes up less space than it appears so; it is then called a
 761 ``sparse file''.
 762 @cindex sparse files
 763 @cindex holes in files
 764
 765 If the file position cannot be changed, or the operation is in some way
 766 invalid, @code{lseek} returns a value of @code{-1}.  The following
 767 @code{errno} error conditions are defined for this function:
 768
 769 @table @code
 770 @item EBADF
 771 The @var{filedes} is not a valid file descriptor.
 772
 773 @item EINVAL
 774 The @var{whence} argument value is not valid, or the resulting
 775 file offset is not valid.  A file offset is invalid.
 776
 777 @item ESPIPE
 778 The @var{filedes} corresponds to an object that cannot be positioned,
 779 such as a pipe, FIFO or terminal device.  (POSIX.1 specifies this error
 780 only for pipes and FIFOs, but in the GNU system, you always get
 781 @code{ESPIPE} if the object is not seekable.)
 782 @end table
 783
 784 When the source file is compiled with @code{_FILE_OFFSET_BITS == 64} the
 785 @code{lseek} function is in fact @code{lseek64} and the type
 786 @code{off_t} has 64 bits which makes it possible to handle files up to
 787 @math{2^63} bytes.
 788
 789 This function is a cancelation point in multi-threaded programs.  This
 790 is a problem if the thread allocates some resources (like memory, file
 791 descriptors, semaphores or whatever) at the time @code{lseek} is
 792 called.  If the thread gets canceled these resources stay allocated
 793 until the program ends.  To avoid this calls to @code{lseek} should be
 794 protected using cancelation handlers.
 795 @c ref pthread_cleanup_push / pthread_cleanup_pop
 796
 797 The @code{lseek} function is the underlying primitive for the
 798 @code{fseek}, @code{fseeko}, @code{ftell}, @code{ftello} and
 799 @code{rewind} functions, which operate on streams instead of file
 800 descriptors.
 801 @end deftypefun
 802
 803 @comment unistd.h
 804 @comment LFS
 805 @deftypefun off64_t lseek64 (int @var{filedes}, off64_t @var{offset}, int @var{whence})
 806 This function is similar to the @code{lseek} function.  The difference
 807 is that the @var{offset} parameter is of type @code{off64_t} instead of
 808 @code{off_t} which makes it possible on 32 bits machines to address
 809 files larger then @math{2^31} bytes and up to @math{2^63} bytes.  The
 810 file descriptor @code{filedes} must be opened using @code{open64} since
 811 otherwise the large offsets possible with @code{off64_t} will lead to
 812 errors with a descriptor in small file mode.
 813
 814 When the sources are defined using @code{_FILE_OFFSET_BITS == 64} on a
 815 32 bits machine this function is actually available under the name
 816 @code{lseek} and so transparently replaces the 32 bits interface.
 817 @end deftypefun
 818
 819 You can have multiple descriptors for the same file if you open the file
 820 more than once, or if you duplicate a descriptor with @code{dup}.
 821 Descriptors that come from separate calls to @code{open} have independent
 822 file positions; using @code{lseek} on one descriptor has no effect on the
 823 other.  For example,
 824
 825 @smallexample
 826 @group
 827 @{
 828   int d1, d2;
 829   char buf[4];
 830   d1 = open ("foo", O_RDONLY);
 831   d2 = open ("foo", O_RDONLY);
 832   lseek (d1, 1024, SEEK_SET);
 833   read (d2, buf, 4);
 834 @}
 835 @end group
 836 @end smallexample
 837
 838 @noindent
 839 will read the first four characters of the file @file{foo}.  (The
 840 error-checking code necessary for a real program has been omitted here
 841 for brevity.)
 842
 843 By contrast, descriptors made by duplication share a common file
 844 position with the original descriptor that was duplicated.  Anything
 845 which alters the file position of one of the duplicates, including
 846 reading or writing data, affects all of them alike.  Thus, for example,
 847
 848 @smallexample
 849 @{
 850   int d1, d2, d3;
 851   char buf1[4], buf2[4];
 852   d1 = open ("foo", O_RDONLY);
 853   d2 = dup (d1);
 854   d3 = dup (d2);
 855   lseek (d3, 1024, SEEK_SET);
 856   read (d1, buf1, 4);
 857   read (d2, buf2, 4);
 858 @}
 859 @end smallexample
 860
 861 @noindent
 862 will read four characters starting with the 1024'th character of
 863 @file{foo}, and then four more characters starting with the 1028'th
 864 character.
 865
 866 @comment sys/types.h
 867 @comment POSIX.1
 868 @deftp {Data Type} off_t
 869 This is an arithmetic data type used to represent file sizes.
 870 In the GNU system, this is equivalent to @code{fpos_t} or @code{long int}.
 871 @end deftp
 872
 873 @comment sys/types.h
 874 @comment LFS
 875 @deftp {Data Type} off64_t
 876 This type is used similar to @code{off_t}.  The difference is that even
 877 on 32 bits machines, where the @code{off_t} type would 32 bits,
 878 @code{off64_t} has 64 bits and so is able to address files up to
 879 @math{2^63} bytes in length.
 880 @end deftp
 881
 882 These aliases for the @samp{SEEK_@dots{}} constants exist for the sake
 883 of compatibility with older BSD systems.  They are defined in two
 884 different header files: @file{fcntl.h} and @file{sys/file.h}.
 885
 886 @table @code
 887 @item L_SET
 888 An alias for @code{SEEK_SET}.
 889
 890 @item L_INCR
 891 An alias for @code{SEEK_CUR}.
 892
 893 @item L_XTND
 894 An alias for @code{SEEK_END}.
 895 @end table
 896
 897 @node Descriptors and Streams
 898 @section Descriptors and Streams
 899 @cindex streams, and file descriptors
 900 @cindex converting file descriptor to stream
 901 @cindex extracting file descriptor from stream
 902
 903 Given an open file descriptor, you can create a stream for it with the
 904 @code{fdopen} function.  You can get the underlying file descriptor for
 905 an existing stream with the @code{fileno} function.  These functions are
 906 declared in the header file @file{stdio.h}.
 907 @pindex stdio.h
 908
 909 @comment stdio.h
 910 @comment POSIX.1
 911 @deftypefun {FILE *} fdopen (int @var{filedes}, const char *@var{opentype})
 912 The @code{fdopen} function returns a new stream for the file descriptor
 913 @var{filedes}.
 914
 915 The @var{opentype} argument is interpreted in the same way as for the
 916 @code{fopen} function (@pxref{Opening Streams}), except that
 917 the @samp{b} option is not permitted; this is because GNU makes no
 918 distinction between text and binary files.  Also, @code{"w"} and
 919 @code{"w+"} do not cause truncation of the file; these have affect only
 920 when opening a file, and in this case the file has already been opened.
 921 You must make sure that the @var{opentype} argument matches the actual
 922 mode of the open file descriptor.
 923
 924 The return value is the new stream.  If the stream cannot be created
 925 (for example, if the modes for the file indicated by the file descriptor
 926 do not permit the access specified by the @var{opentype} argument), a
 927 null pointer is returned instead.
 928
 929 In some other systems, @code{fdopen} may fail to detect that the modes
 930 for file descriptor do not permit the access specified by
 931 @code{opentype}.  The GNU C library always checks for this.
 932 @end deftypefun
 933
 934 For an example showing the use of the @code{fdopen} function,
 935 see @ref{Creating a Pipe}.
 936
 937 @comment stdio.h
 938 @comment POSIX.1
 939 @deftypefun int fileno (FILE *@var{stream})
 940 This function returns the file descriptor associated with the stream
 941 @var{stream}.  If an error is detected (for example, if the @var{stream}
 942 is not valid) or if @var{stream} does not do I/O to a file,
 943 @code{fileno} returns @code{-1}.
 944 @end deftypefun
 945
 946 @cindex standard file descriptors
 947 @cindex file descriptors, standard
 948 There are also symbolic constants defined in @file{unistd.h} for the
 949 file descriptors belonging to the standard streams @code{stdin},
 950 @code{stdout}, and @code{stderr}; see @ref{Standard Streams}.
 951 @pindex unistd.h
 952
 953 @comment unistd.h
 954 @comment POSIX.1
 955 @table @code
 956 @item STDIN_FILENO
 957 @vindex STDIN_FILENO
 958 This macro has value @code{0}, which is the file descriptor for
 959 standard input.
 960 @cindex standard input file descriptor
 961
 962 @comment unistd.h
 963 @comment POSIX.1
 964 @item STDOUT_FILENO
 965 @vindex STDOUT_FILENO
 966 This macro has value @code{1}, which is the file descriptor for
 967 standard output.
 968 @cindex standard output file descriptor
 969
 970 @comment unistd.h
 971 @comment POSIX.1
 972 @item STDERR_FILENO
 973 @vindex STDERR_FILENO
 974 This macro has value @code{2}, which is the file descriptor for
 975 standard error output.
 976 @end table
 977 @cindex standard error file descriptor
 978
 979 @node Stream/Descriptor Precautions
 980 @section Dangers of Mixing Streams and Descriptors
 981 @cindex channels
 982 @cindex streams and descriptors
 983 @cindex descriptors and streams
 984 @cindex mixing descriptors and streams
 985
 986 You can have multiple file descriptors and streams (let's call both
 987 streams and descriptors ``channels'' for short) connected to the same
 988 file, but you must take care to avoid confusion between channels.  There
 989 are two cases to consider: @dfn{linked} channels that share a single
 990 file position value, and @dfn{independent} channels that have their own
 991 file positions.
 992
 993 It's best to use just one channel in your program for actual data
 994 transfer to any given file, except when all the access is for input.
 995 For example, if you open a pipe (something you can only do at the file
 996 descriptor level), either do all I/O with the descriptor, or construct a
 997 stream from the descriptor with @code{fdopen} and then do all I/O with
 998 the stream.
 999
1000 @menu
1001 * Linked Channels::        Dealing with channels sharing a file position.
1002 * Independent Channels::   Dealing with separately opened, unlinked channels.
1003 * Cleaning Streams::       Cleaning a stream makes it safe to use
1004                             another channel.
1005 @end menu
1006
1007 @node Linked Channels
1008 @subsection Linked Channels
1009 @cindex linked channels
1010
1011 Channels that come from a single opening share the same file position;
1012 we call them @dfn{linked} channels.  Linked channels result when you
1013 make a stream from a descriptor using @code{fdopen}, when you get a
1014 descriptor from a stream with @code{fileno}, when you copy a descriptor
1015 with @code{dup} or @code{dup2}, and when descriptors are inherited
1016 during @code{fork}.  For files that don't support random access, such as
1017 terminals and pipes, @emph{all} channels are effectively linked.  On
1018 random-access files, all append-type output streams are effectively
1019 linked to each other.
1020
1021 @cindex cleaning up a stream
1022 If you have been using a stream for I/O, and you want to do I/O using
1023 another channel (either a stream or a descriptor) that is linked to it,
1024 you must first @dfn{clean up} the stream that you have been using.
1025 @xref{Cleaning Streams}.
1026
1027 Terminating a process, or executing a new program in the process,
1028 destroys all the streams in the process.  If descriptors linked to these
1029 streams persist in other processes, their file positions become
1030 undefined as a result.  To prevent this, you must clean up the streams
1031 before destroying them.
1032
1033 @node Independent Channels
1034 @subsection Independent Channels
1035 @cindex independent channels
1036
1037 When you open channels (streams or descriptors) separately on a seekable
1038 file, each channel has its own file position.  These are called
1039 @dfn{independent channels}.
1040
1041 The system handles each channel independently.  Most of the time, this
1042 is quite predictable and natural (especially for input): each channel
1043 can read or write sequentially at its own place in the file.  However,
1044 if some of the channels are streams, you must take these precautions:
1045
1046 @itemize @bullet
1047 @item
1048 You should clean an output stream after use, before doing anything else
1049 that might read or write from the same part of the file.
1050
1051 @item
1052 You should clean an input stream before reading data that may have been
1053 modified using an independent channel.  Otherwise, you might read
1054 obsolete data that had been in the stream's buffer.
1055 @end itemize
1056
1057 If you do output to one channel at the end of the file, this will
1058 certainly leave the other independent channels positioned somewhere
1059 before the new end.  You cannot reliably set their file positions to the
1060 new end of file before writing, because the file can always be extended
1061 by another process between when you set the file position and when you
1062 write the data.  Instead, use an append-type descriptor or stream; they
1063 always output at the current end of the file.  In order to make the
1064 end-of-file position accurate, you must clean the output channel you
1065 were using, if it is a stream.
1066
1067 It's impossible for two channels to have separate file pointers for a
1068 file that doesn't support random access.  Thus, channels for reading or
1069 writing such files are always linked, never independent.  Append-type
1070 channels are also always linked.  For these channels, follow the rules
1071 for linked channels; see @ref{Linked Channels}.
1072
1073 @node Cleaning Streams
1074 @subsection Cleaning Streams
1075
1076 On the GNU system, you can clean up any stream with @code{fclean}:
1077
1078 @comment stdio.h
1079 @comment GNU
1080 @deftypefun int fclean (FILE *@var{stream})
1081 Clean up the stream @var{stream} so that its buffer is empty.  If
1082 @var{stream} is doing output, force it out.  If @var{stream} is doing
1083 input, give the data in the buffer back to the system, arranging to
1084 reread it.
1085 @end deftypefun
1086
1087 On other systems, you can use @code{fflush} to clean a stream in most
1088 cases.
1089
1090 You can skip the @code{fclean} or @code{fflush} if you know the stream
1091 is already clean.  A stream is clean whenever its buffer is empty.  For
1092 example, an unbuffered stream is always clean.  An input stream that is
1093 at end-of-file is clean.  A line-buffered stream is clean when the last
1094 character output was a newline.
1095
1096 There is one case in which cleaning a stream is impossible on most
1097 systems.  This is when the stream is doing input from a file that is not
1098 random-access.  Such streams typically read ahead, and when the file is
1099 not random access, there is no way to give back the excess data already
1100 read.  When an input stream reads from a random-access file,
1101 @code{fflush} does clean the stream, but leaves the file pointer at an
1102 unpredictable place; you must set the file pointer before doing any
1103 further I/O.  On the GNU system, using @code{fclean} avoids both of
1104 these problems.
1105
1106 Closing an output-only stream also does @code{fflush}, so this is a
1107 valid way of cleaning an output stream.  On the GNU system, closing an
1108 input stream does @code{fclean}.
1109
1110 You need not clean a stream before using its descriptor for control
1111 operations such as setting terminal modes; these operations don't affect
1112 the file position and are not affected by it.  You can use any
1113 descriptor for these operations, and all channels are affected
1114 simultaneously.  However, text already ``output'' to a stream but still
1115 buffered by the stream will be subject to the new terminal modes when
1116 subsequently flushed.  To make sure ``past'' output is covered by the
1117 terminal settings that were in effect at the time, flush the output
1118 streams for that terminal before setting the modes.  @xref{Terminal
1119 Modes}.
1120
1121 @node Waiting for I/O
1122 @section Waiting for Input or Output
1123 @cindex waiting for input or output
1124 @cindex multiplexing input
1125 @cindex input from multiple files
1126
1127 Sometimes a program needs to accept input on multiple input channels
1128 whenever input arrives.  For example, some workstations may have devices
1129 such as a digitizing tablet, function button box, or dial box that are
1130 connected via normal asynchronous serial interfaces; good user interface
1131 style requires responding immediately to input on any device.  Another
1132 example is a program that acts as a server to several other processes
1133 via pipes or sockets.
1134
1135 You cannot normally use @code{read} for this purpose, because this
1136 blocks the program until input is available on one particular file
1137 descriptor; input on other channels won't wake it up.  You could set
1138 nonblocking mode and poll each file descriptor in turn, but this is very
1139 inefficient.
1140
1141 A better solution is to use the @code{select} function.  This blocks the
1142 program until input or output is ready on a specified set of file
1143 descriptors, or until a timer expires, whichever comes first.  This
1144 facility is declared in the header file @file{sys/types.h}.
1145 @pindex sys/types.h
1146
1147 In the case of a server socket (@pxref{Listening}), we say that
1148 ``input'' is available when there are pending connections that could be
1149 accepted (@pxref{Accepting Connections}).  @code{accept} for server
1150 sockets blocks and interacts with @code{select} just as @code{read} does
1151 for normal input.
1152
1153 @cindex file descriptor sets, for @code{select}
1154 The file descriptor sets for the @code{select} function are specified
1155 as @code{fd_set} objects.  Here is the description of the data type
1156 and some macros for manipulating these objects.
1157
1158 @comment sys/types.h
1159 @comment BSD
1160 @deftp {Data Type} fd_set
1161 The @code{fd_set} data type represents file descriptor sets for the
1162 @code{select} function.  It is actually a bit array.
1163 @end deftp
1164
1165 @comment sys/types.h
1166 @comment BSD
1167 @deftypevr Macro int FD_SETSIZE
1168 The value of this macro is the maximum number of file descriptors that a
1169 @code{fd_set} object can hold information about.  On systems with a
1170 fixed maximum number, @code{FD_SETSIZE} is at least that number.  On
1171 some systems, including GNU, there is no absolute limit on the number of
1172 descriptors open, but this macro still has a constant value which
1173 controls the number of bits in an @code{fd_set}; if you get a file
1174 descriptor with a value as high as @code{FD_SETSIZE}, you cannot put
1175 that descriptor into an @code{fd_set}.
1176 @end deftypevr
1177
1178 @comment sys/types.h
1179 @comment BSD
1180 @deftypefn Macro void FD_ZERO (fd_set *@var{set})
1181 This macro initializes the file descriptor set @var{set} to be the
1182 empty set.
1183 @end deftypefn
1184
1185 @comment sys/types.h
1186 @comment BSD
1187 @deftypefn Macro void FD_SET (int @var{filedes}, fd_set *@var{set})
1188 This macro adds @var{filedes} to the file descriptor set @var{set}.
1189 @end deftypefn
1190
1191 @comment sys/types.h
1192 @comment BSD
1193 @deftypefn Macro void FD_CLR (int @var{filedes}, fd_set *@var{set})
1194 This macro removes @var{filedes} from the file descriptor set @var{set}.
1195 @end deftypefn
1196
1197 @comment sys/types.h
1198 @comment BSD
1199 @deftypefn Macro int FD_ISSET (int @var{filedes}, fd_set *@var{set})
1200 This macro returns a nonzero value (true) if @var{filedes} is a member
1201 of the file descriptor set @var{set}, and zero (false) otherwise.
1202 @end deftypefn
1203
1204 Next, here is the description of the @code{select} function itself.
1205
1206 @comment sys/types.h
1207 @comment BSD
1208 @deftypefun int select (int @var{nfds}, fd_set *@var{read-fds}, fd_set *@var{write-fds}, fd_set *@var{except-fds}, struct timeval *@var{timeout})
1209 The @code{select} function blocks the calling process until there is
1210 activity on any of the specified sets of file descriptors, or until the
1211 timeout period has expired.
1212
1213 The file descriptors specified by the @var{read-fds} argument are
1214 checked to see if they are ready for reading; the @var{write-fds} file
1215 descriptors are checked to see if they are ready for writing; and the
1216 @var{except-fds} file descriptors are checked for exceptional
1217 conditions.  You can pass a null pointer for any of these arguments if
1218 you are not interested in checking for that kind of condition.
1219
1220 A file descriptor is considered ready for reading if it is at end of
1221 file.  A server socket is considered ready for reading if there is a
1222 pending connection which can be accepted with @code{accept};
1223 @pxref{Accepting Connections}.  A client socket is ready for writing when
1224 its connection is fully established; @pxref{Connecting}.
1225
1226 ``Exceptional conditions'' does not mean errors---errors are reported
1227 immediately when an erroneous system call is executed, and do not
1228 constitute a state of the descriptor.  Rather, they include conditions
1229 such as the presence of an urgent message on a socket.  (@xref{Sockets},
1230 for information on urgent messages.)
1231
1232 The @code{select} function checks only the first @var{nfds} file
1233 descriptors.  The usual thing is to pass @code{FD_SETSIZE} as the value
1234 of this argument.
1235
1236 The @var{timeout} specifies the maximum time to wait.  If you pass a
1237 null pointer for this argument, it means to block indefinitely until one
1238 of the file descriptors is ready.  Otherwise, you should provide the
1239 time in @code{struct timeval} format; see @ref{High-Resolution
1240 Calendar}.  Specify zero as the time (a @code{struct timeval} containing
1241 all zeros) if you want to find out which descriptors are ready without
1242 waiting if none are ready.
1243
1244 The normal return value from @code{select} is the total number of ready file
1245 descriptors in all of the sets.  Each of the argument sets is overwritten
1246 with information about the descriptors that are ready for the corresponding
1247 operation.  Thus, to see if a particular descriptor @var{desc} has input,
1248 use @code{FD_ISSET (@var{desc}, @var{read-fds})} after @code{select} returns.
1249
1250 If @code{select} returns because the timeout period expires, it returns
1251 a value of zero.
1252
1253 Any signal will cause @code{select} to return immediately.  So if your
1254 program uses signals, you can't rely on @code{select} to keep waiting
1255 for the full time specified.  If you want to be sure of waiting for a
1256 particular amount of time, you must check for @code{EINTR} and repeat
1257 the @code{select} with a newly calculated timeout based on the current
1258 time.  See the example below.  See also @ref{Interrupted Primitives}.
1259
1260 If an error occurs, @code{select} returns @code{-1} and does not modify
1261 the argument file descriptor sets.  The following @code{errno} error
1262 conditions are defined for this function:
1263
1264 @table @code
1265 @item EBADF
1266 One of the file descriptor sets specified an invalid file descriptor.
1267
1268 @item EINTR
1269 The operation was interrupted by a signal.  @xref{Interrupted Primitives}.
1270
1271 @item EINVAL
1272 The @var{timeout} argument is invalid; one of the components is negative
1273 or too large.
1274 @end table
1275 @end deftypefun
1276
1277 @strong{Portability Note:}  The @code{select} function is a BSD Unix
1278 feature.
1279
1280 Here is an example showing how you can use @code{select} to establish a
1281 timeout period for reading from a file descriptor.  The @code{input_timeout}
1282 function blocks the calling process until input is available on the
1283 file descriptor, or until the timeout period expires.
1284
1285 @smallexample
1286 @include select.c.texi
1287 @end smallexample
1288
1289 There is another example showing the use of @code{select} to multiplex
1290 input from multiple sockets in @ref{Server Example}.
1291
1292
1293 @node Synchronizing I/O
1294 @section Synchronizing I/O operations
1295
1296 @cindex synchronizing
1297 In most modern operation systems the normal I/O operations are not
1298 executed synchronously.  I.e., even if a @code{write} system call
1299 returns this does not mean the data is actually written to the media,
1300 e.g., the disk.
1301
1302 In situations where synchronization points are necessary the user can
1303 use special functions which ensure that all operations finished before
1304 they return.
1305
1306 @comment unistd.h
1307 @comment X/Open
1308 @deftypefun int sync (void)
1309 A call to this function will not return as long as there is data which
1310 that is not written to the device.  All dirty buffers in the kernel will
1311 be written and so an overall consistent system can be achieved (if no
1312 other process in parallel writes data).
1313
1314 A prototype for @code{sync} can be found in @file{unistd.h}.
1315
1316 The return value is zero to indicate no error.
1317 @end deftypefun
1318
1319 More often it is wanted that not all data in the system is committed.
1320 Programs want to ensure that data written to a given file are all
1321 committed and in this situation @code{sync} is overkill.
1322
1323 @comment unistd.h
1324 @comment POSIX
1325 @deftypefun int fsync (int @var{fildes})
1326 The @code{fsync} can be used to make sure all data associated with the
1327 open file @var{fildes} is written to the device associated with the
1328 descriptor.  The function call does not return unless all actions have
1329 finished.
1330
1331 A prototype for @code{fsync} can be found in @file{unistd.h}.
1332
1333 This function is a cancelation point in multi-threaded programs.  This
1334 is a problem if the thread allocates some resources (like memory, file
1335 descriptors, semaphores or whatever) at the time @code{fsync} is
1336 called.  If the thread gets canceled these resources stay allocated
1337 until the program ends.  To avoid this calls to @code{fsync} should be
1338 protected using cancelation handlers.
1339 @c ref pthread_cleanup_push / pthread_cleanup_pop
1340
1341 The return value of the function is zero if no error occured.  Otherwise
1342 it is @math{-1} and the global variable @var{errno} is set to the
1343 following values:
1344 @table @code
1345 @item EBADF
1346 The descriptor @var{fildes} is not valid.
1347
1348 @item EINVAL
1349 No synchronization is possible since the system does not implement this.
1350 @end table
1351 @end deftypefun
1352
1353 Sometimes it is not even necessary to write all data associated with a
1354 file descriptor.  E.g., in database files which do not change in size it
1355 is enough to write all the file content data to the device.
1356 Meta-information like the modification time etc. are not that important
1357 and leaving such information uncommitted does not prevent a successful
1358 recovering of the file in case of a problem.
1359
1360 @comment unistd.h
1361 @comment POSIX
1362 @deftypefun int fdatasync (int @var{fildes})
1363 When a call to the @code{fdatasync} function returns it is made sure
1364 that all of the file data is written to the device.  For all pending I/O
1365 operations the parts guaranteeing data integrity finished.
1366
1367 Not all systems implement the @code{fdatasync} operation.  On systems
1368 missing this functionality @code{fdatasync} is emulated by a call to
1369 @code{fsync} since the performed actions are a superset of those
1370 required by @code{fdatasyn}.
1371
1372 The prototype for @code{fdatasync} is in @file{unistd.h}.
1373
1374 The return value of the function is zero if no error occured.  Otherwise
1375 it is @math{-1} and the global variable @var{errno} is set to the
1376 following values:
1377 @table @code
1378 @item EBADF
1379 The descriptor @var{fildes} is not valid.
1380
1381 @item EINVAL
1382 No synchronization is possible since the system does not implement this.
1383 @end table
1384 @end deftypefun
1385
1386
1387 @node Asynchronous I/O
1388 @section Perform I/O Operations in Parallel
1389
1390 The POSIX.1b standard defines a new set of I/O operations which can
1391 reduce the time an application spends waiting at I/O significantly.  The
1392 new functions allow a program to initiate one or more I/O operations and
1393 then immediately resume the normal word while the I/O operations are
1394 executed in parallel.
1395
1396 These functions are part of the library with realtime functions named
1397 @file{librt}.  They are not actually part of the @file{libc} binary.
1398 The implementation of these functions can be done using support in the
1399 kernel )if available) or using a implementation based on threads at
1400 userlevel.  In the later case it might be necessary to link applications
1401 linked with @file{librt} also with the thread library @file{libthread}.
1402
1403 All AIO operations operate on files which previously were opened.  There
1404 might be arbitrary many operations for one file running.  The
1405 asynchronous I/O operations are controlled using a data structure named
1406 @code{struct aiocb} (@dfn{AIO control block}).  It is defined in
1407 @file{aio.h} as follows.
1408
1409 @comment aio.h
1410 @comment POSIX.1b
1411 @deftp {Data Type} {struct aiocb}
1412 The POSIX.1b standard mandates that the @code{struct aiocb} structure
1413 contains at least the members described in the following table.  There
1414 might be more elements which are used by the implementation but
1415 depending on these elements is not portable and is highly deprecated.
1416
1417 @table @code
1418 @item int aio_fildes
1419 This element specifies the file descriptor which is used for the
1420 operation.  It must be a legal descriptor since otherwise the operation
1421 fails for obvious reasons.
1422
1423 The device on which the file is opened must allow the seek operation.
1424 I.e., it is not possible to use any of the AIO operations on devices
1425 like terminals where an @code{lseek} call would lead to an error.
1426
1427 @item off_t aio_offset
1428 This element specified at which offset in the file the operation (input
1429 or output) is performed.  Since the operation are carried in arbitrary
1430 order and more than one operation for one file descriptor can be
1431 started, one cannot expect a current read/write position of the file
1432 descriptor.
1433
1434 @item volatile void *aio_buf
1435 This is a pointer to the buffer with the data to be written or the place
1436 where the ead data is stored.
1437
1438 @item size_t aio_nbytes
1439 This element specifies the length of the buffer pointed to by @code{aio_buf}.
1440
1441 @item int aio_reqprio
1442 If for the platform @code{_POSIX_PRIORITIZED_IO} and
1443 @code{_POSIX_PRIORITY_SCHEDULING} is defined the AIO requests are
1444 processed based on the current scheduling priority.  The
1445 @code{aio_reqprio} element can then be used to lower the priority of the
1446 AIO operation.
1447
1448 @item struct sigevent aio_sigevent
1449 This element specifies how the calling process is notified once the
1450 operation terminated.  If the @code{sigev_notify} element is
1451 @code{SIGEV_NONE} no notification is send.  If it is @code{SIGEV_SIGNAL}
1452 the signal determined by @code{sigev_signo} is send.  Otherwise
1453 @code{sigev_notify} must be @code{SIGEV_THREAD} in which case a thread
1454 which starts executing the function pointeed to by
1455 @code{sigev_notify_function}.
1456
1457 @item int aio_lio_opcode
1458 This element is only used by the @code{lio_listio} and
1459 @code{[lio_listio64} functions.  Since these functions allow to start an
1460 arbitrary number of operations at once and since each operationcan be
1461 input or output (or nothing) the information must be stored in the
1462 control block.  The possible values are:
1463
1464 @vtable @code
1465 @item LIO_READ
1466 Start a read operation.  Read from the file at position
1467 @code{aio_offset} and store the next @code{aio_nbytes} bytes in the
1468 buffer pointed to by @code{aio_buf}.
1469
1470 @item LIO_WRITE
1471 Start a write operation.  Write @code{aio_nbytes} bytes starting at
1472 @code{aio_buf} into the file starting at position @code{aio_offset}.
1473
1474 @item LIO_NOP
1475 Do nothing for this control block.  This value is useful sometimes when
1476 an array of @code{struct aiocb} values contains holes, i.e., some of the
1477 values must not be handled allthough the whole array is presented to the
1478 @code{lio_listio} function.
1479 @end vtable
1480 @end table
1481 @end deftp
1482
1483 @menu
1484 * Asynchronous Reads::           Asynchronous Read Operations.
1485 * Cancel AIO Operations::        Cancelation of AIO Operations.
1486 @end menu
1487
1488 @node Asynchronous Reads
1489 @subsection Asynchronous Read Operations
1490
1491 @comment aio.h
1492 @comment POSIX.1b
1493 @deftypefun int aio_read (struct aiocb *@var{aiocbp})
1494 This function initiates an asynchronous read operation.  The function
1495 call immedaitely returns after the operation was enqueued or if before
1496 this happens an error was encoutered.
1497
1498 The first @code{aiocbp->aio_nbytes} bytes from the buffer starting at
1499 @code{aiocbp->aio_buf} are written to the file for which
1500 @code{aiocbp->aio_fildes} is an descriptor, starting at the absolute
1501 position @code{aiocbp->aio_offset} in the file.
1502
1503 If prioritized I/O is supported by the platform the
1504 @code{aiocbp->aio_reqprio} value is used to adjust the priority before
1505 the request is actually enqueued.
1506
1507 The calling process is notified about the termination of the read
1508 request according to the @code{aiocbp->aio_sigevent} value.
1509
1510 When @code{aio_read} returns the return value is zero if no error
1511 occurred that can be found before the process is enqueued.  If such an
1512 earlier error is found the function returns @code{-1} and sets
1513 @code{errno} to one of the following values.
1514
1515 @table @code
1516 @item EAGAIN
1517 The request was not enqueued due to (temporarily) exceeded resource
1518 limitations.
1519 @item ENOSYS
1520 The @code{aio_read} function is not implemented.
1521 @item EBADF
1522 The @code{aiocbp->aio_fildes} descriptor is not valid.  This condition
1523 need not be recognized before enqueueing the request and so this error
1524 might also be signaled asynchrously.
1525 @item EINVAL
1526 The @code{aiocbp->aio_offset} or @code{aiocbp->aio_reqpiro} value is
1527 invalid.  This condition need not be recognized before enqueueing the
1528 request and so this error might also be signaled asynchrously.
1529 @end table
1530
1531 In the case @code{aio_read} return zero the current status of the
1532 request can be queried using @code{aio_error} and @code{aio_return}
1533 questions.  As long as the value returned by @code{aio_error} is
1534 @code{EINPROGRESS} the operation has not yet completed.  If
1535 @code{aio_error} returns zero the operation successfully terminated,
1536 otherwise the value is to be interpreted as an error code.  If the
1537 function terminated the result of the operation can be get using a call
1538 to @code{aio_return}.  The returned value is the same as an equivalent
1539 call to @code{read} would have returned.  Possible error code returned
1540 by @code{aio_error} are:
1541
1542 @table @code
1543 @item EBADF
1544 The @code{aiocbp->aio_fildes} descriptor is not valid.
1545 @item ECANCELED
1546 The operation was canceled before the operation was finished
1547 (@pxref{Cancel AIO Operations})
1548 @item EINVAL
1549 The @code{aiocbp->aio_offset} value is invalid.
1550 @end table
1551 @end deftypefun
1552
1553 @comment aio.h
1554 @comment POSIX.1b
1555 @deftypefun int aio_read64 (struct aiocb *@var{aiocbp})
1556 This function is similar to the @code{aio_read} function.  The only
1557 difference is that only @w{32 bits} machines the file descriptor should
1558 be opened in the large file mode.  Internally @code{aio_read64} uses
1559 functionality equivalent to @code{lseek64} to position the file
1560 descriptor correctly for the reading, as opposed to @code{lseek}
1561 funcationality used in @code{aio_read}.
1562 @end deftypefun
1563
1564
1565 @node Cancel AIO Operations
1566 @subsection Cancelation of AIO Operations
1567
1568
1569 @node Control Operations
1570 @section Control Operations on Files
1571
1572 @cindex control operations on files
1573 @cindex @code{fcntl} function
1574 This section describes how you can perform various other operations on
1575 file descriptors, such as inquiring about or setting flags describing
1576 the status of the file descriptor, manipulating record locks, and the
1577 like.  All of these operations are performed by the function @code{fcntl}.
1578
1579 The second argument to the @code{fcntl} function is a command that
1580 specifies which operation to perform.  The function and macros that name
1581 various flags that are used with it are declared in the header file
1582 @file{fcntl.h}.  Many of these flags are also used by the @code{open}
1583 function; see @ref{Opening and Closing Files}.
1584 @pindex fcntl.h
1585
1586 @comment fcntl.h
1587 @comment POSIX.1
1588 @deftypefun int fcntl (int @var{filedes}, int @var{command}, @dots{})
1589 The @code{fcntl} function performs the operation specified by
1590 @var{command} on the file descriptor @var{filedes}.  Some commands
1591 require additional arguments to be supplied.  These additional arguments
1592 and the return value and error conditions are given in the detailed
1593 descriptions of the individual commands.
1594
1595 Briefly, here is a list of what the various commands are.
1596
1597 @table @code
1598 @item F_DUPFD
1599 Duplicate the file descriptor (return another file descriptor pointing
1600 to the same open file).  @xref{Duplicating Descriptors}.
1601
1602 @item F_GETFD
1603 Get flags associated with the file descriptor.  @xref{Descriptor Flags}.
1604
1605 @item F_SETFD
1606 Set flags associated with the file descriptor.  @xref{Descriptor Flags}.
1607
1608 @item F_GETFL
1609 Get flags associated with the open file.  @xref{File Status Flags}.
1610
1611 @item F_SETFL
1612 Set flags associated with the open file.  @xref{File Status Flags}.
1613
1614 @item F_GETLK
1615 Get a file lock.  @xref{File Locks}.
1616
1617 @item F_SETLK
1618 Set or clear a file lock.  @xref{File Locks}.
1619
1620 @item F_SETLKW
1621 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
1622
1623 @item F_GETOWN
1624 Get process or process group ID to receive @code{SIGIO} signals.
1625 @xref{Interrupt Input}.
1626
1627 @item F_SETOWN
1628 Set process or process group ID to receive @code{SIGIO} signals.
1629 @xref{Interrupt Input}.
1630 @end table
1631
1632 This function is a cancelation point in multi-threaded programs.  This
1633 is a problem if the thread allocates some resources (like memory, file
1634 descriptors, semaphores or whatever) at the time @code{fcntl} is
1635 called.  If the thread gets canceled these resources stay allocated
1636 until the program ends.  To avoid this calls to @code{fcntl} should be
1637 protected using cancelation handlers.
1638 @c ref pthread_cleanup_push / pthread_cleanup_pop
1639 @end deftypefun
1640
1641
1642 @node Duplicating Descriptors
1643 @section Duplicating Descriptors
1644
1645 @cindex duplicating file descriptors
1646 @cindex redirecting input and output
1647
1648 You can @dfn{duplicate} a file descriptor, or allocate another file
1649 descriptor that refers to the same open file as the original.  Duplicate
1650 descriptors share one file position and one set of file status flags
1651 (@pxref{File Status Flags}), but each has its own set of file descriptor
1652 flags (@pxref{Descriptor Flags}).
1653
1654 The major use of duplicating a file descriptor is to implement
1655 @dfn{redirection} of input or output:  that is, to change the
1656 file or pipe that a particular file descriptor corresponds to.
1657
1658 You can perform this operation using the @code{fcntl} function with the
1659 @code{F_DUPFD} command, but there are also convenient functions
1660 @code{dup} and @code{dup2} for duplicating descriptors.
1661
1662 @pindex unistd.h
1663 @pindex fcntl.h
1664 The @code{fcntl} function and flags are declared in @file{fcntl.h},
1665 while prototypes for @code{dup} and @code{dup2} are in the header file
1666 @file{unistd.h}.
1667
1668 @comment unistd.h
1669 @comment POSIX.1
1670 @deftypefun int dup (int @var{old})
1671 This function copies descriptor @var{old} to the first available
1672 descriptor number (the first number not currently open).  It is
1673 equivalent to @code{fcntl (@var{old}, F_DUPFD, 0)}.
1674 @end deftypefun
1675
1676 @comment unistd.h
1677 @comment POSIX.1
1678 @deftypefun int dup2 (int @var{old}, int @var{new})
1679 This function copies the descriptor @var{old} to descriptor number
1680 @var{new}.
1681
1682 If @var{old} is an invalid descriptor, then @code{dup2} does nothing; it
1683 does not close @var{new}.  Otherwise, the new duplicate of @var{old}
1684 replaces any previous meaning of descriptor @var{new}, as if @var{new}
1685 were closed first.
1686
1687 If @var{old} and @var{new} are different numbers, and @var{old} is a
1688 valid descriptor number, then @code{dup2} is equivalent to:
1689
1690 @smallexample
1691 close (@var{new});
1692 fcntl (@var{old}, F_DUPFD, @var{new})
1693 @end smallexample
1694
1695 However, @code{dup2} does this atomically; there is no instant in the
1696 middle of calling @code{dup2} at which @var{new} is closed and not yet a
1697 duplicate of @var{old}.
1698 @end deftypefun
1699
1700 @comment fcntl.h
1701 @comment POSIX.1
1702 @deftypevr Macro int F_DUPFD
1703 This macro is used as the @var{command} argument to @code{fcntl}, to
1704 copy the file descriptor given as the first argument.
1705
1706 The form of the call in this case is:
1707
1708 @smallexample
1709 fcntl (@var{old}, F_DUPFD, @var{next-filedes})
1710 @end smallexample
1711
1712 The @var{next-filedes} argument is of type @code{int} and specifies that
1713 the file descriptor returned should be the next available one greater
1714 than or equal to this value.
1715
1716 The return value from @code{fcntl} with this command is normally the value
1717 of the new file descriptor.  A return value of @code{-1} indicates an
1718 error.  The following @code{errno} error conditions are defined for
1719 this command:
1720
1721 @table @code
1722 @item EBADF
1723 The @var{old} argument is invalid.
1724
1725 @item EINVAL
1726 The @var{next-filedes} argument is invalid.
1727
1728 @item EMFILE
1729 There are no more file descriptors available---your program is already
1730 using the maximum.  In BSD and GNU, the maximum is controlled by a
1731 resource limit that can be changed; @pxref{Limits on Resources}, for
1732 more information about the @code{RLIMIT_NOFILE} limit.
1733 @end table
1734
1735 @code{ENFILE} is not a possible error code for @code{dup2} because
1736 @code{dup2} does not create a new opening of a file; duplicate
1737 descriptors do not count toward the limit which @code{ENFILE}
1738 indicates.  @code{EMFILE} is possible because it refers to the limit on
1739 distinct descriptor numbers in use in one process.
1740 @end deftypevr
1741
1742 Here is an example showing how to use @code{dup2} to do redirection.
1743 Typically, redirection of the standard streams (like @code{stdin}) is
1744 done by a shell or shell-like program before calling one of the
1745 @code{exec} functions (@pxref{Executing a File}) to execute a new
1746 program in a child process.  When the new program is executed, it
1747 creates and initializes the standard streams to point to the
1748 corresponding file descriptors, before its @code{main} function is
1749 invoked.
1750
1751 So, to redirect standard input to a file, the shell could do something
1752 like:
1753
1754 @smallexample
1755 pid = fork ();
1756 if (pid == 0)
1757   @{
1758     char *filename;
1759     char *program;
1760     int file;
1761     @dots{}
1762     file = TEMP_FAILURE_RETRY (open (filename, O_RDONLY));
1763     dup2 (file, STDIN_FILENO);
1764     TEMP_FAILURE_RETRY (close (file));
1765     execv (program, NULL);
1766   @}
1767 @end smallexample
1768
1769 There is also a more detailed example showing how to implement redirection
1770 in the context of a pipeline of processes in @ref{Launching Jobs}.
1771
1772
1773 @node Descriptor Flags
1774 @section File Descriptor Flags
1775 @cindex file descriptor flags
1776
1777 @dfn{File descriptor flags} are miscellaneous attributes of a file
1778 descriptor.  These flags are associated with particular file
1779 descriptors, so that if you have created duplicate file descriptors
1780 from a single opening of a file, each descriptor has its own set of flags.
1781
1782 Currently there is just one file descriptor flag: @code{FD_CLOEXEC},
1783 which causes the descriptor to be closed if you use any of the
1784 @code{exec@dots{}} functions (@pxref{Executing a File}).
1785
1786 The symbols in this section are defined in the header file
1787 @file{fcntl.h}.
1788 @pindex fcntl.h
1789
1790 @comment fcntl.h
1791 @comment POSIX.1
1792 @deftypevr Macro int F_GETFD
1793 This macro is used as the @var{command} argument to @code{fcntl}, to
1794 specify that it should return the file descriptor flags associated
1795 with the @var{filedes} argument.
1796
1797 The normal return value from @code{fcntl} with this command is a
1798 nonnegative number which can be interpreted as the bitwise OR of the
1799 individual flags (except that currently there is only one flag to use).
1800
1801 In case of an error, @code{fcntl} returns @code{-1}.  The following
1802 @code{errno} error conditions are defined for this command:
1803
1804 @table @code
1805 @item EBADF
1806 The @var{filedes} argument is invalid.
1807 @end table
1808 @end deftypevr
1809
1810
1811 @comment fcntl.h
1812 @comment POSIX.1
1813 @deftypevr Macro int F_SETFD
1814 This macro is used as the @var{command} argument to @code{fcntl}, to
1815 specify that it should set the file descriptor flags associated with the
1816 @var{filedes} argument.  This requires a third @code{int} argument to
1817 specify the new flags, so the form of the call is:
1818
1819 @smallexample
1820 fcntl (@var{filedes}, F_SETFD, @var{new-flags})
1821 @end smallexample
1822
1823 The normal return value from @code{fcntl} with this command is an
1824 unspecified value other than @code{-1}, which indicates an error.
1825 The flags and error conditions are the same as for the @code{F_GETFD}
1826 command.
1827 @end deftypevr
1828
1829 The following macro is defined for use as a file descriptor flag with
1830 the @code{fcntl} function.  The value is an integer constant usable
1831 as a bit mask value.
1832
1833 @comment fcntl.h
1834 @comment POSIX.1
1835 @deftypevr Macro int FD_CLOEXEC
1836 @cindex close-on-exec (file descriptor flag)
1837 This flag specifies that the file descriptor should be closed when
1838 an @code{exec} function is invoked; see @ref{Executing a File}.  When
1839 a file descriptor is allocated (as with @code{open} or @code{dup}),
1840 this bit is initially cleared on the new file descriptor, meaning that
1841 descriptor will survive into the new program after @code{exec}.
1842 @end deftypevr
1843
1844 If you want to modify the file descriptor flags, you should get the
1845 current flags with @code{F_GETFD} and modify the value.  Don't assume
1846 that the flags listed here are the only ones that are implemented; your
1847 program may be run years from now and more flags may exist then.  For
1848 example, here is a function to set or clear the flag @code{FD_CLOEXEC}
1849 without altering any other flags:
1850
1851 @smallexample
1852 /* @r{Set the @code{FD_CLOEXEC} flag of @var{desc} if @var{value} is nonzero,}
1853    @r{or clear the flag if @var{value} is 0.}
1854    @r{Return 0 on success, or -1 on error with @code{errno} set.} */
1855
1856 int
1857 set_cloexec_flag (int desc, int value)
1858 @{
1859   int oldflags = fcntl (desc, F_GETFD, 0);
1860   /* @r{If reading the flags failed, return error indication now.}
1861   if (oldflags < 0)
1862     return oldflags;
1863   /* @r{Set just the flag we want to set.} */
1864   if (value != 0)
1865     oldflags |= FD_CLOEXEC;
1866   else
1867     oldflags &= ~FD_CLOEXEC;
1868   /* @r{Store modified flag word in the descriptor.} */
1869   return fcntl (desc, F_SETFD, oldflags);
1870 @}
1871 @end smallexample
1872
1873 @node File Status Flags
1874 @section File Status Flags
1875 @cindex file status flags
1876
1877 @dfn{File status flags} are used to specify attributes of the opening of a
1878 file.  Unlike the file descriptor flags discussed in @ref{Descriptor
1879 Flags}, the file status flags are shared by duplicated file descriptors
1880 resulting from a single opening of the file.  The file status flags are
1881 specified with the @var{flags} argument to @code{open};
1882 @pxref{Opening and Closing Files}.
1883
1884 File status flags fall into three categories, which are described in the
1885 following sections.
1886
1887 @itemize @bullet
1888 @item
1889 @ref{Access Modes}, specify what type of access is allowed to the
1890 file: reading, writing, or both.  They are set by @code{open} and are
1891 returned by @code{fcntl}, but cannot be changed.
1892
1893 @item
1894 @ref{Open-time Flags}, control details of what @code{open} will do.
1895 These flags are not preserved after the @code{open} call.
1896
1897 @item
1898 @ref{Operating Modes}, affect how operations such as @code{read} and
1899 @code{write} are done.  They are set by @code{open}, and can be fetched or
1900 changed with @code{fcntl}.
1901 @end itemize
1902
1903 The symbols in this section are defined in the header file
1904 @file{fcntl.h}.
1905 @pindex fcntl.h
1906
1907 @menu
1908 * Access Modes::                Whether the descriptor can read or write.
1909 * Open-time Flags::             Details of @code{open}.
1910 * Operating Modes::             Special modes to control I/O operations.
1911 * Getting File Status Flags::   Fetching and changing these flags.
1912 @end menu
1913
1914 @node Access Modes
1915 @subsection File Access Modes
1916
1917 The file access modes allow a file descriptor to be used for reading,
1918 writing, or both.  (In the GNU system, they can also allow none of these,
1919 and allow execution of the file as a program.)  The access modes are chosen
1920 when the file is opened, and never change.
1921
1922 @comment fcntl.h
1923 @comment POSIX.1
1924 @deftypevr Macro int O_RDONLY
1925 Open the file for read access.
1926 @end deftypevr
1927
1928 @comment fcntl.h
1929 @comment POSIX.1
1930 @deftypevr Macro int O_WRONLY
1931 Open the file for write access.
1932 @end deftypevr
1933
1934 @comment fcntl.h
1935 @comment POSIX.1
1936 @deftypevr Macro int O_RDWR
1937 Open the file for both reading and writing.
1938 @end deftypevr
1939
1940 In the GNU system (and not in other systems), @code{O_RDONLY} and
1941 @code{O_WRONLY} are independent bits that can be bitwise-ORed together,
1942 and it is valid for either bit to be set or clear.  This means that
1943 @code{O_RDWR} is the same as @code{O_RDONLY|O_WRONLY}.  A file access
1944 mode of zero is permissible; it allows no operations that do input or
1945 output to the file, but does allow other operations such as
1946 @code{fchmod}.  On the GNU system, since ``read-only'' or ``write-only''
1947 is a misnomer, @file{fcntl.h} defines additional names for the file
1948 access modes.  These names are preferred when writing GNU-specific code.
1949 But most programs will want to be portable to other POSIX.1 systems and
1950 should use the POSIX.1 names above instead.
1951
1952 @comment fcntl.h
1953 @comment GNU
1954 @deftypevr Macro int O_READ
1955 Open the file for reading.  Same as @code{O_RDWR}; only defined on GNU.
1956 @end deftypevr
1957
1958 @comment fcntl.h
1959 @comment GNU
1960 @deftypevr Macro int O_WRITE
1961 Open the file for reading.  Same as @code{O_WRONLY}; only defined on GNU.
1962 @end deftypevr
1963
1964 @comment fcntl.h
1965 @comment GNU
1966 @deftypevr Macro int O_EXEC
1967 Open the file for executing.  Only defined on GNU.
1968 @end deftypevr
1969
1970 To determine the file access mode with @code{fcntl}, you must extract
1971 the access mode bits from the retrieved file status flags.  In the GNU
1972 system, you can just test the @code{O_READ} and @code{O_WRITE} bits in
1973 the flags word.  But in other POSIX.1 systems, reading and writing
1974 access modes are not stored as distinct bit flags.  The portable way to
1975 extract the file access mode bits is with @code{O_ACCMODE}.
1976
1977 @comment fcntl.h
1978 @comment POSIX.1
1979 @deftypevr Macro int O_ACCMODE
1980 This macro stands for a mask that can be bitwise-ANDed with the file
1981 status flag value to produce a value representing the file access mode.
1982 The mode will be @code{O_RDONLY}, @code{O_WRONLY}, or @code{O_RDWR}.
1983 (In the GNU system it could also be zero, and it never includes the
1984 @code{O_EXEC} bit.)
1985 @end deftypevr
1986
1987 @node Open-time Flags
1988 @subsection Open-time Flags
1989
1990 The open-time flags specify options affecting how @code{open} will behave.
1991 These options are not preserved once the file is open.  The exception to
1992 this is @code{O_NONBLOCK}, which is also an I/O operating mode and so it
1993 @emph{is} saved.  @xref{Opening and Closing Files}, for how to call
1994 @code{open}.
1995
1996 There are two sorts of options specified by open-time flags.
1997
1998 @itemize @bullet
1999 @item
2000 @dfn{File name translation flags} affect how @code{open} looks up the
2001 file name to locate the file, and whether the file can be created.
2002 @cindex file name translation flags
2003 @cindex flags, file name translation
2004
2005 @item
2006 @dfn{Open-time action flags} specify extra operations that @code{open} will
2007 perform on the file once it is open.
2008 @cindex open-time action flags
2009 @cindex flags, open-time action
2010 @end itemize
2011
2012 Here are the file name translation flags.
2013
2014 @comment fcntl.h
2015 @comment POSIX.1
2016 @deftypevr Macro int O_CREAT
2017 If set, the file will be created if it doesn't already exist.
2018 @c !!! mode arg, umask
2019 @cindex create on open (file status flag)
2020 @end deftypevr
2021
2022 @comment fcntl.h
2023 @comment POSIX.1
2024 @deftypevr Macro int O_EXCL
2025 If both @code{O_CREAT} and @code{O_EXCL} are set, then @code{open} fails
2026 if the specified file already exists.  This is guaranteed to never
2027 clobber an existing file.
2028 @end deftypevr
2029
2030 @comment fcntl.h
2031 @comment POSIX.1
2032 @deftypevr Macro int O_NONBLOCK
2033 @cindex non-blocking open
2034 This prevents @code{open} from blocking for a ``long time'' to open the
2035 file.  This is only meaningful for some kinds of files, usually devices
2036 such as serial ports; when it is not meaningful, it is harmless and
2037 ignored.  Often opening a port to a modem blocks until the modem reports
2038 carrier detection; if @code{O_NONBLOCK} is specified, @code{open} will
2039 return immediately without a carrier.
2040
2041 Note that the @code{O_NONBLOCK} flag is overloaded as both an I/O operating
2042 mode and a file name translation flag.  This means that specifying
2043 @code{O_NONBLOCK} in @code{open} also sets nonblocking I/O mode;
2044 @pxref{Operating Modes}.  To open the file without blocking but do normal
2045 I/O that blocks, you must call @code{open} with @code{O_NONBLOCK} set and
2046 then call @code{fcntl} to turn the bit off.
2047 @end deftypevr
2048
2049 @comment fcntl.h
2050 @comment POSIX.1
2051 @deftypevr Macro int O_NOCTTY
2052 If the named file is a terminal device, don't make it the controlling
2053 terminal for the process.  @xref{Job Control}, for information about
2054 what it means to be the controlling terminal.
2055
2056 In the GNU system and 4.4 BSD, opening a file never makes it the
2057 controlling terminal and @code{O_NOCTTY} is zero.  However, other
2058 systems may use a nonzero value for @code{O_NOCTTY} and set the
2059 controlling terminal when you open a file that is a terminal device; so
2060 to be portable, use @code{O_NOCTTY} when it is important to avoid this.
2061 @cindex controlling terminal, setting
2062 @end deftypevr
2063
2064 The following three file name translation flags exist only in the GNU system.
2065
2066 @comment fcntl.h
2067 @comment GNU
2068 @deftypevr Macro int O_IGNORE_CTTY
2069 Do not recognize the named file as the controlling terminal, even if it
2070 refers to the process's existing controlling terminal device.  Operations
2071 on the new file descriptor will never induce job control signals.
2072 @xref{Job Control}.
2073 @end deftypevr
2074
2075 @comment fcntl.h
2076 @comment GNU
2077 @deftypevr Macro int O_NOLINK
2078 If the named file is a symbolic link, open the link itself instead of
2079 the file it refers to.  (@code{fstat} on the new file descriptor will
2080 return the information returned by @code{lstat} on the link's name.)
2081 @cindex symbolic link, opening
2082 @end deftypevr
2083
2084 @comment fcntl.h
2085 @comment GNU
2086 @deftypevr Macro int O_NOTRANS
2087 If the named file is specially translated, do not invoke the translator.
2088 Open the bare file the translator itself sees.
2089 @end deftypevr
2090
2091
2092 The open-time action flags tell @code{open} to do additional operations
2093 which are not really related to opening the file.  The reason to do them
2094 as part of @code{open} instead of in separate calls is that @code{open}
2095 can do them @i{atomically}.
2096
2097 @comment fcntl.h
2098 @comment POSIX.1
2099 @deftypevr Macro int O_TRUNC
2100 Truncate the file to zero length.  This option is only useful for
2101 regular files, not special files such as directories or FIFOs.  POSIX.1
2102 requires that you open the file for writing to use @code{O_TRUNC}.  In
2103 BSD and GNU you must have permission to write the file to truncate it,
2104 but you need not open for write access.
2105
2106 This is the only open-time action flag specified by POSIX.1.  There is
2107 no good reason for truncation to be done by @code{open}, instead of by
2108 calling @code{ftruncate} afterwards.  The @code{O_TRUNC} flag existed in
2109 Unix before @code{ftruncate} was invented, and is retained for backward
2110 compatibility.
2111 @end deftypevr
2112
2113 @comment fcntl.h
2114 @comment BSD
2115 @deftypevr Macro int O_SHLOCK
2116 Acquire a shared lock on the file, as with @code{flock}.
2117 @xref{File Locks}.
2118
2119 If @code{O_CREAT} is specified, the locking is done atomically when
2120 creating the file.  You are guaranteed that no other process will get
2121 the lock on the new file first.
2122 @end deftypevr
2123
2124 @comment fcntl.h
2125 @comment BSD
2126 @deftypevr Macro int O_EXLOCK
2127 Acquire an exclusive lock on the file, as with @code{flock}.
2128 @xref{File Locks}.  This is atomic like @code{O_SHLOCK}.
2129 @end deftypevr
2130
2131 @node Operating Modes
2132 @subsection I/O Operating Modes
2133
2134 The operating modes affect how input and output operations using a file
2135 descriptor work.  These flags are set by @code{open} and can be fetched
2136 and changed with @code{fcntl}.
2137
2138 @comment fcntl.h
2139 @comment POSIX.1
2140 @deftypevr Macro int O_APPEND
2141 The bit that enables append mode for the file.  If set, then all
2142 @code{write} operations write the data at the end of the file, extending
2143 it, regardless of the current file position.  This is the only reliable
2144 way to append to a file.  In append mode, you are guaranteed that the
2145 data you write will always go to the current end of the file, regardless
2146 of other processes writing to the file.  Conversely, if you simply set
2147 the file position to the end of file and write, then another process can
2148 extend the file after you set the file position but before you write,
2149 resulting in your data appearing someplace before the real end of file.
2150 @end deftypevr
2151
2152 @comment fcntl.h
2153 @comment POSIX.1
2154 @deftypevr Macro int O_NONBLOCK
2155 The bit that enables nonblocking mode for the file.  If this bit is set,
2156 @code{read} requests on the file can return immediately with a failure
2157 status if there is no input immediately available, instead of blocking.
2158 Likewise, @code{write} requests can also return immediately with a
2159 failure status if the output can't be written immediately.
2160
2161 Note that the @code{O_NONBLOCK} flag is overloaded as both an I/O
2162 operating mode and a file name translation flag; @pxref{Open-time Flags}.
2163 @end deftypevr
2164
2165 @comment fcntl.h
2166 @comment BSD
2167 @deftypevr Macro int O_NDELAY
2168 This is an obsolete name for @code{O_NONBLOCK}, provided for
2169 compatibility with BSD.  It is not defined by the POSIX.1 standard.
2170 @end deftypevr
2171
2172 The remaining operating modes are BSD and GNU extensions.  They exist only
2173 on some systems.  On other systems, these macros are not defined.
2174
2175 @comment fcntl.h
2176 @comment BSD
2177 @deftypevr Macro int O_ASYNC
2178 The bit that enables asynchronous input mode.  If set, then @code{SIGIO}
2179 signals will be generated when input is available.  @xref{Interrupt Input}.
2180
2181 Asynchronous input mode is a BSD feature.
2182 @end deftypevr
2183
2184 @comment fcntl.h
2185 @comment BSD
2186 @deftypevr Macro int O_FSYNC
2187 The bit that enables synchronous writing for the file.  If set, each
2188 @code{write} call will make sure the data is reliably stored on disk before
2189 returning. @c !!! xref fsync
2190
2191 Synchronous writing is a BSD feature.
2192 @end deftypevr
2193
2194 @comment fcntl.h
2195 @comment BSD
2196 @deftypevr Macro int O_SYNC
2197 This is another name for @code{O_FSYNC}.  They have the same value.
2198 @end deftypevr
2199
2200 @comment fcntl.h
2201 @comment GNU
2202 @deftypevr Macro int O_NOATIME
2203 If this bit is set, @code{read} will not update the access time of the
2204 file.  @xref{File Times}.  This is used by programs that do backups, so
2205 that backing a file up does not count as reading it.
2206 Only the owner of the file or the superuser may use this bit.
2207
2208 This is a GNU extension.
2209 @end deftypevr
2210
2211 @node Getting File Status Flags
2212 @subsection Getting and Setting File Status Flags
2213
2214 The @code{fcntl} function can fetch or change file status flags.
2215
2216 @comment fcntl.h
2217 @comment POSIX.1
2218 @deftypevr Macro int F_GETFL
2219 This macro is used as the @var{command} argument to @code{fcntl}, to
2220 read the file status flags for the open file with descriptor
2221 @var{filedes}.
2222
2223 The normal return value from @code{fcntl} with this command is a
2224 nonnegative number which can be interpreted as the bitwise OR of the
2225 individual flags.  Since the file access modes are not single-bit values,
2226 you can mask off other bits in the returned flags with @code{O_ACCMODE}
2227 to compare them.
2228
2229 In case of an error, @code{fcntl} returns @code{-1}.  The following
2230 @code{errno} error conditions are defined for this command:
2231
2232 @table @code
2233 @item EBADF
2234 The @var{filedes} argument is invalid.
2235 @end table
2236 @end deftypevr
2237
2238 @comment fcntl.h
2239 @comment POSIX.1
2240 @deftypevr Macro int F_SETFL
2241 This macro is used as the @var{command} argument to @code{fcntl}, to set
2242 the file status flags for the open file corresponding to the
2243 @var{filedes} argument.  This command requires a third @code{int}
2244 argument to specify the new flags, so the call looks like this:
2245
2246 @smallexample
2247 fcntl (@var{filedes}, F_SETFL, @var{new-flags})
2248 @end smallexample
2249
2250 You can't change the access mode for the file in this way; that is,
2251 whether the file descriptor was opened for reading or writing.
2252
2253 The normal return value from @code{fcntl} with this command is an
2254 unspecified value other than @code{-1}, which indicates an error.  The
2255 error conditions are the same as for the @code{F_GETFL} command.
2256 @end deftypevr
2257
2258 If you want to modify the file status flags, you should get the current
2259 flags with @code{F_GETFL} and modify the value.  Don't assume that the
2260 flags listed here are the only ones that are implemented; your program
2261 may be run years from now and more flags may exist then.  For example,
2262 here is a function to set or clear the flag @code{O_NONBLOCK} without
2263 altering any other flags:
2264
2265 @smallexample
2266 @group
2267 /* @r{Set the @code{O_NONBLOCK} flag of @var{desc} if @var{value} is nonzero,}
2268    @r{or clear the flag if @var{value} is 0.}
2269    @r{Return 0 on success, or -1 on error with @code{errno} set.} */
2270
2271 int
2272 set_nonblock_flag (int desc, int value)
2273 @{
2274   int oldflags = fcntl (desc, F_GETFL, 0);
2275   /* @r{If reading the flags failed, return error indication now.} */
2276   if (oldflags == -1)
2277     return -1;
2278   /* @r{Set just the flag we want to set.} */
2279   if (value != 0)
2280     oldflags |= O_NONBLOCK;
2281   else
2282     oldflags &= ~O_NONBLOCK;
2283   /* @r{Store modified flag word in the descriptor.} */
2284   return fcntl (desc, F_SETFL, oldflags);
2285 @}
2286 @end group
2287 @end smallexample
2288
2289 @node File Locks
2290 @section File Locks
2291
2292 @cindex file locks
2293 @cindex record locking
2294 The remaining @code{fcntl} commands are used to support @dfn{record
2295 locking}, which permits multiple cooperating programs to prevent each
2296 other from simultaneously accessing parts of a file in error-prone
2297 ways.
2298
2299 @cindex exclusive lock
2300 @cindex write lock
2301 An @dfn{exclusive} or @dfn{write} lock gives a process exclusive access
2302 for writing to the specified part of the file.  While a write lock is in
2303 place, no other process can lock that part of the file.
2304
2305 @cindex shared lock
2306 @cindex read lock
2307 A @dfn{shared} or @dfn{read} lock prohibits any other process from
2308 requesting a write lock on the specified part of the file.  However,
2309 other processes can request read locks.
2310
2311 The @code{read} and @code{write} functions do not actually check to see
2312 whether there are any locks in place.  If you want to implement a
2313 locking protocol for a file shared by multiple processes, your application
2314 must do explicit @code{fcntl} calls to request and clear locks at the
2315 appropriate points.
2316
2317 Locks are associated with processes.  A process can only have one kind
2318 of lock set for each byte of a given file.  When any file descriptor for
2319 that file is closed by the process, all of the locks that process holds
2320 on that file are released, even if the locks were made using other
2321 descriptors that remain open.  Likewise, locks are released when a
2322 process exits, and are not inherited by child processes created using
2323 @code{fork} (@pxref{Creating a Process}).
2324
2325 When making a lock, use a @code{struct flock} to specify what kind of
2326 lock and where.  This data type and the associated macros for the
2327 @code{fcntl} function are declared in the header file @file{fcntl.h}.
2328 @pindex fcntl.h
2329
2330 @comment fcntl.h
2331 @comment POSIX.1
2332 @deftp {Data Type} {struct flock}
2333 This structure is used with the @code{fcntl} function to describe a file
2334 lock.  It has these members:
2335
2336 @table @code
2337 @item short int l_type
2338 Specifies the type of the lock; one of @code{F_RDLCK}, @code{F_WRLCK}, or
2339 @code{F_UNLCK}.
2340
2341 @item short int l_whence
2342 This corresponds to the @var{whence} argument to @code{fseek} or
2343 @code{lseek}, and specifies what the offset is relative to.  Its value
2344 can be one of @code{SEEK_SET}, @code{SEEK_CUR}, or @code{SEEK_END}.
2345
2346 @item off_t l_start
2347 This specifies the offset of the start of the region to which the lock
2348 applies, and is given in bytes relative to the point specified by
2349 @code{l_whence} member.
2350
2351 @item off_t l_len
2352 This specifies the length of the region to be locked.  A value of
2353 @code{0} is treated specially; it means the region extends to the end of
2354 the file.
2355
2356 @item pid_t l_pid
2357 This field is the process ID (@pxref{Process Creation Concepts}) of the
2358 process holding the lock.  It is filled in by calling @code{fcntl} with
2359 the @code{F_GETLK} command, but is ignored when making a lock.
2360 @end table
2361 @end deftp
2362
2363 @comment fcntl.h
2364 @comment POSIX.1
2365 @deftypevr Macro int F_GETLK
2366 This macro is used as the @var{command} argument to @code{fcntl}, to
2367 specify that it should get information about a lock.  This command
2368 requires a third argument of type @w{@code{struct flock *}} to be passed
2369 to @code{fcntl}, so that the form of the call is:
2370
2371 @smallexample
2372 fcntl (@var{filedes}, F_GETLK, @var{lockp})
2373 @end smallexample
2374
2375 If there is a lock already in place that would block the lock described
2376 by the @var{lockp} argument, information about that lock overwrites
2377 @code{*@var{lockp}}.  Existing locks are not reported if they are
2378 compatible with making a new lock as specified.  Thus, you should
2379 specify a lock type of @code{F_WRLCK} if you want to find out about both
2380 read and write locks, or @code{F_RDLCK} if you want to find out about
2381 write locks only.
2382
2383 There might be more than one lock affecting the region specified by the
2384 @var{lockp} argument, but @code{fcntl} only returns information about
2385 one of them.  The @code{l_whence} member of the @var{lockp} structure is
2386 set to @code{SEEK_SET} and the @code{l_start} and @code{l_len} fields
2387 set to identify the locked region.
2388
2389 If no lock applies, the only change to the @var{lockp} structure is to
2390 update the @code{l_type} to a value of @code{F_UNLCK}.
2391
2392 The normal return value from @code{fcntl} with this command is an
2393 unspecified value other than @code{-1}, which is reserved to indicate an
2394 error.  The following @code{errno} error conditions are defined for
2395 this command:
2396
2397 @table @code
2398 @item EBADF
2399 The @var{filedes} argument is invalid.
2400
2401 @item EINVAL
2402 Either the @var{lockp} argument doesn't specify valid lock information,
2403 or the file associated with @var{filedes} doesn't support locks.
2404 @end table
2405 @end deftypevr
2406
2407 @comment fcntl.h
2408 @comment POSIX.1
2409 @deftypevr Macro int F_SETLK
2410 This macro is used as the @var{command} argument to @code{fcntl}, to
2411 specify that it should set or clear a lock.  This command requires a
2412 third argument of type @w{@code{struct flock *}} to be passed to
2413 @code{fcntl}, so that the form of the call is:
2414
2415 @smallexample
2416 fcntl (@var{filedes}, F_SETLK, @var{lockp})
2417 @end smallexample
2418
2419 If the process already has a lock on any part of the region, the old lock
2420 on that part is replaced with the new lock.  You can remove a lock
2421 by specifying a lock type of @code{F_UNLCK}.
2422
2423 If the lock cannot be set, @code{fcntl} returns immediately with a value
2424 of @code{-1}.  This function does not block waiting for other processes
2425 to release locks.  If @code{fcntl} succeeds, it return a value other
2426 than @code{-1}.
2427
2428 The following @code{errno} error conditions are defined for this
2429 function:
2430
2431 @table @code
2432 @item EAGAIN
2433 @itemx EACCES
2434 The lock cannot be set because it is blocked by an existing lock on the
2435 file.  Some systems use @code{EAGAIN} in this case, and other systems
2436 use @code{EACCES}; your program should treat them alike, after
2437 @code{F_SETLK}.  (The GNU system always uses @code{EAGAIN}.)
2438
2439 @item EBADF
2440 Either: the @var{filedes} argument is invalid; you requested a read lock
2441 but the @var{filedes} is not open for read access; or, you requested a
2442 write lock but the @var{filedes} is not open for write access.
2443
2444 @item EINVAL
2445 Either the @var{lockp} argument doesn't specify valid lock information,
2446 or the file associated with @var{filedes} doesn't support locks.
2447
2448 @item ENOLCK
2449 The system has run out of file lock resources; there are already too
2450 many file locks in place.
2451
2452 Well-designed file systems never report this error, because they have no
2453 limitation on the number of locks.  However, you must still take account
2454 of the possibility of this error, as it could result from network access
2455 to a file system on another machine.
2456 @end table
2457 @end deftypevr
2458
2459 @comment fcntl.h
2460 @comment POSIX.1
2461 @deftypevr Macro int F_SETLKW
2462 This macro is used as the @var{command} argument to @code{fcntl}, to
2463 specify that it should set or clear a lock.  It is just like the
2464 @code{F_SETLK} command, but causes the process to block (or wait)
2465 until the request can be specified.
2466
2467 This command requires a third argument of type @code{struct flock *}, as
2468 for the @code{F_SETLK} command.
2469
2470 The @code{fcntl} return values and errors are the same as for the
2471 @code{F_SETLK} command, but these additional @code{errno} error conditions
2472 are defined for this command:
2473
2474 @table @code
2475 @item EINTR
2476 The function was interrupted by a signal while it was waiting.
2477 @xref{Interrupted Primitives}.
2478
2479 @item EDEADLK
2480 The specified region is being locked by another process.  But that
2481 process is waiting to lock a region which the current process has
2482 locked, so waiting for the lock would result in deadlock.  The system
2483 does not guarantee that it will detect all such conditions, but it lets
2484 you know if it notices one.
2485 @end table
2486 @end deftypevr
2487
2488
2489 The following macros are defined for use as values for the @code{l_type}
2490 member of the @code{flock} structure.  The values are integer constants.
2491
2492 @table @code
2493 @comment fcntl.h
2494 @comment POSIX.1
2495 @vindex F_RDLCK
2496 @item F_RDLCK
2497 This macro is used to specify a read (or shared) lock.
2498
2499 @comment fcntl.h
2500 @comment POSIX.1
2501 @vindex F_WRLCK
2502 @item F_WRLCK
2503 This macro is used to specify a write (or exclusive) lock.
2504
2505 @comment fcntl.h
2506 @comment POSIX.1
2507 @vindex F_UNLCK
2508 @item F_UNLCK
2509 This macro is used to specify that the region is unlocked.
2510 @end table
2511
2512 As an example of a situation where file locking is useful, consider a
2513 program that can be run simultaneously by several different users, that
2514 logs status information to a common file.  One example of such a program
2515 might be a game that uses a file to keep track of high scores.  Another
2516 example might be a program that records usage or accounting information
2517 for billing purposes.
2518
2519 Having multiple copies of the program simultaneously writing to the
2520 file could cause the contents of the file to become mixed up.  But
2521 you can prevent this kind of problem by setting a write lock on the
2522 file before actually writing to the file.
2523
2524 If the program also needs to read the file and wants to make sure that
2525 the contents of the file are in a consistent state, then it can also use
2526 a read lock.  While the read lock is set, no other process can lock
2527 that part of the file for writing.
2528
2529 @c ??? This section could use an example program.
2530
2531 Remember that file locks are only a @emph{voluntary} protocol for
2532 controlling access to a file.  There is still potential for access to
2533 the file by programs that don't use the lock protocol.
2534
2535 @node Interrupt Input
2536 @section Interrupt-Driven Input
2537
2538 @cindex interrupt-driven input
2539 If you set the @code{O_ASYNC} status flag on a file descriptor
2540 (@pxref{File Status Flags}), a @code{SIGIO} signal is sent whenever
2541 input or output becomes possible on that file descriptor.  The process
2542 or process group to receive the signal can be selected by using the
2543 @code{F_SETOWN} command to the @code{fcntl} function.  If the file
2544 descriptor is a socket, this also selects the recipient of @code{SIGURG}
2545 signals that are delivered when out-of-band data arrives on that socket;
2546 see @ref{Out-of-Band Data}.  (@code{SIGURG} is sent in any situation
2547 where @code{select} would report the socket as having an ``exceptional
2548 condition''.  @xref{Waiting for I/O}.)
2549
2550 If the file descriptor corresponds to a terminal device, then @code{SIGIO}
2551 signals are sent to the foreground process group of the terminal.
2552 @xref{Job Control}.
2553
2554 @pindex fcntl.h
2555 The symbols in this section are defined in the header file
2556 @file{fcntl.h}.
2557
2558 @comment fcntl.h
2559 @comment BSD
2560 @deftypevr Macro int F_GETOWN
2561 This macro is used as the @var{command} argument to @code{fcntl}, to
2562 specify that it should get information about the process or process
2563 group to which @code{SIGIO} signals are sent.  (For a terminal, this is
2564 actually the foreground process group ID, which you can get using
2565 @code{tcgetpgrp}; see @ref{Terminal Access Functions}.)
2566
2567 The return value is interpreted as a process ID; if negative, its
2568 absolute value is the process group ID.
2569
2570 The following @code{errno} error condition is defined for this command:
2571
2572 @table @code
2573 @item EBADF
2574 The @var{filedes} argument is invalid.
2575 @end table
2576 @end deftypevr
2577
2578 @comment fcntl.h
2579 @comment BSD
2580 @deftypevr Macro int F_SETOWN
2581 This macro is used as the @var{command} argument to @code{fcntl}, to
2582 specify that it should set the process or process group to which
2583 @code{SIGIO} signals are sent.  This command requires a third argument
2584 of type @code{pid_t} to be passed to @code{fcntl}, so that the form of
2585 the call is:
2586
2587 @smallexample
2588 fcntl (@var{filedes}, F_SETOWN, @var{pid})
2589 @end smallexample
2590
2591 The @var{pid} argument should be a process ID.  You can also pass a
2592 negative number whose absolute value is a process group ID.
2593
2594 The return value from @code{fcntl} with this command is @code{-1}
2595 in case of error and some other value if successful.  The following
2596 @code{errno} error conditions are defined for this command:
2597
2598 @table @code
2599 @item EBADF
2600 The @var{filedes} argument is invalid.
2601
2602 @item ESRCH
2603 There is no process or process group corresponding to @var{pid}.
2604 @end table
2605 @end deftypevr
2606
2607 @c ??? This section could use an example program.