1 \input texinfo @c -*-texinfo-*-
2 @comment %**start of header (This is for running Texinfo on a region.)
4 @settitle Inter Process Communication.
6 @comment %**end of header (This is for running Texinfo on a region.)
9 This file documents the System V style inter process communication
10 primitives available under linux.
12 Copyright @copyright{} 1992 krishna balasubramanian
14 Permission is granted to use this material and the accompanying
15 programs within the terms of the GNU GPL.
20 @center @titlefont{System V Inter Process Communication}
22 @center krishna balasubramanian,
24 @comment The following two commands start the copyright page.
26 @vskip 0pt plus 1filll
27 Copyright @copyright{} 1992 krishna balasubramanian
29 Permission is granted to use this material and the accompanying
30 programs within the terms of the GNU GPL.
33 @dircategory Miscellaneous
35 * ipc: (ipc). System V style inter process communication
38 @node Top, Overview, Notes, (dir)
39 @chapter System V IPC.
41 These facilities are provided to maintain compatibility with
42 programs developed on system V unix systems and others
43 that rely on these system V mechanisms to accomplish inter
44 process communication (IPC).@refill
46 The specifics described here are applicable to the Linux implementation.
47 Other implementations may do things slightly differently.
50 * Overview:: What is system V ipc? Overall mechanisms.
51 * Messages:: System calls for message passing.
52 * Semaphores:: System calls for semaphores.
53 * Shared Memory:: System calls for shared memory access.
54 * Notes:: Miscellaneous notes.
57 @node Overview, example, Top, Top
60 @noindent System V IPC consists of three mechanisms:
64 Messages : exchange messages with any process or server.
66 Semaphores : allow unrelated processes to synchronize execution.
68 Shared memory : allow unrelated processes to share memory.
72 * example:: Using shared memory.
73 * perms:: Description of access permissions.
74 * syscalls:: Overview of ipc system calls.
77 Access to all resources is permitted on the basis of permissions
78 set up when the resource was created.@refill
80 A resource here consists of message queue, a semaphore set (array)
81 or a shared memory segment.@refill
83 A resource must first be allocated by a creator before it is used.
84 The creator can assign a different owner. After use the resource
85 must be explicitly destroyed by the creator or owner.@refill
87 A resource is identified by a numeric @var{id}. Typically a creator
88 defines a @var{key} that may be used to access the resource. The user
89 process may then use this @var{key} in the @dfn{get} system call to obtain
90 the @var{id} for the corresponding resource. This @var{id} is then used for
91 all further access. A library call @dfn{ftok} is provided to translate
92 pathnames or strings to numeric keys.@refill
94 There are system and implementation defined limits on the number and
95 sizes of resources of any given type. Some of these are imposed by the
96 implementation and others by the system administrator
97 when configuring the kernel (@xref{msglimits}, @xref{semlimits},
98 @xref{shmlimits}).@refill
100 There is an @code{msqid_ds}, @code{semid_ds} or @code{shmid_ds} struct
101 associated with each message queue, semaphore array or shared segment.
102 Each ipc resource has an associated @code{ipc_perm} struct which defines
103 the creator, owner, access perms ..etc.., for the resource.
104 These structures are detailed in the following sections.@refill
108 @node example, perms, Overview, Overview
111 Here is a code fragment with pointers on how to use shared memory. The
112 same methods are applicable to other resources.@refill
114 In a typical access sequence the creator allocates a new instance
115 of the resource with the @code{get} system call using the IPC_CREAT
118 @noindent creator process:@*
125 int size = 0x5000; /* 20 K */
126 int flags = 0664 | IPC_CREAT; /* read-only for others */
128 key = ftok ("~creator/ipckey", proc_id);
129 id = shmget (key, size, flags);
130 exit (0); /* quit leaving resource allocated */
134 Users then gain access to the resource using the same key.@*
144 key = ftok ("~creator/ipckey", proc_id);
146 id = shmget (key, 0, 004); /* default size */
148 perror ("shmget ...");
150 shmaddr = shmat (id, 0, SHM_RDONLY); /* attach segment for reading */
151 if (shmaddr == (char *) -1)
152 perror ("shmat ...");
154 local_var = *(shmaddr + 3); /* read segment etc. */
156 shmdt (shmaddr); /* detach segment */
160 When the resource is no longer needed the creator should remove it.@*
162 Creator/owner process 2:
164 key = ftok ("~creator/ipckey", proc_id)
165 id = shmget (key, 0, 0);
166 shmctl (id, IPC_RMID, NULL);
170 @node perms, syscalls, example, Overview
173 Each resource has an associated @code{ipc_perm} struct which defines the
174 creator, owner and access perms for the resource.@refill
178 key_t key; /* set by creator */
179 ushort uid; /* owner euid and egid */
181 ushort cuid; /* creator euid and egid */
183 ushort mode; /* access modes in lower 9 bits */
184 ushort seq; /* sequence number */
187 The creating process is the default owner. The owner can be reassigned
188 by the creator and has creator perms. Only the owner, creator or super-user
189 can delete the resource.@refill
191 The lowest nine bits of the flags parameter supplied by the user to the
192 system call are compared with the values stored in @code{ipc_perms.mode}
193 to determine if the requested access is allowed. In the case
194 that the system call creates the resource, these bits are initialized
195 from the user supplied value.@refill
197 As for files, access permissions are specified as read, write and exec
198 for user, group or other (though the exec perms are unused). For example
199 0624 grants read-write to owner, write-only to group and read-only
200 access to others.@refill
202 For shared memory, note that read-write access for segments is determined
203 by a separate flag which is not stored in the @code{mode} field.
204 Shared memory segments attached with write access can be read.@refill
206 The @code{cuid}, @code{cgid}, @code{key} and @code{seq} fields
207 cannot be changed by the user.@refill
211 @node syscalls, Messages, perms, Overview
212 @section IPC system calls
214 This section provides an overview of the IPC system calls. See the
215 specific sections on each type of resource for details.@refill
217 Each type of mechanism provides a @dfn{get}, @dfn{ctl} and one or more
218 @dfn{op} system calls that allow the user to create or procure the
219 resource (get), define its behaviour or destroy it (ctl) and manipulate
220 the resources (op).@refill
224 @subsection The @dfn{get} system calls
226 The @code{get} call typically takes a @var{key} and returns a numeric
227 @var{id} that is used for further access.
228 The @var{id} is an index into the resource table. A sequence
229 number is maintained and incremented when a resource is
230 destroyed so that access using an obsolete @var{id} is likely to fail.@refill
232 The user also specifies the permissions and other behaviour
233 charecteristics for the current access. The flags are or-ed with the
234 permissions when invoking system calls as in:@refill
236 msgflg = IPC_CREAT | IPC_EXCL | 0666;
237 id = msgget (key, msgflg);
241 @code{key} : IPC_PRIVATE => new instance of resource is initialized.
246 IPC_CREAT : resource created for @var{key} if it does not exist.
248 IPC_CREAT | IPC_EXCL : fail if resource exists for @var{key}.
251 returns : an identifier used for all further access to the resource.
254 Note that IPC_PRIVATE is not a flag but a special @code{key}
255 that ensures (when the call is successful) that a new resource is
258 Use of IPC_PRIVATE does not make the resource inaccessible to other
259 users. For this you must set the access permissions appropriately.@refill
261 There is currently no way for a process to ensure exclusive access to a
262 resource. IPC_CREAT | IPC_EXCL only ensures (on success) that a new
263 resource was initialized. It does not imply exclusive access.@refill
266 See Also : @xref{msgget}, @xref{semget}, @xref{shmget}.@refill
270 @subsection The @dfn{ctl} system calls
272 Provides or alters the information stored in the structure that describes
273 the resource indexed by @var{id}.@refill
278 err = msgctl (id, IPC_STAT, &buf);
282 printf ("creator uid = %d\n", buf.msg_perm.cuid);
287 Commands supported by all @code{ctl} calls:@*
290 IPC_STAT : read info on resource specified by id into user allocated
291 buffer. The user must have read access to the resource.@refill
293 IPC_SET : write info from buffer into resource data structure. The
294 user must be owner creator or super-user.@refill
296 IPC_RMID : remove resource. The user must be the owner, creator or
300 The IPC_RMID command results in immediate removal of a message
301 queue or semaphore array. Shared memory segments however, are
302 only destroyed upon the last detach after IPC_RMID is executed.@refill
304 The @code{semctl} call provides a number of command options that allow
305 the user to determine or set the values of the semaphores in an array.@refill
308 See Also: @xref{msgctl}, @xref{semctl}, @xref{shmctl}.@refill
311 @subsection The @dfn{op} system calls
313 Used to send or receive messages, read or alter semaphore values,
314 attach or detach shared memory segments.
315 The IPC_NOWAIT flag will cause the operation to fail with error EAGAIN
316 if the process has to wait on the call.@refill
319 @code{flags} : IPC_NOWAIT => return with error if a wait is required.
322 See Also: @xref{msgsnd},@xref{msgrcv},@xref{semop},@xref{shmat},
327 @node Messages, msgget, syscalls, Top
330 A message resource is described by a struct @code{msqid_ds} which is
331 allocated and initialized when the resource is created. Some fields
332 in @code{msqid_ds} can then be altered (if desired) by invoking @code{msgctl}.
333 The memory used by the resource is released when it is destroyed by
334 a @code{msgctl} call.@refill
338 struct ipc_perm msg_perm;
339 struct msg *msg_first; /* first message on queue (internal) */
340 struct msg *msg_last; /* last message in queue (internal) */
341 time_t msg_stime; /* last msgsnd time */
342 time_t msg_rtime; /* last msgrcv time */
343 time_t msg_ctime; /* last change time */
344 struct wait_queue *wwait; /* writers waiting (internal) */
345 struct wait_queue *rwait; /* readers waiting (internal) */
346 ushort msg_cbytes; /* number of bytes used on queue */
347 ushort msg_qnum; /* number of messages in queue */
348 ushort msg_qbytes; /* max number of bytes on queue */
349 ushort msg_lspid; /* pid of last msgsnd */
350 ushort msg_lrpid; /* pid of last msgrcv */
353 To send or receive a message the user allocates a structure that looks
354 like a @code{msgbuf} but with an array @code{mtext} of the required size.
355 Messages have a type (positive integer) associated with them so that
356 (for example) a listener can choose to receive only messages of a
361 long mtype; type of message (@xref{msgrcv}).
362 char mtext[1]; message text .. why is this not a ptr?
365 The user must have write permissions to send and read permissions
366 to receive messages on a queue.@refill
368 When @code{msgsnd} is invoked, the user's message is copied into
369 an internal struct @code{msg} and added to the queue. A @code{msgrcv}
370 will then read this message and free the associated struct @code{msg}.@refill
378 * msglimits:: Implementation defined limits.
382 @node msgget, msgsnd, Messages, Messages
386 A message queue is allocated by a msgget system call :
389 msqid = msgget (key_t key, int msgflg);
394 @code{key}: an integer usually got from @code{ftok()} or IPC_PRIVATE.@refill
399 IPC_CREAT : used to create a new resource if it does not already exist.
401 IPC_EXCL | IPC_CREAT : used to ensure failure of the call if the
402 resource already exists.@refill
404 rwxrwxrwx : access permissions.
407 returns: msqid (an integer used for all further access) on success.
408 -1 on failure.@refill
411 A message queue is allocated if there is no resource corresponding
412 to the given key. The access permissions specified are then copied
413 into the @code{msg_perm} struct and the fields in @code{msqid_ds}
414 initialized. The user must use the IPC_CREAT flag or key = IPC_PRIVATE,
415 if a new instance is to be allocated. If a resource corresponding to
416 @var{key} already exists, the access permissions are verified.@refill
421 EACCES : (procure) Do not have permission for requested access.@*
423 EEXIST : (allocate) IPC_CREAT | IPC_EXCL specified and resource exists.@*
425 EIDRM : (procure) The resource was removed.@*
427 ENOSPC : All id's are taken (max of MSGMNI id's system-wide).@*
429 ENOENT : Resource does not exist and IPC_CREAT not specified.@*
431 ENOMEM : A new @code{msqid_ds} was to be created but ... nomem.
436 @node msgsnd, msgrcv, msgget, Messages
440 int msgsnd (int msqid, struct msgbuf *msgp, int msgsz, int msgflg);
445 @code{msqid} : id obtained by a call to msgget.
447 @code{msgsz} : size of msg text (@code{mtext}) in bytes.
449 @code{msgp} : message to be sent. (msgp->mtype must be positive).
451 @code{msgflg} : IPC_NOWAIT.
453 returns : msgsz on success. -1 on error.
456 The message text and type are stored in the internal @code{msg}
457 structure. @code{msg_cbytes}, @code{msg_qnum}, @code{msg_lspid},
458 and @code{msg_stime} fields are updated. Readers waiting on the
459 queue are awakened.@refill
464 EACCES : Do not have write permission on queue.@*
466 EAGAIN : IPC_NOWAIT specified and queue is full.@*
468 EFAULT : msgp not accessible.@*
470 EIDRM : The message queue was removed.@*
472 EINTR : Full queue ... would have slept but ... was interrupted.@*
474 EINVAL : mtype < 1, msgsz > MSGMAX, msgsz < 0, msqid < 0 or unused.@*
476 ENOMEM : Could not allocate space for header and text.@*
480 @node msgrcv, msgctl, msgsnd, Messages
484 int msgrcv (int msqid, struct msgbuf *msgp, int msgsz, long msgtyp,
490 msqid : id obtained by a call to msgget.
492 msgsz : maximum size of message to receive.
494 msgp : allocated by user to store the message in.
499 0 => get first message on queue.
501 > 0 => get first message of matching type.
503 < 0 => get message with least type which is <= abs(msgtyp).
509 IPC_NOWAIT : Return immediately if message not found.
511 MSG_NOERROR : The message is truncated if it is larger than msgsz.
513 MSG_EXCEPT : Used with msgtyp > 0 to receive any msg except of specified
517 returns : size of message if found. -1 on error.
520 The first message that meets the @code{msgtyp} specification is
521 identified. For msgtyp < 0, the entire queue is searched for the
522 message with the smallest type.@refill
524 If its length is smaller than msgsz or if the user specified the
525 MSG_NOERROR flag, its text and type are copied to msgp->mtext and
526 msgp->mtype, and it is taken off the queue.@refill
528 The @code{msg_cbytes}, @code{msg_qnum}, @code{msg_lrpid},
529 and @code{msg_rtime} fields are updated. Writers waiting on the
530 queue are awakened.@refill
535 E2BIG : msg bigger than msgsz and MSG_NOERROR not specified.@*
537 EACCES : Do not have permission for reading the queue.@*
539 EFAULT : msgp not accessible.@*
541 EIDRM : msg queue was removed.@*
543 EINTR : msg not found ... would have slept but ... was interrupted.@*
545 EINVAL : msgsz > msgmax or msgsz < 0, msqid < 0 or unused.@*
547 ENOMSG : msg of requested type not found and IPC_NOWAIT specified.
551 @node msgctl, msglimits, msgrcv, Messages
555 int msgctl (int msqid, int cmd, struct msqid_ds *buf);
560 msqid : id obtained by a call to msgget.
562 buf : allocated by user for reading/writing info.
564 cmd : IPC_STAT, IPC_SET, IPC_RMID (@xref{syscalls}).
567 IPC_STAT results in the copy of the queue data structure
568 into the user supplied buffer.@refill
570 In the case of IPC_SET, the queue size (@code{msg_qbytes})
571 and the @code{uid}, @code{gid}, @code{mode} (low 9 bits) fields
572 of the @code{msg_perm} struct are set from the user supplied values.
573 @code{msg_ctime} is updated.@refill
575 Note that only the super user may increase the limit on the size of a
576 message queue beyond MSGMNB.@refill
578 When the queue is destroyed (IPC_RMID), the sequence number is
579 incremented and all waiting readers and writers are awakened.
580 These processes will then return with @code{errno} set to EIDRM.@refill
585 EPERM : Insufficient privilege to increase the size of the queue (IPC_SET)
586 or remove it (IPC_RMID).@*
588 EACCES : Do not have permission for reading the queue (IPC_STAT).@*
590 EFAULT : buf not accessible (IPC_STAT, IPC_SET).@*
592 EIDRM : msg queue was removed.@*
594 EINVAL : invalid cmd, msqid < 0 or unused.
597 @node msglimits, Semaphores, msgctl, Messages
598 @subsection Limis on Message Resources
601 Sizeof various structures:
604 msqid_ds 52 /* 1 per message queue .. dynamic */
606 msg 16 /* 1 for each message in system .. dynamic */
608 msgbuf 8 /* allocated by user */
615 MSGMNI : number of message queue identifiers ... policy.
617 MSGMAX : max size of message.
618 Header and message space allocated on one page.
619 MSGMAX = (PAGE_SIZE - sizeof(struct msg)).
620 Implementation maximum MSGMAX = 4080.@refill
622 MSGMNB : default max size of a message queue ... policy.
623 The super-user can increase the size of a
624 queue beyond MSGMNB by a @code{msgctl} call.@refill
628 Unused or unimplemented:@*
629 MSGTQL max number of message headers system-wide.@*
630 MSGPOOL total size in bytes of msg pool.
634 @node Semaphores, semget, msglimits, Top
637 Each semaphore has a value >= 0. An id provides access to an array
638 of @code{nsems} semaphores. Operations such as read, increment or decrement
639 semaphores in a set are performed by the @code{semop} call which processes
640 @code{nsops} operations at a time. Each operation is specified in a struct
641 @code{sembuf} described below. The operations are applied only if all of
644 If you do not have a need for such arrays, you are probably better off using
645 the @code{test_bit}, @code{set_bit} and @code{clear_bit} bit-operations
646 defined in <asm/bitops.h>.@refill
648 Semaphore operations may also be qualified by a SEM_UNDO flag which
649 results in the operation being undone when the process exits.@refill
651 If a decrement cannot go through, a process will be put to sleep
652 on a queue waiting for the @code{semval} to increase unless it specifies
653 IPC_NOWAIT. A read operation can similarly result in a sleep on a
654 queue waiting for @code{semval} to become 0. (Actually there are
655 two queues per semaphore array).@refill
658 A semaphore array is described by:
661 struct ipc_perm sem_perm;
662 time_t sem_otime; /* last semop time */
663 time_t sem_ctime; /* last change time */
664 struct wait_queue *eventn; /* wait for a semval to increase */
665 struct wait_queue *eventz; /* wait for a semval to become 0 */
666 struct sem_undo *undo; /* undo entries */
667 ushort sem_nsems; /* no. of semaphores in array */
671 Each semaphore is described internally by :
674 short sempid; /* pid of last semop() */
675 ushort semval; /* current value */
676 ushort semncnt; /* num procs awaiting increase in semval */
677 ushort semzcnt; /* num procs awaiting semval = 0 */
684 * semlimits:: Limits imposed by this implementation.
687 @node semget, semop, Semaphores, Semaphores
691 A semaphore array is allocated by a semget system call:
694 semid = semget (key_t key, int nsems, int semflg);
699 @code{key} : an integer usually got from @code{ftok} or IPC_PRIVATE
704 # of semaphores in array (0 <= nsems <= SEMMSL <= SEMMNS)
706 0 => dont care can be used when not creating the resource.
707 If successful you always get access to the entire array anyway.@refill
713 IPC_CREAT used to create a new resource
715 IPC_EXCL used with IPC_CREAT to ensure failure if the resource exists.
717 rwxrwxrwx access permissions.
720 returns : semid on success. -1 on failure.
723 An array of nsems semaphores is allocated if there is no resource
724 corresponding to the given key. The access permissions specified are
725 then copied into the @code{sem_perm} struct for the array along with the
726 user-id etc. The user must use the IPC_CREAT flag or key = IPC_PRIVATE
727 if a new resource is to be created.@refill
732 EINVAL : nsems not in above range (allocate).@*
733 nsems greater than number in array (procure).@*
735 EEXIST : (allocate) IPC_CREAT | IPC_EXCL specified and resource exists.@*
737 EIDRM : (procure) The resource was removed.@*
739 ENOMEM : could not allocate space for semaphore array.@*
741 ENOSPC : No arrays available (SEMMNI), too few semaphores available (SEMMNS).@*
743 ENOENT : Resource does not exist and IPC_CREAT not specified.@*
745 EACCES : (procure) do not have permission for specified access.
748 @node semop, semctl, semget, Semaphores
752 Operations on semaphore arrays are performed by calling semop :
755 int semop (int semid, struct sembuf *sops, unsigned nsops);
759 semid : id obtained by a call to semget.
761 sops : array of semaphore operations.
763 nsops : number of operations in array (0 < nsops < SEMOPM).
765 returns : semval for last operation. -1 on failure.
769 Operations are described by a structure sembuf:
772 ushort sem_num; /* semaphore index in array */
773 short sem_op; /* semaphore operation */
774 short sem_flg; /* operation flags */
777 The value @code{sem_op} is to be added (signed) to the current value semval
778 of the semaphore with index sem_num (0 .. nsems -1) in the set.
779 Flags recognized in sem_flg are IPC_NOWAIT and SEM_UNDO.@refill
782 Two kinds of operations can result in wait:
785 If sem_op is 0 (read operation) and semval is non-zero, the process
786 sleeps on a queue waiting for semval to become zero or returns with
787 error EAGAIN if (IPC_NOWAIT | sem_flg) is true.@refill
789 If (sem_op < 0) and (semval + sem_op < 0), the process either sleeps
790 on a queue waiting for semval to increase or returns with error EAGAIN if
791 (sem_flg & IPC_NOWAIT) is true.@refill
794 The array sops is first read in and preliminary checks performed on
795 the arguments. The operations are parsed to determine if any of
796 them needs write permissions or requests an undo operation.@refill
798 The operations are then tried and the process sleeps if any operation
799 that does not specify IPC_NOWAIT cannot go through. If a process sleeps
800 it repeats these checks on waking up. If any operation that requests
801 IPC_NOWAIT, cannot go through at any stage, the call returns with errno
802 set to EAGAIN.@refill
804 Finally, operations are committed when all go through without an intervening
805 sleep. Processes waiting on the zero_queue or increment_queue are awakened
806 if any of the semval's becomes zero or is incremented respectively.@refill
811 E2BIG : nsops > SEMOPM.@*
813 EACCES : Do not have permission for requested (read/alter) access.@*
815 EAGAIN : An operation with IPC_NOWAIT specified could not go through.@*
817 EFAULT : The array sops is not accessible.@*
819 EFBIG : An operation had semnum >= nsems.@*
821 EIDRM : The resource was removed.@*
823 EINTR : The process was interrupted on its way to a wait queue.@*
825 EINVAL : nsops is 0, semid < 0 or unused.@*
827 ENOMEM : SEM_UNDO requested. Could not allocate space for undo structure.@*
829 ERANGE : sem_op + semval > SEMVMX for some operation.
832 @node semctl, semlimits, semop, Semaphores
836 int semctl (int semid, int semnum, int cmd, union semun arg);
841 semid : id obtained by a call to semget.
846 GETPID return pid for the process that executed the last semop.
848 GETVAL return semval of semaphore with index semnum.
850 GETNCNT return number of processes waiting for semval to increase.
852 GETZCNT return number of processes waiting for semval to become 0
854 SETVAL set semval = arg.val.
856 GETALL read all semval's into arg.array.
858 SETALL set all semval's with values given in arg.array.
861 returns : 0 on success or as given above. -1 on failure.
864 The first 4 operate on the semaphore with index semnum in the set.
865 The last two operate on all semaphores in the set.@refill
867 @code{arg} is a union :
870 int val; value for SETVAL.
871 struct semid_ds *buf; buffer for IPC_STAT and IPC_SET.
872 ushort *array; array for GETALL and SETALL
877 IPC_SET, SETVAL, SETALL : sem_ctime is updated.
879 SETVAL, SETALL : Undo entries are cleared for altered semaphores in
880 all processes. Processes sleeping on the wait queues are
881 awakened if a semval becomes 0 or increases.@refill
883 IPC_SET : sem_perm.uid, sem_perm.gid, sem_perm.mode are updated from
884 user supplied values.@refill
890 EACCES : do not have permission for specified access.@*
892 EFAULT : arg is not accessible.@*
894 EIDRM : The resource was removed.@*
896 EINVAL : semid < 0 or semnum < 0 or semnum >= nsems.@*
898 EPERM : IPC_RMID, IPC_SET ... not creator, owner or super-user.@*
900 ERANGE : arg.array[i].semval > SEMVMX or < 0 for some i.
905 @node semlimits, Shared Memory, semctl, Semaphores
906 @subsection Limits on Semaphore Resources
909 Sizeof various structures:
911 semid_ds 44 /* 1 per semaphore array .. dynamic */
912 sem 8 /* 1 for each semaphore in system .. dynamic */
913 sembuf 6 /* allocated by user */
914 sem_undo 20 /* 1 for each undo request .. dynamic */
921 SEMVMX 32767 semaphore maximum value (short).
923 SEMMNI number of semaphore identifiers (or arrays) system wide...policy.
925 SEMMSL maximum number of semaphores per id.
926 1 semid_ds per array, 1 struct sem per semaphore
927 => SEMMSL = (PAGE_SIZE - sizeof(semid_ds)) / sizeof(sem).
928 Implementation maximum SEMMSL = 500.@refill
930 SEMMNS maximum number of semaphores system wide ... policy.
931 Setting SEMMNS >= SEMMSL*SEMMNI makes it irrelevent.@refill
933 SEMOPM Maximum number of operations in one semop call...policy.
937 Unused or unimplemented:@*
939 SEMAEM adjust on exit max value.@*
941 SEMMNU number of undo structures system-wide.@*
943 SEMUME maximum number of undo entries per process.
947 @node Shared Memory, shmget, semlimits, Top
948 @section Shared Memory
950 Shared memory is distinct from the sharing of read-only code pages or
951 the sharing of unaltered data pages that is available due to the
952 copy-on-write mechanism. The essential difference is that the
953 shared pages are dirty (in the case of Shared memory) and can be
954 made to appear at a convenient location in the process' address space.@refill
957 A shared segment is described by :
960 struct ipc_perm shm_perm;
961 int shm_segsz; /* size of segment (bytes) */
962 time_t shm_atime; /* last attach time */
963 time_t shm_dtime; /* last detach time */
964 time_t shm_ctime; /* last change time */
965 ulong *shm_pages; /* internal page table */
966 ushort shm_cpid; /* pid, creator */
967 ushort shm_lpid; /* pid, last operation */
968 short shm_nattch; /* no. of current attaches */
971 A shmget allocates a shmid_ds and an internal page table. A shmat
972 maps the segment into the process' address space with pointers
973 into the internal page table and the actual pages are faulted in
974 as needed. The memory associated with the segment must be explicitly
975 destroyed by calling shmctl with IPC_RMID.@refill
982 * shmlimits:: Limits imposed by this implementation.
986 @node shmget, shmat, Shared Memory, Shared Memory
990 A shared memory segment is allocated by a shmget system call:
993 int shmget(key_t key, int size, int shmflg);
998 key : an integer usually got from @code{ftok} or IPC_PRIVATE
1000 size : size of the segment in bytes (SHMMIN <= size <= SHMMAX).
1005 IPC_CREAT used to create a new resource
1007 IPC_EXCL used with IPC_CREAT to ensure failure if the resource exists.
1009 rwxrwxrwx access permissions.
1012 returns : shmid on success. -1 on failure.
1015 A descriptor for a shared memory segment is allocated if there isn't one
1016 corresponding to the given key. The access permissions specified are
1017 then copied into the @code{shm_perm} struct for the segment along with the
1018 user-id etc. The user must use the IPC_CREAT flag or key = IPC_PRIVATE
1019 to allocate a new segment.@refill
1021 If the segment already exists, the access permissions are verified,
1022 and a check is made to see that it is not marked for destruction.@refill
1024 @code{size} is effectively rounded up to a multiple of PAGE_SIZE as shared
1025 memory is allocated in pages.@refill
1030 EINVAL : (allocate) Size not in range specified above.@*
1031 (procure) Size greater than size of segment.@*
1033 EEXIST : (allocate) IPC_CREAT | IPC_EXCL specified and resource exists.@*
1035 EIDRM : (procure) The resource is marked destroyed or was removed.@*
1037 ENOSPC : (allocate) All id's are taken (max of SHMMNI id's system-wide).
1038 Allocating a segment of the requested size would exceed the
1039 system wide limit on total shared memory (SHMALL).@refill
1042 ENOENT : (procure) Resource does not exist and IPC_CREAT not specified.@*
1044 EACCES : (procure) Do not have permission for specified access.@*
1046 ENOMEM : (allocate) Could not allocate memory for shmid_ds or pg_table.
1050 @node shmat, shmdt, shmget, Shared Memory
1054 Maps a shared segment into the process' address space.
1058 virt_addr = shmat (int shmid, char *shmaddr, int shmflg);
1063 shmid : id got from call to shmget.
1065 shmaddr : requested attach address.@*
1066 If shmaddr is 0 the system finds an unmapped region.@*
1067 If a non-zero value is indicated the value must be page
1068 aligned or the user must specify the SHM_RND flag.@refill
1071 SHM_RDONLY : request read-only attach.@*
1072 SHM_RND : attach address is rounded DOWN to a multiple of SHMLBA.
1074 returns: virtual address of attached segment. -1 on failure.
1077 When shmaddr is 0, the attach address is determined by finding an
1078 unmapped region in the address range 1G to 1.5G, starting at 1.5G
1079 and coming down from there. The algorithm is very simple so you
1080 are encouraged to avoid non-specific attaches.
1085 Determine attach address as described above.
1086 Check region (shmaddr, shmaddr + size) is not mapped and allocate
1087 page tables (undocumented SHM_REMAP flag!).
1088 Map the region by setting up pointers into the internal page table.
1089 Add a descriptor for the attach to the task struct for the process.
1090 @code{shm_nattch}, @code{shm_lpid}, @code{shm_atime} are updated.
1095 The @code{brk} value is not altered.
1096 The segment is automatically detached when the process exits.
1097 The same segment may be attached as read-only or read-write and
1098 more than once in the process' address space.
1099 A shmat can succeed on a segment marked for destruction.
1100 The request for a particular type of attach is made using the SHM_RDONLY flag.
1101 There is no notion of a write-only attach. The requested attach
1102 permissions must fall within those allowed by @code{shm_perm.mode}.
1107 EACCES : Do not have permission for requested access.@*
1109 EINVAL : shmid < 0 or unused, shmaddr not aligned, attach at brk failed.@*
1111 EIDRM : resource was removed.@*
1113 ENOMEM : Could not allocate memory for descriptor or page tables.
1116 @node shmdt, shmctl, shmat, Shared Memory
1120 int shmdt (char *shmaddr);
1125 shmaddr : attach address of segment (returned by shmat).
1127 returns : 0 on success. -1 on failure.
1130 An attached segment is detached and @code{shm_nattch} decremented. The
1131 occupied region in user space is unmapped. The segment is destroyed
1132 if it is marked for destruction and @code{shm_nattch} is 0.
1133 @code{shm_lpid} and @code{shm_dtime} are updated.@refill
1138 EINVAL : No shared memory segment attached at shmaddr.
1141 @node shmctl, shmlimits, shmdt, Shared Memory
1145 Destroys allocated segments. Reads/Writes the control structures.
1148 int shmctl (int shmid, int cmd, struct shmid_ds *buf);
1153 shmid : id got from call to shmget.
1155 cmd : IPC_STAT, IPC_SET, IPC_RMID (@xref{syscalls}).
1158 IPC_SET : Used to set the owner uid, gid, and shm_perms.mode field.
1160 IPC_RMID : The segment is marked destroyed. It is only destroyed
1161 on the last detach.@refill
1163 IPC_STAT : The shmid_ds structure is copied into the user allocated buffer.
1166 buf : used to read (IPC_STAT) or write (IPC_SET) information.
1168 returns : 0 on success, -1 on failure.
1171 The user must execute an IPC_RMID shmctl call to free the memory
1172 allocated by the shared segment. Otherwise all the pages faulted in
1173 will continue to live in memory or swap.@refill
1178 EACCES : Do not have permission for requested access.@*
1180 EFAULT : buf is not accessible.@*
1182 EINVAL : shmid < 0 or unused.@*
1184 EIDRM : identifier destroyed.@*
1186 EPERM : not creator, owner or super-user (IPC_SET, IPC_RMID).
1189 @node shmlimits, Notes, shmctl, Shared Memory
1190 @subsection Limits on Shared Memory Resources
1196 SHMMNI max num of shared segments system wide ... 4096.
1198 SHMMAX max shared memory segment size (bytes) ... 4M
1200 SHMMIN min shared memory segment size (bytes).
1201 1 byte (though PAGE_SIZE is the effective minimum size).@refill
1203 SHMALL max shared mem system wide (in pages) ... policy.
1205 SHMLBA segment low boundary address multiple.
1206 Must be page aligned. SHMLBA = PAGE_SIZE.@refill
1209 Unused or unimplemented:@*
1210 SHMSEG : maximum number of shared segments per process.
1214 @node Notes, Top, shmlimits, Top
1215 @section Miscellaneous Notes
1217 The system calls are mapped into one -- @code{sys_ipc}. This should be
1218 transparent to the user.@refill
1220 @subsection Semaphore @code{undo} requests
1222 There is one sem_undo structure associated with a process for
1223 each semaphore which was altered (with an undo request) by the process.
1224 @code{sem_undo} structures are freed only when the process exits.
1226 One major cause for unhappiness with the undo mechanism is that
1227 it does not fit in with the notion of having an atomic set of
1228 operations on an array. The undo requests for an array and each
1229 semaphore therein may have been accumulated over many @code{semop}
1230 calls. Thus use the undo mechanism with private semaphores only.@refill
1232 Should the process sleep in @code{exit} or should all undo
1233 operations be applied with the IPC_NOWAIT flag in effect?
1234 Currently those undo operations which go through immediately are
1235 applied and those that require a wait are ignored silently.@refill
1237 @subsection Shared memory, @code{malloc} and the @code{brk}.
1238 Note that since this section was written the implementation was
1239 changed so that non-specific attaches are done in the region
1240 1G - 1.5G. However much of the following is still worth thinking
1241 about so I left it in.
1243 On many systems, the shared memory is allocated in a special region
1244 of the address space ... way up somewhere. As mentioned earlier,
1245 this implementation attaches shared segments at the lowest possible
1246 address. Thus if you plan to use @code{malloc}, it is wise to malloc a
1247 large space and then proceed to attach the shared segments. This way
1248 malloc sets the brk sufficiently above the region it will use.@refill
1250 Alternatively you can use @code{sbrk} to adjust the @code{brk} value
1251 as you make shared memory attaches. The implementation is not very
1252 smart about selecting attach addresses. Using the system default
1253 addresses will result in fragmentation if detaches do not occur
1254 in the reverse sequence as attaches.@refill
1256 Taking control of the matter is probably best. The rule applied
1257 is that attaches are allowed in unmapped regions other than
1258 in the text space (see <a.out.h>). Also remember that attach addresses
1259 and segment sizes are multiples of PAGE_SIZE.@refill
1261 One more trap (I quote Bruno on this). If you use malloc() to get space
1262 for your shared memory (ie. to fix the @code{brk}), you must ensure you
1263 get an unmapped address range. This means you must mallocate more memory
1264 than you had ever allocated before. Memory returned by malloc(), used,
1265 then freed by free() and then again returned by malloc is no good.
1266 Neither is calloced memory.@refill
1268 Note that a shared memory region remains a shared memory region until
1269 you unmap it. Attaching a segment at the @code{brk} and calling malloc
1270 after that will result in an overlap of what malloc thinks is its
1271 space with what is really a shared memory region. For example in the case
1272 of a read-only attach, you will not be able to write to the overlapped
1276 @subsection Fork, exec and exit
1278 On a fork, the child inherits attached shared memory segments but
1279 not the semaphore undo information.@refill
1281 In the case of an exec, the attached shared segments are detached.
1282 The sem undo information however remains intact.@refill
1284 Upon exit, all attached shared memory segments are detached.
1285 The adjust values in the undo structures are added to the relevant semvals
1286 if the operations are permitted. Disallowed operations are ignored.@refill
1289 @subsection Other Features
1291 These features of the current implementation are
1292 likely to be modified in the future.
1294 The SHM_LOCK and SHM_UNLOCK flag are available (super-user) for use with the
1295 @code{shmctl} call to prevent swapping of a shared segment. The user
1296 must fault in any pages that are required to be present after locking
1299 The IPC_INFO, MSG_STAT, MSG_INFO, SHM_STAT, SHM_INFO, SEM_STAT, SEMINFO
1300 @code{ctl} calls are used by the @code{ipcs} program to provide information
1301 on allocated resources. These can be modified as needed or moved to a proc
1302 file system interface.
1306 Thanks to Ove Ewerlid, Bruno Haible, Ulrich Pegelow and Linus Torvalds
1307 for ideas, tutorials, bug reports and fixes, and merriment. And more