.\"	$NetBSD: locking.9,v 1.5.8.1 2017/08/25 05:50:00 snj Exp $
.\"
.\" Copyright (c) 2015 The NetBSD Foundation, Inc.
.\" All rights reserved.
.\"
.\" This code is derived from software contributed to The NetBSD Foundation
.\" by Kamil Rytarowski.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\"    notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\"    notice, this list of conditions and the following disclaimer in the
.\"    documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.Dd April 15, 2017
.Dt LOCKING 9
.Os
.Sh NAME
.Nm locking
.Nd introduction to kernel synchronization and interrupt control
.Sh DESCRIPTION
The
.Nx
kernel provides several synchronization and interrupt control primitives.
This man page aims to give an overview of these interfaces and their proper
application.
Also included are basic kernel thread control primitives and a rough
overview of the
.Nx
kernel design.
.Sh KERNEL OVERVIEW
The aim of synchronization, threads and interrupt control in the kernel is:
.Bl -bullet -offset indent
.It
To control concurrent access to shared resources (critical sections).
.It
Spawn tasks from an interrupt in the thread context.
.It
Mask interrupts from threads.
.It
Scale on multiple CPU system.
.El
.Pp
There are three types of contexts in the
.Nx
kernel:
.Bl -bullet -offset indent
.It
.Em Thread context
- running processes (represented by
.Dv struct proc )
and light-weight processes (represented by
.Dv struct lwp ,
also known as kernel threads).
Code in this context can sleep, block resources and own address-space.
.It
.Em Software interrupt context
- limited by thread context.
Code in this context must be processed shortly.
These interrupts don't own any address space context.
Software interrupts are a way of deferring hardware interrupts to do more
expensive processing at a lower interrupt priority.
.It
.Em Hard interrupt context
- Code in this context must be processed as quickly as possible.
It is forbidden for a piece of code to sleep or access long-awaited resources here.
.El
.Pp
The main differences between processes and kernel threads are:
.Bl -bullet -offset indent
.It
A single process can own multiple kernel threads (LWPs).
.It
A process owns address space context to map userland address space.
.It
Processes are designed for userland executables and kernel threads for
in-kernel tasks.
The only process running in the kernel-space is
.Dv proc0
(called swapper).
.El
.Sh INTERFACES
.Ss Atomic memory operations
The
.Nm atomic_ops
family of functions provide atomic memory operations.
There are 7 classes of atomic memory operations available:
addition, logical
.Dq and ,
compare-and-swap, decrement, increment, logical
.Dq or ,
swap.
.Pp
See
.Xr atomic_ops 3 .
.Ss Condition variables
Condition variables (CVs) are used in the kernel to synchronize access to
resources that are limited (for example, memory) and to wait for pending I/O
operations to complete.
.Pp
See
.Xr condvar 9 .
.Ss Memory access barrier operations
The
.Nm membar_ops
family of functions provide memory access barrier operations necessary for
synchronization in multiprocessor execution environments that have relaxed load
and store order.
.Pp
See
.Xr membar_ops 3 .
.Ss Memory barriers
The memory barriers can be used to control the order in which memory accesses
occur, and thus the order in which those accesses become visible to other
processors.
They can be used to implement
.Dq lockless
access to data structures where the necessary barrier conditions are well
understood.
.Ss Mutual exclusion primitives
Thread-base adaptive mutexes.
These are lightweight,
exclusive locks that use threads as the focus of synchronization activity.
Adaptive mutexes typically behave like spinlocks,
but under specific conditions an attempt to acquire an already held adaptive
mutex may cause the acquiring thread to sleep.
Sleep activity occurs rarely.
Busy-waiting is typically more efficient because mutex hold times are most
often short.
In contrast to pure spinlocks,
a thread holding an adaptive mutex may be pre-empted in the kernel,
which can allow for reduced latency where soft real-time application are in use
on the system.
.Pp
See
.Xr mutex 9 .
.Ss Restartable atomic sequences
Restartable atomic sequences are user code only sequences which are guaranteed
to execute without preemption.
This property is assured by checking the set of restartable atomic sequences
registered for a process during
.Xr cpu_switchto 9 .
If a process is found to have been preempted during a restartable sequence,
then its execution is rolled-back to the start of the sequence by resetting its
program counter which is saved in its process control block (PCB).
.Pp
See
.Xr ras 9 .
.Ss Reader / writer lock primitives
Reader / writer locks (RW locks) are used in the kernel to synchronize access
to an object among LWPs (lightweight processes) and soft interrupt handlers.
In addition to the capabilities provided by mutexes,
RW locks distinguish between read (shared) and write (exclusive) access.
.Pp
See
.Xr rwlock 9 .
.Ss Functions to modify system interrupt priority level
These functions raise and lower the interrupt priority level.
They are used by kernel code to block interrupts in critical sections,
in order to protect data structures.
.Pp
See
.Xr spl 9 .
.Ss Machine-independent software interrupt framework
The software interrupt framework is designed to provide a generic software
interrupt mechanism which can be used any time a low-priority callback is
required.
It allows dynamic registration of software interrupts for loadable drivers,
protocol stacks, software interrupt prioritization, software interrupt fair
queuing and allows machine-dependent optimizations to reduce cost.
.Pp
See
.Xr softint 9 .
.Ss Functions to raise the system priority level
The
.Nm splraiseipl
function raises the system priority level to the level specified by
.Dv icookie ,
which should be a value returned by
.Xr makeiplcookie 9 .
In general, device drivers should not make use of this interface.
To ensure correct synchronization,
device drivers should use the
.Xr condvar 9 ,
.Xr mutex 9 ,
and
.Xr rwlock 9
interfaces.
.Pp
See
.Xr splraiseipl 9 .
.Ss Passive serialization mechanism
Passive serialization is a reader / writer synchronization mechanism designed
for lock-less read operations.
The read operations may happen from software interrupt at
.Dv IPL_SOFTCLOCK .
.Pp
See
.Xr pserialize 9 .
.Ss Passive reference mechanism
Passive references allow CPUs to cheaply acquire and release passive
references to a resource, which guarantee the resource will not be
destroyed until the reference is released.  Acquiring and releasing
passive references requires no interprocessor synchronization, except
when the resource is pending destruction.
.Pp
See
.Xr psref 9 .
.Ss Localcount mechanism
Localcounts are used in the kernel to implement a medium-weight reference
counting mechanism.  During normal operations, localcounts do not need
the interprocessor synchronization associated with
.Xr atomic_ops 3
atomic memory operations, and (unlike
.Xr psref 9 )
localcount references can be held across sleeps and can migrate between
CPUs.  Draining a localcount requires more expensive interprocessor
synchronization than
.Xr atomic_ops 3
(similar to
.Xr psref 9 ) .
And localcount references require eight bytes of memory per object per-CPU,
significantly more than
.Xr atomic_ops 3
and almost always more than
.Xr psref 9 .
.Pp
See
.Xr localcount 9 .
.Ss Simple do-it-in-thread-context framework
The workqueue utility routines are provided to defer work which is needed to be
processed in a thread context.
.Pp
See
.Xr workqueue 9 .
.Sh USAGE
The following table describes in which contexts the use of the
.Nx
kernel interfaces are valid.
Synchronization primitives which are available in more than one context
can be used to protect shared resources between the contexts they overlap.
.Bl -column -offset indent \
"xxxxxxxxxxxx " "xxxxxxx " "xxxxxxx " "xxxxxxx "
.It Sy interface Ta Sy thread Ta Sy softirq Ta Sy hardirq
.It Xr atomic_ops 3 Ta yes Ta yes Ta yes
.It Xr condvar 9 Ta yes Ta partly Ta no
.It Xr membar_ops 3 Ta yes Ta yes Ta yes
.It Xr mutex 9 Ta yes Ta depends Ta depends
.It Xr rwlock 9 Ta yes Ta yes Ta no
.It Xr softint 9 Ta yes Ta yes Ta yes
.It Xr spl 9 Ta yes Ta no Ta no
.It Xr splraiseipl 9 Ta yes Ta no Ta no
.It Xr pserialize 9 Ta yes Ta yes Ta no
.It Xr psref 9 Ta yes Ta yes Ta no
.It Xr localcount 9 Ta yes Ta yes Ta no
.It Xr workqueue 9 Ta yes Ta yes Ta yes
.El
.Sh SEE ALSO
.Xr atomic_ops 3 ,
.Xr membar_ops 3 ,
.Xr condvar 9 ,
.Xr mutex 9 ,
.Xr ras 9 ,
.Xr rwlock 9 ,
.Xr softint 9 ,
.Xr spl 9 ,
.Xr splraiseipl 9 ,
.Xr workqueue 9
.Sh HISTORY
Initial SMP support was introduced in
.Nx 2.0
and was designed with a giant kernel lock.
Through
.Nx 4.0 ,
the kernel used spinlocks and a per-CPU interrupt priority level (the
.Xr spl 9
system).
These mechanisms did not lend themselves well to a multiprocessor environment
supporting kernel preemption.
The use of thread based (lock) synchronization was limited and the available
synchronization primitive (lockmgr) was inefficient and slow to execute.
.Nx 5.0
introduced massive performance improvements on multicore hardware
by Andrew Doran.
This work was sponsored by The
.Nx
Foundation.
.Pp
A
.Nm
manual first appeared in
.Nx 8.0
and was inspired by the corresponding
.Nm
manuals in
.Fx
and
.Dx .
.Sh AUTHORS
.An Kamil Rytarowski Aq Mt kamil@NetBSD.org .