Status: Ready for Engineering
Affects Version/s: PUP 4.9.4
Fix Version/s: None
- Puppet Server will not allow multiple simultaneous writes to the same filebucket entry
QA Risk Assessment:Needs Assessment
Currently, the FileBucket write code attempts to prevent multiple simultaneous writes to the same bucket entry by obtaining an exclusive lock (flock / LOCK_EX) on the entry's paths file before writing, and only attempting to write to the corresponding contents file after the paths file has been locked.
This logic was added in https://github.com/puppetlabs/puppet/commit/8b0a4ec9cecb0c2ff0e79cd26966d279fecaa9c4.
In writing the new method we discovered that the
exclusive lock ability is lost if the update is done using a tempfile
and atomic replace of the file that is locked. When that happens there
is an opportunity for two processes to attain locks and modify the file.
The solution to this was to only allow updates-in-place.
The specific semantics of file locking on *nix are muddled, at best.
Ruby uses flock, not fnctl which is the "default" implementation on linux.
Locks created by flock() are associated with an open file description
(see open(2)). This means that duplicate file descriptors (created
by, for example, fork(2) or dup(2)) refer to the same lock, and this
lock may be modified or released using any of these file descriptors.
Furthermore, the lock is released either by an explicit LOCK_UN
operation on any of these duplicate file descriptors, or when all
such file descriptors have been closed.
If a process uses open(2) (or similar) to obtain more than one file
descriptor for the same file, these file descriptors are treated
independently by flock(). An attempt to lock the file using one of
these file descriptors may be denied by a lock that the calling
process has already placed via another file descriptor.
This means since Puppet Server spawns ruby instances as threads of the same process, and the threads within a process share file descriptors, our ruby instances are sharing the same lock.
It gets worse... the behavior of flock varies (apparently dramatically) on systems, up to and including simply being an emulation of fnctl:
Since kernel 2.0, flock() is implemented as a system call in its own
right rather than being emulated in the GNU C library as a call to
fcntl(2). With this implementation, there is no interaction between
the types of lock placed by flock() and fcntl(2), and flock() does
not detect deadlock. (Note, however, that on some systems, such as
the modern BSDs, flock() and fcntl(2) locks do interact with one
flock() and fcntl(2) locks have different semantics with respect to
forked processes and dup(2). On systems that implement flock() using
fcntl(2), the semantics of flock() will be different from those
described in this manual page.
fnctl uses record locks, which are explicitly not thread safe:
On Record locks:
- The threads in a process share locks. In other words, a
multithreaded program can't use record locking to ensure that
threads don't simultaneously access the same region of a file.
- Investigate/confirm that FileBucket write atomicity is broken when running under Puppet Server
- Remediate this situation if so