Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
W
wine-fonts
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Registry
Registry
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Aleksandr Isakov
wine-fonts
Commits
3a3784f2
Commit
3a3784f2
authored
Jun 09, 2018
by
Zebediah Figura
Committed by
Vitaly Lipatov
Jul 30, 2022
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
esync: Add a README.
parent
c236e7aa
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
185 additions
and
0 deletions
+185
-0
README.esync
README.esync
+185
-0
No files found.
README.esync
0 → 100644
View file @
3a3784f2
This is eventfd-based synchronization, or 'esync' for short. Turn it on with
WINEESYNC=1 (note that it checks the presence and not the value); debug it
with +esync.
The aim is to execute all synchronization operations in "user-space", that is,
without going through wineserver. We do this using Linux's eventfd
facility. The main impetus to using eventfd is so that we can poll multiple
objects at once; in particular we can't do this with futexes, or pthread
semaphores, or the like. The only way I know of to wait on any of multiple
objects is to use select/poll/epoll to wait on multiple fds, and eventfd gives
us those fds in a quite usable way.
Whenever a semaphore, event, or mutex is created, we have the server, instead
of creating a traditional server-side event/semaphore/mutex, instead create an
'esync' primitive. These live in esync.c and are very slim objects; in fact,
they don't even know what type of primitive they are. The server is involved
at all because we still need a way of creating named objects, passing handles
to another process, etc.
The server creates an eventfd file descriptor with the requested parameters
and passes it back to ntdll. ntdll creates an object of the appropriate type,
then caches it in a table. This table is copied almost wholesale from the fd
cache code in server.c.
Specific operations follow quite straightforwardly from eventfd:
* To release an object, or set an event, we simply write() to it.
* An object is signalled if read() succeeds on it. Notably, we create all
eventfd descriptors with O_NONBLOCK, so that we can atomically check if an
object is signalled and grab it if it is. This also lets us reset events.
* For objects whose state should not be reset upon waiting—e.g. manual-reset
events—we simply check for the POLLIN flag instead of reading.
* Semaphores are handled by the EFD_SEMAPHORE flag. This matches up quite well
(although with some difficulties; see below).
* Mutexes store their owner thread locally. This isn't reliable information if
a different process's thread owns the mutex, but this doesn't matter—a
thread should only care whether it owns the mutex, so it knows whether to
try waiting on it or simply to increase the recursion count.
The interesting part about esync is that (almost) all waits happen in ntdll,
including those on server-bound objects. The idea here is that on the server
side, for any waitable object, we create an eventfd file descriptor (not an
esync primitive), and then pass it to ntdll if the program tries to wait on
it. These are cached too, so only the first wait will require a round trip to
the server. Then the server signals the file descriptor as appropriate, and
thereby wakes up the client. So far this is implemented for processes,
threads, message queues (difficult; see below), and device managers (necessary
for drivers to work). All of these are necessarily server-bound, so we
wouldn't really gain anything by signalling on the client side instead. Of
course, except possibly for message queues, it's not likely that any program
(cutting-edge D3D game or not) is going to be causing a great wineserver load
by waiting on any of these objects; the motivation was rather to provide a way
to wait on ntdll-bound and server-bound objects at the same time.
Some cases are still passed to the server, and there's probably no reason not
to keep them that way. Those that I noticed while testing include: async
objects, which are internal to the file APIs and never exposed to userspace,
startup_info objects, which are internal to the loader and signalled when a
process starts, and keyed events, which are exposed through an ntdll API
(although not through kernel32) but can't be mixed with other objects (you
have to use NtWaitForKeyedEvent()). Other cases include: named pipes, debug
events, sockets, and timers. It's unlikely we'll want to optimize debug events
or sockets (or any of the other, rather rare, objects), but it is possible
we'll want to optimize named pipes or timers.
There were two sort of complications when working out the above. The first one
was events. The trouble is that (1) the server actually creates some events by
itself and (2) the server sometimes manipulates events passed by the
client. Resolving the first case was easy enough, and merely entailed creating
eventfd descriptors for the events the same way as for processes and threads
(note that we don't really lose anything this way; the events include
"LowMemoryCondition" and the event that signals system processes to shut
down). For the second case I basically had to hook the server-side event
functions to redirect to esync versions if the event was actually an esync
primitive.
The second complication was message queues. The difficulty here is that X11
signals events by writing into a pipe (at least I think it's a pipe?), and so
as a result wineserver has to poll on that descriptor. In theory we could just
let wineserver do so and then signal us as appropriate, except that wineserver
only polls on the pipe when the thread is waiting for events (otherwise we'd
get e.g. keyboard input while the thread is doing something else, and spin
forever trying to wake up a thread that doesn't care). The obvious solution is
just to poll on that fd ourselves, and that's what I did—it's just that
getting the fd from wineserver was kind of ugly, and the code for waiting was
also kind of ugly basically because we have to wait on both X11's fd and the
"normal" process/thread-style wineserver fd that we use to signal sent
messages. The upshot about the whole thing was that races are basically
impossible, since a thread can only wait on its own queue.
I had kind of figured that APCs just wouldn't work, but then poll() spat EINTR
at me and I realized that this wasn't necessarily true. It seems that the
server will suspend a thread when trying to deliver a system APC to a thread
that's not waiting, and since the server has no idea that we're waiting it
just suspends us. This of course interrupts poll(), which complains at us, and
it turns out that just returning STATUS_USER_APC in that case is enough to
make rpcrt4 happy.
There are a couple things that this infrastructure can't handle, although
surprisingly there aren't that many. In particular:
* We can't return the previous count on a semaphore, since we have no way to
query the count on a semaphore through eventfd. Currently the code lies and
returns 1 every time. We can make this work (in a single process, or [if
necessary] in multiple processes through shared memory) by keeping a count
locally. We can't guarantee that it's the exact count at the moment the
semaphore was released, but I guess any program that runs into that race
shouldn't be depending on that fact anyway.
* Similarly, we can't enforce the maximum count on a semaphore, since we have
no way to get the current count and subsequently compare it with the
maximum.
* We can't use NtQueryMutant to get the mutant's owner or count if it lives in
a different process. If necessary we can use shared memory to make this
work, I guess, but see below.
* User APCs don't work. However, it's not impossible to make them work; in
particular I think this could be relatively easily implemented by waiting on
another internal file descriptor when we execute an alertable wait.
* Implementing wait-all, i.e. WaitForMultipleObjects(..., TRUE, ...), is not
exactly possible the way we'd like it to be possible. In theory that
function should wait until it knows all objects are available, then grab
them all at once atomically. The server (like the kernel) can do this
because the server is single-threaded and can't race with itself. We can't
do this in ntdll, though. The approach I've taken I've laid out in great
detail in the relevant patch, but for a quick summary we poll on each object
until it's signaled (but don't grab it), check them all again, and if
they're all signaled we try to grab them all at once in a tight loop, and if
we fail on any of them we reset the count on whatever we shouldn't have
consumed. Such a blip would necessarily be very quick.
* The whole patchset only works on Linux, where eventfd is available. However,
it should be possible to make it work on a Mac, since eventfd is just a
quicker, easier way to use pipes (i.e. instead of writing 1 to the fd you'd
write 1 byte; instead of reading a 64-bit value from the fd you'd read as
many bytes as you can carry, which is admittedly less than 2**64 but
can probably be something reasonable.) It's also possible, although I
haven't yet looked, to use some different kind of synchronization
primitives, but pipes would be easiest to tack onto this framework.
* We might hit the maximum number of open fd's. On my system the soft limit is
1024 and the hard limit is 1048576. I'm inclined to hope this won't be an
issue, since a hypothetical Linux port of any application might just as well
use the same number of eventfds.
* PulseEvent() can't work the way it's supposed to work. Fortunately it's rare
and deprecated. It's also explicitly mentioned on MSDN that a thread can
miss the notification for a kernel APC, so in a sense we're not necessarily
doing anything wrong.
There are some things that are perfectly implementable but that I just haven't
done yet:
* NtOpen* (aka Open*). This is just a matter of adding another open_esync
request analogous to those for other server primitives.
* NtQuery*. This can be done to some degree (the difficulties are outlined
above). That said, these APIs aren't exposed through kernel32 in any way, so
I doubt anyone is going to be using them.
* SignalObjectAndWait(). The server combines this into a single operation, but
according to MSDN it doesn't need to be atomic, so we can just signal the
appropriate object and wait, and woe betide anyone who gets in the way of
those two operations.
* Other synchronizable server primitives. It's unlikely we'll need any of
these, except perhaps named pipes (which would honestly be rather difficult)
and (maybe) timers.
This patchset was inspired by Daniel Santos' "hybrid synchronization"
patchset. My idea was to create a framework whereby even contended waits could
be executed in userspace, eliminating a lot of the complexity that his
synchronization primitives used. I do however owe some significant gratitude
toward him for setting me on the right path.
I've tried to maximize code separation, both to make any potential rebases
easier and to ensure that esync is only active when configured. All code in
existing source files is guarded with "if (do_esync())", and generally that
condition is followed by "return esync_version_of_this_method(...);", where
the latter lives in esync.c and is declared in esync.h. I've also tried to
make the patchset very clear and readable—to write it as if I were going to
submit it upstream. (Some intermediate patches do break things, which Wine is
generally against, but I think it's for the better in this case.) I have cut
some corners, though; there is some error checking missing, or implicit
assumptions that the program is behaving correctly.
I've tried to be careful about races. There are a lot of comments whose
purpose are basically to assure me that races are impossible. In most cases we
don't have to worry about races since all of the low-level synchronization is
done by the kernel.
Anyway, yeah, this is esync. Use it if you like.
--Zebediah Figura
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment