Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
synapse
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container registry
Model registry
Monitor
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Maunium
synapse
Commits
5ecad4e7
Unverified
Commit
5ecad4e7
authored
3 years ago
by
Brendan Abolivier
Committed by
GitHub
3 years ago
Browse files
Options
Downloads
Patches
Plain Diff
Update the logcontext doc (#10353)
By referring to awaitables instead of deferreds.
parent
bf72d10d
Branches
Branches containing commit
Tags
Tags containing commit
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
changelog.d/10353.doc
+1
-0
1 addition, 0 deletions
changelog.d/10353.doc
docs/log_contexts.md
+101
-230
101 additions, 230 deletions
docs/log_contexts.md
with
102 additions
and
230 deletions
changelog.d/10353.doc
0 → 100644
+
1
−
0
View file @
5ecad4e7
Refresh the logcontext dev documentation.
This diff is collapsed.
Click to expand it.
docs/log_contexts.md
+
101
−
230
View file @
5ecad4e7
...
...
@@ -14,12 +14,16 @@ The `synapse.logging.context` module provides a facilities for managing
the current log context (as well as providing the
`LoggingContextFilter`
class).
Deferred
s make the whole thing complicated, so this document describes
Asynchronous function
s make the whole thing complicated, so this document describes
how it all works, and how to write code which follows the rules.
## Logcontexts without Deferreds
In this document, "awaitable" refers to any object which can be
`await`
ed. In the context of
Synapse, that normally means either a coroutine or a Twisted
[
`Deferred`
](
https://twistedmatrix.com/documents/current/api/twisted.internet.defer.Deferred.html
)
.
In the absence of any Deferred voodoo, things are simple enough. As with
## Logcontexts without asynchronous code
In the absence of any asynchronous voodoo, things are simple enough. As with
any code of this nature, the rule is that our function should leave
things as it found them:
...
...
@@ -55,126 +59,109 @@ def do_request_handling():
logger
.
debug
(
"
phew
"
)
```
## Using logcontexts with
Deferred
s
## Using logcontexts with
awaitable
s
Deferreds --- and in particular,
`defer.inlineCallbacks`
--- break the
linear flow of code so that there is no longer a single entry point
where we should set the logcontext and a single exit point where we
should remove it.
Awaitables break the linear flow of code so that there is no longer a single entry point
where we should set the logcontext and a single exit point where we should remove it.
Consider the example above, where
`do_request_handling`
needs to do some
blocking operation, and returns a
deferred
:
blocking operation, and returns a
n awaitable
:
```
python
@defer.inlineCallbacks
def
handle_request
(
request_id
):
async
def
handle_request
(
request_id
):
with
context
.
LoggingContext
()
as
request_context
:
request_context
.
request
=
request_id
yield
do_request_handling
()
await
do_request_handling
()
logger
.
debug
(
"
finished
"
)
```
In the above flow:
-
The logcontext is set
-
`do_request_handling`
is called, and returns a
deferred
-
`handle_request`
yield
s the
deferred
-
The
`inlineCallbacks`
wrapper
of
`handle_request`
returns a deferr
ed
-
`do_request_handling`
is called, and returns a
n awaitable
-
`handle_request`
await
s the
awaitable
-
Execution
of
`handle_request`
is suspend
ed
So we have stopped processing the request (and will probably go on to
start processing the next), without clearing the logcontext.
To circumvent this problem, synapse code assumes that, wherever you have
a
deferred
, you will want to
yield on
it. To that end, whereever
functions return a
deferred
, we adopt the following conventions:
a
n awaitable
, you will want to
`await`
it. To that end, whereever
functions return a
waitables
, we adopt the following conventions:
**Rules for functions returning
deferred
s:**
**Rules for functions returning
awaitable
s:**
> - If the
deferred
is already complete, the function returns with the
> - If the
awaitable
is already complete, the function returns with the
> same logcontext it started with.
> - If the
deferred
is incomplete, the function clears the logcontext
> before returning; when the
deferred
completes, it restores the
> - If the
awaitable
is incomplete, the function clears the logcontext
> before returning; when the
awaitable
completes, it restores the
> logcontext before running any callbacks.
That sounds complicated, but actually it means a lot of code (including
the example above) "just works". There are two cases:
-
If
`do_request_handling`
returns a completed
deferred
, then the
-
If
`do_request_handling`
returns a completed
awaitable
, then the
logcontext will still be in place. In this case, execution will
continue immediately after the
`
yield
`
; the "finished" line will
continue immediately after the
`
await
`
; the "finished" line will
be logged against the right context, and the
`with`
block restores
the original context before we return to the caller.
-
If the returned
deferred
is incomplete,
`do_request_handling`
clears
-
If the returned
awaitable
is incomplete,
`do_request_handling`
clears
the logcontext before returning. The logcontext is therefore clear
when
`handle_request`
yields the deferred. At that point, the
`inlineCallbacks`
wrapper adds a callback to the deferred, and
returns another (incomplete) deferred to the caller, and it is safe
to begin processing the next request.
Once `do_request_handling`'s deferred completes, it will reinstate
the logcontext, before running the callback added by the
`inlineCallbacks` wrapper. That callback runs the second half of
`handle_request`, so again the "finished" line will be logged
against the right context, and the `with` block restores the
original context.
when
`handle_request`
`await`
s the awaitable.
Once `do_request_handling`'s awaitable completes, it will reinstate
the logcontext, before running the second half of `handle_request`,
so again the "finished" line will be logged against the right context,
and the `with` block restores the original context.
As an aside, it's worth noting that
`handle_request`
follows our rules
-though that only matters if the caller has its own logcontext which it
-
though that only matters if the caller has its own logcontext which it
cares about.
The following sections describe pitfalls and helpful patterns when
implementing these rules.
Always
yield
your
deferred
s
---------------------------
Always
await
your
awaitable
s
---------------------------
-
Whenever you get a deferred back from a function, you should
`yield`
on
it as soon as possible. (Returning it directly to your caller is ok too,
if you're not doing
`inlineCallbacks`
.) Do not pass go; do not do any
logging; do not call any other functions.
Whenever you get an awaitable back from a function, you should
`await`
on
it as soon as possible. Do not pass go; do not do any logging; do not
call any other functions.
```
python
@defer.inlineCallbacks
def
fun
():
async
def
fun
():
logger
.
debug
(
"
starting
"
)
yield
do_some_stuff
()
# just like this
await
do_some_stuff
()
# just like this
d
=
more_stuff
()
result
=
yield
d
# also fine, of course
coro
=
more_stuff
()
result
=
await
coro
# also fine, of course
return
result
def
nonInlineCallbacksFun
():
logger
.
debug
(
"
just a wrapper really
"
)
return
do_some_stuff
()
# this is ok too - the caller will yield on
# it anyway.
```
Provided this pattern is followed all the way back up to the callchain
to where the logcontext was set, this will make things work out ok:
provided
`do_some_stuff`
and
`more_stuff`
follow the rules above, then
so will
`fun`
(as wrapped by
`inlineCallbacks`
) and
`nonInlineCallbacksFun`
.
so will
`fun`
.
It's all too easy to forget to
`
yield
`
: for instance if we forgot that
`do_some_stuff`
returned a
deferred
, we might plough on regardless. This
It's all too easy to forget to
`
await
`
: for instance if we forgot that
`do_some_stuff`
returned a
n awaitable
, we might plough on regardless. This
leads to a mess; it will probably work itself out eventually, but not
before a load of stuff has been logged against the wrong context.
(Normally, other things will break, more obviously, if you forget to
`
yield
`
, so this tends not to be a major problem in practice.)
`
await
`
, so this tends not to be a major problem in practice.)
Of course sometimes you need to do something a bit fancier with your
Deferreds
- not all code follows the linear A-then-B-then-C pattern.
awaitable
- not all code follows the linear A-then-B-then-C pattern.
Notes on implementing more complex patterns are in later sections.
## Where you create a new
Deferred
, make it follow the rules
## Where you create a new
awaitable
, make it follow the rules
Most of the time, a
Deferred
comes from another synapse function.
Sometimes, though, we need to make up a new
Deferred
, or we get a
Deferred
back from external code. We need to make it follow our rules.
Most of the time, a
n awaitable
comes from another synapse function.
Sometimes, though, we need to make up a new
awaitable
, or we get a
n awaitable
back from external code. We need to make it follow our rules.
The easy way to do it is with a combination of
`defer.inlineCallbacks`
,
and
`context.PreserveLoggingContext`
. Suppose we want to implement
The easy way to do it is by using
`context.make_deferred_yieldable`
. Suppose we want to implement
`sleep`
, which returns a deferred which will run its callbacks after a
given number of seconds. That might look like:
...
...
@@ -186,25 +173,12 @@ def get_sleep_deferred(seconds):
return
d
```
That doesn't follow the rules, but we can fix it by
wrapp
ing it
wit
h
`
PreserveLoggingContext`
and
`yield`
ing on it
:
That doesn't follow the rules, but we can fix it by
call
ing it
throug
h
`
context.make_deferred_yieldable`
:
```
python
@defer.inlineCallbacks
def
sleep
(
seconds
):
with
PreserveLoggingContext
():
yield
get_sleep_deferred
(
seconds
)
```
This technique works equally for external functions which return
deferreds, or deferreds we have made ourselves.
You can also use
`context.make_deferred_yieldable`
, which just does the
boilerplate for you, so the above could be written:
```
python
def
sleep
(
seconds
):
return
context
.
make_deferred_yieldable
(
get_sleep_deferred
(
seconds
))
async
def
sleep
(
seconds
):
return
await
context
.
make_deferred_yieldable
(
get_sleep_deferred
(
seconds
))
```
## Fire-and-forget
...
...
@@ -213,20 +187,18 @@ Sometimes you want to fire off a chain of execution, but not wait for
its result. That might look a bit like this:
```
python
@defer.inlineCallbacks
def
do_request_handling
():
yield
foreground_operation
()
async
def
do_request_handling
():
await
foreground_operation
()
# *don't* do this
background_operation
()
logger
.
debug
(
"
Request handling complete
"
)
@defer.inlineCallbacks
def
background_operation
():
yield
first_background_step
()
async
def
background_operation
():
await
first_background_step
()
logger
.
debug
(
"
Completed first step
"
)
yield
second_background_step
()
await
second_background_step
()
logger
.
debug
(
"
Completed second step
"
)
```
...
...
@@ -235,13 +207,13 @@ The above code does a couple of steps in the background after
against the
`request_context`
logcontext, which may or may not be
desirable. There are two big problems with the above, however. The first
problem is that, if
`background_operation`
returns an incomplete
Deferred
, it will expect its caller to
`
yield
`
immediately, so will have
awaitable
, it will expect its caller to
`
await
`
immediately, so will have
cleared the logcontext. In this example, that means that 'Request
handling complete' will be logged without any context.
The second problem, which is potentially even worse, is that when the
Deferred
returned by
`background_operation`
completes, it will restore
the original logcontext. There is nothing waiting on that
Deferred
, so
awaitable
returned by
`background_operation`
completes, it will restore
the original logcontext. There is nothing waiting on that
awaitable
, so
the logcontext will leak into the reactor and possibly get attached to
some arbitrary future operation.
...
...
@@ -254,9 +226,8 @@ deferred completes will be the empty logcontext), and will restore the
current logcontext before continuing the foreground process:
```
python
@defer.inlineCallbacks
def
do_request_handling
():
yield
foreground_operation
()
async
def
do_request_handling
():
await
foreground_operation
()
# start background_operation off in the empty logcontext, to
# avoid leaking the current context into the reactor.
...
...
@@ -274,16 +245,15 @@ Obviously that option means that the operations done in
The second option is to use
`context.run_in_background`
, which wraps a
function so that it doesn't reset the logcontext even when it returns
an incomplete
deferred
, and adds a callback to the returned
deferred
to
an incomplete
awaitable
, and adds a callback to the returned
awaitable
to
reset the logcontext. In other words, it turns a function that follows
the Synapse rules about logcontexts and
Deferred
s into one which behaves
the Synapse rules about logcontexts and
awaitable
s into one which behaves
more like an external function --- the opposite operation to that
described in the previous section. It can be used like this:
```
python
@defer.inlineCallbacks
def
do_request_handling
():
yield
foreground_operation
()
async
def
do_request_handling
():
await
foreground_operation
()
context
.
run_in_background
(
background_operation
)
...
...
@@ -294,152 +264,53 @@ def do_request_handling():
## Passing synapse deferreds into third-party functions
A typical example of this is where we want to collect together two or
more
deferred
via
`defer.gatherResults`
:
more
awaitables
via
`defer.gatherResults`
:
```
python
d
1
=
operation1
()
d
2
=
operation2
()
d
3
=
defer
.
gatherResults
([
d
1
,
d
2
])
a
1
=
operation1
()
a
2
=
operation2
()
a
3
=
defer
.
gatherResults
([
a
1
,
a
2
])
```
This is really a variation of the fire-and-forget problem above, in that
we are firing off
`
d
1`
and
`
d
2`
without
yield
ing on them. The difference
we are firing off
`
a
1`
and
`
a
2`
without
await
ing on them. The difference
is that we now have third-party code attached to their callbacks. Anyway
either technique given in the
[
Fire-and-forget
](
#fire-and-forget
)
section will work.
Of course, the new
Deferred
returned by
`gather
Results
`
needs to be
Of course, the new
awaitable
returned by
`gather`
needs to be
wrapped in order to make it follow the logcontext rules before we can
yield it, as described in
[
Where you create a new
Deferred
, make it
yield it, as described in
[
Where you create a new
awaitable
, make it
follow the
rules
](
#where-you-create-a-new-
deferred
-make-it-follow-the-rules
)
.
rules
](
#where-you-create-a-new-
awaitable
-make-it-follow-the-rules
)
.
So, option one: reset the logcontext before starting the operations to
be gathered:
```
python
@defer.inlineCallbacks
def
do_request_handling
():
async
def
do_request_handling
():
with
PreserveLoggingContext
():
d
1
=
operation1
()
d
2
=
operation2
()
result
=
yield
defer
.
gatherResults
([
d
1
,
d
2
])
a
1
=
operation1
()
a
2
=
operation2
()
result
=
await
defer
.
gatherResults
([
a
1
,
a
2
])
```
In this case particularly, though, option two, of using
`context.
preserve_fn
`
almost certainly makes more sense, so that
`context.
run_in_background
`
almost certainly makes more sense, so that
`operation1`
and
`operation2`
are both logged against the original
logcontext. This looks like:
```
python
@defer.inlineCallbacks
def
do_request_handling
():
d1
=
context
.
preserve_fn
(
operation1
)()
d2
=
context
.
preserve_fn
(
operation2
)()
async
def
do_request_handling
():
a1
=
context
.
run_in_background
(
operation1
)
a2
=
context
.
run_in_background
(
operation2
)
with
PreserveLoggingContext
():
result
=
yield
defer
.
gatherResults
([
d1
,
d2
])
result
=
await
make_deferred_yieldable
(
defer
.
gatherResults
([
a1
,
a2
]))
```
## Was all this really necessary?
The conventions used work fine for a linear flow where everything
happens in series via
`defer.inlineCallbacks`
and
`yield`
, but are
certainly tricky to follow for any more exotic flows. It's hard not to
wonder if we could have done something else.
We're not going to rewrite Synapse now, so the following is entirely of
academic interest, but I'd like to record some thoughts on an
alternative approach.
I briefly prototyped some code following an alternative set of rules. I
think it would work, but I certainly didn't get as far as thinking how
it would interact with concepts as complicated as the cache descriptors.
My alternative rules were:
-
functions always preserve the logcontext of their caller, whether or
not they are returning a Deferred.
-
Deferreds returned by synapse functions run their callbacks in the
same context as the function was orignally called in.
The main point of this scheme is that everywhere that sets the
logcontext is responsible for clearing it before returning control to
the reactor.
So, for example, if you were the function which started a
`with LoggingContext`
block, you wouldn't
`yield`
within it --- instead
you'd start off the background process, and then leave the
`with`
block
to wait for it:
```
python
def
handle_request
(
request_id
):
with
context
.
LoggingContext
()
as
request_context
:
request_context
.
request
=
request_id
d
=
do_request_handling
()
def
cb
(
r
):
logger
.
debug
(
"
finished
"
)
d
.
addCallback
(
cb
)
return
d
```
(in general, mixing
`with LoggingContext`
blocks and
`defer.inlineCallbacks`
in the same function leads to slighly
counter-intuitive code, under this scheme).
Because we leave the original
`with`
block as soon as the Deferred is
returned (as opposed to waiting for it to be resolved, as we do today),
the logcontext is cleared before control passes back to the reactor; so
if there is some code within
`do_request_handling`
which needs to wait
for a Deferred to complete, there is no need for it to worry about
clearing the logcontext before doing so:
```
python
def
handle_request
():
r
=
do_some_stuff
()
r
.
addCallback
(
do_some_more_stuff
)
return
r
```
--- and provided
`do_some_stuff`
follows the rules of returning a
Deferred which runs its callbacks in the original logcontext, all is
happy.
The business of a Deferred which runs its callbacks in the original
logcontext isn't hard to achieve --- we have it today, in the shape of
`context._PreservingContextDeferred`
:
```
python
def
do_some_stuff
():
deferred
=
do_some_io
()
pcd
=
_PreservingContextDeferred
(
LoggingContext
.
current_context
())
deferred
.
chainDeferred
(
pcd
)
return
pcd
```
It turns out that, thanks to the way that Deferreds chain together, we
automatically get the property of a context-preserving deferred with
`defer.inlineCallbacks`
, provided the final Defered the function
`yields`
on has that property. So we can just write:
```
python
@defer.inlineCallbacks
def
handle_request
():
yield
do_some_stuff
()
yield
do_some_more_stuff
()
```
To conclude: I think this scheme would have worked equally well, with
less danger of messing it up, and probably made some more esoteric code
easier to write. But again --- changing the conventions of the entire
Synapse codebase is not a sensible option for the marginal improvement
offered.
## A note on garbage-collection of Deferred chains
## A note on garbage-collection of awaitable chains
It turns out that our logcontext rules do not play nicely with
Deferred
It turns out that our logcontext rules do not play nicely with
awaitable
chains which get orphaned and garbage-collected.
Imagine we have some code that looks like this:
...
...
@@ -451,13 +322,12 @@ def on_something_interesting():
for
d
in
listener_queue
:
d
.
callback
(
"
foo
"
)
@defer.inlineCallbacks
def
await_something_interesting
():
new_deferred
=
defer
.
Deferred
()
listener_queue
.
append
(
new_deferred
)
async
def
await_something_interesting
():
new_awaitable
=
defer
.
Deferred
()
listener_queue
.
append
(
new_awaitable
)
with
PreserveLoggingContext
():
yield
new_deferred
await
new_awaitable
```
Obviously, the idea here is that we have a bunch of things which are
...
...
@@ -476,18 +346,19 @@ def reset_listener_queue():
listener_queue
.
clear
()
```
So, both ends of the deferred chain have now dropped their references,
and the deferred chain is now orphaned, and will be garbage-collected at
some point. Note that
`await_something_interesting`
is a generator
function, and when Python garbage-collects generator functions, it gives
them a chance to clean up by making the
`yield`
raise a
`GeneratorExit`
So, both ends of the awaitable chain have now dropped their references,
and the awaitable chain is now orphaned, and will be garbage-collected at
some point. Note that
`await_something_interesting`
is a coroutine,
which Python implements as a generator function. When Python
garbage-collects generator functions, it gives them a chance to
clean up by making the
`async`
(or
`yield`
) raise a
`GeneratorExit`
exception. In our case, that means that the
`__exit__`
handler of
`PreserveLoggingContext`
will carefully restore the request context, but
there is now nothing waiting for its return, so the request context is
never cleared.
To reiterate, this problem only arises when
*both*
ends of a
deferred
chain are dropped. Dropping the the reference to a
deferred
you're
supposed to be
call
ing is
probably
bad practice, so this doesn't
To reiterate, this problem only arises when
*both*
ends of a
awaitable
chain are dropped. Dropping the the reference to a
n awaitable
you're
supposed to be
await
ing is bad practice, so this doesn't
actually happen too much. Unfortunately, when it does happen, it will
lead to leaked logcontexts which are incredibly hard to track down.
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment