-
Erik Johnston authored
Fixes #15756
Erik Johnston authoredFixes #15756
How do faster joins work?
This is a work-in-progress set of notes with two goals:
- act as a reference, explaining how Synapse implements faster joins; and
- record the rationale behind our choices.
See also MSC3902.
The key idea is described by MSC3706. This allows servers to
request a lightweight response to the federation /send_join
endpoint.
This is called a faster join, also known as a partial join. In these
notes we'll usually use the word "partial" as it matches the database schema.
Overview: processing events in a partially-joined room
The response to a partial join consists of
- the requested join event
J
, - a list of the servers in the room (according to the state before
J
), - a subset of the state of the room before
J
, - the full auth chain of that state subset.
Synapse marks the room as partially joined by adding a row to the database table
partial_state_rooms
. It also marks the join event J
as "partially stated",
meaning that we have neither received nor computed the full state before/after
J
. This is done by adding a row to partial_state_events
.
DB schema
matrix=> \d partial_state_events
Table "matrix.partial_state_events"
Column │ Type │ Collation │ Nullable │ Default
══════════╪══════╪═══════════╪══════════╪═════════
room_id │ text │ │ not null │
event_id │ text │ │ not null │
matrix=> \d partial_state_rooms
Table "matrix.partial_state_rooms"
Column │ Type │ Collation │ Nullable │ Default
════════════════════════╪════════╪═══════════╪══════════╪═════════
room_id │ text │ │ not null │
device_lists_stream_id │ bigint │ │ not null │ 0
join_event_id │ text │ │ │
joined_via │ text │ │ │
matrix=> \d partial_state_rooms_servers
Table "matrix.partial_state_rooms_servers"
Column │ Type │ Collation │ Nullable │ Default
═════════════╪══════╪═══════════╪══════════╪═════════
room_id │ text │ │ not null │
server_name │ text │ │ not null │
Indices, foreign-keys and check constraints are omitted for brevity.
While partially joined to a room, Synapse receives events E
from remote
homeservers as normal, and can create events at the request of its local users.
However, we run into trouble when we enforce the checks on an event.
- Is a valid event, otherwise it is dropped. For an event to be valid, it must contain a room_id, and it must comply with the event format of that room version.
- Passes signature checks, otherwise it is dropped.
- Passes hash checks, otherwise it is redacted before being processed further.
- Passes authorization rules based on the event’s auth events, otherwise it is rejected.
- Passes authorization rules based on the state before the event, otherwise it is rejected.
- Passes authorization rules based on the current state of the room, otherwise it is “soft failed”.
We can enforce checks 1--4 without any problems.
But we cannot enforce checks 5 or 6 with complete certainty, since Synapse does
not know the full state before E
, nor that of the room.
Partial state
Instead, we make a best-effort approximation. While the room is considered partially joined, Synapse tracks the "partial state" before events. This works in a similar way as regular state:
- The partial state before
J
is that given to us by the partial join response. - The partial state before an event
E
is the resolution of the partial states after each ofE
'sprev_event
s. - If
E
is rejected or a message event, the partial state afterE
is the partial state beforeE
. - Otherwise, the partial state after
E
is the partial state beforeE
, plusE
itself.
More concisely, partial state propagates just like full state; the only
difference is that we "seed" it with an incomplete initial state.
Synapse records that we have only calculated partial state for this event with
a row in partial_state_events
.
While the room remains partially stated, check 5 on incoming events to that room becomes:
- Passes authorization rules based on the resolution between the partial state before
E
andE
's auth events. If the event fails to pass authorization rules, it is rejected.
Additionally, check 6 is deleted: no soft-failures are enforced.
While partially joined, the current partial state of the room is defined as the resolution across the partial states after all forward extremities in the room.
Remark. Events with partial state are not considered outliers.
Approximation error
Using partial state means the auth checks can fail in a few different ways1.
- We may erroneously accept an incoming event in check 5 based on partial state when it would have been rejected based on full state, or vice versa.
- This means that an event could erroneously be added to the current partial state of the room when it would not be present in the full state of the room, or vice versa.
- Additionally, we may have skipped soft-failing an event that would have been soft-failed based on full state.
(Note that the discrepancies described in the last two bullets are user-visible.)
This means that we have to be very careful when we want to lookup pieces of room state in a partially-joined room. Our approximation of the state may be incorrect or missing. But we can make some educated guesses. If
- our partial state is likely to be correct, or
- the consequences of our partial state being incorrect are minor,
then we proceed as normal, and let the resync process fix up any mistakes (see below).
When is our partial state likely to be correct?
- It's more accurate the closer we are to the partial join event. (So we should ideally complete the resync as soon as possible.)
- Non-member events: we will have received them as part of the partial join response, if they were part of the room state at that point. We may incorrectly accept or reject updates to that state (at first because we lack remote membership information; later because of compounding errors), so these can become incorrect over time.
- Local members' memberships: we are the only ones who can create join and knock events for our users. We can't be completely confident in the correctness of bans, invites and kicks from other homeservers, but the resync process should correct any mistakes.
- Remote members' memberships: we did not receive these in the /send_join response, so we have essentially no idea if these are correct or not.
In short, we deem it acceptable to trust the partial state for non-membership and local membership events. For remote membership events, we wait for the resync to complete, at which point we have the full state of the room and can proceed as normal.
Fixing the approximation with a resync
The partial-state approximation is only a temporary affair. In the background,
synapse beings a "resync" process. This is a continuous loop, starting at the
partial join event and proceeding downwards through the event graph. For each
E
seen in the room since partial join, Synapse will fetch
- the event ids in the state of the room before
E
, via/state_ids
; - the event ids in the full auth chain of
E
, included in the/state_ids
response; and - any events from the previous two bullets that Synapse hasn't persisted, via `/state.
This means Synapse has (or can compute) the full state before E
, which allows
Synapse to properly authorise or reject E
. At this point ,the event
is considered to have "full state" rather than "partial state". We record this
by removing E
from the partial_state_events
table.
[TODO: Does Synapse persist a new state group for the full state
before E
, or do we alter the (partial-)state group in-place? Are state groups
ever marked as partially-stated? ]
This scheme means it is possible for us to have accepted and sent an event to clients, only to reject it during the resync. From a client's perspective, the effect is similar to a retroactive state change due to state resolution---i.e. a "state reset".2
When all events since the join J
have been fully-stated, the room resync
process is complete. We record this by removing the room from
partial_state_rooms
.
Faster joins on workers
For the time being, the resync process happens on the master worker.
A new replication stream un_partial_stated_room
is added. Whenever a resync
completes and a partial-state room becomes fully stated, a new message is sent
into that stream containing the room ID.
Notes on specific cases
NB. The notes below are rough. Some of them are hidden under
<details>
disclosures because they have yet to be implemented in mainline Synapse.
Creating events during a partial join
When sending out messages during a partial join, we assume our partial state is accurate and proceed as normal. For this to have any hope of succeeding at all, our partial state must contain an entry for each of the (type, state key) pairs specified by the auth rules:
m.room.create
m.room.join_rules
m.room.power_levels
m.room.third_party_invite
m.room.member
The first four of these should be present in the state before J
that is given
to us in the partial join response; only membership events are omitted. In order
for us to consider the user joined, we must have their membership event. That
means the only possible omission is the target's membership in an invite, kick
or ban.
The worst possibility is that we locally invite someone who is banned according to the full state, because we lack their ban in our current partial state. The rest of the federation---at least, those who are fully joined---should correctly enforce the membership transition constraints. So any the erroneous invite should be ignored by fully-joined homeservers and resolved by the resync for partially-joined homeservers.
In more generality, there are two problems we're worrying about here:
- We might create an event that is valid under our partial state, only to later find out that is actually invalid according to the full state.
- Or: we might refuse to create an event that is invalid under our partial state, even though it would be perfectly valid under the full state.