(#5849) Convert rst to markdown (#6040)

Converting some of the rst documentation to markdown. Attempted to preserve whitespace and line breaks to minimize cosmetic change.

(#5849) Convert rst to markdown (#6040)
379d2a8c · dstipp · Richard van der Hoff · 70c52821 · 379d2a8c · 379d2a8c
Commit 379d2a8c authored 5 years ago by dstipp Committed by Richard van der Hoff 5 years ago
--- a/CONTRIBUTING.rst
+++ b/CONTRIBUTING.rst
@@ -56,7 +56,7 @@ Code style

 All Matrix projects have a well-defined code-style - and sometimes we've even
 got as far as documenting it... For instance, synapse's code style doc lives
-at https://github.com/matrix-org/synapse/tree/master/docs/code_style.rst.
+at https://github.com/matrix-org/synapse/tree/master/docs/code_style.md.

 Please ensure your changes match the cosmetic style of the existing project,
 and **never** mix cosmetic and functional changes in the same commit, as it

--- a/INSTALL.md
+++ b/INSTALL.md
@@ -373,7 +373,7 @@ is suitable for local testing, but for any practical use, you will either need
 to enable a reverse proxy, or configure Synapse to expose an HTTPS port.

 For information on using a reverse proxy, see
-[docs/reverse_proxy.rst](docs/reverse_proxy.rst).
+[docs/reverse_proxy.md](docs/reverse_proxy.md).

 To configure Synapse to expose an HTTPS port, you will need to edit
 `homeserver.yaml`, as follows:
@@ -446,7 +446,7 @@ on your server even if `enable_registration` is `false`.
 ## Setting up a TURN server

 For reliable VoIP calls to be routed via this homeserver, you MUST configure
-a TURN server.  See [docs/turn-howto.rst](docs/turn-howto.rst) for details.
+a TURN server.  See [docs/turn-howto.md](docs/turn-howto.md) for details.

 ## URL previews


--- a/README.rst
+++ b/README.rst
@@ -115,7 +115,7 @@ Registering a new user from a client

 By default, registration of new users via Matrix clients is disabled. To enable
 it, specify ``enable_registration: true`` in ``homeserver.yaml``. (It is then
-recommended to also set up CAPTCHA - see `<docs/CAPTCHA_SETUP.rst>`_.)
+recommended to also set up CAPTCHA - see `<docs/CAPTCHA_SETUP.md>`_.)

 Once ``enable_registration`` is set to ``true``, it is possible to register a
 user via `riot.im <https://riot.im/app/#/register>`_ or other Matrix clients.
@@ -186,7 +186,7 @@ Almost all installations should opt to use PostreSQL. Advantages include:
  synapse itself.

 For information on how to install and use PostgreSQL, please see
-`docs/postgres.rst <docs/postgres.rst>`_.
+`docs/postgres.md <docs/postgres.md>`_.

 .. _reverse-proxy:

@@ -201,7 +201,7 @@ It is recommended to put a reverse proxy such as
 doing so is that it means that you can expose the default https port (443) to
 Matrix clients without needing to run Synapse with root privileges.

-For information on configuring one, see `<docs/reverse_proxy.rst>`_.
+For information on configuring one, see `<docs/reverse_proxy.md>`_.

 Identity Servers
 ================

--- a/UPGRADE.rst
+++ b/UPGRADE.rst
@@ -103,7 +103,7 @@ Upgrading to v1.2.0
 ===================

 Some counter metrics have been renamed, with the old names deprecated. See
-`the metrics documentation <docs/metrics-howto.rst#renaming-of-metrics--deprecation-of-old-names-in-12>`_
+`the metrics documentation <docs/metrics-howto.md#renaming-of-metrics--deprecation-of-old-names-in-12>`_
 for details.

 Upgrading to v1.1.0

--- a/changelog.d/5849.doc
+++ b/changelog.d/5849.doc
+Convert documentation to markdown (from rst)
--- a/docs/CAPTCHA_SETUP.rst
+++ b/docs/CAPTCHA_SETUP.rst
+# Overview
 Captcha can be enabled for this home server. This file explains how to do that.
 The captcha mechanism used is Google's ReCaptcha. This requires API keys from Google.

-Getting keys
------------
+## Getting keys
+
 Requires a public/private key pair from:

-https://developers.google.com/recaptcha/
+<https://developers.google.com/recaptcha/>

 Must be a reCAPTCHA v2 key using the "I'm not a robot" Checkbox option

-Setting ReCaptcha Keys
----------------------
+## Setting ReCaptcha Keys
+
 The keys are a config option on the home server config. If they are not
-visible, you can generate them via --generate-config. Set the following value::
+visible, you can generate them via `--generate-config`. Set the following value:
+
+    recaptcha_public_key: YOUR_PUBLIC_KEY
+    recaptcha_private_key: YOUR_PRIVATE_KEY

-  recaptcha_public_key: YOUR_PUBLIC_KEY
-  recaptcha_private_key: YOUR_PRIVATE_KEY
+In addition, you MUST enable captchas via:

-In addition, you MUST enable captchas via::
+    enable_registration_captcha: true

-  enable_registration_captcha: true
+## Configuring IP used for auth

-Configuring IP used for auth
----------------------------
 The ReCaptcha API requires that the IP address of the user who solved the
 captcha is sent. If the client is connecting through a proxy or load balancer,
-it may be required to use the X-Forwarded-For (XFF) header instead of the origin
-IP address. This can be configured using the x_forwarded directive in the
+it may be required to use the `X-Forwarded-For` (XFF) header instead of the origin
+IP address. This can be configured using the `x_forwarded` directive in the
 listeners section of the homeserver.yaml configuration file.
--- a/docs/MSC1711_certificates_FAQ.md
+++ b/docs/MSC1711_certificates_FAQ.md
@@ -147,7 +147,7 @@ your domain, you can simply route all traffic through the reverse proxy by
 updating the SRV record appropriately (or removing it, if the proxy listens on
 8448).

-See [reverse_proxy.rst](reverse_proxy.rst) for information on setting up a
+See [reverse_proxy.md](reverse_proxy.md) for information on setting up a
 reverse proxy.

 #### Option 3: add a .well-known file to delegate your matrix traffic
@@ -319,7 +319,7 @@ We no longer actively recommend against using a reverse proxy. Many admins will
 find it easier to direct federation traffic to a reverse proxy and manage their
 own TLS certificates, and this is a supported configuration.

-See [reverse_proxy.rst](reverse_proxy.rst) for information on setting up a
+See [reverse_proxy.md](reverse_proxy.md) for information on setting up a
 reverse proxy.

 ### Do I still need to give my TLS certificates to Synapse if I am using a reverse proxy?

--- a/docs/README.md
+++ b/docs/README.md
+# Synapse Documentation
+
+This directory contains documentation specific to the `synapse` homeserver.
+
+All matrix-generic documentation now lives in its own project, located at [matrix-org/matrix-doc](https://github.com/matrix-org/matrix-doc)
+
+(Note:  some items here may be moved to [matrix-org/matrix-doc](https://github.com/matrix-org/matrix-doc) at some point in the future.)
--- a/docs/README.rst
+++ b/docs/README.rst
-All matrix-generic documentation now lives in its own project at
-
-github.com/matrix-org/matrix-doc.git
-
-Only Synapse implementation-specific documentation lives here now
-(together with some older stuff will be shortly migrated over to matrix-doc)
--- a/docs/ancient_architecture_notes.md
+++ b/docs/ancient_architecture_notes.md
+> **Warning**
+>  These architecture notes are spectacularly old, and date back
+> to when Synapse was just federation code in isolation. This should be
+> merged into the main spec.
+
+# Server to Server
+
+## Server to Server Stack
+
+To use the server to server stack, home servers should only need to
+interact with the Messaging layer.
+
+The server to server side of things is designed into 4 distinct layers:
+
+1.  Messaging Layer
+2.  Pdu Layer
+3.  Transaction Layer
+4.  Transport Layer
+
+Where the bottom (the transport layer) is what talks to the internet via
+HTTP, and the top (the messaging layer) talks to the rest of the Home
+Server with a domain specific API.
+
+1. **Messaging Layer**
+
+    This is what the rest of the Home Server hits to send messages, join rooms,
+    etc. It also allows you to register callbacks for when it get's notified by
+    lower levels that e.g. a new message has been received.
+
+    It is responsible for serializing requests to send to the data
+    layer, and to parse requests received from the data layer.
+
+2. **PDU Layer**
+
+    This layer handles:
+
+		- duplicate `pdu_id`'s - i.e., it makes sure we ignore them.
+		- responding to requests for a given `pdu_id`
+		- responding to requests for all metadata for a given context (i.e. room)
+		- handling incoming backfill requests
+
+		So it has to parse incoming messages to discover which are metadata and
+    which aren't, and has to correctly clobber existing metadata where
+    appropriate.
+
+    For incoming PDUs, it has to check the PDUs it references to see
+    if we have missed any. If we have go and ask someone (another
+    home server) for it.
+
+3. **Transaction Layer**
+
+		This layer makes incoming requests idempotent. i.e., it stores
+		which transaction id's we have seen and what our response were.
+		If we have already seen a message with the given transaction id,
+		we do not notify higher levels but simply respond with the
+		previous response.
+
+		`transaction_id` is from "`GET /send/<tx_id>/`"
+
+		It's also responsible for batching PDUs into single transaction for
+		sending to remote destinations, so that we only ever have one
+		transaction in flight to a given destination at any one time.
+
+		This is also responsible for answering requests for things after a
+		given set of transactions, i.e., ask for everything after 'ver' X.
+
+4. **Transport Layer**
+
+		This is responsible for starting a HTTP server and hitting the
+		correct callbacks on the Transaction layer, as well as sending
+		both data and requests for data.
+
+## Persistence
+
+We persist things in a single sqlite3 database. All database queries get
+run on a separate, dedicated thread. This that we only ever have one
+query running at a time, making it a lot easier to do things in a safe
+manner.
+
+The queries are located in the `synapse.persistence.transactions` module,
+and the table information in the `synapse.persistence.tables` module.
--- a/docs/ancient_architecture_notes.rst
+++ b/docs/ancient_architecture_notes.rst
-.. WARNING::
-  These architecture notes are spectacularly old, and date back to when Synapse 
-  was just federation code in isolation.  This should be merged into the main
-  spec.
-  
-
-= Server to Server =
-
-== Server to Server Stack ==
-
-To use the server to server stack, home servers should only need to interact with the Messaging layer.
-
-The server to server side of things is designed into 4 distinct layers:
-
-    1. Messaging Layer
-    2. Pdu Layer
-    3. Transaction Layer
-    4. Transport Layer
-
-Where the bottom (the transport layer) is what talks to the internet via HTTP, and the top (the messaging layer) talks to the rest of the Home Server with a domain specific API.
-
-1. Messaging Layer
-    This is what the rest of the Home Server hits to send messages, join rooms, etc. It also allows you to register callbacks for when it get's notified by lower levels that e.g. a new message has been received.
-
-    It is responsible for serializing requests to send to the data layer, and to parse requests received from the data layer.
-
-
-2. PDU Layer
-    This layer handles: 
-        * duplicate pdu_id's - i.e., it makes sure we ignore them. 
-        * responding to requests for a given pdu_id
-        * responding to requests for all metadata for a given context (i.e. room)
-        * handling incoming backfill requests
-
-    So it has to parse incoming messages to discover which are metadata and which aren't, and has to correctly clobber existing metadata where appropriate.
-
-    For incoming PDUs, it has to check the PDUs it references to see if we have missed any. If we have go and ask someone (another home server) for it.    
-
-
-3. Transaction Layer
-    This layer makes incoming requests idempotent. I.e., it stores which transaction id's we have seen and what our response were. If we have already seen a message with the given transaction id, we do not notify higher levels but simply respond with the previous response.
-
-transaction_id is from "GET /send/<tx_id>/"
-
-    It's also responsible for batching PDUs into single transaction for sending to remote destinations, so that we only ever have one transaction in flight to a given destination at any one time.
-
-    This is also responsible for answering requests for things after a given set of transactions, i.e., ask for everything after 'ver' X.
-
-
-4. Transport Layer
-    This is responsible for starting a HTTP server and hitting the correct callbacks on the Transaction layer, as well as sending both data and requests for data.
-
-
-== Persistence ==
-
-We persist things in a single sqlite3 database. All database queries get run on a separate, dedicated thread. This that we only ever have one query running at a time, making it a lot easier to do things in a safe manner.
-
-The queries are located in the synapse.persistence.transactions module, and the table information in the synapse.persistence.tables module.
-
--- a/docs/application_services.md
+++ b/docs/application_services.md
+# Registering an Application Service
+
+The registration of new application services depends on the homeserver used. 
+In synapse, you need to create a new configuration file for your AS and add it
+to the list specified under the `app_service_config_files` config
+option in your synapse config.
+
+For example:
+
+```yaml
+app_service_config_files:
+- /home/matrix/.synapse/<your-AS>.yaml
+```
+
+The format of the AS configuration file is as follows:
+
+```yaml
+url: <base url of AS>
+as_token: <token AS will add to requests to HS>
+hs_token: <token HS will add to requests to AS>
+sender_localpart: <localpart of AS user>
+namespaces:
+  users:  # List of users we're interested in
+    - exclusive: <bool>
+      regex: <regex>
+    - ...
+  aliases: []  # List of aliases we're interested in
+  rooms: [] # List of room ids we're interested in
+```
+
+See the [spec](https://matrix.org/docs/spec/application_service/unstable.html) for further details on how application services work.
--- a/docs/application_services.rst
+++ b/docs/application_services.rst
-Registering an Application Service
-==================================
-
-The registration of new application services depends on the homeserver used. 
-In synapse, you need to create a new configuration file for your AS and add it
-to the list specified under the ``app_service_config_files`` config
-option in your synapse config.
-
-For example:
-
-.. code-block:: yaml
-
-  app_service_config_files:
-  - /home/matrix/.synapse/<your-AS>.yaml
-
-
-The format of the AS configuration file is as follows:
-
-..  code-block:: yaml
-
-    url: <base url of AS>
-    as_token: <token AS will add to requests to HS>
-    hs_token: <token HS will add to requests to AS>
-    sender_localpart: <localpart of AS user>
-    namespaces:
-      users:  # List of users we're interested in
-        - exclusive: <bool>
-          regex: <regex>
-        - ...
-      aliases: []  # List of aliases we're interested in
-      rooms: [] # List of room ids we're interested in
-
-See the spec_ for further details on how application services work.
-
-.. _spec: https://matrix.org/docs/spec/application_service/unstable.html
--- a/docs/architecture.rst
+++ b/docs/architecture.rst
-Synapse Architecture
-====================
+# Synapse Architecture

-As of the end of Oct 2014, Synapse's overall architecture looks like::
+As of the end of Oct 2014, Synapse's overall architecture looks like:

        synapse
        .-----------------------------------------------------.
@@ -13,7 +12,7 @@ As of the end of Oct 2014, Synapse's overall architecture looks like::
        |                  |            v      |              |
        |                  | Event*Handler <--------> rest/* <=> Client
        |                  | Rooms*Handler     |              |
-  HSes <=> federation/* <==> FederationHandler |              |
+    HS <=> federation/* <==> FederationHandler |              |
        |      |           | PresenceHandler   |              |
        |      |           | TypingHandler     |              |
        |      |           '-------------------'              |
@@ -29,40 +28,38 @@ As of the end of Oct 2014, Synapse's overall architecture looks like::
                                | DB |
                                '----'

-* Handlers: business logic of synapse itself.  Follows a set contract of BaseHandler:
-
-  - BaseHandler gives us onNewRoomEvent which: (TODO: flesh this out and make it less cryptic):
- 
-    + handle_state(event)
-    + auth(event)
-    + persist_event(event)
-    + notify notifier or federation(event)
-   
-  - PresenceHandler: use distributor to get EDUs out of Federation.  Very
-    lightweight logic built on the distributor
-  - TypingHandler: use distributor to get EDUs out of Federation.  Very
-    lightweight logic built on the distributor
-  - EventsHandler: handles the events stream...
-  - FederationHandler: - gets PDU from Federation Layer; turns into an event;
-    follows basehandler functionality.
-  - RoomsHandler: does all the room logic, including members - lots of classes in
-    RoomsHandler.
-  - ProfileHandler: talks to the storage to store/retrieve profile info.
-
-* EventFactory: generates events of particular event types.
-* Notifier: Backs the events handler
-* REST: Interfaces handlers and events to the outside world via HTTP/JSON.
-  Converts events back and forth from JSON.
-* Federation: holds the HTTP client & server to talk to other servers.  Does
-  replication to make sure there's nothing missing in the graph.  Handles
-  reliability.  Handles txns.
-* Distributor: generic event bus. used for presence & typing only currently. 
-  Notifier could be implemented using Distributor - so far we are only using for
-  things which actually /require/ dynamic pluggability however as it can
-  obfuscate the actual flow of control.
-* Auth: helper singleton to say whether a given event is allowed to do a given
-  thing  (TODO: put this on the diagram)
-* State: helper singleton: does state conflict resolution. You give it an event
-  and it tells you if it actually updates the state or not, and annotates the
-  event up properly and handles merge conflict resolution.
-* Storage: abstracts the storage engine.
+-   Handlers: business logic of synapse itself. Follows a set contract of BaseHandler:
+    -   BaseHandler gives us onNewRoomEvent which: (TODO: flesh this out and make it less cryptic):
+        -   handle_state(event)
+        -   auth(event)
+        -   persist_event(event)
+        -   notify notifier or federation(event)
+    -   PresenceHandler: use distributor to get EDUs out of Federation.
+        Very lightweight logic built on the distributor
+    -   TypingHandler: use distributor to get EDUs out of Federation.
+        Very lightweight logic built on the distributor
+    -   EventsHandler: handles the events stream...
+    -   FederationHandler: - gets PDU from Federation Layer; turns into
+        an event; follows basehandler functionality.
+    -   RoomsHandler: does all the room logic, including members - lots
+        of classes in RoomsHandler.
+    -   ProfileHandler: talks to the storage to store/retrieve profile
+        info.
+-   EventFactory: generates events of particular event types.
+-   Notifier: Backs the events handler
+-   REST: Interfaces handlers and events to the outside world via
+    HTTP/JSON. Converts events back and forth from JSON.
+-   Federation: holds the HTTP client & server to talk to other servers.
+    Does replication to make sure there's nothing missing in the graph.
+    Handles reliability. Handles txns.
+-   Distributor: generic event bus. used for presence & typing only
+    currently. Notifier could be implemented using Distributor - so far
+    we are only using for things which actually /require/ dynamic
+    pluggability however as it can obfuscate the actual flow of control.
+-   Auth: helper singleton to say whether a given event is allowed to do
+    a given thing (TODO: put this on the diagram)
+-   State: helper singleton: does state conflict resolution. You give it
+    an event and it tells you if it actually updates the state or not,
+    and annotates the event up properly and handles merge conflict
+    resolution.
+-   Storage: abstracts the storage engine.
--- a/docs/code_style.md
+++ b/docs/code_style.md
+# Code Style
+
+## Formatting tools
+
+The Synapse codebase uses a number of code formatting tools in order to
+quickly and automatically check for formatting (and sometimes logical)
+errors in code.
+
+The necessary tools are detailed below.
+
+-   **black**
+
+    The Synapse codebase uses [black](https://pypi.org/project/black/)
+    as an opinionated code formatter, ensuring all comitted code is
+    properly formatted.
+
+    First install `black` with:
+
+        pip install --upgrade black
+
+    Have `black` auto-format your code (it shouldn't change any
+    functionality) with:
+
+        black . --exclude="\.tox|build|env"
+
+-   **flake8**
+
+    `flake8` is a code checking tool. We require code to pass `flake8`
+    before being merged into the codebase.
+
+    Install `flake8` with:
+
+        pip install --upgrade flake8
+
+    Check all application and test code with:
+
+        flake8 synapse tests
+
+-   **isort**
+
+    `isort` ensures imports are nicely formatted, and can suggest and
+    auto-fix issues such as double-importing.
+
+    Install `isort` with:
+
+        pip install --upgrade isort
+
+    Auto-fix imports with:
+
+        isort -rc synapse tests
+
+    `-rc` means to recursively search the given directories.
+
+It's worth noting that modern IDEs and text editors can run these tools
+automatically on save. It may be worth looking into whether this
+functionality is supported in your editor for a more convenient
+development workflow. It is not, however, recommended to run `flake8` on
+save as it takes a while and is very resource intensive.
+
+## General rules
+
+-   **Naming**:
+    -   Use camel case for class and type names
+    -   Use underscores for functions and variables.
+-   **Docstrings**: should follow the [google code
+    style](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings).
+    This is so that we can generate documentation with
+    [sphinx](http://sphinxcontrib-napoleon.readthedocs.org/en/latest/).
+    See the
+    [examples](http://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html)
+    in the sphinx documentation.
+-   **Imports**:
+    -   Imports should be sorted by `isort` as described above.
+    -   Prefer to import classes and functions rather than packages or
+        modules.
+
+        Example:
+
+            from synapse.types import UserID
+            ...
+            user_id = UserID(local, server)
+
+        is preferred over:
+
+            from synapse import types
+            ...
+            user_id = types.UserID(local, server)
+
+        (or any other variant).
+
+        This goes against the advice in the Google style guide, but it
+        means that errors in the name are caught early (at import time).
+
+    -   Avoid wildcard imports (`from synapse.types import *`) and
+        relative imports (`from .types import UserID`).
+
+## Configuration file format
+
+The [sample configuration file](./sample_config.yaml) acts as a
+reference to Synapse's configuration options for server administrators.
+Remember that many readers will be unfamiliar with YAML and server
+administration in general, so that it is important that the file be as
+easy to understand as possible, which includes following a consistent
+format.
+
+Some guidelines follow:
+
+-   Sections should be separated with a heading consisting of a single
+    line prefixed and suffixed with `##`. There should be **two** blank
+    lines before the section header, and **one** after.
+-   Each option should be listed in the file with the following format:
+    -   A comment describing the setting. Each line of this comment
+        should be prefixed with a hash (`#`) and a space.
+
+        The comment should describe the default behaviour (ie, what
+        happens if the setting is omitted), as well as what the effect
+        will be if the setting is changed.
+
+        Often, the comment end with something like "uncomment the
+        following to <do action>".
+
+    -   A line consisting of only `#`.
+    -   A commented-out example setting, prefixed with only `#`.
+
+        For boolean (on/off) options, convention is that this example
+        should be the *opposite* to the default (so the comment will end
+        with "Uncomment the following to enable [or disable]
+        <feature>." For other options, the example should give some
+        non-default value which is likely to be useful to the reader.
+
+-   There should be a blank line between each option.
+-   Where several settings are grouped into a single dict, *avoid* the
+    convention where the whole block is commented out, resulting in
+    comment lines starting `# #`, as this is hard to read and confusing
+    to edit. Instead, leave the top-level config option uncommented, and
+    follow the conventions above for sub-options. Ensure that your code
+    correctly handles the top-level option being set to `None` (as it
+    will be if no sub-options are enabled).
+-   Lines should be wrapped at 80 characters.
+
+Example:
+
+    ## Frobnication ##
+
+    # The frobnicator will ensure that all requests are fully frobnicated.
+    # To enable it, uncomment the following.
+    #
+    #frobnicator_enabled: true
+
+    # By default, the frobnicator will frobnicate with the default frobber.
+    # The following will make it use an alternative frobber.
+    #
+    #frobincator_frobber: special_frobber
+
+    # Settings for the frobber
+    #
+    frobber:
+       # frobbing speed. Defaults to 1.
+       #
+       #speed: 10
+
+       # frobbing distance. Defaults to 1000.
+       #
+       #distance: 100
+
+Note that the sample configuration is generated from the synapse code
+and is maintained by a script, `scripts-dev/generate_sample_config`.
+Making sure that the output from this script matches the desired format
+is left as an exercise for the reader!
--- a/docs/code_style.rst
+++ b/docs/code_style.rst
-Code Style
-==========
-
-Formatting tools
----------------
-
-The Synapse codebase uses a number of code formatting tools in order to
-quickly and automatically check for formatting (and sometimes logical) errors
-in code.
-
-The necessary tools are detailed below.
-
- **black**
-
-  The Synapse codebase uses `black <https://pypi.org/project/black/>`_ as an
-  opinionated code formatter, ensuring all comitted code is properly
-  formatted.
-
-  First install ``black`` with::
-
-    pip install --upgrade black
-
-  Have ``black`` auto-format your code (it shouldn't change any functionality)
-  with::
-
-    black . --exclude="\.tox|build|env"
-
- **flake8**
-
-  ``flake8`` is a code checking tool. We require code to pass ``flake8`` before being merged into the codebase.
-
-  Install ``flake8`` with::
-
-    pip install --upgrade flake8
-
-  Check all application and test code with::
-
-    flake8 synapse tests
-
- **isort**
-
-  ``isort`` ensures imports are nicely formatted, and can suggest and
-  auto-fix issues such as double-importing.
-
-  Install ``isort`` with::
-
-    pip install --upgrade isort
-
-  Auto-fix imports with::
-
-    isort -rc synapse tests
-
-  ``-rc`` means to recursively search the given directories.
-
-It's worth noting that modern IDEs and text editors can run these tools
-automatically on save. It may be worth looking into whether this
-functionality is supported in your editor for a more convenient development
-workflow. It is not, however, recommended to run ``flake8`` on save as it
-takes a while and is very resource intensive.
-
-General rules
-------------
-
- **Naming**:
-
-  - Use camel case for class and type names
-  - Use underscores for functions and variables.
-
- **Docstrings**: should follow the `google code style
-  <https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings>`_.
-  This is so that we can generate documentation with `sphinx
-  <http://sphinxcontrib-napoleon.readthedocs.org/en/latest/>`_. See the
-  `examples
-  <http://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html>`_
-  in the sphinx documentation.
-
- **Imports**:
-
-  - Imports should be sorted by ``isort`` as described above.
-
-  - Prefer to import classes and functions rather than packages or modules.
-
-    Example::
-
-      from synapse.types import UserID
-      ...
-      user_id = UserID(local, server)
-
-    is preferred over::
-
-      from synapse import types
-      ...
-      user_id = types.UserID(local, server)
-
-    (or any other variant).
-
-    This goes against the advice in the Google style guide, but it means that
-    errors in the name are caught early (at import time).
-
-  - Avoid wildcard imports (``from synapse.types import *``) and relative
-    imports (``from .types import UserID``).
-
-Configuration file format
-------------------------
-
-The `sample configuration file <./sample_config.yaml>`_ acts as a reference to
-Synapse's configuration options for server administrators. Remember that many
-readers will be unfamiliar with YAML and server administration in general, so
-that it is important that the file be as easy to understand as possible, which
-includes following a consistent format.
-
-Some guidelines follow:
-
-* Sections should be separated with a heading consisting of a single line
-  prefixed and suffixed with ``##``. There should be **two** blank lines
-  before the section header, and **one** after.
-
-* Each option should be listed in the file with the following format:
-
-  * A comment describing the setting. Each line of this comment should be
-    prefixed with a hash (``#``) and a space.
-
-    The comment should describe the default behaviour (ie, what happens if
-    the setting is omitted), as well as what the effect will be if the
-    setting is changed.
-
-    Often, the comment end with something like "uncomment the
-    following to \<do action>".
-
-  * A line consisting of only ``#``.
-
-  * A commented-out example setting, prefixed with only ``#``.
-
-    For boolean (on/off) options, convention is that this example should be
-    the *opposite* to the default (so the comment will end with "Uncomment
-    the following to enable [or disable] \<feature\>." For other options,
-    the example should give some non-default value which is likely to be
-    useful to the reader.
-
-* There should be a blank line between each option.
-
-* Where several settings are grouped into a single dict, *avoid* the
-  convention where the whole block is commented out, resulting in comment
-  lines starting ``# #``, as this is hard to read and confusing to
-  edit. Instead, leave the top-level config option uncommented, and follow
-  the conventions above for sub-options. Ensure that your code correctly
-  handles the top-level option being set to ``None`` (as it will be if no
-  sub-options are enabled).
-
-* Lines should be wrapped at 80 characters.
-
-Example::
-
-    ## Frobnication ##
-
-    # The frobnicator will ensure that all requests are fully frobnicated.
-    # To enable it, uncomment the following.
-    #
-    #frobnicator_enabled: true
-
-    # By default, the frobnicator will frobnicate with the default frobber.
-    # The following will make it use an alternative frobber.
-    #
-    #frobincator_frobber: special_frobber
-
-    # Settings for the frobber
-    #
-    frobber:
-       # frobbing speed. Defaults to 1.
-       #
-       #speed: 10
-
-       # frobbing distance. Defaults to 1000.
-       #
-       #distance: 100
-
-Note that the sample configuration is generated from the synapse code and is
-maintained by a script, ``scripts-dev/generate_sample_config``. Making sure
-that the output from this script matches the desired format is left as an
-exercise for the reader!
--- a/docs/federate.md
+++ b/docs/federate.md
@@ -148,7 +148,7 @@ We no longer actively recommend against using a reverse proxy. Many admins will
 find it easier to direct federation traffic to a reverse proxy and manage their
 own TLS certificates, and this is a supported configuration.

-See [reverse_proxy.rst](reverse_proxy.rst) for information on setting up a
+See [reverse_proxy.md](reverse_proxy.md) for information on setting up a
 reverse proxy.

 #### Do I still need to give my TLS certificates to Synapse if I am using a reverse proxy?
@@ -184,7 +184,7 @@ a complicated dance which requires connections in both directions).

 Another common problem is that people on other servers can't join rooms that
 you invite them to. This can be caused by an incorrectly-configured reverse
-proxy: see [reverse_proxy.rst](<reverse_proxy.rst>) for instructions on how to correctly
+proxy: see [reverse_proxy.md](<reverse_proxy.md>) for instructions on how to correctly
 configure a reverse proxy.

 ## Running a Demo Federation of Synapses

--- a/docs/log_contexts.md
+++ b/docs/log_contexts.md
+# Log Contexts
+
+To help track the processing of individual requests, synapse uses a
+'`log context`' to track which request it is handling at any given
+moment. This is done via a thread-local variable; a `logging.Filter` is
+then used to fish the information back out of the thread-local variable
+and add it to each log record.
+
+Logcontexts are also used for CPU and database accounting, so that we
+can track which requests were responsible for high CPU use or database
+activity.
+
+The `synapse.logging.context` module provides a facilities for managing
+the current log context (as well as providing the `LoggingContextFilter`
+class).
+
+Deferreds make the whole thing complicated, so this document describes
+how it all works, and how to write code which follows the rules.
+
+##Logcontexts without Deferreds
+
+In the absence of any Deferred voodoo, things are simple enough. As with
+any code of this nature, the rule is that our function should leave
+things as it found them:
+
+```python
+from synapse.logging import context         # omitted from future snippets
+
+def handle_request(request_id):
+    request_context = context.LoggingContext()
+
+    calling_context = context.LoggingContext.current_context()
+    context.LoggingContext.set_current_context(request_context)
+    try:
+        request_context.request = request_id
+        do_request_handling()
+        logger.debug("finished")
+    finally:
+        context.LoggingContext.set_current_context(calling_context)
+
+def do_request_handling():
+    logger.debug("phew")  # this will be logged against request_id
+```
+
+LoggingContext implements the context management methods, so the above
+can be written much more succinctly as:
+
+```python
+def handle_request(request_id):
+    with context.LoggingContext() as request_context:
+        request_context.request = request_id
+        do_request_handling()
+        logger.debug("finished")
+
+def do_request_handling():
+    logger.debug("phew")
+```
+
+## Using logcontexts with Deferreds
+
+Deferreds --- and in particular, `defer.inlineCallbacks` --- break the
+linear flow of code so that there is no longer a single entry point
+where we should set the logcontext and a single exit point where we
+should remove it.
+
+Consider the example above, where `do_request_handling` needs to do some
+blocking operation, and returns a deferred:
+
+```python
+@defer.inlineCallbacks
+def handle_request(request_id):
+    with context.LoggingContext() as request_context:
+        request_context.request = request_id
+        yield do_request_handling()
+        logger.debug("finished")
+```
+
+In the above flow:
+
+-   The logcontext is set
+-   `do_request_handling` is called, and returns a deferred
+-   `handle_request` yields the deferred
+-   The `inlineCallbacks` wrapper of `handle_request` returns a deferred
+
+So we have stopped processing the request (and will probably go on to
+start processing the next), without clearing the logcontext.
+
+To circumvent this problem, synapse code assumes that, wherever you have
+a deferred, you will want to yield on it. To that end, whereever
+functions return a deferred, we adopt the following conventions:
+
+**Rules for functions returning deferreds:**
+
+> -   If the deferred is already complete, the function returns with the
+>     same logcontext it started with.
+> -   If the deferred is incomplete, the function clears the logcontext
+>     before returning; when the deferred completes, it restores the
+>     logcontext before running any callbacks.
+
+That sounds complicated, but actually it means a lot of code (including
+the example above) "just works". There are two cases:
+
+-   If `do_request_handling` returns a completed deferred, then the
+    logcontext will still be in place. In this case, execution will
+    continue immediately after the `yield`; the "finished" line will
+    be logged against the right context, and the `with` block restores
+    the original context before we return to the caller.
+-   If the returned deferred is incomplete, `do_request_handling` clears
+    the logcontext before returning. The logcontext is therefore clear
+    when `handle_request` yields the deferred. At that point, the
+    `inlineCallbacks` wrapper adds a callback to the deferred, and
+    returns another (incomplete) deferred to the caller, and it is safe
+    to begin processing the next request.
+
+    Once `do_request_handling`'s deferred completes, it will reinstate
+    the logcontext, before running the callback added by the
+    `inlineCallbacks` wrapper. That callback runs the second half of
+    `handle_request`, so again the "finished" line will be logged
+    against the right context, and the `with` block restores the
+    original context.
+
+As an aside, it's worth noting that `handle_request` follows our rules
+-though that only matters if the caller has its own logcontext which it
+cares about.
+
+The following sections describe pitfalls and helpful patterns when
+implementing these rules.
+
+Always yield your deferreds
+---------------------------
+
+Whenever you get a deferred back from a function, you should `yield` on
+it as soon as possible. (Returning it directly to your caller is ok too,
+if you're not doing `inlineCallbacks`.) Do not pass go; do not do any
+logging; do not call any other functions.
+
+```python
+@defer.inlineCallbacks
+def fun():
+    logger.debug("starting")
+    yield do_some_stuff()       # just like this
+
+    d = more_stuff()
+    result = yield d            # also fine, of course
+
+    return result
+
+def nonInlineCallbacksFun():
+    logger.debug("just a wrapper really")
+    return do_some_stuff()      # this is ok too - the caller will yield on
+                                # it anyway.
+```
+
+Provided this pattern is followed all the way back up to the callchain
+to where the logcontext was set, this will make things work out ok:
+provided `do_some_stuff` and `more_stuff` follow the rules above, then
+so will `fun` (as wrapped by `inlineCallbacks`) and
+`nonInlineCallbacksFun`.
+
+It's all too easy to forget to `yield`: for instance if we forgot that
+`do_some_stuff` returned a deferred, we might plough on regardless. This
+leads to a mess; it will probably work itself out eventually, but not
+before a load of stuff has been logged against the wrong context.
+(Normally, other things will break, more obviously, if you forget to
+`yield`, so this tends not to be a major problem in practice.)
+
+Of course sometimes you need to do something a bit fancier with your
+Deferreds - not all code follows the linear A-then-B-then-C pattern.
+Notes on implementing more complex patterns are in later sections.
+
+## Where you create a new Deferred, make it follow the rules
+
+Most of the time, a Deferred comes from another synapse function.
+Sometimes, though, we need to make up a new Deferred, or we get a
+Deferred back from external code. We need to make it follow our rules.
+
+The easy way to do it is with a combination of `defer.inlineCallbacks`,
+and `context.PreserveLoggingContext`. Suppose we want to implement
+`sleep`, which returns a deferred which will run its callbacks after a
+given number of seconds. That might look like:
+
+```python
+# not a logcontext-rules-compliant function
+def get_sleep_deferred(seconds):
+    d = defer.Deferred()
+    reactor.callLater(seconds, d.callback, None)
+    return d
+```
+
+That doesn't follow the rules, but we can fix it by wrapping it with
+`PreserveLoggingContext` and `yield` ing on it:
+
+```python
+@defer.inlineCallbacks
+def sleep(seconds):
+    with PreserveLoggingContext():
+        yield get_sleep_deferred(seconds)
+```
+
+This technique works equally for external functions which return
+deferreds, or deferreds we have made ourselves.
+
+You can also use `context.make_deferred_yieldable`, which just does the
+boilerplate for you, so the above could be written:
+
+```python
+def sleep(seconds):
+    return context.make_deferred_yieldable(get_sleep_deferred(seconds))
+```
+
+## Fire-and-forget
+
+Sometimes you want to fire off a chain of execution, but not wait for
+its result. That might look a bit like this:
+
+```python
+@defer.inlineCallbacks
+def do_request_handling():
+    yield foreground_operation()
+
+    # *don't* do this
+    background_operation()
+
+    logger.debug("Request handling complete")
+
+@defer.inlineCallbacks
+def background_operation():
+    yield first_background_step()
+    logger.debug("Completed first step")
+    yield second_background_step()
+    logger.debug("Completed second step")
+```
+
+The above code does a couple of steps in the background after
+`do_request_handling` has finished. The log lines are still logged
+against the `request_context` logcontext, which may or may not be
+desirable. There are two big problems with the above, however. The first
+problem is that, if `background_operation` returns an incomplete
+Deferred, it will expect its caller to `yield` immediately, so will have
+cleared the logcontext. In this example, that means that 'Request
+handling complete' will be logged without any context.
+
+The second problem, which is potentially even worse, is that when the
+Deferred returned by `background_operation` completes, it will restore
+the original logcontext. There is nothing waiting on that Deferred, so
+the logcontext will leak into the reactor and possibly get attached to
+some arbitrary future operation.
+
+There are two potential solutions to this.
+
+One option is to surround the call to `background_operation` with a
+`PreserveLoggingContext` call. That will reset the logcontext before
+starting `background_operation` (so the context restored when the
+deferred completes will be the empty logcontext), and will restore the
+current logcontext before continuing the foreground process:
+
+```python
+@defer.inlineCallbacks
+def do_request_handling():
+    yield foreground_operation()
+
+    # start background_operation off in the empty logcontext, to
+    # avoid leaking the current context into the reactor.
+    with PreserveLoggingContext():
+        background_operation()
+
+    # this will now be logged against the request context
+    logger.debug("Request handling complete")
+```
+
+Obviously that option means that the operations done in
+`background_operation` would be not be logged against a logcontext
+(though that might be fixed by setting a different logcontext via a
+`with LoggingContext(...)` in `background_operation`).
+
+The second option is to use `context.run_in_background`, which wraps a
+function so that it doesn't reset the logcontext even when it returns
+an incomplete deferred, and adds a callback to the returned deferred to
+reset the logcontext. In other words, it turns a function that follows
+the Synapse rules about logcontexts and Deferreds into one which behaves
+more like an external function --- the opposite operation to that
+described in the previous section. It can be used like this:
+
+```python
+@defer.inlineCallbacks
+def do_request_handling():
+    yield foreground_operation()
+
+    context.run_in_background(background_operation)
+
+    # this will now be logged against the request context
+    logger.debug("Request handling complete")
+```
+
+## Passing synapse deferreds into third-party functions
+
+A typical example of this is where we want to collect together two or
+more deferred via `defer.gatherResults`:
+
+```python
+d1 = operation1()
+d2 = operation2()
+d3 = defer.gatherResults([d1, d2])
+```
+
+This is really a variation of the fire-and-forget problem above, in that
+we are firing off `d1` and `d2` without yielding on them. The difference
+is that we now have third-party code attached to their callbacks. Anyway
+either technique given in the [Fire-and-forget](#fire-and-forget)
+section will work.
+
+Of course, the new Deferred returned by `gatherResults` needs to be
+wrapped in order to make it follow the logcontext rules before we can
+yield it, as described in [Where you create a new Deferred, make it
+follow the
+rules](#where-you-create-a-new-deferred-make-it-follow-the-rules).
+
+So, option one: reset the logcontext before starting the operations to
+be gathered:
+
+```python
+@defer.inlineCallbacks
+def do_request_handling():
+    with PreserveLoggingContext():
+        d1 = operation1()
+        d2 = operation2()
+        result = yield defer.gatherResults([d1, d2])
+```
+
+In this case particularly, though, option two, of using
+`context.preserve_fn` almost certainly makes more sense, so that
+`operation1` and `operation2` are both logged against the original
+logcontext. This looks like:
+
+```python
+@defer.inlineCallbacks
+def do_request_handling():
+    d1 = context.preserve_fn(operation1)()
+    d2 = context.preserve_fn(operation2)()
+
+    with PreserveLoggingContext():
+        result = yield defer.gatherResults([d1, d2])
+```
+
+## Was all this really necessary?
+
+The conventions used work fine for a linear flow where everything
+happens in series via `defer.inlineCallbacks` and `yield`, but are
+certainly tricky to follow for any more exotic flows. It's hard not to
+wonder if we could have done something else.
+
+We're not going to rewrite Synapse now, so the following is entirely of
+academic interest, but I'd like to record some thoughts on an
+alternative approach.
+
+I briefly prototyped some code following an alternative set of rules. I
+think it would work, but I certainly didn't get as far as thinking how
+it would interact with concepts as complicated as the cache descriptors.
+
+My alternative rules were:
+
+-   functions always preserve the logcontext of their caller, whether or
+    not they are returning a Deferred.
+-   Deferreds returned by synapse functions run their callbacks in the
+    same context as the function was orignally called in.
+
+The main point of this scheme is that everywhere that sets the
+logcontext is responsible for clearing it before returning control to
+the reactor.
+
+So, for example, if you were the function which started a
+`with LoggingContext` block, you wouldn't `yield` within it --- instead
+you'd start off the background process, and then leave the `with` block
+to wait for it:
+
+```python
+def handle_request(request_id):
+    with context.LoggingContext() as request_context:
+        request_context.request = request_id
+        d = do_request_handling()
+
+    def cb(r):
+        logger.debug("finished")
+
+    d.addCallback(cb)
+    return d
+```
+
+(in general, mixing `with LoggingContext` blocks and
+`defer.inlineCallbacks` in the same function leads to slighly
+counter-intuitive code, under this scheme).
+
+Because we leave the original `with` block as soon as the Deferred is
+returned (as opposed to waiting for it to be resolved, as we do today),
+the logcontext is cleared before control passes back to the reactor; so
+if there is some code within `do_request_handling` which needs to wait
+for a Deferred to complete, there is no need for it to worry about
+clearing the logcontext before doing so:
+
+```python
+def handle_request():
+    r = do_some_stuff()
+    r.addCallback(do_some_more_stuff)
+    return r
+```
+
+--- and provided `do_some_stuff` follows the rules of returning a
+Deferred which runs its callbacks in the original logcontext, all is
+happy.
+
+The business of a Deferred which runs its callbacks in the original
+logcontext isn't hard to achieve --- we have it today, in the shape of
+`context._PreservingContextDeferred`:
+
+```python
+def do_some_stuff():
+    deferred = do_some_io()
+    pcd = _PreservingContextDeferred(LoggingContext.current_context())
+    deferred.chainDeferred(pcd)
+    return pcd
+```
+
+It turns out that, thanks to the way that Deferreds chain together, we
+automatically get the property of a context-preserving deferred with
+`defer.inlineCallbacks`, provided the final Defered the function
+`yields` on has that property. So we can just write:
+
+```python
+@defer.inlineCallbacks
+def handle_request():
+    yield do_some_stuff()
+    yield do_some_more_stuff()
+```
+
+To conclude: I think this scheme would have worked equally well, with
+less danger of messing it up, and probably made some more esoteric code
+easier to write. But again --- changing the conventions of the entire
+Synapse codebase is not a sensible option for the marginal improvement
+offered.
+
+## A note on garbage-collection of Deferred chains
+
+It turns out that our logcontext rules do not play nicely with Deferred
+chains which get orphaned and garbage-collected.
+
+Imagine we have some code that looks like this:
+
+```python
+listener_queue = []
+
+def on_something_interesting():
+    for d in listener_queue:
+        d.callback("foo")
+
+@defer.inlineCallbacks
+def await_something_interesting():
+    new_deferred = defer.Deferred()
+    listener_queue.append(new_deferred)
+
+    with PreserveLoggingContext():
+        yield new_deferred
+```
+
+Obviously, the idea here is that we have a bunch of things which are
+waiting for an event. (It's just an example of the problem here, but a
+relatively common one.)
+
+Now let's imagine two further things happen. First of all, whatever was
+waiting for the interesting thing goes away. (Perhaps the request times
+out, or something *even more* interesting happens.)
+
+Secondly, let's suppose that we decide that the interesting thing is
+never going to happen, and we reset the listener queue:
+
+```python
+def reset_listener_queue():
+    listener_queue.clear()
+```
+
+So, both ends of the deferred chain have now dropped their references,
+and the deferred chain is now orphaned, and will be garbage-collected at
+some point. Note that `await_something_interesting` is a generator
+function, and when Python garbage-collects generator functions, it gives
+them a chance to clean up by making the `yield` raise a `GeneratorExit`
+exception. In our case, that means that the `__exit__` handler of
+`PreserveLoggingContext` will carefully restore the request context, but
+there is now nothing waiting for its return, so the request context is
+never cleared.
+
+To reiterate, this problem only arises when *both* ends of a deferred
+chain are dropped. Dropping the the reference to a deferred you're
+supposed to be calling is probably bad practice, so this doesn't
+actually happen too much. Unfortunately, when it does happen, it will
+lead to leaked logcontexts which are incredibly hard to track down.
--- a/docs/log_contexts.rst
+++ b/docs/log_contexts.rst
-Log Contexts
-============
-
-.. contents::
-
-To help track the processing of individual requests, synapse uses a
-'log context' to track which request it is handling at any given moment. This
-is done via a thread-local variable; a ``logging.Filter`` is then used to fish
-the information back out of the thread-local variable and add it to each log
-record.
-
-Logcontexts are also used for CPU and database accounting, so that we can track
-which requests were responsible for high CPU use or database activity.
-
-The ``synapse.logging.context`` module provides a facilities for managing the
-current log context (as well as providing the ``LoggingContextFilter`` class).
-
-Deferreds make the whole thing complicated, so this document describes how it
-all works, and how to write code which follows the rules.
-
-Logcontexts without Deferreds
-----------------------------
-
-In the absence of any Deferred voodoo, things are simple enough. As with any
-code of this nature, the rule is that our function should leave things as it
-found them:
-
-.. code:: python
-
-    from synapse.logging import context         # omitted from future snippets
-
-    def handle_request(request_id):
-        request_context = context.LoggingContext()
-
-        calling_context = context.LoggingContext.current_context()
-        context.LoggingContext.set_current_context(request_context)
-        try:
-            request_context.request = request_id
-            do_request_handling()
-            logger.debug("finished")
-        finally:
-            context.LoggingContext.set_current_context(calling_context)
-
-    def do_request_handling():
-        logger.debug("phew")  # this will be logged against request_id
-
-
-LoggingContext implements the context management methods, so the above can be
-written much more succinctly as:
-
-.. code:: python
-
-    def handle_request(request_id):
-        with context.LoggingContext() as request_context:
-            request_context.request = request_id
-            do_request_handling()
-            logger.debug("finished")
-
-    def do_request_handling():
-        logger.debug("phew")
-
-
-Using logcontexts with Deferreds
--------------------------------
-
-Deferreds — and in particular, ``defer.inlineCallbacks`` — break
-the linear flow of code so that there is no longer a single entry point where
-we should set the logcontext and a single exit point where we should remove it.
-
-Consider the example above, where ``do_request_handling`` needs to do some
-blocking operation, and returns a deferred:
-
-.. code:: python
-
-    @defer.inlineCallbacks
-    def handle_request(request_id):
-        with context.LoggingContext() as request_context:
-            request_context.request = request_id
-            yield do_request_handling()
-            logger.debug("finished")
-
-
-In the above flow:
-
-* The logcontext is set
-* ``do_request_handling`` is called, and returns a deferred
-* ``handle_request`` yields the deferred
-* The ``inlineCallbacks`` wrapper of ``handle_request`` returns a deferred
-
-So we have stopped processing the request (and will probably go on to start
-processing the next), without clearing the logcontext.
-
-To circumvent this problem, synapse code assumes that, wherever you have a
-deferred, you will want to yield on it. To that end, whereever functions return
-a deferred, we adopt the following conventions:
-
-**Rules for functions returning deferreds:**
-
-  * If the deferred is already complete, the function returns with the same
-    logcontext it started with.
-  * If the deferred is incomplete, the function clears the logcontext before
-    returning; when the deferred completes, it restores the logcontext before
-    running any callbacks.
-
-That sounds complicated, but actually it means a lot of code (including the
-example above) "just works". There are two cases:
-
-* If ``do_request_handling`` returns a completed deferred, then the logcontext
-  will still be in place. In this case, execution will continue immediately
-  after the ``yield``; the "finished" line will be logged against the right
-  context, and the ``with`` block restores the original context before we
-  return to the caller.
-
-* If the returned deferred is incomplete, ``do_request_handling`` clears the
-  logcontext before returning. The logcontext is therefore clear when
-  ``handle_request`` yields the deferred. At that point, the ``inlineCallbacks``
-  wrapper adds a callback to the deferred, and returns another (incomplete)
-  deferred to the caller, and it is safe to begin processing the next request.
-
-  Once ``do_request_handling``'s deferred completes, it will reinstate the
-  logcontext, before running the callback added by the ``inlineCallbacks``
-  wrapper. That callback runs the second half of ``handle_request``, so again
-  the "finished" line will be logged against the right
-  context, and the ``with`` block restores the original context.
-
-As an aside, it's worth noting that ``handle_request`` follows our rules -
-though that only matters if the caller has its own logcontext which it cares
-about.
-
-The following sections describe pitfalls and helpful patterns when implementing
-these rules.
-
-Always yield your deferreds
---------------------------
-
-Whenever you get a deferred back from a function, you should ``yield`` on it
-as soon as possible. (Returning it directly to your caller is ok too, if you're
-not doing ``inlineCallbacks``.) Do not pass go; do not do any logging; do not
-call any other functions.
-
-.. code:: python
-
-    @defer.inlineCallbacks
-    def fun():
-        logger.debug("starting")
-        yield do_some_stuff()       # just like this
-
-        d = more_stuff()
-        result = yield d            # also fine, of course
-
-        return result
-
-    def nonInlineCallbacksFun():
-        logger.debug("just a wrapper really")
-        return do_some_stuff()      # this is ok too - the caller will yield on
-                                    # it anyway.
-
-Provided this pattern is followed all the way back up to the callchain to where
-the logcontext was set, this will make things work out ok: provided
-``do_some_stuff`` and ``more_stuff`` follow the rules above, then so will
-``fun`` (as wrapped by ``inlineCallbacks``) and ``nonInlineCallbacksFun``.
-
-It's all too easy to forget to ``yield``: for instance if we forgot that
-``do_some_stuff`` returned a deferred, we might plough on regardless. This
-leads to a mess; it will probably work itself out eventually, but not before
-a load of stuff has been logged against the wrong context. (Normally, other
-things will break, more obviously, if you forget to ``yield``, so this tends
-not to be a major problem in practice.)
-
-Of course sometimes you need to do something a bit fancier with your Deferreds
- not all code follows the linear A-then-B-then-C pattern. Notes on
-implementing more complex patterns are in later sections.
-
-Where you create a new Deferred, make it follow the rules
---------------------------------------------------------
-
-Most of the time, a Deferred comes from another synapse function. Sometimes,
-though, we need to make up a new Deferred, or we get a Deferred back from
-external code. We need to make it follow our rules.
-
-The easy way to do it is with a combination of ``defer.inlineCallbacks``, and
-``context.PreserveLoggingContext``. Suppose we want to implement ``sleep``,
-which returns a deferred which will run its callbacks after a given number of
-seconds. That might look like:
-
-.. code:: python
-
-    # not a logcontext-rules-compliant function
-    def get_sleep_deferred(seconds):
-        d = defer.Deferred()
-        reactor.callLater(seconds, d.callback, None)
-        return d
-
-That doesn't follow the rules, but we can fix it by wrapping it with
-``PreserveLoggingContext`` and ``yield`` ing on it:
-
-.. code:: python
-
-    @defer.inlineCallbacks
-    def sleep(seconds):
-        with PreserveLoggingContext():
-            yield get_sleep_deferred(seconds)
-
-This technique works equally for external functions which return deferreds,
-or deferreds we have made ourselves.
-
-You can also use ``context.make_deferred_yieldable``, which just does the
-boilerplate for you, so the above could be written:
-
-.. code:: python
-
-    def sleep(seconds):
-        return context.make_deferred_yieldable(get_sleep_deferred(seconds))
-
-
-Fire-and-forget
---------------
-
-Sometimes you want to fire off a chain of execution, but not wait for its
-result. That might look a bit like this:
-
-.. code:: python
-
-    @defer.inlineCallbacks
-    def do_request_handling():
-        yield foreground_operation()
-
-        # *don't* do this
-        background_operation()
-
-        logger.debug("Request handling complete")
-
-    @defer.inlineCallbacks
-    def background_operation():
-        yield first_background_step()
-        logger.debug("Completed first step")
-        yield second_background_step()
-        logger.debug("Completed second step")
-
-The above code does a couple of steps in the background after
-``do_request_handling`` has finished. The log lines are still logged against
-the ``request_context`` logcontext, which may or may not be desirable. There
-are two big problems with the above, however. The first problem is that, if
-``background_operation`` returns an incomplete Deferred, it will expect its
-caller to ``yield`` immediately, so will have cleared the logcontext. In this
-example, that means that 'Request handling complete' will be logged without any
-context.
-
-The second problem, which is potentially even worse, is that when the Deferred
-returned by ``background_operation`` completes, it will restore the original
-logcontext. There is nothing waiting on that Deferred, so the logcontext will
-leak into the reactor and possibly get attached to some arbitrary future
-operation.
-
-There are two potential solutions to this.
-
-One option is to surround the call to ``background_operation`` with a
-``PreserveLoggingContext`` call. That will reset the logcontext before
-starting ``background_operation`` (so the context restored when the deferred
-completes will be the empty logcontext), and will restore the current
-logcontext before continuing the foreground process:
-
-.. code:: python
-
-    @defer.inlineCallbacks
-    def do_request_handling():
-        yield foreground_operation()
-
-        # start background_operation off in the empty logcontext, to
-        # avoid leaking the current context into the reactor.
-        with PreserveLoggingContext():
-            background_operation()
-
-        # this will now be logged against the request context
-        logger.debug("Request handling complete")
-
-Obviously that option means that the operations done in
-``background_operation`` would be not be logged against a logcontext (though
-that might be fixed by setting a different logcontext via a ``with
-LoggingContext(...)`` in ``background_operation``).
-
-The second option is to use ``context.run_in_background``, which wraps a
-function so that it doesn't reset the logcontext even when it returns an
-incomplete deferred, and adds a callback to the returned deferred to reset the
-logcontext. In other words, it turns a function that follows the Synapse rules
-about logcontexts and Deferreds into one which behaves more like an external
-function — the opposite operation to that described in the previous section.
-It can be used like this:
-
-.. code:: python
-
-    @defer.inlineCallbacks
-    def do_request_handling():
-        yield foreground_operation()
-
-        context.run_in_background(background_operation)
-
-        # this will now be logged against the request context
-        logger.debug("Request handling complete")
-
-Passing synapse deferreds into third-party functions
----------------------------------------------------
-
-A typical example of this is where we want to collect together two or more
-deferred via ``defer.gatherResults``:
-
-.. code:: python
-
-    d1 = operation1()
-    d2 = operation2()
-    d3 = defer.gatherResults([d1, d2])
-
-This is really a variation of the fire-and-forget problem above, in that we are
-firing off ``d1`` and ``d2`` without yielding on them. The difference
-is that we now have third-party code attached to their callbacks. Anyway either
-technique given in the `Fire-and-forget`_ section will work.
-
-Of course, the new Deferred returned by ``gatherResults`` needs to be wrapped
-in order to make it follow the logcontext rules before we can yield it, as
-described in `Where you create a new Deferred, make it follow the rules`_.
-
-So, option one: reset the logcontext before starting the operations to be
-gathered:
-
-.. code:: python
-
-    @defer.inlineCallbacks
-    def do_request_handling():
-        with PreserveLoggingContext():
-            d1 = operation1()
-            d2 = operation2()
-            result = yield defer.gatherResults([d1, d2])
-
-In this case particularly, though, option two, of using
-``context.preserve_fn`` almost certainly makes more sense, so that
-``operation1`` and ``operation2`` are both logged against the original
-logcontext. This looks like:
-
-.. code:: python
-
-    @defer.inlineCallbacks
-    def do_request_handling():
-        d1 = context.preserve_fn(operation1)()
-        d2 = context.preserve_fn(operation2)()
-
-        with PreserveLoggingContext():
-            result = yield defer.gatherResults([d1, d2])
-
-
-Was all this really necessary?
------------------------------
-
-The conventions used work fine for a linear flow where everything happens in
-series via ``defer.inlineCallbacks`` and ``yield``, but are certainly tricky to
-follow for any more exotic flows. It's hard not to wonder if we could have done
-something else.
-
-We're not going to rewrite Synapse now, so the following is entirely of
-academic interest, but I'd like to record some thoughts on an alternative
-approach.
-
-I briefly prototyped some code following an alternative set of rules. I think
-it would work, but I certainly didn't get as far as thinking how it would
-interact with concepts as complicated as the cache descriptors.
-
-My alternative rules were:
-
-* functions always preserve the logcontext of their caller, whether or not they
-  are returning a Deferred.
-
-* Deferreds returned by synapse functions run their callbacks in the same
-  context as the function was orignally called in.
-
-The main point of this scheme is that everywhere that sets the logcontext is
-responsible for clearing it before returning control to the reactor.
-
-So, for example, if you were the function which started a ``with
-LoggingContext`` block, you wouldn't ``yield`` within it — instead you'd start
-off the background process, and then leave the ``with`` block to wait for it:
-
-.. code:: python
-
-    def handle_request(request_id):
-        with context.LoggingContext() as request_context:
-            request_context.request = request_id
-            d = do_request_handling()
-
-        def cb(r):
-            logger.debug("finished")
-
-        d.addCallback(cb)
-        return d
-
-(in general, mixing ``with LoggingContext`` blocks and
-``defer.inlineCallbacks`` in the same function leads to slighly
-counter-intuitive code, under this scheme).
-
-Because we leave the original ``with`` block as soon as the Deferred is
-returned (as opposed to waiting for it to be resolved, as we do today), the
-logcontext is cleared before control passes back to the reactor; so if there is
-some code within ``do_request_handling`` which needs to wait for a Deferred to
-complete, there is no need for it to worry about clearing the logcontext before
-doing so:
-
-.. code:: python
-
-    def handle_request():
-        r = do_some_stuff()
-        r.addCallback(do_some_more_stuff)
-        return r
-
-— and provided ``do_some_stuff`` follows the rules of returning a Deferred which
-runs its callbacks in the original logcontext, all is happy.
-
-The business of a Deferred which runs its callbacks in the original logcontext
-isn't hard to achieve — we have it today, in the shape of
-``context._PreservingContextDeferred``:
-
-.. code:: python
-
-    def do_some_stuff():
-        deferred = do_some_io()
-        pcd = _PreservingContextDeferred(LoggingContext.current_context())
-        deferred.chainDeferred(pcd)
-        return pcd
-
-It turns out that, thanks to the way that Deferreds chain together, we
-automatically get the property of a context-preserving deferred with
-``defer.inlineCallbacks``, provided the final Defered the function ``yields``
-on has that property. So we can just write:
-
-.. code:: python
-
-    @defer.inlineCallbacks
-    def handle_request():
-        yield do_some_stuff()
-        yield do_some_more_stuff()
-
-To conclude: I think this scheme would have worked equally well, with less
-danger of messing it up, and probably made some more esoteric code easier to
-write. But again — changing the conventions of the entire Synapse codebase is
-not a sensible option for the marginal improvement offered.
-
-
-A note on garbage-collection of Deferred chains
-----------------------------------------------
-
-It turns out that our logcontext rules do not play nicely with Deferred
-chains which get orphaned and garbage-collected.
-
-Imagine we have some code that looks like this:
-
-.. code:: python
-
-    listener_queue = []
-
-    def on_something_interesting():
-        for d in listener_queue:
-            d.callback("foo")
-
-    @defer.inlineCallbacks
-    def await_something_interesting():
-        new_deferred = defer.Deferred()
-        listener_queue.append(new_deferred)
-
-        with PreserveLoggingContext():
-            yield new_deferred
-
-Obviously, the idea here is that we have a bunch of things which are waiting
-for an event. (It's just an example of the problem here, but a relatively
-common one.)
-
-Now let's imagine two further things happen. First of all, whatever was
-waiting for the interesting thing goes away. (Perhaps the request times out,
-or something *even more* interesting happens.)
-
-Secondly, let's suppose that we decide that the interesting thing is never
-going to happen, and we reset the listener queue:
-
-.. code:: python
-
-    def reset_listener_queue():
-        listener_queue.clear()
-
-So, both ends of the deferred chain have now dropped their references, and the
-deferred chain is now orphaned, and will be garbage-collected at some point.
-Note that ``await_something_interesting`` is a generator function, and when
-Python garbage-collects generator functions, it gives them a chance to clean
-up by making the ``yield`` raise a ``GeneratorExit`` exception. In our case,
-that means that the ``__exit__`` handler of ``PreserveLoggingContext`` will
-carefully restore the request context, but there is now nothing waiting for
-its return, so the request context is never cleared.
-
-To reiterate, this problem only arises when *both* ends of a deferred chain
-are dropped. Dropping the the reference to a deferred you're supposed to be
-calling is probably bad practice, so this doesn't actually happen too much.
-Unfortunately, when it does happen, it will lead to leaked logcontexts which
-are incredibly hard to track down.
--- a/docs/media_repository.rst
+++ b/docs/media_repository.rst
-Media Repository 
-================
+# Media Repository 

 *Synapse implementation-specific details for the media repository*

@@ -7,21 +6,25 @@ The media repository is where attachments and avatar photos are stored.
 It stores attachment content and thumbnails for media uploaded by local users.
 It caches attachment content and thumbnails for media uploaded by remote users.

-Storage
-------
+## Storage
+
+Each item of media is assigned a `media_id` when it is uploaded.
+The `media_id` is a randomly chosen, URL safe 24 character string.

-Each item of media is assigned a ``media_id`` when it is uploaded.
-The ``media_id`` is a randomly chosen, URL safe 24 character string.
 Metadata such as the MIME type, upload time and length are stored in the
-sqlite3 database indexed by ``media_id``.
-Content is stored on the filesystem under a ``"local_content"`` directory.
-Thumbnails are stored under a ``"local_thumbnails"`` directory.
-The item with ``media_id`` ``"aabbccccccccdddddddddddd"`` is stored under
-``"local_content/aa/bb/ccccccccdddddddddddd"``. Its thumbnail with width
-``128`` and height ``96`` and type ``"image/jpeg"`` is stored under
-``"local_thumbnails/aa/bb/ccccccccdddddddddddd/128-96-image-jpeg"``
-Remote content is cached under ``"remote_content"`` directory. Each item of
-remote content is assigned a local "``filesystem_id``" to ensure that the
-directory structure ``"remote_content/server_name/aa/bb/ccccccccdddddddddddd"``
+sqlite3 database indexed by `media_id`.
+
+Content is stored on the filesystem under a `"local_content"` directory.
+
+Thumbnails are stored under a `"local_thumbnails"` directory.
+
+The item with `media_id` `"aabbccccccccdddddddddddd"` is stored under
+`"local_content/aa/bb/ccccccccdddddddddddd"`. Its thumbnail with width
+`128` and height `96` and type `"image/jpeg"` is stored under
+`"local_thumbnails/aa/bb/ccccccccdddddddddddd/128-96-image-jpeg"`
+
+Remote content is cached under `"remote_content"` directory. Each item of
+remote content is assigned a local `"filesystem_id"` to ensure that the
+directory structure `"remote_content/server_name/aa/bb/ccccccccdddddddddddd"`
 is appropriate. Thumbnails for remote content are stored under
-``"remote_thumbnails/server_name/..."``
+`"remote_thumbnails/server_name/..."`