summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGabriel Ivascu <ivascu.gabriel59@gmail.com>2017-06-12 00:05:13 +0300
committerMichael Catanzaro <mcatanzaro@igalia.com>2017-08-06 09:28:09 -0500
commitff16aaff81495e7b0eca03fb358236f68bd7bb96 (patch)
tree7e4f0082eeb89e97d4b3867b59bdeabc06dadab5
parent641d2c9e5ee605bf7cd1727a1af94aa5ffb6f745 (diff)
downloadepiphany-ff16aaff81495e7b0eca03fb358236f68bd7bb96.tar.gz
sync: Add README
-rw-r--r--lib/sync/README454
1 files changed, 454 insertions, 0 deletions
diff --git a/lib/sync/README b/lib/sync/README
new file mode 100644
index 000000000..08117fdc1
--- /dev/null
+++ b/lib/sync/README
@@ -0,0 +1,454 @@
+
+ Firefox Sync Support in Epiphany
+
+ Overview
+ --------
+
+ Firefox Sync [0] is a browser synchronization feature developed by Mozilla,
+ and is currently used to synchronize data between Firefox browsers, both
+ desktop and mobile. The way it is designed ensures data security in such a
+ manner that not even Mozilla can read the user data on the storage servers.
+
+ To synchronize data via Firefox Sync, users must own a Firefox account.
+
+ Behind the Scenes
+ -----------------
+
+ Firefox Sync relies on the existence of a server to store the synchronized
+ records. This server is called the Sync Storage Server [1] and the current
+ version of its API is 1.5 [2], which is being used since 2014. Each record
+ that is synchronized (a.k.a. BSO - Basic Storage Object) is grouped with
+ other related records into collections. Besides the default collections
+ (i.e. bookmarks, history, passwords, tabs), there are additional collections
+ (clients, crypto, meta) used for management purposes.
+
+ After a quick glance at the API, you may notice that clients access the stored
+ records on the server relatively easy, via HTTP requests. However, there are a
+ couple of aspects that need to be taken into consideration before being able
+ to communicate with the Sync Storage Server:
+
+ 1) What is the URL of the Sync Storage Server and how do you obtain it?
+ Moreover, since every request sent to the Sync Storage Server needs to
+ be authenticated with Hawk [3], how do you obtain the Hawk id and key?
+
+ 2) How do you decrypt the response so that you can actually "see" the stored
+ record? As mentioned in the Mozilla docs, the only job of the Sync Storage
+ Server is to store the records (it does not alter or delete them).
+ So it falls into the responsibility of the clients to encrypt the records
+ that are uploaded to the server.
+
+ These aspects will be covered in the following chapters.
+
+ Obtaining the Storage Credentials
+ ---------------------------------
+
+ To obtain the storage credentials (i.e. the URL of the Sync Storage Server and
+ the Hawk id and key), the sync clients have to interact with the Token Server
+ [4][5]. In the context of Firefox Sync, the role of the Token Server is to
+ share the URL of the Sync Storage Server together with a pair of Hawk id and
+ key in exchange for a BrowserID assertion [6]. However, the Hawk id and key
+ have a limited expiry time, so clients need to take that into consideration
+ and request new credentials when the previous ones have expired.
+
+ So the next question that raises is: how do you make the BrowserID assertion?
+ More exactly, since every BrowserID assertion requires a signed identity
+ certificate, where do you get the certificate from?
+
+ The certificate is obtained from the Firefox Accounts Server [7][8], more
+ specifically from the /certificate/sign endpoint. As you can see in the API,
+ requests to this endpoint have to be Hawk authenticated based on a so called
+ sessionToken (a 32 bytes token) that is obtained from the /account/login
+ endpoint (this endpoint does not require Hawk authentication). Details about
+ how the email and the password of the user's Firefox account are stretched to
+ obtain the request body of the login request can be found here [9], however it
+ is of no big interest since this is made automatically by the Firefox Accounts
+ Content Server [10] (how Epiphany uses the Firefox Accounts Content Server to
+ do the login will be explained later). What is important to know is how the
+ sessionToken is derived to obtain the Hawk id and key that will be used to
+ authorize the request to the certificate endpoint. The process is explained
+ here [11] and it involves the sessionToken being fed into HKDF [12] to obtain
+ the Hawk id and key.
+
+ To summarize, the steps of obtaining the storage credentials are:
+
+ 1. Login with the Firefox account and obtain a sessionToken. This is a one
+ time step, since the sessionToken lasts forever until revoked by a password
+ change or explicit revocation command (via the /session/destroy endpoint of
+ the Firefox Accounts Server) and can be used an unlimited number of times.
+
+ 2. Based on the sessionToken, obtain a signed identity certificate from the
+ /certificate/sign endpoint of the Firefox Accounts Server. The certificate
+ has a limited lifetime of 24 hours.
+
+ 3. Create the BrowserID assertion from the previously obtained certificate.
+
+ 4. Send a request to the Token Server with the HTTP authorization header set
+ to the BrowserID assertion. The Token Server will respond with the URL of
+ the Sync Storage Server and the Hawk id and key together with the validity
+ duration.
+
+ 5. When the validity duration has expired, repeat from step 2.
+
+ Encrypting and Decrypting Records
+ ---------------------------------
+
+ Every collection on the Sync Storage Server has a key bundle associated formed
+ of two keys: a symmetric encryption key and a HMAC key. The former is used to
+ encrypt and decrypt the records with AES-256, while the latter is used to
+ verify the records using HMAC hashing. Both keys are 32 bytes. The hashing
+ algorithm used is SHA-256. Besides the bundles associated to each collection,
+ there is also a default key bundle which is supposed to be used when handling
+ records belonging to a collection that has no key bundle associated.
+
+ All the key bundles (including the default one) are stored in the crypto/keys
+ record [13] on the Sync Storage Server. This is a normal record, but with a
+ special meaning: being a record that holds information about the keys used to
+ encrypt/decrypt all the other records, it cannot be encrypted with any of
+ those keys for obvious reasons. Therefore it is encrypted and verified with a
+ different key bundle derived from the Sync Key or the Master Sync Key.
+
+ The Master Sync Key is a 32 bytes token available only to sync clients and
+ is obtained from the Firefox Accounts Server via the /account/keys endpoint.
+ This endpoint enforces the requests to be Hawk authenticated based on a
+ keyFetchToken. The keyFetchToken is also a 32 bytes token and is obtained at
+ login along with the sessionToken. The process of deriving the keyFetchToken
+ into the Hawk id and key used to authorize the request and the process of
+ extracting the Master Sync Key from response bundle are thoroughly explained
+ here [14]. Note that the Master Sync Key is referred there as kB. In short,
+ the keyFetchToken is used to derive four other tokens through two HKDF
+ processes. The first HKDF process derives the Hawk id and key together with
+ a new keying material that serves as input for the second HKDF process.
+ The second HKDF process derives a response HMAC key and a response XOR key.
+ In response to the Hawk request, the server sends in return a bundle which
+ holds a cipher text and a pre-calculated HMAC value. Clients use the response
+ HMAC key to compute the HMAC value of the cipher text to validate it. After
+ that, the cipher text is xored with the response XOR key to obtain a 64 bytes
+ token. The first 32 bytes represent kA which is left unused. The last 32 bytes
+ are xored with unwrapBKey to obtain kB (a.k.a the Master Sync Key). unwrapBKey
+ is yet another 32 bytes token returned at login together with sessionToken
+ and keyFetchToken). The hashing algorithm used is SHA-256.
+
+ Note that the Master Sync Key is a immutable token which is generated when
+ the Firefox account is created. Therefore, it should be considered a secret
+ and not ever be displayed in clear text as it could lead to the account's data
+ on the Sync Storage Server being compromised.
+
+ Having the Master Sync Key, deriving the key bundle that is used to encrypt
+ and verify the crypto/keys record is rather trivial: just perform a two-step
+ HKDF with an all-zeros salt. T(1) will represent the AES encryption key and
+ T(2) will represent the HMAC key. Having this key bundle, the crypto/keys
+ record can be decrypted to extract the default key bundle together with the
+ per-collection key bundles.
+
+ After that, clients are ready to upload/download records to/from the Sync
+ Storage Server. The flow when uploading a record is:
+
+ 1. Serialize the object representing a bookmark/history/password/tab into
+ a JSON object. The stringified JSON object will represent the clear text.
+
+ 2. Encrypt the clear text with AES-256 using the encryption key from the
+ corresponding collection's key bundle or from the default key bundle if
+ the collection does not have a key bundle associated. As an initialization
+ vector (IV) for AES-256, a random 16 bytes token will be used. AES-256
+ will output the cipher text which will be base64 encoded afterwards.
+
+ 3. Compute the HMAC value of the base64 encoded cipher text using the HMAC
+ key from the corresponding collection's key bundle or from the default
+ key bundle if the collection does not have a key bundle associated.
+ The hashing algorithm used is SHA-256.
+
+ 4. Create a JSON object containing the base64 encoded cipher text, the
+ base64 encoded initialization vector and the hex encoded HMAC value.
+ The stringified JSON object will represent the payload of the BSO
+ that will be uploaded to the Sync Storage Server.
+
+ 5. Create a JSON object containing the id of the record and the previously
+ computed payload. The id of the record is a 12 characters base64 urlsafe
+ string that is randomly generated (however when updating a record the id
+ must be preserved). The stringified JSON object will represent the body of
+ the request sent to the Sync Storage Server. Of course, the request will be
+ Hawk authorized with on the Hawk id and key obtained from the Token Server.
+
+ The flow is reversed when downloading a record.
+
+ More details about the cryptography of the Sync Storage Server can be found
+ here [15].
+
+ Firefox Objects Formats
+ -----------------------
+
+ The Firefox format of the objects uploaded to the Sync Storage Server is
+ described here [16]. You will notice there are multiple versions described
+ for each collection, however, all collections currently use version 1, except
+ for bookmarks which use version 2. All formats are JSON objects.
+
+ In Epiphany these formats are described by the GObject properties of the
+ objects that are synchronized:
+
+ * EphyBookmark for bookmarks
+ * EphyPasswordRecord for passwords
+ * EphyHistoryRecord for history
+ * EphyOpenTabsRecord for tabs
+
+ All these objects implement the JsonSerializable interface which allows
+ GObjects to be serialized into JSON strings and to be constructed from JSON
+ strings based on the properties of the object. They also implement the
+ EphySynchronizable interface which describes an object viable to become a BSO.
+
+ Signing in to Firefox Sync in Epiphany
+ --------------------------------------
+
+ As mentioned in the previous chapters, a vital part of Firefox Sync is
+ signing in with the email and password of the Firefox account and obtaining
+ a sessionToken, a keyFetchToken and an unwrapBKey that are further used to
+ access data on the Sync Storage Server.
+
+ In Epiphany, users can sign in with an existing Firefox account or create
+ a new one via the Sync tab in the preferences dialog. After the sign in has
+ completed successfully, users can customize their sync experience by choosing
+ what collections to sync (bookmarks, history, password and open tabs), whether
+ to sync or not with Firefox and the sync frequency.
+
+ However, it is important to understand what happens behind the scenes when
+ users sign in. The preferences dialog displays a Firefox iframe where users
+ type their email and password of their Firefox account and click Sign In.
+ The Firefox iframe is actually the web interface of the Firefox Accounts
+ Server, called the Firefox Accounts Content Server. This is the preferred
+ way of communicating sync related information from the Firefox Accounts
+ Server to Firefox Sync clients (the other way is by direct requests to the
+ login endpoint of the Firefox Accounts Server but that has been pretty much
+ restricted from public access). Communication with the Firefox Accounts
+ Content Server is made via the WebChannels flow [17] which is briefly
+ described further.
+
+ The Firefox iframe is loaded in a WebKitWebView inside the Sync tab of the
+ preferences dialog. A JavaScript that listens to WebChannelMessageToChrome
+ events is added to the WebKitUserContentManager of the WebKitWebView.
+ WebChannelMessageToChrome are messages that come from the Firefox Accounts
+ Content Server to the web browser. When such an event is received, the
+ JavaScript forwards it to the WebKitUserContentManager via the
+ "script-message-received" signal. The callback connected to this signal in
+ the preferences dialog will parse the message and respond with a
+ WebChannelMessageToContent event through webkit_web_view_run_javascript()
+ if necessary. WebChannelMessageToContent are messages that go from the web
+ browser to the Firefox Accounts Content Server. Both WebChannelMessageToChrome
+ and WebChannelMessageToContent events have a "detail" JSON object that has
+ the following members the id of the WebChannel and a "message" JSON object.
+ The latter has the following members:
+ * command: string, one of fxaccounts:loaded, fxaccounts:can_link_account,
+ fxaccounts:login, fxaccounts:delete_account, fxaccount:change_password,
+ profile:change.
+ * messageId: string, an unique identifier that should be kept the same when
+ responding to a message.
+ * data: JSON object, optional, carries the actual data of the message.
+
+ The WebChannelMessageToChrome/WebChannelMessageToContent sign in flow is:
+
+ 1. fxaccounts:loaded command is received via WebChannelMessageToChrome when
+ the Firefox iframe is loaded. No data is sent and no response is expected.
+
+ 2. fxaccounts:can_link_account command is received via
+ WebChannelMessageToChrome when the user has entered the credentials and
+ submitted the form. The data received contains the email of the user.
+ A response with an "ok" field is expected.
+
+ 3. A WebChannelMessageToContent message is sent in response to the previous
+ command. The fields detail.message.command, detail.message.messageId and
+ detail.id are kept the same. The field detail.message.data is set to
+ {ok: true}.
+
+ 4. fxaccounts:login command is received via WebChannelMessageToChrome.
+ No response is expected. The data field contains the sessionToken,
+ keyFetchToken and unwrapBKey tokens amongst other. Now the client has
+ everything it needs and the sync can begin.
+
+ Sync Modules in Epiphany
+ ------------------------
+
+ I.
+
+ Synchronization in Epiphany is made via EphySyncService, which is a
+ singleton object residing in EphyShell, being accessible anywhere in
+ src/ via ephy_shell_get_sync_service(). However, EphySyncService is designed
+ to operate mostly on its own so you probably won't have to deal with it in
+ newly written code. EphySyncService handles all aspects of the communication
+ with the Sync Storage Server (uploads and downloads records), Token Server
+ (gets the storage credentials) and Firefox Accounts Server (gets account keys,
+ gets certificates, destroys the session). It also schedules and makes
+ periodical synchronizations via every collection's manager.
+
+ After the user clicks Sign In and the fxaccounts:login WebChannel message is
+ received, the sessionToken, keyFetchToken and unwrapBKey are passed from the
+ preferences dialog to EphySyncService via the _sign_in(). The sync service
+ will then go through all the flow previously described to read the crypto/keys
+ record from the Sync Storage Server. In case the crypto/keys record does not
+ exist (this happens when the Firefox account is newly created), the sync
+ service will generate and upload a new crypto/keys record which contains a
+ randomly generated default key bundle. Note that EphySyncService currently
+ supports only v1.5 of the Sync Storage Server and the sign in will fail if it
+ detects a lower version. The version is kept on the Sync Storage Server in
+ the meta/global record [18]. This is a special record that is not encrypted
+ and contains general information about the state of the Sync Storage Server.
+
+ The sessionToken, the Master Sync Key and the crypto key bundles are then
+ stored in the sync SecretSchema. They are loaded in memory at every startup
+ and used whenever needed until the user signs out. At that point they are
+ freed and the SecretSchema is cleared.
+
+ The requests to the Sync Storage Server are sent internally via
+ ephy_sync_service_queue_storage_request(). This checks whether the storage
+ credentials are expired or not. If not expired, the request is sent directly
+ via ephy_sync_service_send_storage_request(), otherwise, the request is put
+ in a message queue and new storage credentials are obtained. When the new
+ storage credentials have been obtained, the queue will be emptied and all
+ requests will be sent.
+
+ The API of EphySyncService is rather simple. It contains functions to sign in,
+ sign out, start periodical synchronization and do a synchronization. Besides
+ these, there are four other functions:
+
+ * _new(). This is the constructor. It receives a boolean that says whether
+ this sync service should do periodical synchronizations. This is needed
+ because passwords are saved from EphyWebExtension which runs in another
+ process (the web process). For that, EphyWebExtension needs to have its
+ own EphySyncService that will upload/update/delete passwords once they are
+ saved/modified/deleted but will not do any periodical synchronizations.
+ Periodical synchronizations (which synchronize all records from all
+ collections) will only be made by the EphySyncService that belongs to the
+ UI process.
+
+ * _register_device(). This function adds the current device with the given
+ name to the clients collection on the Sync Storage Server. The clients
+ collection is a special collection which stores data about the devices
+ connected to the Firefox account. It is worth mentioning that every device
+ is identified by a device id and a device name. The device id is randomly
+ generated at login by EphySyncService and cannot be changed by users.
+ The device name can be edited by users from the preferences dialog.
+ If no device name is chosen, then the default is used: "@username's
+ Epiphany on @hostname".
+
+ * _register_manager() and _unregister_manager(). These will be explained in
+ the context of EphySynchronizableManager.
+
+ The sign out function is more of a cleanup function: it stops the periodical
+ synchronization, unregisters the device by deleting the associated record
+ from the clients collection, deletes the associated record from the tabs
+ collection, destroys the session, clears the message queue, unregisters
+ all managers and clears the SecretSchema that contains the sync secrets.
+
+ II.
+
+ EphySynchronizableManager is an interface that describes the common
+ functionality of every collection and is implemented by every collection
+ manager: EphyBookmarksManager, EphyPasswordManager, EphyHistoryManager and
+ EphyOpenTabsManager. All these managers and also singleton objects residing
+ in EphyShell and are accessible in src/ via ephy_shell_get_<manager>().
+ The main reason why such an interface is needed is because EphySyncService is
+ defined under lib/ which makes it unable to access objects from src/ (i.e.
+ EphyBookmark and EphyBookmarksManager) so it needs to access them through
+ a delegate interface. EphySyncService is defined under lib/ because it is
+ required by EphyWebExtension which is defined under embed/. (See section
+ LAYERING in HACKING). For the same reason, EphySyncService has functions to
+ register/unregister EphySynchronizableManagers. Managers are registered in
+ EphyShell when EphySyncService is created based on the user preferences stored
+ in the GSettings schema. Managers are also registered and unregistered when
+ users toggle the check buttons that say whether a collection should be
+ synchronized or not in the preferences dialog.
+
+ The API of EphySynchronizableManager is described in the source code via
+ documentation comments. EphySynchronizableManager provides two signals:
+ "synchronizable-modified" and "synchronizable-deleted". The implementations
+ of EphySynchronizableManager trigger the first signal when an object has
+ been added or modified so it needs to be uploaded to be Sync Storage Server
+ and the second signal when an object has been deleted locally so it needs to
+ be deleted from the Sync Storage Server too. EphySyncService connects a
+ callback to these signals for every managers that has been registered to it
+ and disconnects them when the manager is unregistered. This way objects any
+ local changes to the synchronized objects will be mirrored instantly on the
+ Sync Storage Server. However, in case EphySyncService finds a newer version
+ of the object on the server, it will download it.
+
+ Note that when an record is deleted from the Sync Storage Server, it does not
+ disappear from the server. It is only marked as deleted with a "deleted" flag
+ set to true. This way other sync clients will know that the record has been
+ deleted by another client and will delete it too from their local collection.
+ See ephy_sync_debug_delete_record() vs ephy_sync_debug_erase_record() for more
+ details about this.
+
+ The synchronization merge logic of every collection is described in the
+ source code comments of every manager's merge function.
+
+ III.
+
+ EphySynchronizableManager is another delegate interface, that describes the
+ objects that are uploaded and downloaded from the Sync Storage Server.
+ It is implemented by EphyBookmark, EphyPasswordRecord, EphyHistoryRecord and
+ EphyOpenTabsRecord which also implement the JsonSerializable interface so
+ that they can be converted to BSOs. EphySynchronizable objects are managed
+ by the associated EphySynchronizableManager. The API of EphySynchronizable is
+ described in the source code via documentation comments.
+
+ IV.
+
+ EphySyncCrypto is a helper module that handles all the cryptographic stuff.
+ Its API include:
+
+ * _process_session_token(). Derives the Hawk id and key from a sessionToken.
+
+ * _process_key_fetch_token(). Derives the Hawk id and key, the response HMAC
+ key and the response XOR key from a keyFetchToken.
+
+ * _compute_sync_keys(). Derives the Master Sync Key from the unwrapBKey
+ token, the bundle returned by the /accounts/keys endpoint of the Firefox
+ Accounts Server, the response HMAC key and the response XOR key.
+
+ * _derive_key_bundle(). Derives the key bundle from the Master Sync Key.
+
+ * _generate_crypto_keys(). Generates a new crypto/keys record.
+
+ * _encrypt_record(). Encrypts a clear text into a BSO payload.
+
+ * _decrypt_record(). Decrypts a BSO payload into a clear text.
+
+ * _generate_rsa_key_pair(). Generates a RSA key pair. This is needed when
+ obtaining an identify certificate and creating the BrowserID assertion.
+
+ * _create_assertion(). Creates a BrowserID assertion from a certificate, an
+ audience and a RSA key pair.
+
+ * _compute_hawk_header(). Creates a Hawk header that is used to authorize
+ Hawk requests. Unfortunately, there isn't any C library for creating Hawk
+ headers so the code has been reproduced from a Python library [19].
+
+ Until Nettle adds support for HKDF, Epiphany will use its own implementation
+ of HKDF.
+
+ V.
+
+ EphySyncDebug is a helper module for debugging purposes. All its functions
+ use only synchronous API calls and should not be used in production code.
+ Its API is described in the source code via documentation comments.
+
+ References
+ ----------
+
+ [0] https://wiki.mozilla.org/Services/Sync
+ [1] https://github.com/mozilla-services/server-syncstorage
+ [2] https://mozilla-services.readthedocs.io/en/latest/storage/apis-1.5.html
+ [3] https://github.com/hueniverse/hawk/blob/master/README.md
+ [4] https://mozilla-services.readthedocs.io/en/latest/token/index.html
+ [5] https://github.com/mozilla-services/tokenserver
+ [6] https://fuller.li/posts/how-does-browserid-work/
+ [7] https://mozilla-services.readthedocs.io/en/latest/fxa/index.html
+ [8] https://github.com/mozilla/fxa-auth-server/
+ [9] https://github.com/mozilla/fxa-auth-server/wiki/onepw-protocol#login-obtaining-the-sessiontoken
+ [10] https://github.com/mozilla/fxa-content-server/
+ [11] https://github.com/mozilla/fxa-auth-server/wiki/onepw-protocol#signing-certificates
+ [12] https://tools.ietf.org/html/rfc5869
+ [13] https://mozilla-services.readthedocs.io/en/latest/sync/storageformat5.html#crypto-keys-record
+ [14] https://github.com/mozilla/fxa-auth-server/wiki/onepw-protocol#-fetching-sync-keys
+ [15] https://mozilla-services.readthedocs.io/en/latest/sync/storageformat5.html#cryptography
+ [16] https://mozilla-services.readthedocs.io/en/latest/sync/objectformats.html
+ [17] https://github.com/mozilla/fxa-content-server/blob/master/docs/relier-communication-protocols/fx-webchannel.md
+ [18] https://mozilla-services.readthedocs.io/en/latest/sync/storageformat5.html#metaglobal-record
+ [19] https://github.com/mozilla/PyHawk