Synchronisation

This section describes the basic aspects of synchronisation using Kinto.

Note

If you are looking for a ready-to-use synchronisation solution, jump to Implementations.

The basic idea is to keep a local database up to date with the Kinto server:

  • Remote changes are downloaded and applied on the local data.

  • Local changes are uploaded using HTTP headers to control concurrency and overwrites.

In short:

  1. Poll for remote changes using ?_since=<timestamp>

  2. Apply changes locally

  3. Send local creations

  4. Use concurrency control to send local updates and deletes

Polling for remote changes

Kinto supports range queries for timestamps. Combining them with the sort parameter allows to fetch changes in a particular order.

Depending on the context (latest first, readonly, etc.), there are several strategies to poll the server for changes.

Important

  • Timestamps are unique.

  • Deleted records have an attribute delete: true.

  • Created/updated records are both returned in their new version.

  • Since Kinto does not keep any history, there is no diff for updates.

Strategy #1 — Oldest first

The simplest way to obtain the changes is to sort the records by timestamp ascending.

We will use _sort=last_modified and _since=<timestamp>:

  1. First sync: timestamp := 0

  2. Next sync: timestamp := MAX(local_records['last_modified'])

  3. Fetch GET /buckets/<bid>/collections/<cid>/records?_sort=last_modified&_since=<timestamp>

  4. If response is 200 OK, handle the list of remote changes.

  5. If response has Next-Page header, follow full URL in header.

  6. If list of changes is empty, done → up-to-date.

../_images/sync-oldest.svg

If an error occurs during the retrieval of pages, the synchronisation can be resumed transparently, since the pages are obtained with ascending timestamps, and the next sync relies on the highest timestamp successfully stored locally.

Strategy #2 — Newest first

In order to populate a UI, it might be relevant to obtain the latest changes first.

Syncing newest records first is a bit more complex since changes can occur between the retrieval of the first and the last pages.

We will use _sort=-last_modified (desc), _before to omit later changes, and _since to include changes after last sync:

  1. First sync: timestamp := 0

  2. Next sync: use timestamp stored in last successful sync.

  3. Fetch current collection timestamp HEAD /buckets/<bid>/collections/<cid>/records in ETag response header and store its value in start.

  4. Fetch GET /buckets/<bid>/collections/<cid>/records?_sort=-last_modified&_before=<start>&_since=<timestamp>

  5. If response is 200 OK, stack the obtained list of remote changes.

  6. If response has Next-Page header, follow full URL in header.

  7. If list of changes is empty, done → handle the stack of remote changes and update the timestamp: timestamp := MAX(local_records['last_modified'])

../_images/sync-newest.svg

With this approach, the main algorithm is rather simple but since we track the last sync timestamp when the last page is done, if an error occurs between the first and the last step, the client must redownload every page obtained from step 1 until it succeeds to fetch every page of the sync.

In order to avoid that, the algorithm should slightly be complexified in order to track additional info obtained from the page that failed. The upper and lower values of timestamps (_before and _since) can then be specified manually to resume the synchronisation.

Strategy #3 — Newest first, partially

For very large collections, it could be interesting to perform a first partial synchronisation, and then fetch old records in the background.

When a new client wants to sync, instead of syncing hundreds of pages on the first synchronization, two distinct synchronization processes can be combined.

For example, start with some recent records in order to populate a UI, and then fetch older records in background.

  1. Obtain a few pages of recent records using the newest first strategy from above

  2. In background, fetch old records using _sort=-last_modified and _before=MIN(local_records[last_modified])

  3. Recent changes can be obtained using _sort=-last_modified and _since=MAX(local_records[last_modified])

../_images/sync-both.svg

Apply changes locally

Applying remote changes to the local database consists in adding new records, updating changed records and remove deleted records.

From the client perspective, Kinto does not distinguish creations from updates. In the polling for changes response, created records are simply the records unknown by the client (using id field).

If the records to be updated or deleted had also been modified locally then the developper must choose a relevant strategy. For example, merge fields or ignore deletion.

Concurrency control

As described in Server timestamps, Kinto uses ETag for concurrency control.

ETags are provided in response headers, for the collection as well as individual records.

Even though it is recommended to consider them as opaque and abstract, it can still be useful to observe that ETags are the quoted last_modified value of the record: "<record.last_modified>".

Protected creation with PUT

Add a If-None-Match: * request header to the PUT to make sure no record exists on the server with this ID.

This can be useful to avoid overwrites when creating records with PUT instead of POST.

Protected update and delete

Add a If-Match: "<record.last_modified>" request header to the PUT, PATCH or DELETE request.

Kinto will reject the request with a 412 Precondition Failed response if the record was modified in the interim.

If the remote record was already deleted, a 404 Not Found response will be returned. The client can choose to ignore it.

Offline-first

Since the server won’t be available to assign record identifiers while offline, it is recommended to generate them on the client.

Record identifiers are UUID, a very common format for unique strings with almost zero 1 collision probability.

When going back online, the set of changes can be sent to the server using a POST /batch request.

Implementations

The current implementation of reference for offline-first records synchronisation is Kinto.js.

Before that, some other clients were implemented in the context of the ReadingList project. That project was abandoned, but you can still see the implementation of the RL Web client (React.js), Android RL sync (Java) or Firefox RL client (asm.js).

1

After generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. Source