Hi Patrick,
On 01/30/2012 10:32 AM, Patrick Ohly wrote:
On Mo, 2012-01-30 at 08:57 +0100, Mikel Astiz wrote:
> How would that be? I thought the sync engine relied on the backend LUID
> to be able to merge changes.
Without a reliable LUID, some things (kind of) already work:
* --print-items: passes through the raw data
* "refresh from phone" syncing: delete all local data, convert
phone contacts and store locally - without merging
* "slow sync": compare data to find matches, merge - problem is
that it may attempt to write to the phone
Creating the LUID from some contact properties is similar to what the
engine does in a slow sync. The difference is in the choice of
properties. The engine is configured to compare the following
properties:
src/syncevo/configs/datatypes/00vcard-fieldlist.xml
<field name="N_LAST" type="string"
compare="always"/>
<field name="N_FIRST" type="string"
compare="always"/>
<field name="N_MIDDLE" type="string"
compare="always"/>
[...]
<field name="ORG_NAME" type="string"
compare="slowsync" merge="fillempty"/>
That's actually pretty close to creating a LUID from FN. The engine will
also look at the first part of ORG to distinguish between "John Doe @
Intel" and "John Doe @ BMW".
Note that the engine distinguishes between an initial sync (where it
assumes that both sides have mostly items created independently) and a
slow sync (where it assumes that both sides have almost the same items,
copied by a previous sync):
• "slowsync": field is compared in conflict case (like
"conflict"), but in addition, it is also
compared during slow-sync to match client objects with existing server objects.
There-
fore, "slowsync" should be set only on data fields that are important
for identifying ob-
jects (such as name, company, country, but probably not details that might
differ in server
and client like telephone numbers, notes etc.). Setting too many fields to
"slowsync"
carries the risk of creating duplicates during slow sync, because the matching
cri-
teria is too tight and small differences between client and server versions of a
data
record will prevent them to match.
• "always": field is always used in comparisons, not only in conflict
and slow-sync cases,
but also in "first time sync" case. This is the special case when a
client and a server per-
form sync for the first time. This is different from slow-sync as in a
first-time sync situa-
tion, it is often desirable to have relatively loose matching criteria (for
example only com-
pare first and last name) to match and union server and client objects. Use this
only for
fields that are absolutely essential for identifying an object.
So what do we get from doing some of the engine's work in advance by
creating a LUID from FN? It is a first step towards "one-way from phone"
sync. If you combine it with a revision string created from the data
(for example, a simple hash of the full vCard received from the phone),
then the engine can look at the list of items and only copy/merge those
which have changed since the last sync.
"one-way" from phone would definitely be the typical use-case inside the
car, so according to this, it seems that the LUID generation would be
required.
The downside of the "hash vCard" approach is that a slight
change in the
data (like reordering properties or different folding, as it might
happen when the software creating the vCard changes) will lead to false
"data changed" events - not a big issue. It increases the risk of
conflicts (if local data was also changed in the meantime) and (more
likely) leads to unnecessary work for writing changed data.
The "hash vCard" approach was exactly what I had in mind. Those
drawbacks do not seem major issues.
When I suggested to throw an error on writes and enumerate items, I
was
only thinking of the --print-items and "refresh from phone" use cases.
Slow sync will require further thought/changes (configure engine to
never write back to the phone?!) and as I said above, "one-way" sync
will also need a revision string.
Makes sense.
While talking about use cases, let me point out the big (sic!)
elephant
in the room: how should a PHOTO be handled in a "one-way" sync? The goal
has to be to make such a sync as fast as possible in the normal case
(all data already available locally).
We can't avoid retrieving the full address book, because we need the FN
to find matches. Can we ask for a subset of the data, most notably
without the big PHOTO property?
Yes, we can do that. It would be exactly like that, having a first pass
without the PHOTO property and later a second one for this purpose. I
still have to think how this two-pass approach would be possible using
SyncEvolution.
Technically PBAP supports that, if I'm not mistaken. Not sure
whether
the obex-client API supports it.
Yes, it does.
Let's assume we can and will do that during a "one-way"
sync. It'll
reduce memory and BT bandwith usage. We'll be able to identify new and
removed contacts. The drawbacks:
1. A new contact will be stored without PHOTO.
2. A modified PHOTO will not be recognized as modified.
We could add some code which pro-actively reads all new contacts *within
the same PBAP session* (because we still have the valid PBAP IDs of
them). The second problem is harder. To find all modified PHOTOs, we
have to read all of them, which puts us back to the situation that we
wanted to avoid.
Perhaps it would be acceptable to not refresh photos during sync?
Instead this could be done in the background when a local contact is
actually used. The drawback there is that the PBAP session and the
information about the contact's PBAP ID in that session is most likely
gone, and thus we would have to search for the contact via PBAP (full
name, phone number, ...)
As you mention, the first point should not be a problem as long as we
keep the session alive for the second pass. However, as I mentioned
before, I don't see how this could be integrated in SyncEvolution.
Regarding the photo modification detection, that's something the
protocol makes kind-of difficult anyway. It's a shame that the REV
property in PBAP is not more widely used.
Cheers,
Mikel