Archive

Archive for October, 2012

openSUSE Conference 2012

October 22, 2012 1 comment

I spent the last weekend in Prague at the openSUSE conference 2012. It was a great opportunity to visit the wonderful city of Prague and meet old friends from the openSUSE community and get a bit more involved into openSUSE again.

I gave a talk about ownCloud on Saturday where I tried to show some technical details. That could have gone even deeper was the feedback some people gave me so I promise to show more of for example app development next time. Apart from that (and apart from the fact that laptops, projectors and Linux still do not play well) the talk went ok and was well received.

The conference is a joint conference of actually four conferences: The openSUSE Conference, the Czech Linux Days, the Gentoo Summit and the SUSE Labs Conference. Over the weekend it was hosted at the Czech technical university of prague and that was a great venue. Very inspiring that over the whole foyer of the building intersting art sculptures were shown. Unusual for a technical faculty but a great thing. The building is modern and large enough for all the talks and workshops but as a result of that it was sometimes a bit difficult to keep the overview where happens what. Especially because the start- and end times of the talks were not kept in sync over the tracks.

I found it is a different atmosphere on the event compared to the openSUSE conferences of the last years which might have been more focussed around the openSUSE project. This one felt more like an general FOSS event. That is not bad, just different, and since change is obvious and good I look forward to seeing how this influences the openSUSE project in the near future.

I like to thank all involved on the organizing teams for their successful hard work, have a lot of fun for the next two days, I had a lot during the last two days 🙂

Csync for ownCloud Client 1.1.0 – A New Sync Engine

October 11, 2012 18 comments

Along with todays ownCloud 4.5 release we released the new ownCloud Client 1.1.0 with a new syncing concept.

This blog will shed some light on the details. I apologize, it’s a long read.

Time Issues

ownCloud Client versions 1.0.x worked with csyncs traditional way of using the file modification times to detect updates between the two repositories that should be synced to each other. That works fine and conforms to our idea to ideally not use any other metadata in syncing than what the file system has anyway.

Time flies

However, there is one drawback which we all know from daily life: If at least two parties sync on time its important that all clocks are set exactly the same way. Remember good crime movies where a bank robbery always starts with a clock adjustment of all gangsters? We have exactly the same in ownClouds syncing: All involved have to have the same time setting, otherwise modification times of files can not be compared reliably.

There are solutions for computers to set the exact time (like ntp) so in general that works. However, in real life scenarios these are not reliable because either people do not have them started on the system or because the daemon updates the time once in a while and in that time span the clock skews already too much.

Users all the time reported problems with that and other experts continued to advise that we never get around that problems if we don’t change something fundamental and go away from pure time based syncing.

Well, we did that with our csync version 0.60.0 which is the sync engine for ownCloud Client 1.1.0.

An Unique Id

Now, every file and directory inside a sync directory has an unique Id associated. The idea is that the Id changes if the file changes. So in the sync process the need for a file update in either direction can be computed by comparing the two Ids of the file. If the id has changed on one repository the file was changed there and needs to be synced to the other side.

The Ids are generated on the ownCloud server and one challenge for the client is to always download the correct Id of a file. The Ids are just random tags for a file version. It is not associated to the file content as MD5 sums would be. Actually it was a frequent advise to use MD5 sums or a similar approach which digests the files content to detect updates. That would have come very handy because that means comparing file contents directly and, more important, it’s reproducable on either side. Also the client would have been able to recalculate the MD5-Sum of the local files and would not have depended on a local database with Ids that were pulled from the server before.

But we decided against hashes. Calculating MD5-Sums is costly in terms of CPU and time, especially for large files. The CPU problem is small on clients, but not on servers where a lot of clients connect to. Even though the sums can be calculated during upload, the problems remain for the case where the server does not see the upload stream, think of the “mount my Dropbox” case.

For files on the ownCloud server, the Id is always updated when the file gets updated. On the client side the last Id of a file is in the client database. It is invalidated in case the files modification time changed meanwhile to detect local changes.

Change Propagation

Another remarkable change in the 1.1.0 client is that change events in the file tree propagate up to the top directory on the owncloud server, ie. if a file changes in a directory, the id of the directory changes as well as the one of its parent directory etc.

That means that to detect if a file tree has changed, it’s enough to check the top most directories Id. If that has changed, ok, than the client needs to dig deeper, but in the not so rare case that nothing has changed, the one call is enough to detect that. That dramatically lowers the server load with clients because instead of digging through the whole directory structure what we did with the 1.0.x series
it is a few requests now.

CSync and ownCloud for Success

These are very intrusive changes to csync. For example, we had to add two additional fields to the database, add code that is able to build a representation
of the local file tree from the database and make csync query for the file Ids from
the server if needed. Deep under the hood the updater, reconciler and propagator code needed changes to work with the Ids. All these changes did not go back to csync upstream yet.

To not conflict with the upstream version of csync we decided to rename our csync version to ocsync. But: This is a temporar solution for the time we need to catch up with upstream again. That will take a while until everything is sorted again but we will work on that.

I am are very excited about the new version of csync. But obviously there are other changes in the ownCloud Client 1.1.0 which will be subject of another blog post.