Archive

Posts Tagged ‘syncing’

ownCloud and CryFS

August 17, 2019 4 comments

It is a great idea to encrypt files on client side before uploading them to an ownCloud server if that one is not running in controlled environment, or if one just wants to act defensive and minimize risk.

Some people think it is a great idea to include the functionality in the sync client.

I don’t agree because it combines two very complex topics into one code base and makes the code difficult to maintain. The risk is high to end up with a kind of code base which nobody is able to maintain properly any more. So let’s better avoid that for ownCloud and look for alternatives.

A good way is to use a so called encrypted overlay filesystem and let ownCloud sync the encrypted files. The downside is that you can not use the encrypted files in the web interface because it can not decrypt the files easily. To me, that is not overly important because I want to sync files between different clients, which probably is the most common usecase.

Encrypted overlay filesystems put the encrypted data in one directory called the cipher directory. A decrypted representation of the data is mounted to a different directory, in which the user works.

That is easy to setup and use, and also in principle good to use with file sync software like ownCloud because it does not store the files in one huge container file that needs to be synced if one bit changes as other solutions do.

To use it, the cypher directory must be configured as local sync dir of the client. If a file is changed in the mounted dir, the overlay file system changes the crypto files in the cypher dir. These are synced by the ownCloud client.

One of the solutions I tried is CryFS. It works nicely in general, but is unfortunately very slow together with ownCloud sync.

The reason for that is that CryFS is chunking all files in the cypher dir into 16 kB blocks, which are spread over a set of directories. It is very beneficial because file names and sizes are not reconstructable in the cypher dir, but it hits on one of the weak sides of the ownCloud sync. ownCloud is traditionally a bit slow with many small files spread over many directories. That shows dramatically in a test with CryFS: Adding eleven new files with a overall size of around 45 MB to a CryFS filesystem directory makes the ownCloud client upload for 6:30 minutes.

Adding another four files with a total size of a bit over 1MB results in an upload of 130 files and directories, with an overall size of 1.1 MB.

A typical change use case like changing an existing office text document locally is not that bad. CryFS splits a 8,2 kB big LibreOffice text doc into three 16 kB files in three directories here. When one word gets inserted, CryFS needs to create three new dirs in the cypher dir and uploads four new 16 kB blocks.

My personal conclusion: CryFS is an interesting project. It has a nice integration in the KDE desktop with Plasma Vault. Splitting files into equal sized blocks is good because it does not allow to guess data based on names and sizes. However, for syncing with ownCloud, it is not the best partner.

If there is a way how to improve the situation, I would be eager to learn. Maybe the size of the blocks can be expanded, or the number of directories limited?
Also the upcoming ownCloud sync client version 2.6.0 again has optimizations in the discovery and propagation of changes, I am sure that improves the situation.

Let’s see what other alternatives can be found.

Categories: FOSS, KDE, ownCloud Tags: , , ,

ownCloud Chunking NG Part 2: Announcing an Upload

July 10, 2015 5 comments

The first part of this little blog series explained the basic operations of chunk file upload as we set it up for discussion. This part goes a bit beyond and talks about an addition to that, called announcing the upload.

With the processing described in the first part of the blog, the upload is done savely and with a clean approach, but it also has some drawbacks.

Most notably the server does not know the target filename of the uploaded file upfront. Also it does not know the final size or mimetype of the target file. That is not a problem in general, but imagine the following situation: A big file should be uploaded, which would exceed the users quota. That would only become an error for the user once all uploads happened, and the final upload directory is going to be moved on the final file name.

To avoid useless file transfers like that or to implement features like a file firewall, it would be good if the server would know these data at start of the upload and stop the upload in case it can not be accepted.

To achieve that, the client creates a file called _meta in /uploads/ before the upload of the chunks starts. The file contains information such as overall size, target file name and other meta information.

The server’s reply to the PUT of the _meta file can be a fail result code and error description to indicate that the upload will not be accepted due to certain server conditions. The client should check the result codes in order to avoid not necessary upload of data volume of which the final MOVE would fail anyway.

This is just a collection of ideas for an improved big file chunking protocol, nothing is decided yet. But now is the time to discuss. We’re looking forward to hearing your input.

The third and last part will describe how this plays into delta sync, which is especially interesting for big files, which are usually chunked.

ownCloud Client 1.8.0 Released

March 17, 2015 14 comments

Today, we’re happy to release the best ownCloud Desktop Client ever to our community and users! It is ownCloud Client 1.8.0 and it will push syncing with ownCloud to a new level of performance, stability and convenience.

The Share Dialog

The Share Dialog

This release brings a new integration into the operating system file manager. With 1.8.0, there is a new context menu that opens a dialog to allow the user to create a public link on a synced file. This link can be forwarded to other users who get access to the file via ownCloud.

Also the clients behavior when syncing files that are opened by other applications on Windows has greatly been improved. The problems with file locking some users saw for example with MS office apps were fixed.

Another area of improvements is again performance. With latest ownCloud servers, the client uses even more parallized requests, now for all kind of operations. Depending on the synced data structure, this can make a huge difference.

All the other changes, improvements and bug-fixes are too hard to count. Finally, this release received around 700 git commits compared to the previous release.

All this is only possible with the powerful and awesome community of ownClouders. We received a lot of very good contributions through the GitHub tracker, which helped us to nail down a lot of issues and improved the client tremendously.

But this time we’d like to specifically point out the code contributions of Alfie “Azelphur” Day and Roeland Jago Douma who contributed significant code bits to the sharing dialog on the client and also some server code.

A great thanks goes out to all of you who helped with this release. It was a great experience again and it is big fun working with you!

We hope you enjoy 1.8.0! Get it from https://owncloud.org/install/#desktop

ownCloud ETags and FileIDs

March 13, 2015 2 comments

Often questions come up about the meaning of FileIDs and ETags. Both values are metadata that the ownCloud Server stores for each of the files and directories in the server database. These values are fundamentally important for the integrity of data in the overall system.
Here are some thoughts about what they are why these are so important.This is mainly from a clients point of view, but there are other use cases as well.

ETags

ETags are strings that describe exactly one specific version of a file (example: 71a89a94b0846d53c17905a940b1581e).

data2Whenever the file changes, the ownCloud server will make sure that the ETag of the specific file changes as well. It is not important in which way the ETag changes, it also does not have to be strictly unique, it’s just important that it changes reliably if the file changes for whatever reason. However, ETags should not change if the file was not changed, otherwise the client will download that file again.

In addition to that, The ETags of the parent directories of the file have to change as well, up to the root directory. That way client systems can detect changes that happen somewhere in the file tree. This is in contrast to normal computer file systems where only the modification time of the direct parent of a file is changing.

File IDs

FileIDs are also strings that are created once at the creation time of the file (example: 00003867ocobzus5kn6s).

data3But contrary to the ETags, the file IDs should never ever change over the files lifetime. Not on an edit of the file, and also not if the file is renamed or moved. One of the important usages of the FileID is to detect renames and moves of a file on the server.

The FileID is used as an unique key to identify a file. FileIDs need to be unique within one ownCloud, and in inter-owncloud connections, they must be compared together with the ownCloud server instance id.

Also, the FileIDs must never be recycled or reused.

Checksums?

Often ETags and FileIDs are confused with checksums such as MD5 or SHA1 sums over the file content.

Neither ETags nor FileIDs are, even if there are similarities: Especially the ETag can be seen as a checksum over the file content. However, file checksums are way more costly to compute than just a value that only needs to change somehow.

What happens if…?

Let’s make a thought experiment and consider what it would mean especially for sync clients if either fileID or ETag gets lost from the servers database.

If ETags are lost, clients loose the ability to decide if files have changed since the last time that was checked by the clients. So what happens is that the client will download the files again, byte-wise compare them to the local file and use the server file if the files differ. A conflict file will be created. Because the ETag was lost, the server will create new ETags on download. This could be improved by the server creating more predictable ETags based on the storage backends capabilities.

If the ETags are changed without reason, for example because a backup was played back on the server, the clients will consider the ones with changed ETags as changed and redownload them. Conflict handling will happen as described if there was a local change as well.

For the user, this means a lot of unnecessary downloads as well as potential conflicts. However, there will not be data loss.

If FileIDs got lost or changed, the problem is that renames or moves on server side can no longer be detected. That would result in a new download of files in the good case. If a fileID however changes to something that was used before, that can result in a rename that overwrites an unrelated file. That is because clients might still have the FileID associated with another file.

Hopefully this little post explains the importance of the additional metadata that we maintain in ownCloud.

Workshop at CERN

November 27, 2014 5 comments

cern_logoLast week, Thomas, Christian and myself were attending a workshop in CERN, the European Organization for Nuclear Research in Geneve, Switzerland.

CERN is a very inspiring place, attracting intelligent people from all over the world to get behind the secrets of our being. I felt honored to be at the place where for example the world wide web was invented.

The event was called Workshop on Cloud Services for File Synchronisation and Sharing and was hosted by CERN IT department. There have been around 100 attendees.

I was giving a talk called The File Sync Algorithm of the ownCloud Desktop Clients, which was very well received. If you happen to be interested in the sync algorithm we’re using, the slides are a nice starting point.

What amazed me most was the great atmosphere and the very positive attitude towards ownCloud. Many representatives of edu organizations that use ownCloud to which I talked were very happy with the product (even though there are problems here and there) from the technical POV. A lot of interesting setups and environments were explained and also showcased ownCloud’s flexibility to integrate into existing structures.

What also was pointed out by the attendees of the workshop was the importance of the fact that ownCloud is open source. Non free software does not have a chance at all in that market. That was the very clear statement in the final discussion session of the workshop.

The keynote was given by Prof. Benjamin Pierce from Pennsylvania with the title Principles of Synchronization. He is the lead author of
the project Unison which is another opensource sync project. It’s sync engine marks very high quality, but is not “up-to-date software” any more as he said.

I had the pleasure to spend quite some time with him to discuss syncing in general and our sync algorithms in particular, amongst other interesting things.

Atlas Detectors

Atlas Detectors

As part of his work, he works with a tool called QuickCheck to do very enhanced testing. One night we were sitting in the cantina there hacking to adopt the testing to ownCloud client and server. The first results were very promising, for example we revealed a “problem” in our sync core that I knew of, which formally is a sync error, yet very very unlikely to happen and thus accepted for the sake of an easier algorithm. It was impressive how fast the testing method was identifying that problem.
I like to follow up with the testing method.

Furthermore we met with a whole variety of other interesting people, backend developers, operators of the huge datasets (100 Peta-Byte), the director of CERN IT, a maintainer of the Scientific Linux and others.

Also we had the chance to visit the Atlas experiment, it is 100 meter underneath the surface and huge. That is where the particles are accelerated, and it was great to have the chance to visit that.

The trip was a great experience and very motivating for me, and I think it should be for all of us all doing ownCloud. Frank was really hitting a nerv when he was seeding the idea, and we all were doing a nice product of it so far.

Lets do more of this cool stuff!

Categories: Event, FOSS, ownCloud Tags: , , ,

After the 1.4.0 ownCloud Client Release

September 11, 2013 10 comments

You might have heard, ownCloud Client 1.4.0 was released last week. It is available from our sync clients page for all major desktop platforms, investigate the Changelog.

Danimos Visual Guide has outlined the new stuff in the release already, so no need to repeat it here. You should install and try it, that seems to be the opinion of many people who tried it.

Also people who shared their critical view on the client very publically in the past are much more pleased now with 1.4.0. One example is a recent blog post on BITBlokes. It is a blog about all kind of topics around FOSS. I regularly read it and often share its opinions. He concludes very positively about the 1.4.0 client.

It is good to see the positive feedback overall. That shows a couple of things from my engineering point of view: The concentrated work we continously do on all parts of ownCloud pays off. That is obvious of course, but still nice to see. And our (also obvious) actions to improve code quality such as the consequent use of continous integration, code reviews and such helps to improve quality.

“People are always excited if releases come with GUI changes!” I heard people saying. Well, maybe, but that’s not the whole truth. It also proves for me again is how important UI design and UX is. Me as a knee-deep-developer have an interesting relationship to all UX topics: I always have an opinion. Often a strong opinion. But the results coming out of that have not always been the, well, the most optimal. Very fortunate on the client we work together with our UX guy Jan and the positive feedback also shows how good that is for the software.

But enough of release pride. There is more work to do: The bug tracker is still not empty, the list of feature ideas is long. We will continue to focus on correctness, stability and robustness of syncing, performance and useful features and work on a version 1.5 for you.

These are a couple of concrete points we’re focussing on for 1.5:

  1. we already merged the client code on the new upstream sync version in git.
  2. performace improvements through further reduction of the number of requests and more efficiency in database operations on the client.
  3. we are working on a new propagator component that allows us to do the changes mentioned in 2 more easily.
  4. File manager integration, which means havingn icons in Explorer, Dolphin and friends.

A more detailed list can be found at github.

Thank you for all your help and support. It’s big fun!

ownCloud Client 1.2.0 beta1

December 21, 2012 10 comments

xmas_bulb2012 is slowly coming to an end and we all are looking forward to a few silent days around Christmas. But we did not want to leave to holidays without adding another thing to your vacation experience: I am happy to announce the first beta of the upcoming ownCloud Client release 1.2.0, ready now for you to test and enjoy under the tree.

This is the first build with the new things we did in Berlin a couple of weeks ago, you will

  • discover that there is much better error reporting if something goes wrong.
  • probably feel like it syncs faster, yes faster.
  • see that there are less HTTP requests to the server for a single sync run.
  • don’t see any issues with MacOSX and funny characters in filenames any more.
  • recognize a new icon set, which is not finalized yet (actually not all sizes are there, thats why the status dialog looks a bit funny) but we thought its nice to already add it to the beta. It should fit nicely into your operating system environment.
  • realize that this client comes with a cross platform file system watcher on clientside, so no polling any more.
  • have your password stored in a secure keychain on all platforms since we added qtkeychain to the client.
    • Maybe there is more, but we thought that’s already a nice beta release.

      Please find packages for MacOSX, Windows and Linuxes. Note, not all packages are finished yet. If the one for your distro is missing, please come back later, or even better – speak up at packaging@owncloud.org and help fixing 🙂

      Of course you also should note that this is an early beta and you would not want to use it without a good backup of your data and only on your test account without important data.

      We would appreciate if you let us know your experience on the mailinglist. If you find problems, please report it to the client’s bugtracker mentioning client- and server versions and at best with useful logs.

      With that we are happily vanishing to spend some time away from the computer, looking back on a very exciting and very busy year, working on an interesting topic with a lot of nice people.

      Thanks and best Season’s Greetings!