[guardian-dev] debugging libsqlfs

Abel Luck abel at guardianproject.info
Thu Jan 10 06:49:02 EST 2013

Hans-Christoph Steiner:
> On 01/08/2013 09:44 AM, Hans of Guardian wrote:
>> On Jan 8, 2013, at 4:05 AM, Abel Luck wrote:
>>> Hans-Christoph Steiner:
>>>> It seems that sqlfs_t gets re-issued a lot, I added this to get_sqlfs() and
>>>> got it many many times:
>>>>    if (p)
>>>>        return p;
>>>>    else
>>>>        show_msg(stderr, "sqlfs was 0");
>>>> This lead me to examine the threading stuff, since it seems that when sqlfs ==
>>>> 0, its getting a new sqlfs issued from:
>>>>    sqlfs = (sqlfs_t *) (pthread_getspecific(sql_key));
>>>> Since the libsqlfs code only has pthread code relating to the pthread_key_t
>>>> creation and use, my guess is that FUSE itself is handling the rest of the
>>>> threading.  So the good news is that libsqlfs was made with threading in mind,
>>>> the bad news is that a lot of it is normally handled in libfuse, so we'll have
>>>> to figure that part out and implement it in IOCipher in order to support threads.
>>> Yes! This was a conclusion I had several months ago when looking at
>>> Aaron's threading bug. There are definitely discrepancies between the
>>> way IOCipher runs and the we test things with libsqlfs and fuse.
>> Hey Abel,
>> I think that first thing's first: let's getting fully passing long fsx tests, then we can check whether we need to handle threads different in IOCipher.
> Ok, today was a good day, much better than before.
> * we isolated the super mysterious crash/corruption issues in libsqlfs itself.
>  It only happens with mmap'ed writes in multi-threaded mode.  So we can easily
> avoid doing that in IOCipher.
> https://dev.guardianproject.info/issues/488
> * we now have a pretty solid grasp of the multi-threading issues in IOCipher's
> JNI C code. We have some working test cases, so by the end of tomorrow we hope
> to have something for you to test in your app.  We need to restructure things
> a little bit to make sure that we're using the pthread_key_t to keep track of
> the instances of sqlfs_t state struct.
> I pushed my libsqlfs changes to implement the sqlfs threaded mode with a key,
> i.e. using the key in get_sqlfs():
> https://github.com/eighthave/libsqlfs
> Abel, feel free to merge this to the official repo if you think they're good.

Alright, so things aren't all rosy quite yet. Your changes are more or
less good on the sqlfs side of things, but some dramatic changes are
necessary on the IOCipher side of things, at least I think so.

First: the way things are currently in IOCipher:

VirtualFileSystem is responsible for opening and holding onto the
sqlfs*. .mount() calls the _open(db_name, key) method to open the
database and get a handle on the sqlfs*. isMounted() and unmount() rely
on this pointer to sqlfs.

This central global sqlfs* we suspect is the cause of our
multithreading issues.

Yesterday, we changed Posix.cpp and File.cpp to not use the global
sqlfs* but to grab it from the pthread specific context. Of course this
context doesn't yet exist.

Second: your changes attempt to create this context, but breaks our VFS
model (I'll explain).

We're moving from a central sqlfs* to a created-on-the-fly per thread
access to sqlfs*. This destroys our notion of a "mounted" VFS, or at
least it breaks semantics of "a valid sqlfs* means we're mounted".

You've seen how this works with fuse, sqlfs gets created and freed quite
a bit, that is, per thread. The only state that really sticks around is
(1) the database file path and (2) the sqlcipher key.

My conclusion is that we will simple not handle sqlfs* structs in our
IOCipher code. That will be internal to sqlfs only. Therefore, the
notion of a VirtualFileSystem being "mounted" means (1) we have a valid
path and (2) a valid key.

Implementation wise, I propose:

(1) VirtualFileSystem.init() validates databaseFileName is writable
    (dispensing with the sqlfs_init call)
(2) VirtualFileSystem.mount() does two things:
    a. if a key is provided AND the database file exists, check if
       the key is valid by attempting an open on the database
    b. call the new sqlfs_init_key() function to store the db name
       and key for thread access.
    c. set a boolean isMounted = true
(3) VirtualFileSystem.isMounted() returns the boolean set by .mount()
(4) VirtualFileSystem.unmount() unsets the boolean and overwrites the
databaseFileName and key values to null.

Issues: 1,2 are fine I think, but 3,4 are quite hacky. unmount() doesn't
mean anything know.

For example, if you have 2 threads using an IOCipher database, and one
thread unmounts(), the other can continue to write, because "mounted"
merely means a valid sqlite db and password.

Should we get rid of the notion of mount() altogether?

Gonna get some lunch and ponder.

More information about the Guardian-dev mailing list