[guardian-dev] debugging libsqlfs

Abel Luck abel at guardianproject.info
Tue Jan 8 13:45:41 EST 2013


I can reproduce these.

70-80% of the time the truncate op appears in the backtrace, so I
definitely think there's something fishy going on there.

However, due to the nature of fsx (i.e., do random shit a whole bunch
until something breaks) the bug might not necessarily lie in truncate.
Perhaps something is being written incorrectly, and truncation merely
triggers the fault.

:|


Hans-Christoph Steiner:
> 
> Ok round 2 with valgrind, this looks like it caught something.
> 
> * as root, I ran:
>  valgrind --tool=memcheck --trace-children=no --leak-check=full
> --show-reachable=yes \
>    --max-stackframe=3000000 -v ./.libs/fuse_sqlfs -d -o allow_other /mnt/
> 
> * then the same fsx as with gdb:
>  ./libsqlfs/tests/fsx  -l 10485760 -o 1048576  /mnt/testfile8
> 
> And it quite quickly gave me a useful looking dump, but its late, and I'm
> going to bed soon, so hopefully, Abel, you can make something of this:
> 
> FUSE library version: 2.9.0
> nullpath_ok: 0
> nopath: 0
> utime_omit_ok: 0
> unique: 1, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0
> INIT: 7.17
> flags=0x0000047b
> max_readahead=0x00020000
>    INIT: 7.18
>    flags=0x00000010
>    max_readahead=0x00020000
>    max_write=0x00020000
>    max_background=0
>    congestion_threshold=0
>    unique: 1, success, outsize: 40
> unique: 2, opcode: LOOKUP (1), nodeid: 1, insize: 50, pid: 9137
> LOOKUP /testfile8
> getattr /testfile8
> --9131-- REDIR: 0x4267890 (__memset_sse2) redirected to 0x402aba0 (memset)
> --9131-- REDIR: 0x42689a0 (__memcpy_ssse3) redirected to 0x4029c10 (memcpy)
> --9131-- REDIR: 0x4273a40 (__memcmp_ssse3) redirected to 0x402a780 (bcmp)
>    NODEID: 2
>    unique: 2, success, outsize: 144
> unique: 3, opcode: OPEN (14), nodeid: 2, insize: 48, pid: 9137
> open flags: 0x8002 /testfile8
>    open[0] flags: 0x8002 /testfile8
>    unique: 3, success, outsize: 32
> unique: 4, opcode: SETATTR (4), nodeid: 2, insize: 128, pid: 9137
> truncate /testfile8 0
> ==9131== Thread 4:
> ==9131== Invalid write of size 4
> ==9131==    at 0x4029E8A: memcpy (mc_replace_strmem.c:838)
> ==9131==    by 0x412999A: get_value_block (sqlfs.c:1314)
> ==9131==    by 0x412A6C7: get_value (sqlfs.c:1475)
> ==9131==    by 0x412E803: sqlfs_proc_truncate (sqlfs.c:2533)
> ==9131==    by 0x412F262: sqlfs_op_truncate (sqlfs.c:3421)
> ==9131==    by 0x40F9809: fuse_fs_truncate (in
> /lib/i386-linux-gnu/libfuse.so.2.9.0)
> ==9131==    by 0x40FE63F: ??? (in /lib/i386-linux-gnu/libfuse.so.2.9.0)
> ==9131==    by 0x4103E8B: ??? (in /lib/i386-linux-gnu/libfuse.so.2.9.0)
> ==9131==    by 0x4104560: ??? (in /lib/i386-linux-gnu/libfuse.so.2.9.0)
> ==9131==  Address 0x66b4ae0 is 8,321,720 bytes inside a block of size
> 8,321,722 alloc'd
> ==9131==    at 0x4028308: malloc (vg_replace_malloc.c:263)
> ==9131==    by 0x412A64D: get_value (sqlfs.c:1467)
> ==9131==    by 0x412E803: sqlfs_proc_truncate (sqlfs.c:2533)
> ==9131==    by 0x412F262: sqlfs_op_truncate (sqlfs.c:3421)
> ==9131==    by 0x40F9809: fuse_fs_truncate (in
> /lib/i386-linux-gnu/libfuse.so.2.9.0)
> ==9131==    by 0x40FE63F: ??? (in /lib/i386-linux-gnu/libfuse.so.2.9.0)
> ==9131==    by 0x4103E8B: ??? (in /lib/i386-linux-gnu/libfuse.so.2.9.0)
> ==9131==    by 0x4104560: ??? (in /lib/i386-linux-gnu/libfuse.so.2.9.0)
> ==9131==
> ==9131== Invalid read of size 4
> ==9131==    at 0x4049C32: ??? (in /usr/lib/i386-linux-gnu/libsqlite3.so.0.8.6)
> ==9131==    by 0x4079F4A: ??? (in /usr/lib/i386-linux-gnu/libsqlite3.so.0.8.6)
> ==9131==    by 0x404DA04: ??? (in /usr/lib/i386-linux-gnu/libsqlite3.so.0.8.6)
> ==9131==    by 0x407F63F: ??? (in /usr/lib/i386-linux-gnu/libsqlite3.so.0.8.6)
> ==9131==    by 0x4092B2A: ??? (in /usr/lib/i386-linux-gnu/libsqlite3.so.0.8.6)
> ==9131==    by 0x40939A2: ??? (in /usr/lib/i386-linux-gnu/libsqlite3.so.0.8.6)
> ==9131==    by 0x40C8665: ??? (in /usr/lib/i386-linux-gnu/libsqlite3.so.0.8.6)
> ==9131==    by 0x6E3C3EF: ???
> ==9131==  Address 0xc is not stack'd, malloc'd or (recently) free'd
> ==9131==
> 
> 
> .hc
> 
> 
> On 01/07/2013 09:50 PM, Hans-Christoph Steiner wrote:
>>
>> Hey Abel,
>>
>> I'm now diving in deep into debugging using gdb for now.  I've been catching
>> crashed in sqlfs with gdb and can get backtraces.  For the record, here's how
>> I'm doing it:
>>
>> * using a VM to avoid filesystem crashes and lock-ups :)
>>
>> * to mount the file system, run this as root:
>> 	gdb --args ./.libs/fuse_sqlfs -d -o allow_other /mnt/
>>
>> * to test it, run this as a normal user:
>> 	./libsqlfs/tests/fsx  -l 10485760 -o 1048576  /mnt/testfile8
>>
>> Most of the backtraces look like this, i.e. occuring in sqlfs_proc_trunctate()
>> with sqlfs=sqlfs at entry=0x0:
>>
>> #1  0xb7ede033 in get_value_block (sqlfs=sqlfs at entry=0x8072b10,
>>     key=0x808a7c8
>> "(\317\006\b\b\265\b\b\300\266\b\bx\270\b\b\230\250\b\b\260\267\b\b\b",
>>     key at entry=0x805a618 "/testfile8",
>>     data=0xb64d2008
>> "\024\177\024\305\024\303\024\241\024\231\024\372\024\006\024\334\024u\024\314\024\336\024\353\024\064\024\266\024x\024\277\024$\024\063\024\033\024\332\024\206\024\257\024\034\024y\024\a\024\340\024\061\024\202\024\a\024e\024\017\024\071\024?\024\221\024$\024\274\024\271\024U\024\336\024\030\024\254\024\302\024\065\024@\024\020\024\061\024\257\024\243\024\261\024\"\024\305\024\353\024T\024D\024@\024\357\024\214\024\064\024%\024\356\024A\024\062\024\272\024p\024y\024#\024\321\024\246\024\326\024K\024\250\024\022\024\372\024!\024:\024\264\024O\024\356\024\317\024\377\024Y\024}\024\246\024\306\024\343\024K\024\b\024\261\024m\024\314\024]\024\005\024\261\024\351\024\\\024@\024\210\024\267\024\331\024\243"...,
>> block_no=block_no at entry=63, size=size at entry=0x0) at sqlfs.c:1326
>> #2  0xb7edee28 in get_value (sqlfs=0x8072b10, key=key at entry=0x805a618
>> "/testfile8",
>>     value=value at entry=0xb7504fb4, begin=8257536, begin at entry=0, end=8321722,
>> end at entry=0)
>>     at sqlfs.c:1487
>> #3  0xb7ee36f9 in sqlfs_proc_truncate (sqlfs=sqlfs at entry=0x0,
>> path=path at entry=0x805a618 "/testfile8",
>>     size=0) at sqlfs.c:2545
>> #4  0xb7ee3fb3 in sqlfs_op_truncate (path=0x805a618 "/testfile8", size=0) at
>> sqlfs.c:3433
>> #5  0xb7ef980a in fuse_fs_truncate () from /lib/i386-linux-gnu/libfuse.so.2
>>
>>
>> It seems that sqlfs_t gets re-issued a lot, I added this to get_sqlfs() and
>> got it many many times:
>>
>>     if (p)
>>         return p;
>>     else
>>         show_msg(stderr, "sqlfs was 0");
>>
>> This lead me to examine the threading stuff, since it seems that when sqlfs ==
>> 0, its getting a new sqlfs issued from:
>>
>>     sqlfs = (sqlfs_t *) (pthread_getspecific(sql_key));
>>
>> Since the libsqlfs code only has pthread code relating to the pthread_key_t
>> creation and use, my guess is that FUSE itself is handling the rest of the
>> threading.  So the good news is that libsqlfs was made with threading in mind,
>> the bad news is that a lot of it is normally handled in libfuse, so we'll have
>> to figure that part out and implement it in IOCipher in order to support threads.
>>
>> Now, I'm going to see if I can get valgrind to run on fuse_sqlfs.
>>
>> .hc
>>
> _______________________________________________
> Guardian-dev mailing list
> 
> Post: Guardian-dev at lists.mayfirst.org
> List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev
> 
> To Unsubscribe
>         Send email to:  Guardian-dev-unsubscribe at lists.mayfirst.org
>         Or visit: https://lists.mayfirst.org/mailman/options/guardian-dev/abel%40guardianproject.info
> 
> You are subscribed as: abel at guardianproject.info
> 



More information about the Guardian-dev mailing list