[guardian-dev] [Suspected Junk Mail] Re: 81% of Tor users can be de-anonymised by analysing router information, research indicates

Mon Nov 24 19:09:55 EST 2014

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

Just in case everyone forgets about BitMessage, some of these
concepts remind me of at least the general idea it has. There is a
very high latency there and a lot of the issues of devoted storage
for messages are part of development. I haven't looked in on the
project in awhile but I used to run it over Tor just fine. I know a
separate program over anonymizing network is not the idea being
discussed, but this is a protocol; and application in enthusiastic
current development and some of those devs mi9ght be worth
contacting.

gl

On Sun, 23 Nov 2014 17:23:00 +0000 "Michael Rogers"
<michael at briarproject.org> wrote:
>On 22/11/14 00:27, str4d wrote:
>>> I'm not suggesting that running over Tor or I2P would make any
>>> system *less* effective. What I'm saying is that if we assume
>an
>>>  adversary who can break Tor or I2P's anonymity through traffic
>
>>> confirmation, then we either need to make Tor or I2P stronger,
>or
>>>  build a separate system that's stronger on its own. If we
>build
>>> a separate system then it won't provide a large anonymity set
>>> until it becomes popular, so we could face a chicken-and-egg
>>> problem.
>>
>> That makes sense. The chicken-and-egg problem is lessened if we
>> restrict the adversary's abilities to breaking Tor or I2P
>through
>> traffic confirmation on a targeted scale, rather than being able
>> to completely break the anonymity of all users all the time -
>then
>> there is a real benefit of having the separate stronger system
>> running over Tor or I2P. Whether this is a realistic
>restriction,
>> however...
>
>This restriction seems realistic to me. We might also consider an
>adversary who can see some subset of internet traffic and thus
>carry
>out traffic confirmation attacks against some subset of users
>(e.g. an
>ISP). We know these adversaries exist, regardless of whether the
>global adversary also exists.
>
>In each case, would it make sense for a new high-latency anonymity
>system to use Tor or I2P for its connections rather than plain
>TCP?
>
>For the global adversary, no. Tor and I2P are transparent to that
>adversary.
>
>For the targetted adversary, yes. If the high-latency system has
>any
>connections between targetted and untargetted nodes, using Tor or
>I2P
>will prevent the adversary from identifying and therefore
>targetting
>the untargetted nodes, so the adversary's view of the high-latency
>system will remain partial.
>
>For the subset adversary, maybe. If the high-latency system has
>any
>connections between inside and outside nodes, using Tor or I2P
>will
>prevent the adversary from identifying the outside nodes. But
>whether
>identifying the outside nodes without being able to target them is
>useful depends on how the adversary's trying to attack the
>high-latency system.
>
>So overall it looks like it makes sense to use Tor or I2P instead
>of
>plain TCP, even if you're aiming to resist stronger adversaries
>than
>Tor and I2P can resist on their own.
>
>>> By the way, I've been meaning to ask you about I2P-Bote's
>>> architecture for a while; maybe this is a good opportunity. Is
>>> the DHT where the messages are stored specific to I2P-Bote, or
>is
>>> it part of I2P?
>>
>> The DHT is a Kademlia DHT specific to I2P-Bote, with a few
>> modifications from standard Kademlia. See section 2 of the
>> technical documentation [0] for details. I2P's netDb DHT is only
>> for storing network information; applications are expected to
>> handle their own data requirements.
>
>Thanks! I couldn't fine the tech docs before, I'll give them a
>read.
>
>>>> If high-latency tunnels would actually be useful, we can
>>>> implement them and get most of the network supporting delays
>>>> relatively quickly (we usually have 80% of the network on the
>>>> latest release within six weeks).
>>
>>> Wow, that would be amazing!
>>
>> Re-reading my message, I want to clarify that my last sentence
>> needed an additional comma. I was saying that once implemented,
>> getting the network to support the changes would be relatively
>> quick. Actually implementing delays is a much trickier kettle of
>> fish, and the subject of the rest of this message :)
>
>Ah, OK. :-)
>
>>> The longer the delays, the bigger the storage requirements. At
>>> some point you have to think about writing the data to disk
>until
>>> it's time to forward it.
>>
>> This applies equally to any system with a delay between
>receiving
>> and sending data - I2P/Tor with delays, I2P-Bote, Freenet,
>> Tahoe-LAFS...
>
>Yes, absolutely. My point was just that Tor doesn't currently use
>much
>disk space or disk throughput, so adding a disk-based data cache
>to
>Tor would be a big change for relay operators.
>
>> The storage requirement issue raises another question: what
>> incentive is there for other routers to store delayed packets?
>> There is no guarantee that a participating router is going to
>honor
>> your request. The answer is clearer for a specific app like
>> I2P-Bote or Freenet than it is for a generic network transport
>like
>> I2P or Tor. As the required delay increases, so does the
>incentive
>> required. IIRC, previous study indicates that a 10 minute delay
>is
>> the minimum that would make any difference [1], and that can
>> quickly become non-trivial.
>
>I agree it's important to think about the resources we're asking
>people to contribute, but I prefer not to frame the issue in terms
>of
>incentives, because in the past that led me down a game theory
>rabbit
>hole and it took me years to escape. :-)
>
>People contribute resources to Tor, I2P and Freenet for a wide
>range
>of reasons apart from improving their own anonymity.
>
>> Let's say we turned on a 20 minute delay for all 5000 I2PSnark
>> (torrent) users each with 50 KBps of traffic. That's 60 MB of
>data
>> for each of the 5000 to be buffered somewhere. If you keep it to
>> Snark users, thats 60 MB each. If you spread it across all I2P
>> routers, maybe 6 MB each. More likely, the data will be spread
>> across the fast routers used by the Snark users in their
>tunnels;
>> say there are ~1000 of them (approx number of I2P FFs), that is
>300
>> MB each. Then throw in the fact that this data is constantly
>> churning, with complete turnover every 20 minutes. Things get
>> sticky, and routers need a good reason for the additional memory
>> and disk load.
>
>I wouldn't expect people to run BitTorrent-like workloads over a
>high-latency system - I'm thinking of email-like workloads. But
>yeah,
>we should definitely consider the disk space and disk throughput
>requirements, and we can't drop new requirements on relay
>operators
>without warning.
>
>> For I2P and Tor, the incentive is of course cover traffic.
>Defined
>> more carefully, the incentive a router has for delaying traffic
>is
>> that it can use that traffic to smooth out its own bandwidth
>> curve, hiding its own patterns. This is at odds with having
>> user-defined delays on traffic, but that is not a bad thing
>IMHO.
>> There is no point in having a deterministic delay because the
>> traffic confirmation attack can easily account for it; and if
>the
>> delay is random, there is no need for the user to specify it. At
>> most, the user could provide an indication of how long they
>would
>> ideally like the traffic to be delayed, but the router doing the
>> delaying would have the assumed right to send the data whenever
>it
>> desired / required, which might be immediately.
>
>That sounds good. We could also have user-specified minimum and
>maximum delays, allowing the relay some flexibility, a bit like a
>stop-and-go mix:
>
>http://freehaven.net/anonbib/#stop-and-go
>
>I have a slight preference for user-specified delays over random
>delays because they allow the endpoints to choose new delay
>distributions without upgrading the relays (end-to-end principle).
>But
>maybe there's a delay distribution that's provably optimal or
>something.
>
>> Other issues to consider: - The effect of low volumes of
>> high-latency traffic on the delaying router's incentive, and how
>> this fits in with dummy traffic [2] - Mixing strategies, e.g.
>[3]
>
>Yeah, I need to go back to the mix literature - it's possible that
>mixing provides stronger anonymity than independently delaying
>packets. George Danezis is the person to talk to about this.
>
>>> I imagine that would be a big architectural change, but I've
>>> never looked at the I2P code - what do you reckon?
>>
>> If delays existed in isolation of other network effects,
>> implementing them would be trivial: modify the hop processor or
>a
>> related handler [4] to store packets that have a delay, and have
>a
>> job that re-inserts them into the outbound message processor
>once
>> the delay has elapsed.
>>
>> An actual implementation would need to play nice with other
>parts
>> of I2P. Over the years we've added more strict expiration
>> enforcement to prevent loops and DDoSing, and these would need
>to
>> be modified to handle the delays. We also have session tags that
>> enable use of faster crypto once a session is established (AES
>> instead of ElGamal) [5], and because these expire we would need
>to
>> force the use of the slower and more expensive crypto. Time-wise
>> it's not a problem (the packet is meant to be delayed anyway),
>but
>> the increased crypto processing load may have effects on router
>> performance (we've had similar issues before).
>
>I guess there's an architectural question here: should the system
>deal
>with streams of packets, as in Tor and I2P, or independent
>packets, as
>in mix networks? Not something I expect to answer at this stage,
>but
>something to bear in mind as we explore possible designs.
>
>Cheers,
>Michael
-----BEGIN PGP SIGNATURE-----
Charset: UTF8
Version: Hush 3.0
Note: This signature can be verified at https://www.hushtools.com/verify

wsBcBAEBAgAGBQJUc8jTAAoJEJRqj8F0y8k5PowIAMhLzXLuwzfCmKbpyYevhOk7Kois
B7Gq8FApaGvH5B+kXcGE1QFIHlS2GWmDknLwT6GhIrbG2ulnxYD7T0picvGgM/4L7oEM
rAEaDC9MLoFbUgRhHBQYFzSKstDJK9TxqU3lTTwuUdKZ8T6s+mNfxXqX2IlkFntvzxLO
26cJ4w0uZAg5NMqj8WFdIfivdXacOhfrBRSfgjiugs4+vsqSuu66BhbffR/DDvrtHirt
+KRQe/0X1V+g0/zZnEPvy8Xxk6rsguvAeFBty0oj3jZ/Zltkgvlmbs7HUpvKz8Gw2zY9
tZ6F8UT5OPqDUEmZ0GGbnExRjG5GrosPMrHU29WF5Ww=
=5TZ7
-----END PGP SIGNATURE-----