[guardian-dev] [Suspected Junk Mail] Re: 81% of Tor users can be de-anonymised by analysing router information, research indicates

Wed Nov 26 09:06:37 EST 2014

On Mon, Nov 24, 2014, at 07:09 PM, ghostlands at hush.com wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi all,
> 
> Just in case everyone forgets about BitMessage, some of these
> concepts remind me of at least the general idea it has. There is a
> very high latency there and a lot of the issues of devoted storage
> for messages are part of development. I haven't looked in on the
> project in awhile but I used to run it over Tor just fine. I know a
> separate program over anonymizing network is not the idea being
> discussed, but this is a protocol; and application in enthusiastic
> current development and some of those devs mi9ght be worth
> contacting.

I'm definitely keeping track of the BitMessage for Android work, and am
very interested in the protocol, especially how it could be used in a
non-Internet context.

> 
> gl
> 
> 
> On Sun, 23 Nov 2014 17:23:00 +0000 "Michael Rogers"
> <michael at briarproject.org> wrote:
> >On 22/11/14 00:27, str4d wrote:
> >>> I'm not suggesting that running over Tor or I2P would make any
> >>> system *less* effective. What I'm saying is that if we assume
> >an
> >>>  adversary who can break Tor or I2P's anonymity through traffic
> >
> >>> confirmation, then we either need to make Tor or I2P stronger,
> >or
> >>>  build a separate system that's stronger on its own. If we
> >build
> >>> a separate system then it won't provide a large anonymity set
> >>> until it becomes popular, so we could face a chicken-and-egg
> >>> problem.
> >>
> >> That makes sense. The chicken-and-egg problem is lessened if we
> >> restrict the adversary's abilities to breaking Tor or I2P
> >through
> >> traffic confirmation on a targeted scale, rather than being able
> >> to completely break the anonymity of all users all the time -
> >then
> >> there is a real benefit of having the separate stronger system
> >> running over Tor or I2P. Whether this is a realistic
> >restriction,
> >> however...
> >
> >This restriction seems realistic to me. We might also consider an
> >adversary who can see some subset of internet traffic and thus
> >carry
> >out traffic confirmation attacks against some subset of users
> >(e.g. an
> >ISP). We know these adversaries exist, regardless of whether the
> >global adversary also exists.
> >
> >In each case, would it make sense for a new high-latency anonymity
> >system to use Tor or I2P for its connections rather than plain
> >TCP?
> >
> >For the global adversary, no. Tor and I2P are transparent to that
> >adversary.
> >
> >For the targetted adversary, yes. If the high-latency system has
> >any
> >connections between targetted and untargetted nodes, using Tor or
> >I2P
> >will prevent the adversary from identifying and therefore
> >targetting
> >the untargetted nodes, so the adversary's view of the high-latency
> >system will remain partial.
> >
> >For the subset adversary, maybe. If the high-latency system has
> >any
> >connections between inside and outside nodes, using Tor or I2P
> >will
> >prevent the adversary from identifying the outside nodes. But
> >whether
> >identifying the outside nodes without being able to target them is
> >useful depends on how the adversary's trying to attack the
> >high-latency system.
> >
> >So overall it looks like it makes sense to use Tor or I2P instead
> >of
> >plain TCP, even if you're aiming to resist stronger adversaries
> >than
> >Tor and I2P can resist on their own.
> >
> >>> By the way, I've been meaning to ask you about I2P-Bote's
> >>> architecture for a while; maybe this is a good opportunity. Is
> >>> the DHT where the messages are stored specific to I2P-Bote, or
> >is
> >>> it part of I2P?
> >>
> >> The DHT is a Kademlia DHT specific to I2P-Bote, with a few
> >> modifications from standard Kademlia. See section 2 of the
> >> technical documentation [0] for details. I2P's netDb DHT is only
> >> for storing network information; applications are expected to
> >> handle their own data requirements.
> >
> >Thanks! I couldn't fine the tech docs before, I'll give them a
> >read.
> >
> >>>> If high-latency tunnels would actually be useful, we can
> >>>> implement them and get most of the network supporting delays
> >>>> relatively quickly (we usually have 80% of the network on the
> >>>> latest release within six weeks).
> >>
> >>> Wow, that would be amazing!
> >>
> >> Re-reading my message, I want to clarify that my last sentence
> >> needed an additional comma. I was saying that once implemented,
> >> getting the network to support the changes would be relatively
> >> quick. Actually implementing delays is a much trickier kettle of
> >> fish, and the subject of the rest of this message :)
> >
> >Ah, OK. :-)
> >
> >>> The longer the delays, the bigger the storage requirements. At
> >>> some point you have to think about writing the data to disk
> >until
> >>> it's time to forward it.
> >>
> >> This applies equally to any system with a delay between
> >receiving
> >> and sending data - I2P/Tor with delays, I2P-Bote, Freenet,
> >> Tahoe-LAFS...
> >
> >Yes, absolutely. My point was just that Tor doesn't currently use
> >much
> >disk space or disk throughput, so adding a disk-based data cache
> >to
> >Tor would be a big change for relay operators.
> >
> >> The storage requirement issue raises another question: what
> >> incentive is there for other routers to store delayed packets?
> >> There is no guarantee that a participating router is going to
> >honor
> >> your request. The answer is clearer for a specific app like
> >> I2P-Bote or Freenet than it is for a generic network transport
> >like
> >> I2P or Tor. As the required delay increases, so does the
> >incentive
> >> required. IIRC, previous study indicates that a 10 minute delay
> >is
> >> the minimum that would make any difference [1], and that can
> >> quickly become non-trivial.
> >
> >I agree it's important to think about the resources we're asking
> >people to contribute, but I prefer not to frame the issue in terms
> >of
> >incentives, because in the past that led me down a game theory
> >rabbit
> >hole and it took me years to escape. :-)
> >
> >People contribute resources to Tor, I2P and Freenet for a wide
> >range
> >of reasons apart from improving their own anonymity.
> >
> >> Let's say we turned on a 20 minute delay for all 5000 I2PSnark
> >> (torrent) users each with 50 KBps of traffic. That's 60 MB of
> >data
> >> for each of the 5000 to be buffered somewhere. If you keep it to
> >> Snark users, thats 60 MB each. If you spread it across all I2P
> >> routers, maybe 6 MB each. More likely, the data will be spread
> >> across the fast routers used by the Snark users in their
> >tunnels;
> >> say there are ~1000 of them (approx number of I2P FFs), that is
> >300
> >> MB each. Then throw in the fact that this data is constantly
> >> churning, with complete turnover every 20 minutes. Things get
> >> sticky, and routers need a good reason for the additional memory
> >> and disk load.
> >
> >I wouldn't expect people to run BitTorrent-like workloads over a
> >high-latency system - I'm thinking of email-like workloads. But
> >yeah,
> >we should definitely consider the disk space and disk throughput
> >requirements, and we can't drop new requirements on relay
> >operators
> >without warning.
> >
> >> For I2P and Tor, the incentive is of course cover traffic.
> >Defined
> >> more carefully, the incentive a router has for delaying traffic
> >is
> >> that it can use that traffic to smooth out its own bandwidth
> >> curve, hiding its own patterns. This is at odds with having
> >> user-defined delays on traffic, but that is not a bad thing
> >IMHO.
> >> There is no point in having a deterministic delay because the
> >> traffic confirmation attack can easily account for it; and if
> >the
> >> delay is random, there is no need for the user to specify it. At
> >> most, the user could provide an indication of how long they
> >would
> >> ideally like the traffic to be delayed, but the router doing the
> >> delaying would have the assumed right to send the data whenever
> >it
> >> desired / required, which might be immediately.
> >
> >That sounds good. We could also have user-specified minimum and
> >maximum delays, allowing the relay some flexibility, a bit like a
> >stop-and-go mix:
> >
> >http://freehaven.net/anonbib/#stop-and-go
> >
> >I have a slight preference for user-specified delays over random
> >delays because they allow the endpoints to choose new delay
> >distributions without upgrading the relays (end-to-end principle).
> >But
> >maybe there's a delay distribution that's provably optimal or
> >something.
> >
> >> Other issues to consider: - The effect of low volumes of
> >> high-latency traffic on the delaying router's incentive, and how
> >> this fits in with dummy traffic [2] - Mixing strategies, e.g.
> >[3]
> >
> >Yeah, I need to go back to the mix literature - it's possible that
> >mixing provides stronger anonymity than independently delaying
> >packets. George Danezis is the person to talk to about this.
> >
> >>> I imagine that would be a big architectural change, but I've
> >>> never looked at the I2P code - what do you reckon?
> >>
> >> If delays existed in isolation of other network effects,
> >> implementing them would be trivial: modify the hop processor or
> >a
> >> related handler [4] to store packets that have a delay, and have
> >a
> >> job that re-inserts them into the outbound message processor
> >once
> >> the delay has elapsed.
> >>
> >> An actual implementation would need to play nice with other
> >parts
> >> of I2P. Over the years we've added more strict expiration
> >> enforcement to prevent loops and DDoSing, and these would need
> >to
> >> be modified to handle the delays. We also have session tags that
> >> enable use of faster crypto once a session is established (AES
> >> instead of ElGamal) [5], and because these expire we would need
> >to
> >> force the use of the slower and more expensive crypto. Time-wise
> >> it's not a problem (the packet is meant to be delayed anyway),
> >but
> >> the increased crypto processing load may have effects on router
> >> performance (we've had similar issues before).
> >
> >I guess there's an architectural question here: should the system
> >deal
> >with streams of packets, as in Tor and I2P, or independent
> >packets, as
> >in mix networks? Not something I expect to answer at this stage,
> >but
> >something to bear in mind as we explore possible designs.
> >
> >Cheers,
> >Michael
> -----BEGIN PGP SIGNATURE-----
> Charset: UTF8
> Version: Hush 3.0
> Note: This signature can be verified at https://www.hushtools.com/verify
> 
> wsBcBAEBAgAGBQJUc8jTAAoJEJRqj8F0y8k5PowIAMhLzXLuwzfCmKbpyYevhOk7Kois
> B7Gq8FApaGvH5B+kXcGE1QFIHlS2GWmDknLwT6GhIrbG2ulnxYD7T0picvGgM/4L7oEM
> rAEaDC9MLoFbUgRhHBQYFzSKstDJK9TxqU3lTTwuUdKZ8T6s+mNfxXqX2IlkFntvzxLO
> 26cJ4w0uZAg5NMqj8WFdIfivdXacOhfrBRSfgjiugs4+vsqSuu66BhbffR/DDvrtHirt
> +KRQe/0X1V+g0/zZnEPvy8Xxk6rsguvAeFBty0oj3jZ/Zltkgvlmbs7HUpvKz8Gw2zY9
> tZ6F8UT5OPqDUEmZ0GGbnExRjG5GrosPMrHU29WF5Ww=
> =5TZ7
> -----END PGP SIGNATURE-----
> 
> _______________________________________________
> Guardian-dev mailing list
> 
> Post: Guardian-dev at lists.mayfirst.org
> List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev
> 
> To Unsubscribe
>         Send email to:  Guardian-dev-unsubscribe at lists.mayfirst.org
>         Or visit:
>         https://lists.mayfirst.org/mailman/options/guardian-dev/nathan%40guardianproject.info
> 
> You are subscribed as: nathan at guardianproject.info

-- 
  Nathan of Guardian
  nathan at guardianproject.info