[Tech] [Old frost] Lets talk data transfer
Matthew Toseland
toad at amphibian.dyndns.org
Wed Aug 8 14:47:05 UTC 2007
----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.21 - 19:30:25GMT -----
Lets talk about data transfer (not requests, for the moment). The basic
question: Do we want to have end to end flow control?
It's easy to illustrate this with an attack:
Attacker A sends a large number of requests to each of his peers. He then
pretends that most of his packets from the nodes are being dropped, keeps the
congestion window tiny, and receives packets really slowly. The result is
that he can force downstream nodes to shift far more data than he has to.
Output bandwidth liability will not detect this, because it is only focused
on the node's output bandwidth limit; it doesn't take into account how fast
the node can receive it at the moment. In fact, right now, the attack would
be limited only by the arbitrary hardcoded maximum number of simultaneous
requests! On opennet, the attacker can keep getting peers forever, changing
identity when necessary, and effectively flood out the network.
The proposed solution then is to receive data at the same speed as the fastest
requestor. Thus even on opennet, the attacker cannot use much more bandwidth
(per hop) than he personally has available. (Of course, downstream bandwidth
is cheap, just buy a botnet, but that's another discussion...). This does
solve the flooding attack, and it might solve similar issues which are
naturally occurring. It allows request level load limiting to be relatively
lenient.
One problem with this is that it will make traffic analysis a lot easier. On
several levels:
- With or without this mechanism, if you are the data sender, you can vary the
timing of sending the packets within a block. If you are also a receiver,
with this mechanism you can identify whether there are other receivers. You
may be able to send a covert signal via the block timing, and trace the data
all the way back to the requestor.
- With a sufficiently strict form of this mechanism, you may be able to do it
the other way around and identify the data source.
The sender-traces-requestor attack is clearly far more insidious, and must be
dealt with anyway, so the receiver-traces-sender attack is not terribly
important, as well as probably being harder. Both attacks should be difficult
on busy nodes with high bandwidth limits. One way to make them harder is to
reduce the packet overhead and block/bulk data packet size; we will hopefully
be able to do this when the NewPacketFormat is done.
As far as I understand it, the implementation would be roughly:
- New packet and congestion control layers, with a real congestion window
applying to all
- Instead of acknowledging packets, we'd acknowledge messages.
- We don't acknowledge the incoming data packet for a block transfer until at
least one of the peers or local clients waiting for it has taken the packet.
- Local clients accept the packet immediately.
- For nodes requesting the data, we accept the packet when we manage to send
it (so we may have to wait for a gap in the congestion window). We do NOT
wait for an acknowledgement, for two reasons: Firstly, it would provide a way
to get an accurate round trip all the way back to the requestor. This is very
bad for anonymity. Secondly, it would significantly slow down data transfer
on links with high bandwidth * latency. It would also make the above style of
attack easier; waiting for just the next hop should be adequate while not
compromising security or performance.
This is definitely a good idea, whether we implement other mechanisms at the
request level or not.
Discuss. What have I got wrong here?
----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.21 - 22:07:29GMT -----
I'm not convinced actually. If the requestor is accepting packets slowly, we
send them slowly; that's fine. But what about the next hop (call it A)? A
packet comes in, it sees it has to send it to the requestor, and doesn't ack
it until it's been able to send it on. It sends the packet, thus using up the
congestion window to the requestor (which is very small because of all the
packet losses). The requestor doesn't ack the packet, but does ack other
messages. The next time A receives a packet for the requestor, it doesn't ack
it because it can't send it. However there is still space in the congestion
window to A from B. So B sends some more packets. Which are also not acked. B
is then stuck with these packets, and its congestion window is significantly
shrunk - all traffic to B is affected, although at this point it's obvious
who's to blame. A similar effect happens on the next hop, and the next hop,
back to the data sender: they all slow down all of their traffic. Which
achieves pretty much what the flooder had wanted in the first place.
So a one-size-fits-all approach doesn't work: we need to acknowledge the
packets at link level, and not interfere with the congestion window, but
implement flow control at some higher level. How? It seems that we would have
to have an acknowledgement from the requestor, which is propagated back to
the sender. This would have to be unique to that specific block transfer
tunnel. Does that mean that we'd have to wait for a full round-trip to send
the next block? Not if we have a limited size buffer on each intermediary
node. Each node could have a buffer of a certain number of packets, and if it
is not full, it could accept packets even though they haven't been accepted
yet. This would however have much the same effect - the buffers get full,
everyone slows down. How do we deal with this?
----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.21 - 22:54:19GMT -----
Maybe we can:
- Fixed size buffer for each node - or size the same as the real congestion
window i.e. determined by real packet loss.
- When a data packet is acked, it is removed from the buffer (or "window"),
and we can send another one to that node.
- A-B-C-D, A starts lots of requests, then stops acking data packets. Window
is say 4 packets size. D sends first 4 packets to C, which buffers and
forwards to B, which buffers and tries to send one to A, which doesn't ack.
So B's buffer is now full. So it doesn't ack packet 5 when it comes from C.
C's buffer also fills up, and eventually D stops sending.
- Same with E connected only to B, making requests, well-behaved: Once all the
buffers are full, E won't get a look in. Even if the buffers are large, a
bunch of requests can still fill them all up.
Or maybe we can't. :(
--------------------------------------------------------------------------------------------------------------------------
[ alternate reply to second message ]
----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.21 - 23:38:01GMT -----
As far as I can see, it all comes down to requests. We can't really deal with
this on the timescales of single packets or single block transfers. How to
deal with it on the level of requests? AIMD!
- Node connects
- Initially we allow 1 request in flight simultaneously.
- If <window> requests complete successfully, without a timeout, increment the
window.
- If there is a timeout, halve the window (max once per <window> requests).
- This is what TCP does (minus slow start, which doesn't make sense here
IMHO).
- Added caveat: Don't increase it indefinitely if it isn't being used: Only
increase it if there were <window> requests running simultaneously at some
point during the window of requests.
- Maximum window size we can calculate from e.g. output liability limiting and
dividing it up appropriately (possibly extrapolating from the current
low-level AIMD for this node??).
- Obviously this is all limited by low level congestion control.
- And limits on our nodes will propagate: provided we have sufficient
misrouting protection, we can start a new request only when one completes.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/4126cb47/attachment.pgp
More information about the Tech
mailing list