[freenet-dev] Load management proposal
Michael Rogers
m.rogers at cs.ucl.ac.uk
Mon Jun 18 16:05:19 UTC 2007
I'll answer this on Frost to avoid breaking the thread (unless Anonymous
is on the mailing list).
The short answer is "argh". ;-)
Cheers,
Michael
Matthew Toseland wrote:
> MRogers and Anonymous have been arguing that we should limit load only on the
> immediate bandwidth level. Some of the thread is below, and my response
> including a proposal for a bandwidth limiting mechanism:
>
> ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.14 - 21:32:43GMT -----
>
> No, what I've been arguing is that we don't need to *explicitly* take the
> lifetime overhead into account, because the feedback loop should be slow
> enough that it will *implicitly* take it into account anyway. If the feedback
> loop takes a few minutes to adapt (which it must if we don't want to
> over-adapt to short-term conditions) then we're implicitly measuring the
> lifetime overhead of a search, and we don't need to make an additional
> explicit estimate of the lifetime overhead.
>
> ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.15 - 17:12:52GMT -----
>
> What makes you think it would work at all? It seems to me that it would simply
> oscillate - we accept too many requests, most of them time out, the we accept
> too many requests again, and most of them timeout again. KISS is one thing,
> but not slitting your own throat with occam's razor is another!
>
> ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.15 - 21:44:34GMT -----
>
> :) If the feedback loop is slow enough it shouldn't oscillate - that applies
> equally to bandwidth liability estimation, token passing or any other
> mechanism - we musn't get into the situation of saying "I accepted 100
> requests in the last 10 seconds and nothing bad has happened yet, therefore I
> can accept another 100". As far as I can see, the only way to avoid that
> situation is by adapting very slowly, ie on the order of the longest timeout.
> Luckily we expect the connections to be relatively long-lived (minutes or
> hours) so we can afford to take a few minutes to get up to speed.
>
> Assuming that we have a suitably slow feedback loop in place, the next
> question is how to tell our peers when we're ready to accept another search.
> We could use tokens, preemptive rejection or blocking sockets. I don't have
> any religious opinions on this issue, but I thought Anonymous made a fairly
> good case for handling it in the same way as any other network app: don't
> read from the socket until you're ready to process more data. I realise
> there's a highly variable relationship between bytes in and bytes out, but
> regardless of which mechanism we use we'll have to rely on the slow feedback
> loop to smooth that out, because so far nobody's suggested a way of dealing
> with it that doesn't involve favouring some kinds of traffic over others. (I
> hope we agree that we don't want the whole network to end up favouring SSK
> traffic over CHK traffic just because it happens to come in smaller quanta.
> Maybe there are reasons for giving SSK traffic priority, but that isn't one
> of them.)
>
> ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.18 - 14:25:49GMT -----
>
> It's not possible using the mechanism you propose *on its own*, while it *is*
> interesting. Here's why: I have 48kB/sec output. With your proposed
> mechanism, the fact that I have 1MB/sec input won't matter. So a leaf node
> requests a splitfile through me, which is all in the network, but is not in
> my datastore. I route his requests. It takes 5 seconds for enough of the
> requests to start transferring to make a difference. Over that 5 seconds,
> he's sent me 240kB of requests. That's around 240k/50 = 4,915 requests. Which
> will yield 160MB of data. Unfortunately it will take me 160M/48k = nearly an
> hour to transfer all the data.
>
> This is not acceptable, because we want Freenet requests to complete in a
> reasonable amount of time - if only for fproxy support, which IMHO is an
> important application. And I don't like the idea of having different request
> priorities either; it gives away information on the traffic and allows for
> Bad Things.
>
> Imposing an arbitrary limit on the number of requests running is not the
> solution either. A limit may be imposed because of threads/memory usage, but
> this is likely to be heavily architecture dependant. The solution IMHO is to
> take into account likely future bandwidth usage. If we want to guarantee that
> there are no timeouts, this means bandwidth liability limiting. The current
> code will accept an SSK request only if it could also accept a CHK request,
> and vice versa, so I don't see any reason to think it is excessively biased
> in favour of CHKs. However if you like I will add code to collect the
> probability of rejection for SSKs and CHKs individually.
>
> For data blocks, clearly we cannot send one if there is no available space in
> the congestion window to the peer in question. However we may want the data
> for ourself, or for multiple peers, in which case the slowest peer should not
> necessarily dictate the transfer rate; AFAICS we must accept the data as fast
> as we can manage within memory/cpu/etc limits, and limit the incoming
> bandwidth at a higher level.
>
> So given that we must limit traffic on the request level (as well as the
> congestion control level), how can we do this? We can either:
>
> A) Wait for a request, and then either accept it or reject it, based on e.g.
> how many requests are running already.
> PRO:
> - Simple
> CON:
> - Wasteful of time and bandwidth when we have to ask multiple peers before one
> accepts
> - Difficult to adapt to propagate load back to source
>
> Queueing doesn't help much here because we still have to complete the
> request - including queueing it - in a reasonable time.
>
> or
>
> B) Tell our peers when they can send us a request. The obvious way to do this
> is to compute our overall quota of requests (or request-success-bytes), and
> allocate it to our peers (and tell them on every packet/response/etc),
> initially equally, but progressively allowing more to busier nodes and less
> to nodes that don't use their quota (but with diminishing returns: a node
> with a low quota that suddenly starts using more will take quota away from an
> established heavy requestor). Thus, initially, if most nodes aren't busy we
> won't run many requests, but as we identify those nodes which need a bigger
> quota, more requests run overall. Note that running fewer requests than we
> could isn't necessarily a problem anyway unless they complete really slowly.
>
> How to avoid excessive misrouting?
> Options:
> 1: We don't need to queue requests, as they are queued on the next node for us
> (running requests can be regarded as queued requests). When we get a request,
> we identify the set of nodes that we are willing to route it to, and if none
> of them have any free request slots, we reject it.
> 2: Queueing requests helps, because we can match requests with nodes. When we
> get a request, (if it's allowable by the node's current quota), we queue it.
> When we have a slot in an outgoing node's quota, we send the request which is
> closest to the node's location (which hasn't already been to that node),
> regardless of the time it's been queued for (if multiple nodes have space, we
> find the best request/node pair until they don't or we run out of requests).
> If after a certain period a request hasn't been sent, we reject it. This
> avoids excessive misrouting without requiring arbitrary parameters (as the
> first option does), and sends more specialised requests to slower nodes.
> 3: Simulated queueing: remember the locations of recent requests we could have
> sent to a node, and calculate a threshold, which decays over time if we don't
> accept a request (obviously this will result in fewer requests being sent,
> but lower latency).
>
> Does this prevent flooding? Yes: we only accept, and offer, as many requests
> as we can handle, and with the right fairness algorithms, a flood will be
> limited to what the other peers don't need, although if our overall quota
> calculation is too generous flooding may still be an effective attack.
> Obviously on opennet, connecting to several hundred peers and flooding to
> each of them will be fairly effective; opennet sucks, we all know this, it's
> a transition phase.
>
> Does this propagate load back to the originator? Not explicitly, as it's not
> an originator-throttles scheme (such schemes don't prevent flooding and may
> make it easier to identify the originator). However, the rate at which a node
> can send requests is quite clearly limited by the rate at which the rest of
> the network can handle them:
>
> O - A - B - C
>
> If B is a bottleneck, then O will only send requests at the rate that can be
> funneled through it (or satisfied on A).
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Devl mailing list
> Devl at freenetproject.org
> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
More information about the Devl
mailing list