From toad at amphibian.dyndns.org Wed Aug 8 14:40:50 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 15:40:50 +0100 Subject: [Tech] [Old frost] Posting some Frost posts Message-ID: <200708081540.51860.toad@amphibian.dyndns.org> I will post some Frost threads in which I, mrogers, and Anonymous were talking about load management and ULPRs, before they expire completely. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/de91fcd5/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 14:47:05 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 15:47:05 +0100 Subject: [Tech] [Old frost] Lets talk data transfer Message-ID: <200708081547.06788.toad@amphibian.dyndns.org> ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.21 - 19:30:25GMT ----- Lets talk about data transfer (not requests, for the moment). The basic question: Do we want to have end to end flow control? It's easy to illustrate this with an attack: Attacker A sends a large number of requests to each of his peers. He then pretends that most of his packets from the nodes are being dropped, keeps the congestion window tiny, and receives packets really slowly. The result is that he can force downstream nodes to shift far more data than he has to. Output bandwidth liability will not detect this, because it is only focused on the node's output bandwidth limit; it doesn't take into account how fast the node can receive it at the moment. In fact, right now, the attack would be limited only by the arbitrary hardcoded maximum number of simultaneous requests! On opennet, the attacker can keep getting peers forever, changing identity when necessary, and effectively flood out the network. The proposed solution then is to receive data at the same speed as the fastest requestor. Thus even on opennet, the attacker cannot use much more bandwidth (per hop) than he personally has available. (Of course, downstream bandwidth is cheap, just buy a botnet, but that's another discussion...). This does solve the flooding attack, and it might solve similar issues which are naturally occurring. It allows request level load limiting to be relatively lenient. One problem with this is that it will make traffic analysis a lot easier. On several levels: - With or without this mechanism, if you are the data sender, you can vary the timing of sending the packets within a block. If you are also a receiver, with this mechanism you can identify whether there are other receivers. You may be able to send a covert signal via the block timing, and trace the data all the way back to the requestor. - With a sufficiently strict form of this mechanism, you may be able to do it the other way around and identify the data source. The sender-traces-requestor attack is clearly far more insidious, and must be dealt with anyway, so the receiver-traces-sender attack is not terribly important, as well as probably being harder. Both attacks should be difficult on busy nodes with high bandwidth limits. One way to make them harder is to reduce the packet overhead and block/bulk data packet size; we will hopefully be able to do this when the NewPacketFormat is done. As far as I understand it, the implementation would be roughly: - New packet and congestion control layers, with a real congestion window applying to all - Instead of acknowledging packets, we'd acknowledge messages. - We don't acknowledge the incoming data packet for a block transfer until at least one of the peers or local clients waiting for it has taken the packet. - Local clients accept the packet immediately. - For nodes requesting the data, we accept the packet when we manage to send it (so we may have to wait for a gap in the congestion window). We do NOT wait for an acknowledgement, for two reasons: Firstly, it would provide a way to get an accurate round trip all the way back to the requestor. This is very bad for anonymity. Secondly, it would significantly slow down data transfer on links with high bandwidth * latency. It would also make the above style of attack easier; waiting for just the next hop should be adequate while not compromising security or performance. This is definitely a good idea, whether we implement other mechanisms at the request level or not. Discuss. What have I got wrong here? ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.21 - 22:07:29GMT ----- I'm not convinced actually. If the requestor is accepting packets slowly, we send them slowly; that's fine. But what about the next hop (call it A)? A packet comes in, it sees it has to send it to the requestor, and doesn't ack it until it's been able to send it on. It sends the packet, thus using up the congestion window to the requestor (which is very small because of all the packet losses). The requestor doesn't ack the packet, but does ack other messages. The next time A receives a packet for the requestor, it doesn't ack it because it can't send it. However there is still space in the congestion window to A from B. So B sends some more packets. Which are also not acked. B is then stuck with these packets, and its congestion window is significantly shrunk - all traffic to B is affected, although at this point it's obvious who's to blame. A similar effect happens on the next hop, and the next hop, back to the data sender: they all slow down all of their traffic. Which achieves pretty much what the flooder had wanted in the first place. So a one-size-fits-all approach doesn't work: we need to acknowledge the packets at link level, and not interfere with the congestion window, but implement flow control at some higher level. How? It seems that we would have to have an acknowledgement from the requestor, which is propagated back to the sender. This would have to be unique to that specific block transfer tunnel. Does that mean that we'd have to wait for a full round-trip to send the next block? Not if we have a limited size buffer on each intermediary node. Each node could have a buffer of a certain number of packets, and if it is not full, it could accept packets even though they haven't been accepted yet. This would however have much the same effect - the buffers get full, everyone slows down. How do we deal with this? ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.21 - 22:54:19GMT ----- Maybe we can: - Fixed size buffer for each node - or size the same as the real congestion window i.e. determined by real packet loss. - When a data packet is acked, it is removed from the buffer (or "window"), and we can send another one to that node. - A-B-C-D, A starts lots of requests, then stops acking data packets. Window is say 4 packets size. D sends first 4 packets to C, which buffers and forwards to B, which buffers and tries to send one to A, which doesn't ack. So B's buffer is now full. So it doesn't ack packet 5 when it comes from C. C's buffer also fills up, and eventually D stops sending. - Same with E connected only to B, making requests, well-behaved: Once all the buffers are full, E won't get a look in. Even if the buffers are large, a bunch of requests can still fill them all up. Or maybe we can't. :( -------------------------------------------------------------------------------------------------------------------------- [ alternate reply to second message ] ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.21 - 23:38:01GMT ----- As far as I can see, it all comes down to requests. We can't really deal with this on the timescales of single packets or single block transfers. How to deal with it on the level of requests? AIMD! - Node connects - Initially we allow 1 request in flight simultaneously. - If requests complete successfully, without a timeout, increment the window. - If there is a timeout, halve the window (max once per requests). - This is what TCP does (minus slow start, which doesn't make sense here IMHO). - Added caveat: Don't increase it indefinitely if it isn't being used: Only increase it if there were requests running simultaneously at some point during the window of requests. - Maximum window size we can calculate from e.g. output liability limiting and dividing it up appropriately (possibly extrapolating from the current low-level AIMD for this node??). - Obviously this is all limited by low level congestion control. - And limits on our nodes will propagate: provided we have sufficient misrouting protection, we can start a new request only when one completes. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/4126cb47/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 14:49:59 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 15:49:59 +0100 Subject: [Tech] [Old frost] Delayed socket reads Message-ID: <200708081550.00266.toad@amphibian.dyndns.org> ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.06.17 - 13:53:13GMT ----- Sounds like the proposal should be expressed in more formal way, to ensure that we all do not speak about different things. Maybe comparing it with current implementation is a good starting point, so the proposed differences are: - instead of using a single system UDP socket, each connection should be given a distinct UDP socket bound to specific remote address. The UDP socket non-bound to remote is still used to notice new connections (and connections where peer changed its IP address) - once newly seen address authentificated, new bound UDP socket created to handle the connection. Potentially, specific peer might be sending UDP packets from different IP address, where the address changes not on daily/hourly basis (which is quite common) but a remote NAT assigns a random address from a pool of alloted addresses for [virtually] every packet. I havn't seen so badly broken internet connections yet; most likely such node would prefer using TCP or HTTP over any UDP based transport solution anyway. The unbound UDP socket is given low weight as far as bandwidth sharing algoritm goes. - Aside the unbound UDP socket, the per-connection UDP sockets are given equal bandwith sharing weight (with unused bandwidth somehow (not neccessarily evenly) distributed among peers that could handle more traffic). Each peer/socket is handled by a separate thread, and so different peers do never block each other on network I/O. Other words, every peer is guaranteed to get at least 1/(P+1) share of incoming bandwidth - never long delays of no read which might lead to oscillations (which, as mroger pointed out, could be fought with very inertious averages, but no need to provoking them also). But as all the momentarily unused bandwidth is made available for other peers/sockets, the token bucket is very small, and most likely should be not higher than zero (using the following semantics: upon request processing the bucket size becomes negative, and next read from the socket might occur only when the buckets gets refilled back to zero level at which point the continuous background refilling stops). Using NIO the extra threads could be avoided, but it is not really needed seeing that a node is dicouraged having more than a dozen connected peers anyway. The threads will of course be synchronizing on local tokens/bucket handling - which is expected to be within a microsecond, and tasks like db access - which is very good as it keeps node from spiralling down due to CPU/disk performance starvation. Simple implementation would also limit each peer up to one simultaneous db access - and that is good, as that throttles the peers that abuse our disk resources (by flooding requests with htl=1 for example). - the current I/O bandwidth liability limiting mechanism is not deprecated at all; as toad pointed out multiply times, it is essential for handling tiny requests that tend to result in large responses often enough. Once a packet is read from a bound UDP socket: a) an insert or request: it is verified against the bandwidth liability limiters; is the checks passed, the insert/request accepted as it currently does. Otherwise the insert/request gets explicitly rejected (FNPRejectOverload) as it currently does. Nothing really changes here. At the moment of accepting the insert/request the peer token bucket might be charged, in order to discourage floods. Then credit the charged tokens to the peer we later received the requested data from. b) data we are awaiting for, either for our node or as forwarded request. c) auxilarly traffic like acks or location swaps or keepalives. These undergo no liability limiting, as the transfer already happened - exactly the same as now. Notice that if the received traffic is 'valuable' (explicitly requested earlier) we credit the peer with the tokens charged from the original requestor bucket (the only mechanism to possibly raise the bucket size above zero), so this peer is not penalized (at very least not as heavily as now) - and so requests of this peer are not getting postponed for long. As the bonus tokens are not just thrown in, but taken from other peers who are interested in service, there should be no bursts happenning. At least, charging buckets might be done at the moment the data response sent, not at the moment of requests (or the charge can be somehow split among those timepoints). Now, a question should be raised - if all the liability limiter stuff is expected to work as it already does, why bother? The trick is that delayed socket reads will be allowing remote side to notice in advance that the peer became slower. The remote will not know if that happens due to exceeding the 1/(P+1) share of bandwidth, or due to CPU/disk/swap/whatever performance shortage: all these situations require slowing down to avoid rejects and/or timeouts. So the slowing down is made at 21 levels: first, the delayed socket reads smooth the load over longer time period, and second, the requestor will start sending/queueing less requests here, more utilizing alternative paths and less busy peers. And that's the whole point. Additional note: this algorithm requires no modification if UDP sockets are mixed with TCP ones (provided that TCP window size is comparable to MTU, not the default 32/64/128KB). Until last week I was under impression that the timing based mechanism of routeing around overloaded nodes is already implemented. After taking time to study the close closer it turns out to be false: AFAIS the routing around happens only upon explicit FNPRequestOverload - the freenet.node.PeerManager._closerPeer() method does not take the peer roundtrip average into account at all. Here is the [rather simple] improvement I propose: instead of making routing decision based solely on location distance skipping 'unsuitable' locations, I propose searching a peer by choosing the best (numerically highest) request advance speed, specificially for each node: - disconnected peer is skipped (as it is now); - the peers we already tried to search the key at are skipped (as it is now); - the peer the original request came from is skipped (as it is now); - once passive requests implemented, other requestors of the key are to be skipped too (is they gain the key, the subscription will time out soon and then we will be able to try there); - [the maxDiff and the old version logic is not too clear for me]; Now the real salt: - the speed is calculated as signedLocationDiff(our_location, peer_location, key_location)/RTT(peer) where signedLocationDiff is positive if locationDiff(our_location, key_location)>locationDiff(peer_location, key_location), and negative otherwise (assuming our_location!=peer_location, which would give zero location difference but should never happen). That can be multiplied by nodeAveragePingTime if values of predictable range (say -3..3) are preferred (but no need to provide guarantees). - backed off node speed is subtracted by 3 (or thereabout; but maybe smaller value, like 1 or even 0.5). Using the speed for connections sorting is very natural: it basically allows to answer the question 'which peer the request should be sent to in order to receive response ASAP - preferring to use slightly more bandwidth on underloaded nodes (where bandwidth is idle i.e. virtually free) rather than waiting for slow/overloaded nodes to do the job, and provided that the solicited key already exists where it should'. And so it gives the following advantages: - (major) we start routing around busy nodes well before they start generating FNPRejectOverload packets. - (minor) no longer need to scan the peer list twice, first skipping backed off nodes then including them. - (noticeable) requests that result in successful data retrieve will be delivering answer to original requestor faster (on average). - (average) the routes will be more fluid and thus harder for adversary to supervise or censor; - (minor) very fast nodes even with small own storage/cache provide effective caching services (for popular keys) for their immediate peers which are slow but have huge well-specialized storage. And note that this algorithm will be useful even for the current code/protocol, without delayed socket reads; delaying the socket reads will undoubtfully have positive impact by smoothing the choice over time (and efficiently fighting any oscillations) and having routing around busy nodes to start occuring much earlier (starting from 'boundary' keys, then if the node anyway gets increasingly busy, by further narrowing down the location space we try at the busy node). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/92b3fe4a/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 14:50:46 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 15:50:46 +0100 Subject: [Tech] [Old frost] Delayed socket reads In-Reply-To: <200708081550.00266.toad@amphibian.dyndns.org> References: <200708081550.00266.toad@amphibian.dyndns.org> Message-ID: <200708081550.48721.toad@amphibian.dyndns.org> ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.18 - 15:46:45GMT ----- One socket per peer is unrealistic given that 90%+ of nodes are NATed, often behind port restricted cones etc. Different behaviour for data we requested is bad in general because it might indicate to an attacker whether we requested it. And I'm not sure what you use token buckets for - why not just use the congestion window? For the second part of the mail, you've just re-invented NGR, well done. We had it in 0.5, it didn't work terribly well. Given that we have locations, misrouting is almost always bad. On Wednesday 08 August 2007 15:49, Matthew Toseland wrote: > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.06.17 - > 13:53:13GMT ----- > > Sounds like the proposal should be expressed in more formal way, to ensure > that we all do not speak about different things. Maybe comparing it with > current implementation is a good starting point, so the proposed > differences are: > > > - instead of using a single system UDP socket, each connection should be > given a distinct UDP socket bound to specific remote address. The UDP > socket non-bound to remote is still used to notice new connections (and > connections where peer changed its IP address) - once newly seen address > authentificated, new bound UDP socket created to handle the connection. > > Potentially, specific peer might be sending UDP packets from different IP > address, where the address changes not on daily/hourly basis (which is > quite common) but a remote NAT assigns a random address from a pool of > alloted addresses for [virtually] every packet. I havn't seen so badly > broken internet connections yet; most likely such node would prefer using > TCP or HTTP over any UDP based transport solution anyway. > > The unbound UDP socket is given low weight as far as bandwidth sharing > algoritm goes. > > > - Aside the unbound UDP socket, the per-connection UDP sockets are given > equal bandwith sharing weight (with unused bandwidth somehow (not > neccessarily evenly) distributed among peers that could handle more > traffic). Each peer/socket is handled by a separate thread, and so > different peers do never block each other on network I/O. > > Other words, every peer is guaranteed to get at least 1/(P+1) share of > incoming bandwidth - never long delays of no read which might lead to > oscillations (which, as mroger pointed out, could be fought with very > inertious averages, but no need to provoking them also). But as all the > momentarily unused bandwidth is made available for other peers/sockets, the > token bucket is very small, and most likely should be not higher than zero > (using the following semantics: upon request processing the bucket size > becomes negative, and next read from the socket might occur only when the > buckets gets refilled back to zero level at which point the continuous > background refilling stops). > > Using NIO the extra threads could be avoided, but it is not really needed > seeing that a node is dicouraged having more than a dozen connected peers > anyway. > > The threads will of course be synchronizing on local tokens/bucket handling > - which is expected to be within a microsecond, and tasks like db access - > which is very good as it keeps node from spiralling down due to CPU/disk > performance starvation. Simple implementation would also limit each peer up > to one simultaneous db access - and that is good, as that throttles the > peers that abuse our disk resources (by flooding requests with htl=1 for > example). > > > - the current I/O bandwidth liability limiting mechanism is not deprecated > at all; as toad pointed out multiply times, it is essential for handling > tiny requests that tend to result in large responses often enough. > > Once a packet is read from a bound UDP socket: > > a) an insert or request: it is verified against the bandwidth liability > limiters; is the checks passed, the insert/request accepted as it currently > does. Otherwise the insert/request gets explicitly rejected > (FNPRejectOverload) as it currently does. Nothing really changes here. > > At the moment of accepting the insert/request the peer token bucket might > be charged, in order to discourage floods. Then credit the charged tokens > to the peer we later received the requested data from. > > b) data we are awaiting for, either for our node or as forwarded request. > c) auxilarly traffic like acks or location swaps or keepalives. > > These undergo no liability limiting, as the transfer already happened - > exactly the same as now. Notice that if the received traffic is 'valuable' > (explicitly requested earlier) we credit the peer with the tokens charged > from the original requestor bucket (the only mechanism to possibly raise > the bucket size above zero), so this peer is not penalized (at very least > not as heavily as now) - and so requests of this peer are not getting > postponed for long. > > As the bonus tokens are not just thrown in, but taken from other peers who > are interested in service, there should be no bursts happenning. At least, > charging buckets might be done at the moment the data response sent, not at > the moment of requests (or the charge can be somehow split among those > timepoints). > > > > Now, a question should be raised - if all the liability limiter stuff is > expected to work as it already does, why bother? The trick is that delayed > socket reads will be allowing remote side to notice in advance that the > peer became slower. The remote will not know if that happens due to > exceeding the 1/(P+1) share of bandwidth, or due to CPU/disk/swap/whatever > performance shortage: all these situations require slowing down to avoid > rejects and/or timeouts. So the slowing down is made at 21 levels: first, > the delayed socket reads smooth the load over longer time period, and > second, the requestor will start sending/queueing less requests here, more > utilizing alternative paths and less busy peers. And that's the whole > point. > > > Additional note: this algorithm requires no modification if UDP sockets are > mixed with TCP ones (provided that TCP window size is comparable to MTU, > not the default 32/64/128KB). > > > > Until last week I was under impression that the timing based mechanism of > routeing around overloaded nodes is already implemented. After taking time > to study the close closer it turns out to be false: AFAIS the routing > around happens only upon explicit FNPRequestOverload - the > freenet.node.PeerManager._closerPeer() method does not take the peer > roundtrip average into account at all. Here is the [rather simple] > improvement I propose: instead of making routing decision based solely on > location distance skipping 'unsuitable' locations, I propose searching a > peer by choosing the best (numerically highest) request advance speed, > specificially for each node: > > - disconnected peer is skipped (as it is now); > - the peers we already tried to search the key at are skipped (as it is > now); - the peer the original request came from is skipped (as it is now); > - once passive requests implemented, other requestors of the key are to be > skipped too (is they gain the key, the subscription will time out soon and > then we will be able to try there); > - [the maxDiff and the old version logic is not too clear for me]; > > Now the real salt: > - the speed is calculated as signedLocationDiff(our_location, > peer_location, key_location)/RTT(peer) where signedLocationDiff is positive > if > locationDiff(our_location, key_location)>locationDiff(peer_location, > key_location), and negative otherwise (assuming > our_location!=peer_location, which would give zero location difference but > should never happen). That can be multiplied by nodeAveragePingTime if > values of predictable range (say -3..3) are preferred (but no need to > provide guarantees). > - backed off node speed is subtracted by 3 (or thereabout; but maybe > smaller value, like 1 or even 0.5). > > Using the speed for connections sorting is very natural: it basically > allows to answer the question 'which peer the request should be sent to in > order to receive response ASAP - preferring to use slightly more bandwidth > on underloaded nodes (where bandwidth is idle i.e. virtually free) rather > than waiting for slow/overloaded nodes to do the job, and provided that the > solicited key already exists where it should'. > > And so it gives the following advantages: > - (major) we start routing around busy nodes well before they start > generating FNPRejectOverload packets. > - (minor) no longer need to scan the peer list twice, first skipping backed > off nodes then including them. > - (noticeable) requests that result in successful data retrieve will be > delivering answer to original requestor faster (on average). > - (average) the routes will be more fluid and thus harder for adversary to > supervise or censor; > - (minor) very fast nodes even with small own storage/cache provide > effective caching services (for popular keys) for their immediate peers > which are slow but have huge well-specialized storage. > > And note that this algorithm will be useful even for the current > code/protocol, without delayed socket reads; delaying the socket reads will > undoubtfully have positive impact by smoothing the choice over time (and > efficiently fighting any oscillations) and having routing around busy nodes > to start occuring much earlier (starting from 'boundary' keys, then if the > node anyway gets increasingly busy, by further narrowing down the location > space we try at the busy node). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/5c94e556/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 14:51:05 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 15:51:05 +0100 Subject: [Tech] [Old frost] Delayed socket reads In-Reply-To: <200708081550.00266.toad@amphibian.dyndns.org> References: <200708081550.00266.toad@amphibian.dyndns.org> Message-ID: <200708081551.06851.toad@amphibian.dyndns.org> ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.18 - 20:34:29GMT ----- > - instead of using a single system UDP socket, each connection should be given a distinct UDP socket bound to specific remote address. I'm not sure it's necessary to use separate UDP sockets, but we should have some kind of socket-like abstraction for each peer (and ideally for each local client) which blocks the sender when the receiver doesn't process the incoming data. > - the current I/O bandwidth liability limiting mechanism is not deprecated at all; as toad pointed out multiply times, it is essential for handling tiny requests that tend to result in large responses often enough. He pointed it out multiple times but he's still wrong. ;-) Imagine putting water into a red-hot pipe. Each drop of water turns into a litre of steam after ten seconds. If you fill the pipe too quickly it will explode ten seconds later. But if you fill the pipe very slowly, you can keep adding water at one end at the same rate steam escapes from the other end. You don't need to know the size of the pipe, or how quickly the steam escapes, or the relative density of steam and water - you just add water slowly until the pipe is full, and after that you add a drop of water whenever there's space in the pipe. The current system is equivalent to measuring how quickly the steam escapes, measuring the relative density of steam and water, and calculating quickly we can add water. > if all the liability limiter stuff is expected to work as it already does, why bother? The trick is that delayed socket reads will be allowing remote side to notice in advance that the peer became slower. Agreed, this is very useful. Specifically, the peer can notice the queue getting longer and at some point stop sending us new searches until the queue shrinks. > (provided that TCP window size is comparable to MTU, not the default 32/64/128KB) TCP windows are larger than one MTU for a good reason - you can only send one window of traffic per round-trip time, so with an MTU of 1500 bytes and a round-trip time of 500 ms you'd be limiting the throughput to 3 kB/s. > the freenet.node.PeerManager._closerPeer() method does not take the peer roundtrip average into account at all. Not directly, but we back off when a peer sends us a locally generated RejecterOverload, and we only route to backed-off peers if there are no un-backed-off peers available. > Using the speed for connections sorting is very natural: it basically allows to answer the question 'which peer the request should be sent to in order to receive response ASAP It appears to answer that question, but only from a local point of view. It doesn't take into account the fact that a misrouted search could put extra load on the network, which might not be visible to us or our immediate peers. I'm not saying I have an answer to the question of "how much misrouting is too much", but I don't think ad hoc mechanisms like multiplying the routing distance by the expected delay are necessarily the answer either. As Toad pointed out, something similar was tried in 0.5 (next-generation routing) and it turned out to be pretty complex. On Wednesday 08 August 2007 15:49, Matthew Toseland wrote: > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.06.17 - > 13:53:13GMT ----- > > Sounds like the proposal should be expressed in more formal way, to ensure > that we all do not speak about different things. Maybe comparing it with > current implementation is a good starting point, so the proposed > differences are: > > > - instead of using a single system UDP socket, each connection should be > given a distinct UDP socket bound to specific remote address. The UDP > socket non-bound to remote is still used to notice new connections (and > connections where peer changed its IP address) - once newly seen address > authentificated, new bound UDP socket created to handle the connection. > > Potentially, specific peer might be sending UDP packets from different IP > address, where the address changes not on daily/hourly basis (which is > quite common) but a remote NAT assigns a random address from a pool of > alloted addresses for [virtually] every packet. I havn't seen so badly > broken internet connections yet; most likely such node would prefer using > TCP or HTTP over any UDP based transport solution anyway. > > The unbound UDP socket is given low weight as far as bandwidth sharing > algoritm goes. > > > - Aside the unbound UDP socket, the per-connection UDP sockets are given > equal bandwith sharing weight (with unused bandwidth somehow (not > neccessarily evenly) distributed among peers that could handle more > traffic). Each peer/socket is handled by a separate thread, and so > different peers do never block each other on network I/O. > > Other words, every peer is guaranteed to get at least 1/(P+1) share of > incoming bandwidth - never long delays of no read which might lead to > oscillations (which, as mroger pointed out, could be fought with very > inertious averages, but no need to provoking them also). But as all the > momentarily unused bandwidth is made available for other peers/sockets, the > token bucket is very small, and most likely should be not higher than zero > (using the following semantics: upon request processing the bucket size > becomes negative, and next read from the socket might occur only when the > buckets gets refilled back to zero level at which point the continuous > background refilling stops). > > Using NIO the extra threads could be avoided, but it is not really needed > seeing that a node is dicouraged having more than a dozen connected peers > anyway. > > The threads will of course be synchronizing on local tokens/bucket handling > - which is expected to be within a microsecond, and tasks like db access - > which is very good as it keeps node from spiralling down due to CPU/disk > performance starvation. Simple implementation would also limit each peer up > to one simultaneous db access - and that is good, as that throttles the > peers that abuse our disk resources (by flooding requests with htl=1 for > example). > > > - the current I/O bandwidth liability limiting mechanism is not deprecated > at all; as toad pointed out multiply times, it is essential for handling > tiny requests that tend to result in large responses often enough. > > Once a packet is read from a bound UDP socket: > > a) an insert or request: it is verified against the bandwidth liability > limiters; is the checks passed, the insert/request accepted as it currently > does. Otherwise the insert/request gets explicitly rejected > (FNPRejectOverload) as it currently does. Nothing really changes here. > > At the moment of accepting the insert/request the peer token bucket might > be charged, in order to discourage floods. Then credit the charged tokens > to the peer we later received the requested data from. > > b) data we are awaiting for, either for our node or as forwarded request. > c) auxilarly traffic like acks or location swaps or keepalives. > > These undergo no liability limiting, as the transfer already happened - > exactly the same as now. Notice that if the received traffic is 'valuable' > (explicitly requested earlier) we credit the peer with the tokens charged > from the original requestor bucket (the only mechanism to possibly raise > the bucket size above zero), so this peer is not penalized (at very least > not as heavily as now) - and so requests of this peer are not getting > postponed for long. > > As the bonus tokens are not just thrown in, but taken from other peers who > are interested in service, there should be no bursts happenning. At least, > charging buckets might be done at the moment the data response sent, not at > the moment of requests (or the charge can be somehow split among those > timepoints). > > > > Now, a question should be raised - if all the liability limiter stuff is > expected to work as it already does, why bother? The trick is that delayed > socket reads will be allowing remote side to notice in advance that the > peer became slower. The remote will not know if that happens due to > exceeding the 1/(P+1) share of bandwidth, or due to CPU/disk/swap/whatever > performance shortage: all these situations require slowing down to avoid > rejects and/or timeouts. So the slowing down is made at 21 levels: first, > the delayed socket reads smooth the load over longer time period, and > second, the requestor will start sending/queueing less requests here, more > utilizing alternative paths and less busy peers. And that's the whole > point. > > > Additional note: this algorithm requires no modification if UDP sockets are > mixed with TCP ones (provided that TCP window size is comparable to MTU, > not the default 32/64/128KB). > > > > Until last week I was under impression that the timing based mechanism of > routeing around overloaded nodes is already implemented. After taking time > to study the close closer it turns out to be false: AFAIS the routing > around happens only upon explicit FNPRequestOverload - the > freenet.node.PeerManager._closerPeer() method does not take the peer > roundtrip average into account at all. Here is the [rather simple] > improvement I propose: instead of making routing decision based solely on > location distance skipping 'unsuitable' locations, I propose searching a > peer by choosing the best (numerically highest) request advance speed, > specificially for each node: > > - disconnected peer is skipped (as it is now); > - the peers we already tried to search the key at are skipped (as it is > now); - the peer the original request came from is skipped (as it is now); > - once passive requests implemented, other requestors of the key are to be > skipped too (is they gain the key, the subscription will time out soon and > then we will be able to try there); > - [the maxDiff and the old version logic is not too clear for me]; > > Now the real salt: > - the speed is calculated as signedLocationDiff(our_location, > peer_location, key_location)/RTT(peer) where signedLocationDiff is positive > if > locationDiff(our_location, key_location)>locationDiff(peer_location, > key_location), and negative otherwise (assuming > our_location!=peer_location, which would give zero location difference but > should never happen). That can be multiplied by nodeAveragePingTime if > values of predictable range (say -3..3) are preferred (but no need to > provide guarantees). > - backed off node speed is subtracted by 3 (or thereabout; but maybe > smaller value, like 1 or even 0.5). > > Using the speed for connections sorting is very natural: it basically > allows to answer the question 'which peer the request should be sent to in > order to receive response ASAP - preferring to use slightly more bandwidth > on underloaded nodes (where bandwidth is idle i.e. virtually free) rather > than waiting for slow/overloaded nodes to do the job, and provided that the > solicited key already exists where it should'. > > And so it gives the following advantages: > - (major) we start routing around busy nodes well before they start > generating FNPRejectOverload packets. > - (minor) no longer need to scan the peer list twice, first skipping backed > off nodes then including them. > - (noticeable) requests that result in successful data retrieve will be > delivering answer to original requestor faster (on average). > - (average) the routes will be more fluid and thus harder for adversary to > supervise or censor; > - (minor) very fast nodes even with small own storage/cache provide > effective caching services (for popular keys) for their immediate peers > which are slow but have huge well-specialized storage. > > And note that this algorithm will be useful even for the current > code/protocol, without delayed socket reads; delaying the socket reads will > undoubtfully have positive impact by smoothing the choice over time (and > efficiently fighting any oscillations) and having routing around busy nodes > to start occuring much earlier (starting from 'boundary' keys, then if the > node anyway gets increasingly busy, by further narrowing down the location > space we try at the busy node). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/19454f2e/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 14:56:55 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 15:56:55 +0100 Subject: [Tech] [Old frost] network bandwidth usage and QoS Message-ID: <200708081556.56475.toad@amphibian.dyndns.org> ----- ET at mj+bSV4hxRMtCj9fcwy4Ww9_3mc ----- 2007.05.27 - 10:13:11GMT ----- I think Stochastic Fairness Queuing (SFQ) algorithm is appropriate for freenet output. http://www.opalsoft.net/qos/DS-25.htm - equalize bandwidth between each actif node link - minimize delay on each link - adapt link usage when bandwidth change (due to QoS on network or saturate network) ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.27 - 20:08:46GMT ----- Node A and Node B connected: - Node A has bandwidth 1000 with 15 active connections - Node B has bandwidth 5000 with 8 active connections As result, node B feels ok to send to node A datastream with seemingly fair speed of 625, but from Node A standpoint the fair level would be 67. 10x difference. So unlikely a particular QoS algorithm will have noticeable impact on overall performance/fairness, there are much more critical bandwidth control tasks to optimize. (But it should be easy to check: SFQ among other algorithms is implemented in linux, so just configure host, raise fred outgoing bandwidth limit to the sky (and make sure fred has something extra to send - like large storage, and/and/or large inserts), and try to notice any difference, preferably expressed in numbers. I am serious, please report the results if any.) On the other hand, good QoS could be useful on a host/network with freenet node sharing relatively slow internet connection with other applications/users. Unfortunatelly for outgoing traffic only. ----- ET at mj+bSV4hxRMtCj9fcwy4Ww9_3mc ----- 2007.05.28 - 07:37:39GMT ----- I use already QoS on my computer, HTB and SFQ for freenet. When freenet's bandwidth is reduce by QoS, bwlimitDelayTime and nodeAveragePingTime increase and reject requests (SUB_MAX_PING_TIME, MAX_PING_TIME and Output bandwidth liability). Problem is long time to take again normal activity, about 30 minutes after QoS give back bandwidth. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/a01d5b2c/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 14:57:27 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 15:57:27 +0100 Subject: [Tech] [Old frost] network bandwidth usage and QoS In-Reply-To: <200708081556.56475.toad@amphibian.dyndns.org> References: <200708081556.56475.toad@amphibian.dyndns.org> Message-ID: <200708081557.28601.toad@amphibian.dyndns.org> ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.28 - 22:30:27GMT ----- Interesting - I wonder how quickly the running averages adapt. On Wednesday 08 August 2007 15:56, Matthew Toseland wrote: > ----- ET at mj+bSV4hxRMtCj9fcwy4Ww9_3mc ----- 2007.05.27 - 10:13:11GMT ----- > > I think Stochastic Fairness Queuing (SFQ) algorithm is appropriate for > freenet output. > > http://www.opalsoft.net/qos/DS-25.htm > > - equalize bandwidth between each actif node link > - minimize delay on each link > - adapt link usage when bandwidth change (due to QoS on network or saturate > network) > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.27 - > 20:08:46GMT ----- > > Node A and Node B connected: > - Node A has bandwidth 1000 with 15 active connections > - Node B has bandwidth 5000 with 8 active connections > > As result, node B feels ok to send to node A datastream with seemingly fair > speed of 625, but from Node A standpoint the fair level would be 67. > > 10x difference. So unlikely a particular QoS algorithm will have noticeable > impact on overall performance/fairness, there are much more critical > bandwidth control tasks to optimize. > > (But it should be easy to check: SFQ among other algorithms is implemented > in linux, so just configure host, raise fred outgoing bandwidth limit to > the sky (and make sure fred has something extra to send - like large > storage, and/and/or large inserts), and try to notice any difference, > preferably expressed in numbers. I am serious, please report the results if > any.) > > On the other hand, good QoS could be useful on a host/network with freenet > node sharing relatively slow internet connection with other > applications/users. Unfortunatelly for outgoing traffic only. > > ----- ET at mj+bSV4hxRMtCj9fcwy4Ww9_3mc ----- 2007.05.28 - 07:37:39GMT ----- > > I use already QoS on my computer, HTB and SFQ for freenet. When freenet's > bandwidth is reduce by QoS, bwlimitDelayTime and nodeAveragePingTime > increase and reject requests (SUB_MAX_PING_TIME, MAX_PING_TIME and Output > bandwidth liability). Problem is long time to take again normal activity, > about 30 minutes after QoS give back bandwidth. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/467a6f86/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 14:57:43 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 15:57:43 +0100 Subject: [Tech] [Old frost] network bandwidth usage and QoS In-Reply-To: <200708081556.56475.toad@amphibian.dyndns.org> References: <200708081556.56475.toad@amphibian.dyndns.org> Message-ID: <200708081557.44492.toad@amphibian.dyndns.org> ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.30 - 09:17:48GMT ----- This makes me think that the current values for MAX_PING_TIME are a little bit too strict and probably could benefit from lifting them a little. ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.30 - 15:17:10GMT ----- They're pretty high IMHO. Maybe we're averaging too much though. On Wednesday 08 August 2007 15:56, Matthew Toseland wrote: > ----- ET at mj+bSV4hxRMtCj9fcwy4Ww9_3mc ----- 2007.05.27 - 10:13:11GMT ----- > > I think Stochastic Fairness Queuing (SFQ) algorithm is appropriate for > freenet output. > > http://www.opalsoft.net/qos/DS-25.htm > > - equalize bandwidth between each actif node link > - minimize delay on each link > - adapt link usage when bandwidth change (due to QoS on network or saturate > network) > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.27 - > 20:08:46GMT ----- > > Node A and Node B connected: > - Node A has bandwidth 1000 with 15 active connections > - Node B has bandwidth 5000 with 8 active connections > > As result, node B feels ok to send to node A datastream with seemingly fair > speed of 625, but from Node A standpoint the fair level would be 67. > > 10x difference. So unlikely a particular QoS algorithm will have noticeable > impact on overall performance/fairness, there are much more critical > bandwidth control tasks to optimize. > > (But it should be easy to check: SFQ among other algorithms is implemented > in linux, so just configure host, raise fred outgoing bandwidth limit to > the sky (and make sure fred has something extra to send - like large > storage, and/and/or large inserts), and try to notice any difference, > preferably expressed in numbers. I am serious, please report the results if > any.) > > On the other hand, good QoS could be useful on a host/network with freenet > node sharing relatively slow internet connection with other > applications/users. Unfortunatelly for outgoing traffic only. > > ----- ET at mj+bSV4hxRMtCj9fcwy4Ww9_3mc ----- 2007.05.28 - 07:37:39GMT ----- > > I use already QoS on my computer, HTB and SFQ for freenet. When freenet's > bandwidth is reduce by QoS, bwlimitDelayTime and nodeAveragePingTime > increase and reject requests (SUB_MAX_PING_TIME, MAX_PING_TIME and Output > bandwidth liability). Problem is long time to take again normal activity, > about 30 minutes after QoS give back bandwidth. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/de9dfd9c/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 14:59:50 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 15:59:50 +0100 Subject: [Tech] [old frost] bandwidth usage improvements for old nodes (and much more!) Message-ID: <200708081559.51360.toad@amphibian.dyndns.org> Unfortunately this thread is rather rambling, it includes lots of discussion on token passing as well as the original premise. ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.25 - 20:18:36GMT ----- I made some measurements on how freenet node behaves if bandwidth limit is set low: 10KBps and downto 6KBps (specificially, input bandwidth limit; output bandwidth limit was set to at least 15KBps but as expected factually used output bandwidth is comparable (just slightly above) with factually used input bandwidth). The node itself was running frost but no uploads/downloads, so absolute majority of network traffic was forwarded CHK/SSK requests/inserts. Results are interesting enough: CHK traffic becomes as low as 5% (in packets) of CHK+SSK, while at least 92% of SSK requests were not satisfied for assorted failures (plus quite some more certainly resulted in NotFound response due to missing the key in whole network, but I don't have the number). This makes low traffic node working highly inefficient and improportionally slow; this also slows down its peers with all the extra reject traffic. Worse, input bandwidth sometimes goes over set limit, suggesting that on hardware 33600/56000 Kbps modem and even ISDN things will just get worse due to increased delays. Another thing to note: low bandwidth node (LBN) almost exclusively reject requests with "input bandwidth liability" reason, and extremely rarely other reasons. Speculating a bit, the same picture will likely be observed for peers of fast node (1Mbps or more) with many peers having typical home connection of 256Kbps or less. Not sure if simulations ever showed anything like this, but contributing to network mostly SSK service (and absolute majority of SSK requests fail!) is rather useless: optimally working network is supposed to transfer at least one CHK block for each SSK key, and typically much much more (single 10MB file consists of 481 CHK blocks!), and even if you found SSK but not CHK the SSK points to, then you failed to find information you requested. OK to make the long story short[er], at the end of this message you will find a small patch that noticably improves LBN situation. Idea is to reserve some bandwidth for CHK transfers (and SSK inserts, as those are too rare to penalize, and more valuable than requests). The line directly before the inserted one implicitly penalizes CHK transfers (as much smaller SSK requests tend to rereserve bandwidth the next moment it got released after CHK transfer finish, while much larger CHK requests do not have such good chance), so bandwidth should be reserved for 2 CHKs at least (and tests show that's enough to make a difference). Another thing I tried was increasing the 90 seconds period up to 120. That had some (no numbers here; just "noticeable but small") positive effect on making traffic smoother and staying closer to set limit, without jumping up and down too much. Where the 90 seconds number came from anyway, and how dangerous 120 could be? Some pros observed and/or thought out during tests of the patch: - I observe increase of output payload by approx. 15% (of total traffic), making LBN more useful for its peers. - the change is negligibly small for faster nodes so should not break anything globally. - entire network SSK flood traffic will be toned down a little bit (at temporary overloaded nodes only), additionally simplifying life for LBNs: after all, requesting the same SSK every 15 seconds for 35 hours, total 8100 times (factual numbers from one of the test before this patch applied; there are over 60 other SSKs that were requested more than 1000 times during the same period) is just way too much, SSKs are not inserted into network THAT fast. [does it worth to remember recently seen SSK requests, and do not forward them if same request was already forwarded within last 10 minutes and resulted in DNF/RNF? Table of recently requested SSKs that are closest to the node location should not be too big?]. And contras: - in exceptional conditions (specificially, with less than 2 incoming CHK requests per 90 seconds; factually I observe 2-7 CHK requests per seconds, that's 180-630 per 90 seconds) notwithstanding node bandwidth speed, up to 800 Bps might end being unused. For high bandwidth node that's just way too small to notice, for LBN that's still acceptable (10% of 56Kbps) and will decrease roundtrip delays a bit which is always a good thing for so slow links. Other notes: - distribution of location closeness/number of SSK requests is very nice: only SSK requests with location very close to node location get repeated frequently; farther SSK location is, less requests the node sees, with those SSKs seen only once or two times per 1-2 days period are distributed evenly among location space. This suggests that routing is working fine. - As far as I understand, if input bandwidth limit/liability exceeded (but a packet already received anyway), CHK/SSK request gets instantly rejected (thus throwing out received bytes while input bandwidth has no spare volume!); only otherwise node checks if the requested key exists in the storage. Heh? This feels like a serious bug hurting overall network performance: better query storage and hopefully send back result (or still reject if the key not found locally) rather than wait for retry request to waste more input bandwidth. At least for SSK reject and reply are comparable in output bandwidth usage, so worth a little delay in response. Or do I miss something? === diff --git a/freenet/src/freenet/node/NodeStats.java b/freenet/src/freenet/node/NodeStats.java index 3b091b4..fb9f8b9 100644 --- a/freenet/src/freenet/node/NodeStats.java +++ b/freenet/src/freenet/node/NodeStats.java @@ -414,6 +414,7 @@ public class NodeStats implements Persistable { successfulChkInsertBytesReceivedAverage.currentValue() * node.getNumCHKInserts() + successfulSskInsertBytesReceivedAverage.currentValue() * node.getNumSSKInserts(); bandwidthLiabilityInput += getSuccessfulBytes(isSSK, isInsert, true).currentValue(); + if (isSSK && !isInsert) bandwidthLiabilityInput+=successfulChkFetchBytesReceivedAverage.currentValue()+successfulChkInsertBytesReceivedAverage.currentValue(); // slightly penalize SSK requests by reserving bandwidth for 2 additional CHK transfers (or SSK inserts if any) double bandwidthAvailableInput = node.getInputBandwidthLimit() * 90; // 90 seconds at full power if(bandwidthLiabilityInput > bandwidthAvailableInput) { === ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.26 - 16:56:59GMT ----- Most SSK requests fail. They DNF. The reason for this is most SSK requests are polling for data that has not yet been inserted. Bandwidth liability is usually the main reason for rejection. If we reach most of the other reasons, there is a problem - usually a cyclical problem. The main reason for it is to ensure that we don't accept so many requests that some of them needlessly timeout even though they succeeded. The timeout is 120 seconds, so we need the actual transfer to take less than this; on a request, 30 seconds seems a reasonable upper bound for the search time. We don't throw out many bytes when we reject a request/insert because the bulk of it hasn't been sent yet, except with SSKs where typically a little under half of the total bytes will have been moved. Ideally we wouldn't send requests until we have a good idea that they will be accepted, but token passing load balancing is a long way off, not likely to happen for 0.7.0. We cannot control input bandwidth usage precisely. Any more info on SSK flooding? Is it simply Frost? We can add a failure table, we had one before, however a failure table which results in actually blocking keys can be extremely dangerous; what I had envisaged was "per node failure tables" i.e. reroute requests which have recently failed to a different node since we know it isn't where it's supposed to be. On what do you base the assertion about key closeness? It would be nice to have a histogram or circle on the stats pages showing recent keys on the keyspace - can you write a patch? As far as your patch goes, surely rejecting more SSK requests would be counterproductive as it wastes bandwidth? Shouldn't a slow node accept those requests it's likely to be able to handle? I can see an argument that we shouldn't prefer SSKs, and that on slow nodes we do prefer SSKs... I'm not sure the above is the right way to deal with it though. The effect of the patch would be to never accept any SSKs unless we have plenty of spare bandwidth, correct? ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.26 - 18:41:32GMT ----- > Ideally we wouldn't send requests until we have a good idea that they will be accepted, but token passing load balancing is a long way off, not likely to happen for 0.7.0. Well, even current algorithm implementation has certain room for improvement. Here is the typical numbers I observe: === unclaimedFIFO Message Counts * FNPRejectOverload: 89 (45.2%) * FNPInsertTransfersCompleted: 59 (29.9%) * FNPDataNotFound: 15 (7.6%) * packetTransmit: 12 (6.1%) * FNPRejectLoop: 7 (3.6%) * FNPAccepted: 6 (3.0%) * FNPSwapRejected: 4 (2.0%) * FNPDataInsertRejected: 4 (2.0%) * FNPRouteNotFound: 1 (0.5%) * Unclaimed Messages Considered: 197 === FNPRejectOverload always stays at top sometimes with hundreds messages (for the last hour before unclaimed messages expire, that's alot), and so indicates that there is some bug (or bugs) with bandwidth limiting obeying. > Any more info on SSK flooding? Is it simply Frost? Not local frost for sure, it generates just several SSK simultaneous requests (by default max 8: 6 for boards plus 2 for filesharing, AFAIR; practically 2-4 simutaneous requests most of the time). Other 100 SSK requests (without proposed patch) are forwarded ones. >We can add a failure table, we had one before, however a failure table which results in actually blocking keys can be extremely dangerous; Is it, having timeout of max few minutes (i.e. at least few times less than SSK propagation time visible with frost messages)? Is it more dangerous than current wastage of bandwith for same SSK key requests several times per minute? Had some simulations been done on that in the past? BTW, isn't the observed very low store hit rate results from prioritising the likely-to-fail SSKs? BTW2 the failure table could also act as a targetted content propagation mechanism: if a node sees SSK insert for a temporary blacklisted (non-existing) SSK, then forwarding the insert (more likely insert copy, for security reasons and routing sake) to the original requestor should speed up propagaton of new SSKs toward the nodes that already anticipate/await for them. >what I had envisaged was "per node failure tables" i.e. reroute requests which have recently failed to a different node since we know it isn't where it's supposed to be. At a glance, very nice idea. But LBNs typically answer with reject, not DFN... even with current code. Probably such rerouting will even further increase SSK traffic toward LBNs, and get sharply increased volume of SSK rejects as result. Hmm, some testing/simulation seems really needed here. >On what do you base the assertion about key closeness? It would be nice to have a histogram or circle on the stats pages showing recent keys on the keyspace - can you write a patch? Mmmm... in fact I just added custom logging, then a wild combination of grep/sed/sort/uniq to analyze the logs. But let me think, maybe visualizing a couple of stats files I operate with will be rather trivial... But I would rather stay away from stats page graphics at this time, as the stats files I operate (filtered+sorted+uniqued) with are rather large, 20-50Mb each - too much memory for the toy. Unless your 'recent' means just 10-15 minutes at most? >As far as your patch goes, surely rejecting more SSK requests would be counterproductive as it wastes bandwidth? Tests show the opposite: without the patch payload output at stats page never exceeded 38%, with patch it becomes 53% or little more after several minutes upon node restart. So, with the patch SSK/CHK forwarding behaviour 'feels' logical: without patch: - just several CHKs, and over over 100 SSKs very typical. with patch: - most of the time (say, 75%) number of currently forwarded CHK requests+inserts approximately equals to the number of SSK requests+inserts (i.e. 10-25 each, depending on set bandwidth limit); - sometimes (say, 10%) CHK requests start to prevail, but current SSK requests+inserts seems never go below the amount which CHK get at max without patch (i.e. 6-10). This is very typical when number of CHK inserts gets several times higher than CHK requests (close fast peer inserts something really large?). - other times (say, 15%) CHK requests+inserts flow does not saturate bandwidth, and number of SSK requests quickly climbs to 50 or even over 100+ as it typically gets without the patch. That's for LBN. Raising input bandwidth allotment, number of SSKs quickly grows resembling the situation without the patch. So that's why I suggest reserving bandwidth for 2 CHK transfers; 3 would kill SSKs, 1 still makes SSKs to seriously prevail over CHKs (but nonetheless gives quite better ratio, so is a legal value to try if the value of 2 alarms you too much). Just, in case of reserving bandwidth for 1 extra CHK the proposed patch is not really needed: simply comment out the line starting with "bandwidthLiabilityInput +=" and decrease 90 seconds constant to 80 (10 seconds is roughly how much 33.6Kbod modem takes to transmit a single CHK - using anything noticeably slower than 28800/33600bod for freenet will not ever work well anyway). >Shouldn't a slow node accept those requests it's likely to be able to handle? Considering the very high chance of SSK request failures (at lest 92%), I would say the answer is no. But with sane SSK failure rate (say 75% or below) SSK requests would likely not waste the limited thus precious LBN bandwidth so fruitlessly. The problem, in my belief, is too small size of UDP packets if SSK requests prevail: PPP(oE)/TCP/FNP overhead becomes too large while LBNs, unlike faster link nodes, almost never coalesce packets, obviously. ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.27 - 17:19:24GMT ----- The current algorithm is working, on most nodes, far better than it has in *ages*. I'm at 62% of a 700MB ISO, I started inserting it yesterday morning, and only a few of my peers are backed off - frequently none are backed off, right now it's 11 connected, 6 backed off, which is more backed off than I've seen for quite a while. Re failure tables: Yes it is extremely dangerous. It can result in self-reinforcing key censorship, either as an attack or just occurring naturally. This happened on 0.5. And the hit ratio is only for CHKs iirc. Even LBNs don't often send local RejectedOverload's on SSKs *once they have accepted them*. They may relay downstream RO's but that is not fatal. And if they reject some requests, so what, it's a slow node, it's bound to reject some requests with the current load balancing system. 10-15 minutes would be interesting. We already show a circle and histogram of nearby nodes from swapping and of our peers, you'd just have to add another one. It would be good to have a visual proof that routing is working on the level of adhering to node specialisations. I didn't expect it to be working given the load: I'm surprised that it does, it's an interesting result. Packet size has nothing to do with it, ethernet has a 1472 byte maximum. Dial-up has 576 bytes max, but we ignore it, and use fragmented packets (this sucks, obviously, as it greatly increases the chance of losing a packet and having to retransmit it). Please explain why the patch doesn't result in never accepting a single SSK? ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.27 - 19:31:14GMT ----- >Packet size has nothing to do with it, ethernet has a 1472 byte maximum. Dial-up has 576 bytes max, but we ignore it, and use fragmented packets (this sucks, obviously, as it greatly increases the chance of losing a packet and having to retransmit it). I am talking about typical/average packet size, not MTU. LBNs, unlike faster nodes, rarely have a chance to coalesce reject responses (over max 100ms), and thus send improportionally more tiny packets resulting in much higher protocols overhead. Thus having LBNs to mostly cater SSKs not CHKs results in lowest imaginable usefulness of LBNs for network as a whole. BTW in my experience typical/default dialup/PPP MTU is 1500 minus link level headers, like ethernet. 576 is a reasonable adjustment for interactive traffic like ssh but I fail to remember if it was used as default since the time the super fast 28800 bod modems became common. :) 1400+ is the typical size of GPRS PPP packets too, and the same holds true for other popular wireless mediae like BlueTooth or WiFi; so I have no concerns regarding IP fragmentation. > Please explain why the patch doesn't result in never accepting a single SSK? I can not. :) Can you explain why the current code that penalizes CHKs still gives 5% for them, even if CHKs are 25 times larger and similarly less frequent so have really hard time to arrive at the exact moment when bandwidth liability is not maxed out? Seriously, I believe that goes with 2 facts: - SSK requests are much more frequent, so any temporary drop of CHK requests level enables node to quickly get a bunch of new SSKs accepted for processing; - the large CHK requests (at times while they prevail over SSKs) tend to hit other limits too, like "output bandwidth liability", "Insufficient input/output bandwidth" throttles. Then the small SSK requests quickly pick up all the remaining bandwidth bits. But currently I do not have relevant statistics to prove that. Anyway, please commit the following patch - it should equal out bandwidth rights for CHKs and SSKs at least half way toward fair/expected distribution (and the change will make any difference for high-/over-loaded nodes only). Once most of my peers (and their peers) update, I will study the new node traffic forwarding efficiency and behavior at different bandwidth limits and with different penalization levels again - and then will be in better position to prove the original proposal of reserving bandwidth for 2 CHKs is optimal (or maybe withdraw it). === diff --git a/freenet/src/freenet/node/NodeStats.java b/freenet/src/freenet/node/NodeStats.java index 3b091b4..98c82c3 100644 --- a/freenet/src/freenet/node/NodeStats.java +++ b/freenet/src/freenet/node/NodeStats.java @@ -399,9 +399,8 @@ public class NodeStats implements Persistable { successfulSskFetchBytesSentAverage.currentValue() * node.getNumSSKRequests() + successfulChkInsertBytesSentAverage.currentValue() * node.getNumCHKInserts() + successfulSskInsertBytesSentAverage.currentValue() * node.getNumSSKInserts(); - bandwidthLiabilityOutput += getSuccessfulBytes(isSSK, isInsert, false).currentValue(); double bandwidthAvailableOutput = - node.getOutputBandwidthLimit() * 90; // 90 seconds at full power; we have to leave some time for the search as well + node.getOutputBandwidthLimit() * 80; // 80 seconds at full power; we have to leave some time for the search as well bandwidthAvailableOutput *= NodeStats.FRACTION_OF_BANDWIDTH_USED_BY_REQUESTS; if(bandwidthLiabilityOutput > bandwidthAvailableOutput) { preemptiveRejectReasons.inc("Output bandwidth liability"); @@ -413,9 +412,8 @@ public class NodeStats implements Persistable { successfulSskFetchBytesReceivedAverage.currentValue() * node.getNumSSKRequests() + successfulChkInsertBytesReceivedAverage.currentValue() * node.getNumCHKInserts() + successfulSskInsertBytesReceivedAverage.currentValue() * node.getNumSSKInserts(); - bandwidthLiabilityInput += getSuccessfulBytes(isSSK, isInsert, true).currentValue(); double bandwidthAvailableInput = - node.getInputBandwidthLimit() * 90; // 90 seconds at full power + node.getInputBandwidthLimit() * 80; // 80 seconds at full power if(bandwidthLiabilityInput > bandwidthAvailableInput) { preemptiveRejectReasons.inc("Input bandwidth liability"); return "Input bandwidth liability"; === ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.28 - 17:05:53GMT ----- Why does assuming 80 seconds instead of 90 help? I would have expected it to make the situation worse. Isn't what you want to increment the value you are multiplying the CHK byte counters by if the request is an SSK? In any case I'm not convinced - we accept 32x as many SSKs as CHKs precisely because they use 32x less bandwidth. As far as I can see incrementing the CHK counts but only on a CHK would just result in us never accepting an SSK... But by all means continue to investigate. ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.04.30 - 19:36:36GMT ----- > we accept 32x as many SSKs as CHKs precisely because they use 32x less bandwidth. Sorry but I don't understand the rationale behind this. It seems to be based on the assumption that equal resources should be allocated to SSKs and CHKs, regardless of whether there's equal demand for resources. If we're only getting, say, 16 times as many SSK requests as CHK requests, would we reject CHK requests to keep things "fair"? ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.02 - 16:13:52GMT ----- Why should CHKs be prioritised over SSKs? What do you think of the patch I committed anyway? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/02cf8b59/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 15:00:37 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 16:00:37 +0100 Subject: [Tech] [old frost] bandwidth usage improvements for old nodes (and much more!) In-Reply-To: <200708081559.51360.toad@amphibian.dyndns.org> References: <200708081559.51360.toad@amphibian.dyndns.org> Message-ID: <200708081600.38189.toad@amphibian.dyndns.org> ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.03 - 14:26:53GMT ----- I haven't looked at it in detail, but it looks like NodeStats.java is still keeping a separate liability estimator for each search type, correct? Sorry but I really don't agree with the rationale behind that. It might seem like it makes sense to accept (for example) an SSK request but not a CHK insert if we're really tight on bandwidth, but the knock-on effects of that (especially with a separate throttle for each type of search) are undesirable. It would make more sense to use a single bandwidth liability estimator for all searches (CHK or SSK, insert or request). Yes, it will overestimate the bandwidth liability for SSK requests and underestimate it for CHK inserts, but *on average* it will still be correct, and it won't have any weird knock-on effects. ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.04 - 12:48:20GMT ----- On average a lot of requests will time out if we implement it this way. ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.06 - 15:19:43GMT ----- Sorry, can you explain why? On average we won't go over quota, so why will it cause timeouts? The bandwidth limiter can handle slightly bursty traffic. ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.10 - 12:06:00GMT ----- No, we are not supposed to go much over the limit for more than a few seconds. Because it interferes with other usage of the connection. ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.10 - 13:36:55GMT ----- That's easily dealt with. For example, track the variance as well as the average, and don't accept a search unless there's a 99% chance you have enough bandwidth to handle it. Or calculate the maximum bandwidth a search could ever use (32 KB * (numPeers - 1)) and don't accept any searches unless you have that much bandwidth. It's really not important how you do it - the important thing is that we shouldn't accept an SSK when we would have rejected a CHK, or accept a request when we would have rejected an insert. Otherwise we'll get dismal performance for CHKs and dismal performance for inserts, respectively. ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.10 - 15:27:08GMT ----- Like I said, please have a look at the code. ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.11 - 14:11:42GMT ----- I've looked, and the situation is still the same: we reject CHKs while accepting SSKs, and reject inserts while accepting requests, because there are four different running averages for bandwidth liability. Here's my understanding of the current situation, please correct me if I'm wrong: * We keep four separate bandwidth liability estimators for incoming SSK requests, SSK inserts, CHK requests and CHK inserts, respectively * When a request or insert comes in, we use the ping time and bandwidth delay to decide whether to reject it * If it passes those tests, we use one of the four bandwidth liability estimators to decide whether to reject it * We occasionally accept a CHK request or insert even if there isn't enough bandwidth, to keep the bandwidth delay estimator up to date While I was looking at the code I came across some more things I don't understand in NodeDispatcher.java - would you mind giving me some more information? // can accept 1 CHK request every so often, but not with SSKs because they aren't throttled so won't sort out bwlimitDelayTime, which was the whole reason for accepting them when overloaded... // SSKs don't fix bwlimitDelayTime so shouldn't be accepted when overloaded. The first seems to imply that incoming requests as well as local requests pass through the throttle - is that correct? But SSKs bypass the throttle - does that apply to local requests as well as incoming requests? The second suggests that SSKs aren't considered when calculating bwlimitDelayTime - which messages are considered? Thanks, Michael On Wednesday 08 August 2007 15:59, Matthew Toseland wrote: > Unfortunately this thread is rather rambling, it includes lots of > discussion on token passing as well as the original premise. > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.25 - > 20:18:36GMT ----- > > I made some measurements on how freenet node behaves if bandwidth limit is > set low: 10KBps and downto 6KBps (specificially, input bandwidth limit; > output bandwidth limit was set to at least 15KBps but as expected factually > used output bandwidth is comparable (just slightly above) with factually > used input bandwidth). The node itself was running frost but no > uploads/downloads, so absolute majority of network traffic was forwarded > CHK/SSK > requests/inserts. > > Results are interesting enough: CHK traffic becomes as low as 5% (in > packets) of CHK+SSK, while at least 92% of SSK requests were not satisfied > for assorted failures (plus quite some more certainly resulted in NotFound > response due to missing the key in whole network, but I don't have the > number). This makes low traffic node working highly inefficient and > improportionally slow; this also slows down its peers with all the extra > reject traffic. Worse, input bandwidth sometimes goes over set limit, > suggesting that on hardware 33600/56000 Kbps modem and even ISDN things > will just get worse due to increased delays. > > Another thing to note: low bandwidth node (LBN) almost exclusively reject > requests with "input bandwidth liability" reason, and extremely rarely > other reasons. > > Speculating a bit, the same picture will likely be observed for peers of > fast node (1Mbps or more) with many peers having typical home connection of > 256Kbps or less. > > Not sure if simulations ever showed anything like this, but contributing to > network mostly SSK service (and absolute majority of SSK requests fail!) is > rather useless: optimally working network is supposed to transfer at least > one CHK block for each SSK key, and typically much much more (single 10MB > file consists of 481 CHK blocks!), and even if you found SSK but not CHK > the SSK points to, then you failed to find information you requested. > > OK to make the long story short[er], at the end of this message you will > find a small patch that noticably improves LBN situation. Idea is to > reserve some bandwidth for CHK transfers (and SSK inserts, as those are too > rare to penalize, and more valuable than requests). The line directly > before the inserted one implicitly penalizes CHK transfers (as much smaller > SSK requests tend to rereserve bandwidth the next moment it got released > after CHK transfer finish, while much larger CHK requests do not have such > good chance), so bandwidth should be reserved for 2 CHKs at least (and > tests show that's enough to make a difference). > > Another thing I tried was increasing the 90 seconds period up to 120. That > had some (no numbers here; just "noticeable but small") positive effect on > making traffic smoother and staying closer to set limit, without jumping up > and down too much. Where the 90 seconds number came from anyway, and how > dangerous 120 could be? > > Some pros observed and/or thought out during tests of the patch: > - I observe increase of output payload by approx. 15% (of total traffic), > making LBN more useful for its peers. > - the change is negligibly small for faster nodes so should not break > anything globally. > - entire network SSK flood traffic will be toned down a little bit (at > temporary overloaded nodes only), additionally simplifying life for LBNs: > after all, requesting the same SSK every 15 seconds for 35 hours, total > 8100 times (factual numbers from one of the test before this patch applied; > there are over 60 other SSKs that were requested more than 1000 times > during the same period) is just way too much, SSKs are not inserted into > network THAT fast. [does it worth to remember recently seen SSK requests, > and do not forward them if same request was already forwarded within last > 10 minutes and resulted in DNF/RNF? Table of recently requested SSKs that > are closest to the node location should not be too big?]. > > And contras: > - in exceptional conditions (specificially, with less than 2 incoming CHK > requests per 90 seconds; factually I observe 2-7 CHK requests per seconds, > that's 180-630 per 90 seconds) notwithstanding node bandwidth speed, up to > 800 Bps might end being unused. For high bandwidth node that's just way too > small to notice, for LBN that's still acceptable (10% of 56Kbps) and will > decrease roundtrip delays a bit which is always a good thing for so slow > links. > > Other notes: > - distribution of location closeness/number of SSK requests is very nice: > only SSK requests with location very close to node location get repeated > frequently; farther SSK location is, less requests the node sees, with > those SSKs seen only once or two times per 1-2 days period are distributed > evenly among location space. This suggests that routing is working fine. > - As far as I understand, if input bandwidth limit/liability exceeded (but > a packet already received anyway), CHK/SSK request gets instantly rejected > (thus throwing out received bytes while input bandwidth has no spare > volume!); only otherwise node checks if the requested key exists in the > storage. Heh? This feels like a serious bug hurting overall network > performance: better query storage and hopefully send back result (or still > reject if the key not found locally) rather than wait for retry request to > waste more input bandwidth. At least for SSK reject and reply are > comparable in output bandwidth usage, so worth a little delay in response. > Or do I miss something? > > === > diff --git a/freenet/src/freenet/node/NodeStats.java > b/freenet/src/freenet/node/NodeStats.java > index 3b091b4..fb9f8b9 100644 > --- a/freenet/src/freenet/node/NodeStats.java > +++ b/freenet/src/freenet/node/NodeStats.java > @@ -414,6 +414,7 @@ public class NodeStats implements Persistable { > > successfulChkInsertBytesReceivedAverage.currentValue() * > node.getNumCHKInserts() + > > successfulSskInsertBytesReceivedAverage.currentValue() * > node.getNumSSKInserts(); > bandwidthLiabilityInput += getSuccessfulBytes(isSSK, > isInsert, true).currentValue(); > + if (isSSK && !isInsert) > bandwidthLiabilityInput+=successfulChkFetchBytesReceivedAverage.currentValu >e()+successfulChkInsertBytesReceivedAverage.currentValue(); // slightly > penalize SSK requests by reserving bandwidth for 2 additional CHK transfers > (or SSK inserts if any) > double bandwidthAvailableInput = > node.getInputBandwidthLimit() * 90; // 90 seconds > at full power > if(bandwidthLiabilityInput > bandwidthAvailableInput) { > === > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.26 - 16:56:59GMT ----- > > Most SSK requests fail. They DNF. The reason for this is most SSK requests > are polling for data that has not yet been inserted. > > Bandwidth liability is usually the main reason for rejection. If we reach > most of the other reasons, there is a problem - usually a cyclical problem. > The main reason for it is to ensure that we don't accept so many requests > that some of them needlessly timeout even though they succeeded. The > timeout is 120 seconds, so we need the actual transfer to take less than > this; on a request, 30 seconds seems a reasonable upper bound for the > search time. We don't throw out many bytes when we reject a request/insert > because the bulk of it hasn't been sent yet, except with SSKs where > typically a little under half of the total bytes will have been moved. > Ideally we wouldn't send requests until we have a good idea that they will > be accepted, but token passing load balancing is a long way off, not likely > to happen for 0.7.0. > > We cannot control input bandwidth usage precisely. > > Any more info on SSK flooding? Is it simply Frost? > > We can add a failure table, we had one before, however a failure table > which results in actually blocking keys can be extremely dangerous; what I > had envisaged was "per node failure tables" i.e. reroute requests which > have recently failed to a different node since we know it isn't where it's > supposed to be. > > On what do you base the assertion about key closeness? It would be nice to > have a histogram or circle on the stats pages showing recent keys on the > keyspace - can you write a patch? > > As far as your patch goes, surely rejecting more SSK requests would be > counterproductive as it wastes bandwidth? Shouldn't a slow node accept > those requests it's likely to be able to handle? > > I can see an argument that we shouldn't prefer SSKs, and that on slow nodes > we do prefer SSKs... I'm not sure the above is the right way to deal with > it though. The effect of the patch would be to never accept any SSKs unless > we have plenty of spare bandwidth, correct? > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.26 - > 18:41:32GMT ----- > > > Ideally we wouldn't send requests until we have a good idea that they > > will > > be accepted, but token passing load balancing is a long way off, not likely > to happen for 0.7.0. > > Well, even current algorithm implementation has certain room for > improvement. Here is the typical numbers I observe: > > === > unclaimedFIFO Message Counts > * FNPRejectOverload: 89 (45.2%) > * FNPInsertTransfersCompleted: 59 (29.9%) > * FNPDataNotFound: 15 (7.6%) > * packetTransmit: 12 (6.1%) > * FNPRejectLoop: 7 (3.6%) > * FNPAccepted: 6 (3.0%) > * FNPSwapRejected: 4 (2.0%) > * FNPDataInsertRejected: 4 (2.0%) > * FNPRouteNotFound: 1 (0.5%) > * Unclaimed Messages Considered: 197 > === > > FNPRejectOverload always stays at top sometimes with hundreds messages (for > the last hour before unclaimed messages expire, that's alot), and so > indicates that there is some bug (or bugs) with bandwidth limiting obeying. > > > Any more info on SSK flooding? Is it simply Frost? > > Not local frost for sure, it generates just several SSK simultaneous > requests (by default max 8: 6 for boards plus 2 for filesharing, AFAIR; > practically 2-4 simutaneous requests most of the time). Other 100 SSK > requests (without proposed patch) are forwarded ones. > > >We can add a failure table, we had one before, however a failure table > > which > > results in actually blocking keys can be extremely dangerous; > > Is it, having timeout of max few minutes (i.e. at least few times less than > SSK propagation time visible with frost messages)? Is it more dangerous > than current wastage of bandwith for same SSK key requests several times > per minute? Had some simulations been done on that in the past? > > BTW, isn't the observed very low store hit rate results from prioritising > the likely-to-fail SSKs? > > BTW2 the failure table could also act as a targetted content propagation > mechanism: if a node sees SSK insert for a temporary blacklisted > (non-existing) SSK, then forwarding the insert (more likely insert copy, > for security reasons and routing sake) to the original requestor should > speed up propagaton of new SSKs toward the nodes that already > anticipate/await for them. > > >what I had envisaged was "per node failure tables" i.e. reroute requests > > which have recently failed to a different node since we know it isn't where > it's supposed to be. > > At a glance, very nice idea. But LBNs typically answer with reject, not > DFN... even with current code. Probably such rerouting will even further > increase SSK traffic toward LBNs, and get sharply increased volume of SSK > rejects as result. Hmm, some testing/simulation seems really needed here. > > >On what do you base the assertion about key closeness? It would be nice to > > have a histogram or circle on the stats pages showing recent keys on the > keyspace - can you write a patch? > > Mmmm... in fact I just added custom logging, then a wild combination of > grep/sed/sort/uniq to analyze the logs. But let me think, maybe visualizing > a couple of stats files I operate with will be rather trivial... > > But I would rather stay away from stats page graphics at this time, as the > stats files I operate (filtered+sorted+uniqued) with are rather large, > 20-50Mb each - too much memory for the toy. Unless your 'recent' means just > 10-15 minutes at most? > > >As far as your patch goes, surely rejecting more SSK requests would be > > counterproductive as it wastes bandwidth? > > Tests show the opposite: without the patch payload output at stats page > never exceeded 38%, with patch it becomes 53% or little more after several > minutes upon node restart. So, with the patch SSK/CHK forwarding behaviour > 'feels' logical: > > without patch: > - just several CHKs, and over over 100 SSKs very typical. > > with patch: > - most of the time (say, 75%) number of currently forwarded CHK > requests+inserts approximately equals to the number of SSK requests+inserts > (i.e. 10-25 each, depending on set bandwidth limit); > - sometimes (say, 10%) CHK requests start to prevail, but current SSK > requests+inserts seems never go below the amount which CHK get at max > without patch (i.e. 6-10). This is very typical when number of CHK inserts > gets several times higher than CHK requests (close fast peer inserts > something really large?). > - other times (say, 15%) CHK requests+inserts flow does not saturate > bandwidth, and number of SSK requests quickly climbs to 50 or even over > 100+ as it typically gets without the patch. > > That's for LBN. Raising input bandwidth allotment, number of SSKs quickly > grows resembling the situation without the patch. > > So that's why I suggest reserving bandwidth for 2 CHK transfers; 3 would > kill SSKs, 1 still makes SSKs to seriously prevail over CHKs (but > nonetheless gives quite better ratio, so is a legal value to try if the > value of 2 alarms you too much). Just, in case of reserving bandwidth for 1 > extra CHK the proposed patch is not really needed: simply comment out the > line starting with "bandwidthLiabilityInput +=" and decrease 90 seconds > constant to 80 (10 seconds is roughly how much 33.6Kbod modem takes to > transmit a single CHK - using anything noticeably slower than > 28800/33600bod for freenet will not ever work well anyway). > > >Shouldn't a slow node accept those requests it's likely to be able to > > handle? > > Considering the very high chance of SSK request failures (at lest 92%), I > would say the answer is no. But with sane SSK failure rate (say 75% or > below) SSK requests would likely not waste the limited thus precious LBN > bandwidth so fruitlessly. > > The problem, in my belief, is too small size of UDP packets if SSK requests > prevail: PPP(oE)/TCP/FNP overhead becomes too large while LBNs, unlike > faster link nodes, almost never coalesce packets, obviously. > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.27 - 17:19:24GMT ----- > > The current algorithm is working, on most nodes, far better than it has in > *ages*. I'm at 62% of a 700MB ISO, I started inserting it yesterday > morning, and only a few of my peers are backed off - frequently none are > backed off, right now it's 11 connected, 6 backed off, which is more backed > off than I've seen for quite a while. > > Re failure tables: Yes it is extremely dangerous. It can result in > self-reinforcing key censorship, either as an attack or just occurring > naturally. This happened on 0.5. And the hit ratio is only for CHKs iirc. > > Even LBNs don't often send local RejectedOverload's on SSKs *once they have > accepted them*. They may relay downstream RO's but that is not fatal. And > if they reject some requests, so what, it's a slow node, it's bound to > reject some requests with the current load balancing system. > > 10-15 minutes would be interesting. We already show a circle and histogram > of nearby nodes from swapping and of our peers, you'd just have to add > another one. It would be good to have a visual proof that routing is > working on the level of adhering to node specialisations. I didn't expect > it to be working given the load: I'm surprised that it does, it's an > interesting result. > > Packet size has nothing to do with it, ethernet has a 1472 byte maximum. > Dial-up has 576 bytes max, but we ignore it, and use fragmented packets > (this sucks, obviously, as it greatly increases the chance of losing a > packet and having to retransmit it). > > Please explain why the patch doesn't result in never accepting a single > SSK? > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.27 - > 19:31:14GMT ----- > > >Packet size has nothing to do with it, ethernet has a 1472 byte maximum. > > Dial-up has 576 bytes max, but we ignore it, and use fragmented packets > (this sucks, obviously, as it greatly increases the chance of losing a > packet and having to retransmit it). > > I am talking about typical/average packet size, not MTU. LBNs, unlike > faster nodes, rarely have a chance to coalesce reject responses (over max > 100ms), and thus send improportionally more tiny packets resulting in much > higher protocols overhead. Thus having LBNs to mostly cater SSKs not CHKs > results in lowest imaginable usefulness of LBNs for network as a whole. > > BTW in my experience typical/default dialup/PPP MTU is 1500 minus link > level headers, like ethernet. 576 is a reasonable adjustment for > interactive traffic like ssh but I fail to remember if it was used as > default since the time the super fast 28800 bod modems became common. :) > 1400+ is the typical size of GPRS PPP packets too, and the same holds true > for other popular wireless mediae like BlueTooth or WiFi; so I have no > concerns regarding IP fragmentation. > > > Please explain why the patch doesn't result in never accepting a single > > SSK? > > I can not. :) Can you explain why the current code that penalizes CHKs > still gives 5% for them, even if CHKs are 25 times larger and similarly > less frequent so have really hard time to arrive at the exact moment when > bandwidth liability is not maxed out? > > Seriously, I believe that goes with 2 facts: > > - SSK requests are much more frequent, so any temporary drop of CHK > requests level enables node to quickly get a bunch of new SSKs accepted for > processing; > - the large CHK requests (at times while they prevail over SSKs) tend to > hit other limits too, like "output bandwidth liability", "Insufficient > input/output bandwidth" throttles. Then the small SSK requests quickly pick > up all the remaining bandwidth bits. > > But currently I do not have relevant statistics to prove that. > > Anyway, please commit the following patch - it should equal out bandwidth > rights for CHKs and SSKs at least half way toward fair/expected > distribution (and the change will make any difference for high-/over-loaded > nodes only). Once most of my peers (and their peers) update, I will study > the new node traffic forwarding efficiency and behavior at different > bandwidth limits and with different penalization levels again - and then > will be in better position to prove the original proposal of reserving > bandwidth for 2 CHKs is optimal (or maybe withdraw it). > > === > diff --git a/freenet/src/freenet/node/NodeStats.java > b/freenet/src/freenet/node/NodeStats.java > index 3b091b4..98c82c3 100644 > --- a/freenet/src/freenet/node/NodeStats.java > +++ b/freenet/src/freenet/node/NodeStats.java > @@ -399,9 +399,8 @@ public class NodeStats implements Persistable { > successfulSskFetchBytesSentAverage.currentValue() * > node.getNumSSKRequests() + > successfulChkInsertBytesSentAverage.currentValue() > * node.getNumCHKInserts() + > successfulSskInsertBytesSentAverage.currentValue() > * node.getNumSSKInserts(); > - bandwidthLiabilityOutput += getSuccessfulBytes(isSSK, > isInsert, false).currentValue(); > double bandwidthAvailableOutput = > - node.getOutputBandwidthLimit() * 90; // 90 seconds > at full power; we have to leave some time for the search as well > + node.getOutputBandwidthLimit() * 80; // 80 seconds > at full power; we have to leave some time for the search as well > bandwidthAvailableOutput *= > NodeStats.FRACTION_OF_BANDWIDTH_USED_BY_REQUESTS; > if(bandwidthLiabilityOutput > bandwidthAvailableOutput) { > preemptiveRejectReasons.inc("Output bandwidth > liability"); > @@ -413,9 +412,8 @@ public class NodeStats implements Persistable { > > successfulSskFetchBytesReceivedAverage.currentValue() * > node.getNumSSKRequests() + > > successfulChkInsertBytesReceivedAverage.currentValue() * > node.getNumCHKInserts() + > > successfulSskInsertBytesReceivedAverage.currentValue() * > node.getNumSSKInserts(); > - bandwidthLiabilityInput += getSuccessfulBytes(isSSK, > isInsert, true).currentValue(); > double bandwidthAvailableInput = > - node.getInputBandwidthLimit() * 90; // 90 seconds > at full power > + node.getInputBandwidthLimit() * 80; // 80 seconds > at full power > if(bandwidthLiabilityInput > bandwidthAvailableInput) { > preemptiveRejectReasons.inc("Input bandwidth > liability"); > return "Input bandwidth liability"; > === > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.28 - 17:05:53GMT ----- > > Why does assuming 80 seconds instead of 90 help? I would have expected it > to make the situation worse. > > Isn't what you want to increment the value you are multiplying the CHK byte > counters by if the request is an SSK? In any case I'm not convinced - we > accept 32x as many SSKs as CHKs precisely because they use 32x less > bandwidth. As far as I can see incrementing the CHK counts but only on a > CHK would just result in us never accepting an SSK... > > But by all means continue to investigate. > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.04.30 - 19:36:36GMT > ----- > > > we accept 32x as many SSKs as CHKs precisely because they use 32x less > > bandwidth. > > Sorry but I don't understand the rationale behind this. It seems to be > based on the assumption that equal resources should be allocated to SSKs > and CHKs, regardless of whether there's equal demand for resources. If > we're only getting, say, 16 times as many SSK requests as CHK requests, > would we reject CHK requests to keep things "fair"? > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.02 - 16:13:52GMT ----- > > Why should CHKs be prioritised over SSKs? > > What do you think of the patch I committed anyway? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/5ed5214f/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 15:01:36 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 16:01:36 +0100 Subject: [Tech] [old frost] bandwidth usage improvements for old nodes (and much more!) In-Reply-To: <200708081600.38189.toad@amphibian.dyndns.org> References: <200708081559.51360.toad@amphibian.dyndns.org> <200708081600.38189.toad@amphibian.dyndns.org> Message-ID: <200708081601.37106.toad@amphibian.dyndns.org> ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.14 - 17:07:13GMT ----- As far as I can see, there are 4 diferent PACKET SIZE estimators, but they all add up toward a single input liability limit, and separately - for output liability limit. On Wednesday 08 August 2007 16:00, Matthew Toseland wrote: > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.03 - 14:26:53GMT > ----- > > I haven't looked at it in detail, but it looks like NodeStats.java is still > keeping a separate liability estimator for each search type, correct? Sorry > but I really don't agree with the rationale behind that. It might seem like > it makes sense to accept (for example) an SSK request but not a CHK insert > if we're really tight on bandwidth, but the knock-on effects of that > (especially with a separate throttle for each type of search) are > undesirable. It would make more sense to use a single bandwidth liability > estimator for all searches (CHK or SSK, insert or request). Yes, it will > overestimate the bandwidth liability for SSK requests and underestimate it > for CHK inserts, but *on average* it will still be correct, and it won't > have any weird knock-on effects. > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.04 - 12:48:20GMT ----- > > On average a lot of requests will time out if we implement it this way. > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.06 - 15:19:43GMT > ----- > > Sorry, can you explain why? On average we won't go over quota, so why will > it cause timeouts? The bandwidth limiter can handle slightly bursty > traffic. > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.10 - 12:06:00GMT ----- > > No, we are not supposed to go much over the limit for more than a few > seconds. Because it interferes with other usage of the connection. > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.10 - 13:36:55GMT > ----- > > That's easily dealt with. For example, track the variance as well as the > average, and don't accept a search unless there's a 99% chance you have > enough bandwidth to handle it. Or calculate the maximum bandwidth a search > could ever use (32 KB * (numPeers - 1)) and don't accept any searches > unless you have that much bandwidth. It's really not important how you do > it - the important thing is that we shouldn't accept an SSK when we would > have rejected a CHK, or accept a request when we would have rejected an > insert. Otherwise we'll get dismal performance for CHKs and dismal > performance for inserts, respectively. > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.10 - 15:27:08GMT ----- > > Like I said, please have a look at the code. > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.11 - 14:11:42GMT > ----- > > I've looked, and the situation is still the same: we reject CHKs while > accepting SSKs, and reject inserts while accepting requests, because there > are four different running averages for bandwidth liability. Here's my > understanding of the current situation, please correct me if I'm wrong: > > * We keep four separate bandwidth liability estimators for incoming SSK > requests, SSK inserts, CHK requests and CHK inserts, respectively > * When a request or insert comes in, we use the ping time and bandwidth > delay to decide whether to reject it > * If it passes those tests, we use one of the four bandwidth liability > estimators to decide whether to reject it > * We occasionally accept a CHK request or insert even if there isn't enough > bandwidth, to keep the bandwidth delay estimator up to date > > While I was looking at the code I came across some more things I don't > understand in NodeDispatcher.java - would you mind giving me some more > information? > > // can accept 1 CHK request every so often, but not with SSKs because they > aren't throttled so won't sort out bwlimitDelayTime, which was the whole > reason for accepting them when overloaded... > > // SSKs don't fix bwlimitDelayTime so shouldn't be accepted when > overloaded. > > The first seems to imply that incoming requests as well as local requests > pass through the throttle - is that correct? But SSKs bypass the throttle - > does that apply to local requests as well as incoming requests? > > The second suggests that SSKs aren't considered when calculating > bwlimitDelayTime - which messages are considered? > > Thanks, > Michael > > On Wednesday 08 August 2007 15:59, Matthew Toseland wrote: > > Unfortunately this thread is rather rambling, it includes lots of > > discussion on token passing as well as the original premise. > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.25 - > > 20:18:36GMT ----- > > > > I made some measurements on how freenet node behaves if bandwidth limit > > is set low: 10KBps and downto 6KBps (specificially, input bandwidth > > limit; output bandwidth limit was set to at least 15KBps but as expected > > factually used output bandwidth is comparable (just slightly above) with > > factually used input bandwidth). The node itself was running frost but no > > uploads/downloads, so absolute majority of network traffic was forwarded > > CHK/SSK > > requests/inserts. > > > > Results are interesting enough: CHK traffic becomes as low as 5% (in > > packets) of CHK+SSK, while at least 92% of SSK requests were not > > satisfied for assorted failures (plus quite some more certainly resulted > > in NotFound response due to missing the key in whole network, but I don't > > have the number). This makes low traffic node working highly inefficient > > and improportionally slow; this also slows down its peers with all the > > extra reject traffic. Worse, input bandwidth sometimes goes over set > > limit, suggesting that on hardware 33600/56000 Kbps modem and even ISDN > > things will just get worse due to increased delays. > > > > Another thing to note: low bandwidth node (LBN) almost exclusively reject > > requests with "input bandwidth liability" reason, and extremely rarely > > other reasons. > > > > Speculating a bit, the same picture will likely be observed for peers of > > fast node (1Mbps or more) with many peers having typical home connection > > of 256Kbps or less. > > > > Not sure if simulations ever showed anything like this, but contributing > > to network mostly SSK service (and absolute majority of SSK requests > > fail!) is rather useless: optimally working network is supposed to > > transfer at least one CHK block for each SSK key, and typically much much > > more (single 10MB file consists of 481 CHK blocks!), and even if you > > found SSK but not CHK the SSK points to, then you failed to find > > information you requested. > > > > OK to make the long story short[er], at the end of this message you will > > find a small patch that noticably improves LBN situation. Idea is to > > reserve some bandwidth for CHK transfers (and SSK inserts, as those are > > too rare to penalize, and more valuable than requests). The line directly > > before the inserted one implicitly penalizes CHK transfers (as much > > smaller SSK requests tend to rereserve bandwidth the next moment it got > > released after CHK transfer finish, while much larger CHK requests do not > > have such good chance), so bandwidth should be reserved for 2 CHKs at > > least (and tests show that's enough to make a difference). > > > > Another thing I tried was increasing the 90 seconds period up to 120. > > That had some (no numbers here; just "noticeable but small") positive > > effect on making traffic smoother and staying closer to set limit, > > without jumping up and down too much. Where the 90 seconds number came > > from anyway, and how dangerous 120 could be? > > > > Some pros observed and/or thought out during tests of the patch: > > - I observe increase of output payload by approx. 15% (of total traffic), > > making LBN more useful for its peers. > > - the change is negligibly small for faster nodes so should not break > > anything globally. > > - entire network SSK flood traffic will be toned down a little bit (at > > temporary overloaded nodes only), additionally simplifying life for LBNs: > > after all, requesting the same SSK every 15 seconds for 35 hours, total > > 8100 times (factual numbers from one of the test before this patch > > applied; there are over 60 other SSKs that were requested more than 1000 > > times during the same period) is just way too much, SSKs are not inserted > > into network THAT fast. [does it worth to remember recently seen SSK > > requests, and do not forward them if same request was already forwarded > > within last 10 minutes and resulted in DNF/RNF? Table of recently > > requested SSKs that are closest to the node location should not be too > > big?]. > > > > And contras: > > - in exceptional conditions (specificially, with less than 2 incoming CHK > > requests per 90 seconds; factually I observe 2-7 CHK requests per > > seconds, that's 180-630 per 90 seconds) notwithstanding node bandwidth > > speed, up to 800 Bps might end being unused. For high bandwidth node > > that's just way too small to notice, for LBN that's still acceptable (10% > > of 56Kbps) and will decrease roundtrip delays a bit which is always a > > good thing for so slow links. > > > > Other notes: > > - distribution of location closeness/number of SSK requests is very nice: > > only SSK requests with location very close to node location get repeated > > frequently; farther SSK location is, less requests the node sees, with > > those SSKs seen only once or two times per 1-2 days period are > > distributed evenly among location space. This suggests that routing is > > working fine. - As far as I understand, if input bandwidth > > limit/liability exceeded (but a packet already received anyway), CHK/SSK > > request gets instantly rejected (thus throwing out received bytes while > > input bandwidth has no spare volume!); only otherwise node checks if the > > requested key exists in the storage. Heh? This feels like a serious bug > > hurting overall network performance: better query storage and hopefully > > send back result (or still reject if the key not found locally) rather > > than wait for retry request to waste more input bandwidth. At least for > > SSK reject and reply are comparable in output bandwidth usage, so worth a > > little delay in response. Or do I miss something? > > > > === > > diff --git a/freenet/src/freenet/node/NodeStats.java > > b/freenet/src/freenet/node/NodeStats.java > > index 3b091b4..fb9f8b9 100644 > > --- a/freenet/src/freenet/node/NodeStats.java > > +++ b/freenet/src/freenet/node/NodeStats.java > > @@ -414,6 +414,7 @@ public class NodeStats implements Persistable { > > > > successfulChkInsertBytesReceivedAverage.currentValue() * > > node.getNumCHKInserts() + > > > > successfulSskInsertBytesReceivedAverage.currentValue() * > > node.getNumSSKInserts(); > > bandwidthLiabilityInput += getSuccessfulBytes(isSSK, > > isInsert, true).currentValue(); > > + if (isSSK && !isInsert) > > bandwidthLiabilityInput+=successfulChkFetchBytesReceivedAverage.currentVa > >lu e()+successfulChkInsertBytesReceivedAverage.currentValue(); // slightly > > penalize SSK requests by reserving bandwidth for 2 additional CHK > > transfers (or SSK inserts if any) > > double bandwidthAvailableInput = > > node.getInputBandwidthLimit() * 90; // 90 seconds > > at full power > > if(bandwidthLiabilityInput > bandwidthAvailableInput) { > > === > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.26 - 16:56:59GMT > > ----- > > > > Most SSK requests fail. They DNF. The reason for this is most SSK > > requests are polling for data that has not yet been inserted. > > > > Bandwidth liability is usually the main reason for rejection. If we reach > > most of the other reasons, there is a problem - usually a cyclical > > problem. The main reason for it is to ensure that we don't accept so many > > requests that some of them needlessly timeout even though they succeeded. > > The timeout is 120 seconds, so we need the actual transfer to take less > > than this; on a request, 30 seconds seems a reasonable upper bound for > > the search time. We don't throw out many bytes when we reject a > > request/insert because the bulk of it hasn't been sent yet, except with > > SSKs where typically a little under half of the total bytes will have > > been moved. Ideally we wouldn't send requests until we have a good idea > > that they will be accepted, but token passing load balancing is a long > > way off, not likely to happen for 0.7.0. > > > > We cannot control input bandwidth usage precisely. > > > > Any more info on SSK flooding? Is it simply Frost? > > > > We can add a failure table, we had one before, however a failure table > > which results in actually blocking keys can be extremely dangerous; what > > I had envisaged was "per node failure tables" i.e. reroute requests which > > have recently failed to a different node since we know it isn't where > > it's supposed to be. > > > > On what do you base the assertion about key closeness? It would be nice > > to have a histogram or circle on the stats pages showing recent keys on > > the keyspace - can you write a patch? > > > > As far as your patch goes, surely rejecting more SSK requests would be > > counterproductive as it wastes bandwidth? Shouldn't a slow node accept > > those requests it's likely to be able to handle? > > > > I can see an argument that we shouldn't prefer SSKs, and that on slow > > nodes we do prefer SSKs... I'm not sure the above is the right way to > > deal with it though. The effect of the patch would be to never accept any > > SSKs unless we have plenty of spare bandwidth, correct? > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.26 - > > 18:41:32GMT ----- > > > > > Ideally we wouldn't send requests until we have a good idea that they > > > will > > > > be accepted, but token passing load balancing is a long way off, not > > likely to happen for 0.7.0. > > > > Well, even current algorithm implementation has certain room for > > improvement. Here is the typical numbers I observe: > > > > === > > unclaimedFIFO Message Counts > > * FNPRejectOverload: 89 (45.2%) > > * FNPInsertTransfersCompleted: 59 (29.9%) > > * FNPDataNotFound: 15 (7.6%) > > * packetTransmit: 12 (6.1%) > > * FNPRejectLoop: 7 (3.6%) > > * FNPAccepted: 6 (3.0%) > > * FNPSwapRejected: 4 (2.0%) > > * FNPDataInsertRejected: 4 (2.0%) > > * FNPRouteNotFound: 1 (0.5%) > > * Unclaimed Messages Considered: 197 > > === > > > > FNPRejectOverload always stays at top sometimes with hundreds messages > > (for the last hour before unclaimed messages expire, that's alot), and so > > indicates that there is some bug (or bugs) with bandwidth limiting > > obeying. > > > > > Any more info on SSK flooding? Is it simply Frost? > > > > Not local frost for sure, it generates just several SSK simultaneous > > requests (by default max 8: 6 for boards plus 2 for filesharing, AFAIR; > > practically 2-4 simutaneous requests most of the time). Other 100 SSK > > requests (without proposed patch) are forwarded ones. > > > > >We can add a failure table, we had one before, however a failure table > > > which > > > > results in actually blocking keys can be extremely dangerous; > > > > Is it, having timeout of max few minutes (i.e. at least few times less > > than SSK propagation time visible with frost messages)? Is it more > > dangerous than current wastage of bandwith for same SSK key requests > > several times per minute? Had some simulations been done on that in the > > past? > > > > BTW, isn't the observed very low store hit rate results from prioritising > > the likely-to-fail SSKs? > > > > BTW2 the failure table could also act as a targetted content propagation > > mechanism: if a node sees SSK insert for a temporary blacklisted > > (non-existing) SSK, then forwarding the insert (more likely insert copy, > > for security reasons and routing sake) to the original requestor should > > speed up propagaton of new SSKs toward the nodes that already > > anticipate/await for them. > > > > >what I had envisaged was "per node failure tables" i.e. reroute requests > > > > which have recently failed to a different node since we know it isn't > > where it's supposed to be. > > > > At a glance, very nice idea. But LBNs typically answer with reject, not > > DFN... even with current code. Probably such rerouting will even further > > increase SSK traffic toward LBNs, and get sharply increased volume of SSK > > rejects as result. Hmm, some testing/simulation seems really needed here. > > > > >On what do you base the assertion about key closeness? It would be nice > > > to > > > > have a histogram or circle on the stats pages showing recent keys on the > > keyspace - can you write a patch? > > > > Mmmm... in fact I just added custom logging, then a wild combination of > > grep/sed/sort/uniq to analyze the logs. But let me think, maybe > > visualizing a couple of stats files I operate with will be rather > > trivial... > > > > But I would rather stay away from stats page graphics at this time, as > > the stats files I operate (filtered+sorted+uniqued) with are rather > > large, 20-50Mb each - too much memory for the toy. Unless your 'recent' > > means just 10-15 minutes at most? > > > > >As far as your patch goes, surely rejecting more SSK requests would be > > > > counterproductive as it wastes bandwidth? > > > > Tests show the opposite: without the patch payload output at stats page > > never exceeded 38%, with patch it becomes 53% or little more after > > several minutes upon node restart. So, with the patch SSK/CHK forwarding > > behaviour 'feels' logical: > > > > without patch: > > - just several CHKs, and over over 100 SSKs very typical. > > > > with patch: > > - most of the time (say, 75%) number of currently forwarded CHK > > requests+inserts approximately equals to the number of SSK > > requests+inserts (i.e. 10-25 each, depending on set bandwidth limit); > > - sometimes (say, 10%) CHK requests start to prevail, but current SSK > > requests+inserts seems never go below the amount which CHK get at max > > without patch (i.e. 6-10). This is very typical when number of CHK > > inserts gets several times higher than CHK requests (close fast peer > > inserts something really large?). > > - other times (say, 15%) CHK requests+inserts flow does not saturate > > bandwidth, and number of SSK requests quickly climbs to 50 or even over > > 100+ as it typically gets without the patch. > > > > That's for LBN. Raising input bandwidth allotment, number of SSKs quickly > > grows resembling the situation without the patch. > > > > So that's why I suggest reserving bandwidth for 2 CHK transfers; 3 would > > kill SSKs, 1 still makes SSKs to seriously prevail over CHKs (but > > nonetheless gives quite better ratio, so is a legal value to try if the > > value of 2 alarms you too much). Just, in case of reserving bandwidth for > > 1 extra CHK the proposed patch is not really needed: simply comment out > > the line starting with "bandwidthLiabilityInput +=" and decrease 90 > > seconds constant to 80 (10 seconds is roughly how much 33.6Kbod modem > > takes to transmit a single CHK - using anything noticeably slower than > > 28800/33600bod for freenet will not ever work well anyway). > > > > >Shouldn't a slow node accept those requests it's likely to be able to > > > handle? > > > > Considering the very high chance of SSK request failures (at lest 92%), I > > would say the answer is no. But with sane SSK failure rate (say 75% or > > below) SSK requests would likely not waste the limited thus precious LBN > > bandwidth so fruitlessly. > > > > The problem, in my belief, is too small size of UDP packets if SSK > > requests prevail: PPP(oE)/TCP/FNP overhead becomes too large while LBNs, > > unlike faster link nodes, almost never coalesce packets, obviously. > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.27 - 17:19:24GMT > > ----- > > > > The current algorithm is working, on most nodes, far better than it has > > in *ages*. I'm at 62% of a 700MB ISO, I started inserting it yesterday > > morning, and only a few of my peers are backed off - frequently none are > > backed off, right now it's 11 connected, 6 backed off, which is more > > backed off than I've seen for quite a while. > > > > Re failure tables: Yes it is extremely dangerous. It can result in > > self-reinforcing key censorship, either as an attack or just occurring > > naturally. This happened on 0.5. And the hit ratio is only for CHKs iirc. > > > > Even LBNs don't often send local RejectedOverload's on SSKs *once they > > have accepted them*. They may relay downstream RO's but that is not > > fatal. And if they reject some requests, so what, it's a slow node, it's > > bound to reject some requests with the current load balancing system. > > > > 10-15 minutes would be interesting. We already show a circle and > > histogram of nearby nodes from swapping and of our peers, you'd just have > > to add another one. It would be good to have a visual proof that routing > > is working on the level of adhering to node specialisations. I didn't > > expect it to be working given the load: I'm surprised that it does, it's > > an interesting result. > > > > Packet size has nothing to do with it, ethernet has a 1472 byte maximum. > > Dial-up has 576 bytes max, but we ignore it, and use fragmented packets > > (this sucks, obviously, as it greatly increases the chance of losing a > > packet and having to retransmit it). > > > > Please explain why the patch doesn't result in never accepting a single > > SSK? > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.27 - > > 19:31:14GMT ----- > > > > >Packet size has nothing to do with it, ethernet has a 1472 byte maximum. > > > > Dial-up has 576 bytes max, but we ignore it, and use fragmented packets > > (this sucks, obviously, as it greatly increases the chance of losing a > > packet and having to retransmit it). > > > > I am talking about typical/average packet size, not MTU. LBNs, unlike > > faster nodes, rarely have a chance to coalesce reject responses (over max > > 100ms), and thus send improportionally more tiny packets resulting in > > much higher protocols overhead. Thus having LBNs to mostly cater SSKs not > > CHKs results in lowest imaginable usefulness of LBNs for network as a > > whole. > > > > BTW in my experience typical/default dialup/PPP MTU is 1500 minus link > > level headers, like ethernet. 576 is a reasonable adjustment for > > interactive traffic like ssh but I fail to remember if it was used as > > default since the time the super fast 28800 bod modems became common. :) > > 1400+ is the typical size of GPRS PPP packets too, and the same holds > > true for other popular wireless mediae like BlueTooth or WiFi; so I have > > no concerns regarding IP fragmentation. > > > > > Please explain why the patch doesn't result in never accepting a single > > > SSK? > > > > I can not. :) Can you explain why the current code that penalizes CHKs > > still gives 5% for them, even if CHKs are 25 times larger and similarly > > less frequent so have really hard time to arrive at the exact moment when > > bandwidth liability is not maxed out? > > > > Seriously, I believe that goes with 2 facts: > > > > - SSK requests are much more frequent, so any temporary drop of CHK > > requests level enables node to quickly get a bunch of new SSKs accepted > > for processing; > > - the large CHK requests (at times while they prevail over SSKs) tend to > > hit other limits too, like "output bandwidth liability", "Insufficient > > input/output bandwidth" throttles. Then the small SSK requests quickly > > pick up all the remaining bandwidth bits. > > > > But currently I do not have relevant statistics to prove that. > > > > Anyway, please commit the following patch - it should equal out bandwidth > > rights for CHKs and SSKs at least half way toward fair/expected > > distribution (and the change will make any difference for > > high-/over-loaded nodes only). Once most of my peers (and their peers) > > update, I will study the new node traffic forwarding efficiency and > > behavior at different bandwidth limits and with different penalization > > levels again - and then will be in better position to prove the original > > proposal of reserving bandwidth for 2 CHKs is optimal (or maybe withdraw > > it). > > > > === > > diff --git a/freenet/src/freenet/node/NodeStats.java > > b/freenet/src/freenet/node/NodeStats.java > > index 3b091b4..98c82c3 100644 > > --- a/freenet/src/freenet/node/NodeStats.java > > +++ b/freenet/src/freenet/node/NodeStats.java > > @@ -399,9 +399,8 @@ public class NodeStats implements Persistable { > > successfulSskFetchBytesSentAverage.currentValue() > > * node.getNumSSKRequests() + > > > > successfulChkInsertBytesSentAverage.currentValue() * > > node.getNumCHKInserts() + > > > > successfulSskInsertBytesSentAverage.currentValue() * > > node.getNumSSKInserts(); > > - bandwidthLiabilityOutput += getSuccessfulBytes(isSSK, > > isInsert, false).currentValue(); > > double bandwidthAvailableOutput = > > - node.getOutputBandwidthLimit() * 90; // 90 > > seconds at full power; we have to leave some time for the search as well > > + node.getOutputBandwidthLimit() * 80; // 80 > > seconds at full power; we have to leave some time for the search as well > > bandwidthAvailableOutput *= > > NodeStats.FRACTION_OF_BANDWIDTH_USED_BY_REQUESTS; > > if(bandwidthLiabilityOutput > bandwidthAvailableOutput) { > > preemptiveRejectReasons.inc("Output bandwidth > > liability"); > > @@ -413,9 +412,8 @@ public class NodeStats implements Persistable { > > > > successfulSskFetchBytesReceivedAverage.currentValue() * > > node.getNumSSKRequests() + > > > > successfulChkInsertBytesReceivedAverage.currentValue() * > > node.getNumCHKInserts() + > > > > successfulSskInsertBytesReceivedAverage.currentValue() * > > node.getNumSSKInserts(); > > - bandwidthLiabilityInput += getSuccessfulBytes(isSSK, > > isInsert, true).currentValue(); > > double bandwidthAvailableInput = > > - node.getInputBandwidthLimit() * 90; // 90 seconds > > at full power > > + node.getInputBandwidthLimit() * 80; // 80 seconds > > at full power > > if(bandwidthLiabilityInput > bandwidthAvailableInput) { > > preemptiveRejectReasons.inc("Input bandwidth > > liability"); > > return "Input bandwidth liability"; > > === > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.28 - 17:05:53GMT > > ----- > > > > Why does assuming 80 seconds instead of 90 help? I would have expected it > > to make the situation worse. > > > > Isn't what you want to increment the value you are multiplying the CHK > > byte counters by if the request is an SSK? In any case I'm not convinced > > - we accept 32x as many SSKs as CHKs precisely because they use 32x less > > bandwidth. As far as I can see incrementing the CHK counts but only on a > > CHK would just result in us never accepting an SSK... > > > > But by all means continue to investigate. > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.04.30 - 19:36:36GMT > > ----- > > > > > we accept 32x as many SSKs as CHKs precisely because they use 32x less > > > > bandwidth. > > > > Sorry but I don't understand the rationale behind this. It seems to be > > based on the assumption that equal resources should be allocated to SSKs > > and CHKs, regardless of whether there's equal demand for resources. If > > we're only getting, say, 16 times as many SSK requests as CHK requests, > > would we reject CHK requests to keep things "fair"? > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.02 - 16:13:52GMT > > ----- > > > > Why should CHKs be prioritised over SSKs? > > > > What do you think of the patch I committed anyway? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://emu.freenetproject.org/pipermail/tech/attachments/20070808/68ec3f19/attachment.pgp From toad at amphibian.dyndns.org Wed Aug 8 15:02:39 2007 From: toad at amphibian.dyndns.org (Matthew Toseland) Date: Wed, 8 Aug 2007 16:02:39 +0100 Subject: [Tech] [old frost] bandwidth usage improvements for old nodes (and much more!) In-Reply-To: <200708081600.38189.toad@amphibian.dyndns.org> References: <200708081559.51360.toad@amphibian.dyndns.org> <200708081600.38189.toad@amphibian.dyndns.org> Message-ID: <200708081602.39638.toad@amphibian.dyndns.org> ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.03 - 14:26:53GMT ----- I haven't looked at it in detail, but it looks like NodeStats.java is still keeping a separate liability estimator for each search type, correct? Sorry but I really don't agree with the rationale behind that. It might seem like it makes sense to accept (for example) an SSK request but not a CHK insert if we're really tight on bandwidth, but the knock-on effects of that (especially with a separate throttle for each type of search) are undesirable. It would make more sense to use a single bandwidth liability estimator for all searches (CHK or SSK, insert or request). Yes, it will overestimate the bandwidth liability for SSK requests and underestimate it for CHK inserts, but *on average* it will still be correct, and it won't have any weird knock-on effects. ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.04 - 12:48:20GMT ----- On average a lot of requests will time out if we implement it this way. ----- mrogers at UU