Skip to content

Conversation

@danielmgit
Copy link

No description provided.

edumazet and others added 9 commits June 15, 2016 15:59
We want to get rid of generic qdisc throttled management,
so this qdisc has to use a private flag.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
So far no qdisc ever unset the throttled bit at enqueue() time,
so CBQ usage of qdisc_is_throttled() was flaky.

Since __QDISC_STATE_THROTTLED set/unset is way too expensive
considering that only CBQ was eventually caring for this status,
it would make sense to implement a Qdisc ops ->is_throttled()
if we find that this is needed.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Looks like it is only there as some optimization attempt.

Since __QDISC_STATE_THROTTLED set/unset is way too expensive,
and netem is the last user, just remove this check.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
__QDISC_STATE_THROTTLED bit manipulation is rather expensive
for HTB and few others.

I already removed it for sch_fq in commit f2600cf
("net: sched: avoid costly atomic operation in fq_dequeue()")
and so far nobody complained.

When one ore more packets are stuck in one or more throttled
HTB class, a htb dequeue() performs two atomic operations
to clear/set __QDISC_STATE_THROTTLED bit, while root qdisc
lock is held.

Removing this pair of atomic operations bring me a 8 % performance
increase on 200 TCP_RR tests, in presence of throttled classes.

This patch has no side effect, since nothing actually uses
disc_is_throttled() anymore.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
In IPv6 the ToS values are part of the flowlabel in flowi6 and get
extracted during fib rule lookup, but we forgot to correctly initialize
the flowlabel before the routing lookup.

Reported-by: <[email protected]>
Signed-off-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Since msleep is based on jiffies the PHY reset could take longer
than expected. So use msleep for values greater than 20 msec otherwise
usleep_range.

Signed-off-by: Stefan Wahren <[email protected]>
Acked-by: Fugang Duan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
sch_atm returns this when TC_ACT_SHOT classification occurs.

But all other schedulers that use tc_classify
(htb, hfsc, drr, fq_codel ...) return NET_XMIT_SUCCESS | __BYPASS
in this case so just do that in atm.

BATMAN uses it as an intermediate return value to signal
forwarding vs. buffering, but it did not return POLICED to
callers outside of BATMAN.

Reviewed-by: Sven Eckelmann <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
The commit e858fae ("virtio_net: use common code for virtio_net_hdr
and skb GSO conversion") replaced the tun code for header manipulation
with the generic helpers. While doing so, it implictly moved the
skb_partial_csum_set() invocation after eth_type_trans(), which
invalidate the current gso start/offset values.
Fix it by moving the helper invocation before the mac pulling.

Fixes: e858fae ("virtio_net: use common code for virtio_net_hdr and
skb GSO conversion")

Reported-by: David Ahern <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
This patch adjusts Linux RTO calculation to be RFC6298 Standard
compliant with two exceptions:
- Instead of using a MinRTO of 1 second use Linux MinRTO
  -> RTO is not rounded up to MinRTO if it is less anyways
- RTTVAR flooring: RTTVAR = max(RTTVAR, MinRTO)
  -> >= MinRTO will always be added to computed RTO

The first version of this patch was without RTTVAR flooring.
However, this seems to be too aggressive and cause too many (spurious)
retransmissions. This version now uses RTTVAR flooring and seems to
be advantageous compared to Linux historic RTO computation. As
disadvantage of RTTVAR flooring sender limited flows no longer
benefit from decreased response time on packet loss.

A side effect of using this implementation is tcp_sock struct variables
u32 mdev_max_us and u32 mdev_us become obsolete and consequently are
being removed.

Analysis of first patch version compared to Linux implementation:
https://docs.google.com/document/d/1pKmPfnQb6fDK4qpiNVkN8cQyGE4wYDZukcuZfR-BnnM/

Reasoning for historic design:
Sarolahti, P.; Kuznetsov, A. (2002). Congestion Control in Linux TCP.
Conference Paper. Proceedings of the FREENIX Track. 2002 USENIX Annual
https://www.cs.helsinki.fi/research/iwtcp/papers/linuxtcp.pdf

Signed-off-by: Hagen Paul Pfeifer <[email protected]>
Signed-off-by: Daniel Metz <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Yuchung Cheng <[email protected]>
hgn pushed a commit that referenced this pull request Jan 29, 2020
Gautam Ramakrishnan says:

====================
net: sched: add Flow Queue PIE packet scheduler

Flow Queue PIE packet scheduler

This patch series implements the Flow Queue Proportional
Integral controller Enhanced (FQ-PIE) active queue
Management algorithm. It is an enhancement over the PIE
algorithm. It integrates the PIE aqm with a deficit round robin
scheme.

FQ-PIE is implemented over the latest version of PIE which
uses timestamps to calculate queue delay with an additional
option of using average dequeue rate to calculate the queue
delay. This patch also adds a memory limit of all the packets
across all queues to a default value of 32Mb.

 - Patch #1
   - Creates pie.h and moves all small functions and structures
     common to PIE and FQ-PIE here. The functions are all made
     inline.
 - Patch #2 - #8
   - Addresses code formatting, indentation, comment changes
     and rearrangement of structure members.
 - Patch torvalds#9
   - Refactors sch_pie.c by changing arguments to
     calculate_probability(), [pie_]drop_early() and
     pie_process_dequeue() to make it generic enough to
     be used by sch_fq_pie.c. These functions are exported
     to be used by sch_fq_pie.c.
 - Patch torvalds#10
   - Adds the FQ-PIE Qdisc.

For more information:
https://tools.ietf.org/html/rfc8033

Changes from v6 to v7
 - Call tcf_block_put() when destroying the Qdisc as suggested
   by Jakub Kicinski.

Changes from v5 to v6
 - Rearranged struct members according to their access pattern
   and to remove holes.

Changes from v4 to v5
 - This patch series breaks down patch 1 of v4 into
   separate logical commits as suggested by David Miller.

Changes from v3 to v4
 - Used non deprecated version of nla_parse_nested
 - Used SZ_32M macro
 - Removed an unused variable
 - Code cleanup
 All suggested by Jakub and Toke.

Changes from v2 to v3
 - Exported drop_early, pie_process_dequeue and
   calculate_probability functions from sch_pie as
   suggested by Stephen Hemminger.

Changes from v1 ( and RFC patch) to v2
 - Added timestamp to calculate queue delay as recommended
   by Dave Taht
 - Packet memory limit implemented as recommended by Toke.
 - Added external classifier as recommended by Toke.
 - Used NET_XMIT_CN instead of NET_XMIT_DROP as the return
   value in the fq_pie_qdisc_enqueue function.
====================

Signed-off-by: David S. Miller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants