I watched nice presentation about how Cloudflare protects itself against DoS. Most of us are not able to do that exactly like them but some of tips were general enough to be used on typical web front server.

I took notes from this presentation and presented here. Thanks to Marek agreement I also reposted all examples (in easier to copy paste way).

Howto prepare against ACK/FIN/RST/X-mas flood

Use conntrack rule:

iptables -A INPUT --dst 1.2.3.4 -m conntrack --ctstate INVALID -j DROP

which will only work with disabled tcp_loose setting (it’s by default enabled) with addition to sysctl:

sysctl -w net.netfilter.nf_conntrack_tcp_loose=0

Howto prepare against SYN floods

SYN flood is hard case - because when you use conntrack it will make your performance worst validating state for every new single packet.

The only way to get around this is to enable syncookies:

sysctl -w net.ipv4.tcp_syncookies=1
sysctl -w net.ipv4.tcp_timestamps=1

Enabling syncookies will cause loose of some of connection informations, that are pretty useful like:

  • window scaling factor
  • ECN bit (Explicit Congestion Notification)

For that we will use tcp_timestamp option, that will use few bits from timestamp field to store some of this informations.

This still may be not efficient enough, but in kernel 4.4 there will be some update to how syncookies are served that should make it few times faster than with older one.

Related docs: https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txtexternal link

Howto prepare against botnet attack

Symptoms:

  • concurrent connection count going up
  • many sockets in orphaned state
  • many sockets in time wait state

Solutions:

  1. Enable connlimit feature on conntrack to limit amount of concurrent connections to our service
  2. Use hashlimits to rate limit SYN packets per IP
  3. Use ipsets to efficiently block many IP/subnet addresses
    • manual blacklisting - feed IP blacklist from HTTP server logs
    • supports subnets, timeouts
    • automatic blacklisting hashlimits
  4. Disable HTTP keep-alives to make this attack look more like SYN flood

This may still not work against DDoS because huge amount of bots won’t allow you to block them efficiently enough.

Some exciting system tweaks and examples from this presentation

I hope to find some time to merge them into template/script that could be used much easier - but first I have to play with these rules a little and test what will be most useful.

NIC: Discard with flow steering

ethtool -N eth3 flow-type udp4 dst-ip 129.168.254.30 dst-port 53 action -1

Flow steering for priority

ethtool -X eth3 weight 0 1 1 1 1 1 1 1 1 1 1
ethtool -N eth3 flow-type tcp4 dst-port 22 action 0

SYN backlog size

sysctl -w net.core.somaxconn=65535
sysctl -w net.ipv4.tcp_max_syn_backlog=65535

It’s rounded to next power of two (in this case to 65536).

SYN backlog decay

sysctl -w net.ipv4.tcp_synack_retries=1

L7 connection count

sysctl -w net.ipv4.tcp_max_orphans=262144
sysctl -w net.ipv4.tcp_orphan_retries=1

sysctl -w net.ipv4.tcp_max_tw_buckets=360000
sysctl -w net.ipv4.tcp_tw_reuse=1
sysctl -w net.ipv4.tcp_fin_timeout=5

L3: u32

iptables -A INPUT \
  --dst 1.2.3.4 \
  --p udp -m udp --dport 53 \
  -m u32 --u32 "6&0xFF=0x6 && 4&0x1FFF=0 && 0>>22&0x3C@4=0x29" \
  -j DROP

L4: Conntrack

iptables -t raw -A PREROUTING \
  -i eth2 \
  --dst 1.2.3.4 \
  -j ACCEPT

iptables -t raw -A PREROUTING \
  -i eth2 \
  -j NOTRACK

iptables -A INPUT \
  --dst 1.2.3.4 \
  -m conntrack --ctstate INVALID \
  -j DROP

Tuning conntrack

sysctl -w net.netfilter.nf_conntrack_tcp_loose=0
sysctl -w net.netfilter.nf_conntrack_helper=0
sysctl -w net.nf_conntrack_max=2000000
echo 2500000 > /sys/module/nf_conntrack/parameters/hashsize

More info about conntrack sysctl options: https://www.kernel.org/doc/Documentation/networking/nf_conntrack-sysctl.txtexternal link

L7: Connlimit

iptables -t raw -A PREROUTING \
  -i eth2 \
  --dst 1.2.4.5 \
  -j ACCEPT

iptables -A INPUT \
  --dst 1.2.3.4 \
  -p tcp -m tcp --dport 80 \
  -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN \
  -m connlimit \
  --connlimit-above 10 \
  --connlimit-mask 32 \
  --connlimit-saddr \
  -j DROP

L7: ipset for blacklisting

ipset -exist create ta_d335c5 hash:net family inet

ipset add ta_d335c5 192.168.0.0/16
ipset add ta_d335c5 10.0.0/8

iptables -A INPUT \
  -m set --match-set ta_d335c5 src \
  -j DROP

L7: being evil - TARPIT

iptables -A INPUT \
  -m set --match-set ta_d335c5 src \
  -j TARPIT

TARPIT target will imitate successful connection for the client (bot in this case) but without responding to it’s queries. It will cost that bot a lot more resources and time to timeout and drop this connection than when using DROP or REJECT here.

L7: hashlimit for rate limiting

iptables -A  INPUT \
  --dst 1.2.3.4 -p tcp -m tcp --dport 80 \
  --tcp-flags FIN,SYN,RST,PSH,ACK,URG SYN \
  -m hashlimit \
  --hashlimi-above 123/sec \
  --hashlimit-burst 5 \
  --hashlimit-mode srcip \
  --hashlimit-srcmask 24 \
  --hashlimit-name 341654b1d4af9bf \
  -j DROP

L7: auto-blacklist

ipset -exist create blacklist hash:net timeout 60

iptables -A INPUT \
  --dst 1.2.3.4 \
  -m set --match-set blaclist src \
  -j DROP

iptables -A INPUT \
  --dst 1.2.3.4 -p tcp -m tcp --dport 80 \
  --tcp-flags FIN,SYN,RST,PSH,ACK,URG SYN \
  -m hashlimit \
  --hashlimit-above 100/sec \
  --hashlimit-mode srcip \
  --hashlimit-srcmask 24 \
  --hashlimit-name hl_blacklist \
  -j SET --add-set blacklist src

L7+: payload in TCP - string

iptables -A INPUT \
  --dst 1.2.3.4 \
  -p tcp --dport 80 \
  -m string \
  --hex-string 486f737777777777... \
  --from 231 --to 300 \
  -j DROP

For more informations and explanations watch this great presentation:

And here is the whole presentation with additional examples:
https://speakerdeck.com/majek04/lessons-from-defending-the-indefensibleexternal link