A quick post about TOE (TCP Offload Engine) present these days in about all NIC's. If enabled TCP/IP operations of packets are processed on NIC without interrupting CPU and consuming PCI bandwidth. On Linux systems, TOE can be configured through standard utility Ethtool. Remember, all these parameters are interface specific and process requesting access to network stack to send packet will inherit properties of interface and consequentially knowing about which operation needs to be done by itself and which is going to be offloaded.
To check what is the status of current system supported operations offloaded to NIC use following switch in Ethtool.
ethtool -k ethX
server1 root [tmp] > ethtool -k eth1 Offload parameters for eth11: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: off large-receive-offload: off
Detailed description of all offloaded operations is out of scope for this post. Refer to online resources for that. I will try to provide one-liner descriptions of important fields to proceed.
- Scatter-Gather I/O - Rather than passing one large buffer, small buffers are passed which makes up large buffers. This provides more efficiency than large buffers passed.
- TCP Segmentation Offload - It is the ability to frame data according to size of MTU & same IP header with all packets. Useful when buffer is much larger than MTU on the link. The segmentation into smaller size is offloaded to NIC.
- Generic Segmentation Offload - This is used to postpone the segmentation as long as possible. This performs the segmentation just before the entry into the driver's xmit routine. GSO & TSO are only significantly effective only when MTU is much less than buffer size.
- Generic Receive Offload - GSO only works for transmission of packets. This allows the packets to be re-fragmented at output. Unlike LRO which merges every packets, GRO merges with restriction keeping important fields in packet intact. NAPI API polls for new packets and process packets in batches before passing it to OS.
- Large Receive Offload - This is used for combining multiple incoming packets into single buffer before passing it up to OS stack. Benefits of this is OS sees fewer packets & uses less CPU time.
Depending upon your NIC vendor, names of these processes may vary. Some vendors do provide additional offload processes. My test hardware is having above mentioned features. For the sake of test, I have disabled GRO & LRO. Operation UFO is generally off on all NIC's, reason behind this is UDP packet acknowledgments if they were used are implemented at application layer, CPU needs to be more transparent of all packets & replies. For TSO to work RX,TX & SG are needed to enabled. To enable/disable these operations usage of Ethtool is as follows. Customize it according to requirement. It is to be noted that I kept same TOE configurations on FTP server & client.
ethtool -K ethX rx on/off tx on/off sg on/off tso on/off gso on/off gro on/off lro on/off
I have generated a 4KB random data file using dd utility to transfer through FTP.
MTU for the interface under test on server & client is 1500 Bytes. Packet Captures are performed through Tcpdump on client server and later analyzed on Wireshark.
dd if=/dev/urandom of=dat.file bs=1k count=4
tcpdump -i eth1 -w TOE_test.pcap -n not port 22
4KB file is segmented into chunks of data and multiple packets will flow through link. This TCP segmentation is usually done by CPU in absence of TOE, but if TOE is enabled packets will be encapsulated at OS layer directly as 2920 bytes. Now this is very weird if you don't know about TOE and start wondering how 2920 bytes can be sent over 1500 bytes MTU Ethernet Frame. This is the difference between practical observations & theoretical understanding. In Screenshot FTP PUT operation from client to server sends 4KB file in two packets, one of them is highlighted and carries 2920 bytes of data followed by packet of remaining bytes. Here TCP stack is in NIC domain. These packets are handed over to TOE of NIC to do segmentation and sequencing of packets. Packet Capture program hook is exactly at the boundary of OS stack, hence we cannot see actual TCP segmentation happening inside NIC. Ack's of data are dependent on size of data. Disabling GSO/TSO results in normal operation of OS TCP/IP stack. Data packets becomes 1460 bytes which is regular size for 1500 bytes MTU links. TCP operations responsibility shifts to OS stack when TOE is disabled.
|TCP Segmentation Offload & Generic Segmentation Offload Enabled|
|TCP Segmentation Offload & Generic Segmentation Offload Disabled|
Performance improvements are observed by use of TOE's for servers serving large number of concurrent sessions serving homogeneous large data files. PCI bandwidth is conserved because of less management overhead of TCP segmentation which involves continuous communication between NIC & CPU. I am doing study on the performance implications of TOE & will post soon about it. This TOE behavior is needed to be understood because it also affects IDS/IPS systems like Snort which performs threat signature matching through packet captures. That's it for now. :)