Optimizing TCP Configurations: Throughput vs Response Speeds

There are two competing configurations when it comes to a TCP connection: large data throughput, and fast interactive response time. Large data throughput settings maximize the amount of data sent in each packet, while fast interactive response time settings send packets as soon as data is available, even if it is only a single byte of information.

The Secret's Out. Try Our ARM® Embedded Dev Kit Today.

Or, learn more about NetBurner IoT.

The default settings for TCP on your NetBurner platform configure the connection as a “medium” setting, providing good data throughput as well as fairly fast response time, which is fine for the vast majority of applications. However, if your application has specific needs in one direction or the other, you can adjust the settings with just a few function calls. It is important to note that TCP sockets are bidirectional, and the adjustments for maximizing transmit and receive are different and independent of each other.

Buffers

Buffers are a key component in network communications and their setup can be changed to favor throughput or response time. Before we discuss the buffer settings, though, let’s take a look at what the buffers are and how they are used. The NetBurner TCP/IP system creates a pool of buffers that are used for network and serial communication. The default number of total buffers is 256. These buffers are used by the network stack to send and receive data. For example, when a network operation needs to send data it requests a buffer, writes the data that it wants to send to the buffer. When it is done with it the buffer is freed back to the pool.

The size of each buffer is 1548 bytes. Subtracting the TCP header and some other options provides 1500 bytes of data payload per buffer. The section of the TCP header that is used to tell the other side of the connection how much receive room we have is 16-bits. This is called the window size, making the maximum window size 65536. Therefore, the maximum number of buffers allocated to a socket should be no more than 65536/1500 = 43. Any more than that will have little or no effect.

Note that if an application has a problem which causes all buffers to be in use, the system will be unable to communicate until some are freed up. The number of free buffers can be determined at any time with the GetFreeCount() function in buffers.h.

Receive Settings

When packets are received by your NetBurner device, they go into the TCP receive system. Each packet has a sequence number associated with it. Packets that are received in sequence/order are put in a receive buffer. When your application reads from a TCP socket file descriptor, they are actually reading from the receive buffers associated with the TCP socket. The TCP system also tells the other side of the connection how much unused buffer space is available (window size). Our first adjustable parameter specifies how many buffers the receive system will use to accumulate data for the specified TCP socket:

SetSocketRxBuffers(fd, n);

Here, fd is the file descriptor of the socket, and n is the number of buffers which can range from the default value of 5, up to 43 as discussed in the TCP Buffers section. Note that throughout this document we will fd to describe the file descriptor for the TCP connection. This is the value returned from functions such as connect() or accept().

We mentioned that packets in sequence are put into the receive buffer, but what about packets that are not received in sequence? The underlying IP network can be unpredictable: packets can be lost, received out of order, or even duplicated on the way to their destination by other network equipment. If we are expecting packet number 99 and instead receive packet number 101 what do we do? The NetBurner TCP stack saves these out of order packets in buffers so the other side does not have to resend them. The question is, how much space should be set aside for these out of order packets? This brings us to our second TCP Receive performance setting:

SetOutOfOrderBuffers( fd, max);

This setting specifies the maximum number of buffers to allocate to store out of sequence packets. If an out of order packet cannot be stored, the sender will have to retransmit the packet. If you want to maximize receive performance on a lossy network, you may want to set this to a value 1 less than the value used in SetSocketRxBuffers(fd, n);. The default value is 5.

Transmit Settings

Sending TCP data has additional options to consider. The first decision to make is whether you want to maximize the data throughput or minimize latency. For data throughput, the system is most efficient when it sends full size frames with a 1500 byte payload (maximum size for an Ethernet frame). While this is best for sending bulk data, the consequence is latency for an interactive system, such as a Telnet session, which typically sends just 1 byte of data in a packet.

NetBurner provides several individual settings that can be enabled and disabled, depending on your network considerations. These are listed below.

TCP Configuration to Minimize latency

Disable the NAGLE algorithm so every byte is sent when available:

setsockoption(fd, SO_NONAGLE);

Turn on the TCP PUSH flag by clearing SO_NOPUSH. As mentioned earlier, when you write to a TCP socket, you are actually writing to a buffer. This allows for more efficient transfer when sending more than one maximum segment size (MSS). Enabling the PUSH flag tells the system to send the buffer immediately rather than waiting for additional data.

clrsockoption(fd, SO_NOPUSH);

These settings will reduce the bulk throughput, but ensure the most responsive interactive connection. The default values are PUSH and NAGLE enabled.

TCP Configuration to Maximize Bulk Data Throughput

These settings cause the system to send maximum sized packets whenever possible.

Enable the NAGLE algorithm so data can accumulate before sending:

clrsockoption(fd, SO_NONAGLE);

Disable the PUSH option:

setsockoption(fd, SO_NOPUSH);

Then either just prior to or just after your last bulk data write, enable the PUSH option:

clrsockoption(fd, SO_NOPUSH);

Putting It All Together

Now let’s talk about the path through the system. When you call a write function on a TCP socket, the system writes the data requested into a TX buffer. You should always check the return value of a write function. If the TX buffer has less space than what you requested to write, the write return value will be a number lower than what you requested, and your application will need to handle that and call the write operation enough times to send all your data.

Tip: If you do not want to track how much data is sent each time, you can use the writeall() function instead of write(). The downside to using writeall()is that it will take slightly longer for errors to propagate back to you, and your writing task will be asleep or pending until all the data has made it into the TX buffer.

This brings us to specifying the number of TX buffers, which controls how much data can be written/buffered:

SetSocketTxBuffers(fd, n);

Note that n is the number of buffers, not the number of bytes.

Once data is in the TX buffer, the TCP system attempts to send it out to the other side. Assuming the window on the other side says we are allowed to send a packet, the TCP system takes a packet worth of data off the TX buffer, packages it up, and sends it across the network. It also keeps a copy of that packet in an unacknowledged buffers list. It will stay on this unacknowledged list until the other side of the TCP connection acknowledges receipt. It will retransmit this buffer automatically if it is lost. Controlling how big this list can get is our next TX tuning parameter:

SetSocketUnackBuffers( fd, n);

This default is 5. It would not make any sense to make it larger than the maximum window size of 65535, so values from 5 to 43 are appropriate.

Real World Examples

*Lossy Networks Might Consume Substantial Networking Resources*

We have to remember that out in the real world TCP traffic can get messy. If an application is in an environment where devices get rebooted or power cycled frequently, crashes occur, or it is running on a lossy network (such as a satellite link) in which packets are lost, you may be using more resources than you think. If a TCP socket is left hanging by the client it was connected to, all those buffers will remain allocated until the socket is cleaned up. If a high percentage of packets are lost, it will take more time for retransmission to occur.

In general, testing has shown that the maximum bulk data speeds on a LAN can be achieved with 20 buffers. For traffic that is routed over the Internet, a value of 40 provided the best result. Results will vary depending on the amount of network traffic, link speeds, and client response times.

Remember, this is only one side of the connection; the other client settings are just as important. The tests mentioned were done on a NetBurner MOD54415 with a 4MB file transfer:

On a 10/100 LAN, increasing the buffers to 20 provided a 34% speed increase, from .907 to .658 seconds. Increasing beyond that did not improve performance.
On an Internet connection limited to 25M/25M, setting the buffers to 20 provided a 267% speed increase, from 5.66 to 2.12 seconds. Setting the buffers to 40 provided a 329% increase with a time of 1.72 seconds.

Final Thoughts

In conclusion, the system has a buffer pool of 256 buffers, and each socket is allocated 5 each of ACK, RX and TX, providing the best compromise of performance, response time and resource usage. You can customize each socket’s allocation using the SetSocket functions. If you do not achieve the expected performance, then a good place to start is by using the GetFreeCount() function to ensure you are not running out of buffers.

We hope you found this article helpful. We’d love to hear any thoughts or questions you have in the comments below. If you’d rather talk to us directly, feel free to email us at [email protected].

Tags: response time, TCP

Share this post

Subscribe to our Newsletter

Get monthly updates from our Learn Blog with the latest in IoT and Embedded technology news, trends, tutorial and best practices. Or just opt in for product change notifications.

Optimizing TCP Configurations: Throughput vs Response Speeds

The Secret's Out. Try Our ARM® Embedded Dev Kit Today.

Buffers

Receive Settings

Transmit Settings

TCP Configuration to Minimize latency

TCP Configuration to Maximize Bulk Data Throughput

Putting It All Together

Real World Examples

Final Thoughts

Related

Share this post

Subscribe to our Newsletter

Recent Posts

Optimizing TCP Configurations: Throughput vs Response Speeds

The Secret's Out. Try Our ARM® Embedded Dev Kit Today.

Buffers

Receive Settings

Transmit Settings

TCP Configuration to Minimize latency

TCP Configuration to Maximize Bulk Data Throughput

Putting It All Together

Real World Examples

Final Thoughts

Related

Share this post

Subscribe to our Newsletter

Recent Posts

Tags