Using 10Gbit Ethernet Adapters with PowerVM SEA: Virtual Ethernet Considerations
Considerations and recommendations for SEA with 10Gbit Ethernet adapters as they’ve become conventional with Power Systems shops.
By Jose Ortega08/01/2018
10Gb Ethernet adapters have become mainstream in IBM Power Systems shops. The first thought when having these adapters is to use them the with PowerVM SEA, since the SEA device has had a long come way with the IBM Power Systems platform and provides the most advanced functions such as Live Partition Mobility. However, it’s necessary to keep in mind a couple of considerations before using these adapters with PowerVM.
When I researched this topic, I found a lot of documentation using POWER7, and it’s important to keep in mind is that the results on POWER7 with PCI Express Gen2 Bus are much different (lower) than using POWER8 with Gen3 technology. In this article, I’ll talk about the general considerations when using these adapters, the recommendations for virtual Ethernet and virtual Ethernet performance statistics.
In a future article, I’ll discuss the considerations and recommendations for PowerSEA with a Link aggregation based on two 10Gbit adapters. The idea is to get clear picture for expectations when using SEA with 10Gbit Ethernet adapters.
The reason this topic is broken up in two articles is in order to get the maximum performance for the SEA, it is important to ensure that the all recommendations are implemented in the whole PowerVM Network stack, since any bottleneck will prevent a 10Gb Adapter providing maximum throughput.
The test results that I’ll show are using two S824 - Server 8286-42A - with 24 cores - 3525 MHz processor clock speed, VIOS 126.96.36.199 with 10Gbit adapter RoCE Converged Network Adapter - Feature Code EC2N and AIX LPARs 7200-02-01-1731.
The information in this article is based on Virtualization and the world of 10Gbit Ethernet written by Gareth Coates, Ethernet on Power written by Steven Knudson, AIX information center, Redbooks and test results taken in my lab environment.
- First of all, when using PowerVM SEA, don’t expect 10Gb to be 10 faster than a 1Gb. It’s needed to give enough data to drive the maximum capacity for the adapter.
- Using PowerVM SEA and a virtual Ethernet adapter on the client side is going to be more expensive in terms of CPU and latency. This is because PowerVM SEA do not use hardware-direct connection, and extra context switches increase the processing workload.
- To drive the maximum speed for 10Gbit adapters, you have to guarantee enough CPU resources in both VIOS and client LPAR when want to achieve the maximum speed.
- AIX Version 7.2 is going comes with mtu_bypass enabled by default, which makes a huge improvement over virtual Ethernet performance. More detail in virtual side considerations.
- Consider using dedicated physical adapters when you have LPARs with very high demanding throughput or need low latency.
- It could be difficult to achieve the line speed of 10Gbit Ethernet through an SEA with a single trunk adapter on POWER7, but it’s possible with two trunk adapters. Take a look at Gareth Coats’ presentation. It is considered that the maximum throughput for trunk adapter could be about 13Gbits/sec on POWER7.
A Word of Caution About Network Benchmark Tools
The type of network benchmark tool you choose to benchmark the network is very important. Typically they are FTP and DD, netperf and iperf. A word of caution: The results from these tools will vary. Eventually the iperf will report more throughput than the netperf, and netpert will report more traffic than FTP since the latter is single threaded. In addition, iperf has the advantage of being able to define several processes at parallel, therefore the throughput is going to be greater. For this reason, I’ll be using iperf to run the tests. Finally, take note that using iperf V2 will report more throughput than iperf V3.
To illustrate, the following are the results from using tools when stressing LPAR to LPAR inter LAN communication with 1 CPU /1VP capped and default Ethernet adapter configuration in AIX 7.2:
Using the right tool is important in order to know where the maximum for the system are.
Finally, FTP could be used when you want to know in what is maximum traffic for a single TCP connection.
With these main considerations in mind, I’d like to explain the recommendations for virtual Ethernet adapters. As I mentioned earlier, any bottleneck in the virtual environment will prevent to reach the maximum speed for the 10Gbit Ethernet adapter.
Considerations and Recommendations for Virtual Ethernet
In order to get the maximum performance when using Shared Ethernet adapter, it’s necessary to guarantee a proper configuration for the LPAR and the virtual Ethernet device. The virtual switch communication is fast and reliable because the TCP/IP packets are copied using the partition’s memory.
Due to its the default configuration, the virtual Ethernet performance on AIX 7.2 is improved dramatically in regards to AIX 7.1. For instance, virtual Ethernet adapters in AIX 7.2 come with mtu_bypass parameter enabled by default. This means the large send is enabled and, as a result, the virtual communication is greatly improved. The following table shows a performance comparison between the default configuration for AIX V7.1 and V7.2:
CPU Resources Impact Over Virtual Ethernet Performance
The most important aspect for virtual Ethernet performance is the processing resources. The idea in the following graph is simple: If the system is CPU constrained with high virtual Ethernet workload, probably linear improvements will be seen in the throughput as more CPU resources are available. The reason of this, the virtual Ethernet adapters do not have an additional processor to offload the calculations tasks for the Ethernet activity. In addition, the out of box internal LAN performance for AIX 7.2 with POWER8 is pretty good.
Despite the fact that these results are taken from POWER8 and POWER9 is now being rolled out, the important matter isn’t the actual throughput, it’s that CPU allocation on the LPAR has a significant impact on throughput rates with virtual Ethernet adapters.
Recommendations for Virtual Ethernet Adapters
In addition to the default AIX 7.2 Configuration and CPU resources, it’s suggested to enable the following parameters for virtual Ethernet devices:
- Data Cache Block Flush (dcbflush): This allows the virtual Ethernet device driver to flush the processor’s data cache of any data after it has been received. It increases CPU utilization but also increases throughput. Run the following command:
# chdev -l entX -a dcbflush_local=yes –P (need a reboot to take effect)
- Dog thread (Thread): By enabling the dog threads feature, the driver queues the incoming packet to the thread and the thread handles calling IP, TCP, and the socket code. Enable this parameter as your LPAR grows in CPU resources. Run the following command:
#ifconfig enX thread #chdev -l enX -a thread=on
- Jumbo frames: Increasing the MTU to 9000 bytes could help you to reduce CPU consumption and increase throughput. It is necessary the MTU for all devices involved in the communication have the same MTU.
#chdev -l enX -a mtu=9000
For this part of the series I didn’t run a Jumbo Frame test. However, for the second article I’ll show the performance impact of using this parameter.
- Increase virtual Ethernet buffers: When load on a virtual Ethernet adapter is very high, packets could be retransmitted and throughput decreased. Use entstat –d and look for dropped packets and errors. There are two reasons this could be happening: Lack of CPU resources or virtual buffer exhaustion. For the latter reason, consider increasing the virtual Ethernet buffers to avoid their exhaustion in case you have enough CPU resources. In my case I run the following commands:
chdev –l entX -a max_buf_tiny=4096 -a min_buf_tiny=2048 -a max_buf_small=4096 -a min_buf_small=2048 -a max_buf_medium=512 -a min_buf_medium=256 -a max_buf_large=128 -a min_buf_large=64 -a max_buf_huge=128 -a min_buf_huge=64 –P (need a reboot to take effect)
- Send and receive TCP/IP buffers: These values specify how much data can be buffered when sending or receiving data. For most workloads the default value of 256Kb is sufficient. However, in my case I found that doubling the default values of 256Kb increased my throughput and reduced CPU consumption. Example:
#chdev -l enX -a tcp_recvspace=524288 -a tcp_sendspace=524288
Tuning attributes impact over virtual Ethernet performance
The virtual Ethernet performance observed by attributes tuning benefit throughput and reduction on CPU utilization. The results that are mentioned on the visual suggest 1.7X increased throughput with the appropriate parameters.
Virtual switch performance comparison between POWER7 and POWER8:
It’s worth noting that with POWER7, I wasn’t able to achieve 20Gbits/sec with only one virtual Ethernet adapter and one trunk device on the SEA. However, with virtual Ethernet devices on POWER8, it’s much easier to get more than 20Gbits/sec with only one virtual Ethernet device.
Final Thoughts on Virtual Ethernet Performance
The default configuration on AIX Version 7.2 is more than appropriate to get decent transfers for 10Gbit Ethernet adapters. The most important aspect to take into account is the CPU resources to drive the maximum throughput. There are a couple of changes in regard to attributes which make sense to run, such as enable dog thread and increase the virtual and TCP/IP buffers. In this scenario, it wouldn’t be necessary to create several trunk adapters to drive the maximum capacity for a shared Ethernet adapter mounted over a link aggregation 2X 10Gbit ports. Finally, keep in mind the word of caution about choosing the right tool to stress the network. In the second part of this series, I will discuss the considerations for the Virtual I/O server, the Shared Ethernet adapter and link aggregation.
Jose Ortega is an IBM Power Systems and database consultant. He's been working on the platform since 2005.More →