







# Towards Low Latency Interrupt Mode DPDK

David Su **Yunhong Jiang** Wei Wang





























# Legal Disclaimer

- INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.
- Intel may make changes to specifications and product descriptions at any time, without notice.
- All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
- Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.
- Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
- \*Other names and brands may be claimed as the property of others.
- Copyright © 2017 Intel Corporation.







DPDK

#### LEGAL DISCLAIMER

- No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
- Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.
- This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.
- The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.
- Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting: <a href="http://www.intel.com/design/literature.htm">http://www.intel.com/design/literature.htm</a>
- Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
- \*Other names and brands may be claimed as the property of others.
- Copyright © 2017, Intel Corporation. All rights reserved.
- Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #20110804









- DPDK Working Mode Transition
- Problems and Optimizations
- Performance Evaluation
- Next Step Plan









### Working Model Transition



- Polling mode:
  - 100% CPU usage even without inbound packets
- Interrupt mode DPDK on a dedicated CPU:
  - Enter CPU idle state when no packet is received
- Interrupt mode DPDK sharing a CPU with other processes:
  - Run with the highest priority
  - Yield the CPU to other processes when no packet is received







#### Working Model Transition with Virtualization

PMD DPDK

**Guest OS** 

Host OS with VMM

Logical CPU

- Polling mode:
  - 100% CPU usage even without inbound packets

**Interrupt DPDK** 

**Guest OS** 

Host OS with VMM

**Logical CPU** 

- Interrupt mode DPDK inside a VM on a dedicated CPU:
  - Enter CPU idle state when no inbound packets



- Interrupt mode DPDK inside a VM sharing a CPU with other processes:
  - Run with the highest priority
  - Yield the CPU to other processes on the Host OS when no inbound packets
  - Possible to share the CPU with processes inside the VM, but not encouraged currently.









- DPDK Working Mode Transition
- Problems and Optimizations
- Performance Evaluation
- Next Step Plan







Performance Issues on a Native OS







# Optimizations on a Native OS

- Interrupt Handling Optimization
  - Handling the interrupt immediately to avoid the scheduling of the ISR thread

igb\_uio driver:

http://dpdk.org/dev/patchwork/patch/ 19855/(merged)

vfio pci driver:

https://patchwork.kernel.org/patch/74
93081/(WIP)

NIC



- Scheduling Optimization
  - RT Linux is helpful to reduce the scheduling delay

1. Interrupt

- Interrupt Latency Optimization
  - Interrupt affinity setup to avoid one IPI. It will be good if the affinity can be set in the DPDK library.
  - Remove the timer throttling to get interrupts in time.
     http://dpdk.org/dev/patchwork/patch/19856/ (WIP)

- Wakeup Latency Optimization
  - Limit the maximum C state via the kernel booting parameter







#### Performance Issues on a VM

- Latency as described for the native environment, plus the extra latency from the virtualization layer
  - The ISR on the guest kernel
  - Host/Guest context switch for interrupt injection
- Potential bugs on the VMM layer may cause longer latency
  - <a href="https://www.spinics.net/lists/kvm/msg144">https://www.spinics.net/lists/kvm/msg144</a>
    469.html

Further optimizations to the VMM layer are in our next step plan











- DPDK Working Mode Transition
- Problems and Optimizations
- Performance Evaluation
- Next Step Plan









#### Test Environment

- Host
  - CPU: Intel XeonE5-2699 v3 @ 2.30GHz
  - OS: KVM4NFV D release (RT Kernel 4.4)
  - NIC: Intel Ethernet Controller X540-AT2, 10Gbs
- Guest
  - vCPUs bound to isolated pCPUs
  - OS: same as host
- Test applications
  - DPDK basicfwd
    - Modified based on DPDK l3fwd-power example
    - Sleep if no packets for more than 300 us
- Packet generator (MoonGen)
  - 1 packet every 350 us









### CPU Idle Optimization —Current Situation

| Max Cstate                           | CO | C1   | C3   | <b>C</b> 6 |
|--------------------------------------|----|------|------|------------|
| Interrupt mode Basicfwd Latency (us) | 14 | 14.9 | 60.9 | 87.7       |
| C State Exit Latency *               | 0  | 2    | 33   | 133        |

<sup>\*</sup> Output from "cpupower idle-info" on Intel XeonE5-2699 v3 @ 2.30GHz





# Latency Improvement

| Latency                                              | Minimum (μs) | Average (μs) | Maximum (μs) |
|------------------------------------------------------|--------------|--------------|--------------|
| Interrupt mode Basicfwd (Host, before optimization)  | 19           | 105          | 418          |
| Interrupt mode Basicfwd (Host, after optimization)   | 9            | 14           | 21           |
| Interrupt mode Basicfwd (in-VM, before optimization) | 9            | 112          | 7210         |
| Interrupt mode Basicfwd (in-VM, after optimization)  | 9            | 20           | 35           |









- DPDK Working Mode Transition
- Problems and Optimizations
- Performance Test
- Next Step Plan









#### Optimizations to DPDK inside the VM



- DPDK App starts to run once the vCPU is woken up by the Host ISR
- No need to inject virtual interrupts
- No need to signal eventfd inside the VM









End of Presentation

Thank you!

