Ostinato and Intel® DPDK Data Rates Report

6,276 views

Published on

Ostinato performance report for 1G link: comparison between libpcap and DPDK performed on both Intel and AMD hardware platforms.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,276
On SlideShare
0
From Embeds
0
Number of Embeds
4,381
Actions
Shares
0
Downloads
82
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Ostinato and Intel® DPDK Data Rates Report

  1. 1. OSTINATO AND INTEL® DPDK DATA RATES REPORT INTEL AND AMD CPU ARCHITECTURES plvision.eu
  2. 2. WHAT IS OSTINATO? ‘Transmit’, ‘capture’ and ‘statistics’ functions are implemented using thread model Open-source and cross-platform Shows real-time port receive/transmit statistics and rates OSTINATO IS A SOFTWARE TRAFFIC GENERATOR AND ANALYZER BASED ON LIBPCAP. MAIN FEATURES: plvision.eu
  3. 3. WHAT IS INTEL® DPDK? INTEL® DPDK – DATA PLANE DEVELOPMENT KIT Intel® DPDK is a user-space software library for high-speed packet processing - Polling instead of interrupt handling - Lock-free structures to avoid kernel synchronization - Packets ring buffer is located in user space to avoid kernel to user space copying - Large pages to avoid TLB misses - SSE for copying data effectively - CPU affinity to avoid Linux schedulers - NUMA awareness to speed up memory access - Sending packets in bursts to use bus effectively - Memory pre-allocation for fixed sized packets. plvision.eu
  4. 4. OSTINATO AND INTEL® DPDK Ostinato supports Intel® DPDK with limited functionality PLVision’s engineers have enabled support of Intel® DPDK for Ostinato, with all its functionality preserved* *PLVision is working on making it globally available in Ostinato repository plvision.eu
  5. 5. CPU Intel Xeon E5-2603 - 1.80 GHz 4 cores 8 GB RAM NIC I350-T4 from Intel Four 1G ports (only two are used) Attached using PCI-express bus HOST Ubuntu 14 Server, kernel 3.13.0 KVM Hypervisor, QEMU ver. 2.0.0 VM Guest Ubuntu 14.1 Server, kernel 3.13.0 NIC used directly via PCI pass through *VM used due to limitations of hardware availability CPU AMD Phenom II X4 965 - 3,4 GHZ 4 cores 4 GB RAM NIC I350-T2 from Intel Two 1G ports Attached using PCI-express bus HOST Ubuntu 14.0 Desktop, kernel 3.13.0 VM Guest No VM used (bare metal configuration) TESTBED 2 ports are connected with a loopback Send, capture, and gather statistics DUTDUT KVM VM INTEL AMD plvision.eu
  6. 6. HOW INTEL® DPDK IS USED Main core (0) gather statistics, etc. First core (1) TX / RX Third core (3) idle Second core (2) TX / RX BINDING THREADS TO CPU CORES GE Port 1 GE Port 2 plvision.eu
  7. 7. DATA RATE СALCULATION: 64 B PACKETS ON 1G LINK Maximum data rate 761.9 Mbit/s Packet body size 64 bytes Packet rate on 1G link 1488095 pсkts Layer 1 frame size 84 bytes Packet body 64 bytes Preamble 8 bytes Inter-packet gap 12 bytes 1488095 * (64 * 8) / 1 million 1 billion / 672 bit 64+8+12 = 84 bytes = 672 bits plvision.eu
  8. 8. TEST 1 SINGE PORT TX Port 1 TX Port 2 RX Intel®NIC TEST PC plvision.eu
  9. 9. LIBPCAP TX (1 PORT): STATS 158 328 510 944 352 700 912 952 761.9 864.9 927.5 962.4 0 100 200 300 400 500 600 700 800 900 1000 64 128 256 512 Datarate,Mbit/s Packet size, KBytes Intel CPU AMD CPU Theoretical highest data rate • Theoretical max throughput is not reached. Possible reason – lack of CPU performance • Packet rate on Intel CPU is more than twice lower than on AMD, especially on small packets, due to Intel CPU frequency being more than twice lower. plvision.eu
  10. 10. INTEL® DPDK TX (1 PORT): STATS 761 864 927 962 761 864 927 962 761.9 864.9 927.5 962.4 600 650 700 750 800 850 900 950 1000 64 128 256 512 Datarate,Mbit/s Packet size, KBytes Intel CPU AMD CPU Theoretical highest data rate • Theoretical max throughput is reached on both CPUs • Packet rate is the same for Intel and AMD, thus DPDK is processing packets much more effectively • Intel CPU speed is enough on DPDK, but not so for libpcap. plvision.eu
  11. 11. INTEL® DPDK VS LIBPCAP TX (1 PORT): IMPROVEMENTS Intel AMD Packet size (B) Byte speed improvement (%) Byte speed improvement (%) 64 x 4,8 + 54 % 128 x 2.6 + 18 % 256 + 81 % + 2 % 512 same same plvision.eu
  12. 12. TEST 1 - CONCLUSION DPDK on Intel CPU shows considerable improvement • x4.8 improvement – due to low Intel CPU frequency (1.8 vs 3.4 GHz) • DPDK is processing packets more effectively. For large packets, there is almost no improvement • Bigger amount of small packets is needed to fulfill line rate • CPU effort for creating small and big packets is almost the same, but to fulfill the line rate larger amount of small packets is needed, thus more CPU effort is required. • DPDK allows using CPU which has lower frequency. plvision.eu
  13. 13. TEST 2 PORT 1 AND 2 – BOTH TX / RX Port 1 TX/RX Port 2 TX/RX Intel®NIC TEST PC plvision.eu
  14. 14. LIBPCAP TX (2 PORTS0): STATS 112 320 360 176 352 360 952 761.9 864.9 927.5 962.4 240 488 760 0 200 400 600 800 1000 64 128 256 512 Datarate,Mbit/s Packet size, KBytes Intel CPU (Port1) AMD CPU • Different speeds between port 1 and port 2 on Intel CPU o Linux scheduler is beyond our control, and load balancing between cores might not be efficient • Traffic degradation observed as compared to single port traffic usage by Ostinato o Obviously, RX processing is affecting performance. plvision.eu
  15. 15. DPDK TX (2 PORTS): STATS 761 864 927 962 761 864 927 962 761.9 864.9 927.5 962.4 600 650 700 750 800 850 900 950 1000 64 128 256 512 Datarate,Mbit/s Packet size, KBytes Intel CPU AMD CPU Theoretical highest data rate • Theoretical max throughput is reached on both CPUs and for both ports • Using DPDK allows avoiding utilization of Linux scheduler and bind thread to CPU core. Therefore, each port has a dedicated core for transmitting traffic. plvision.eu
  16. 16. DPDK VS LIBPCAP TX (2 PORTS): STATS Intel AMD Packet size (B) speed improvement (%) speed improvement (%) 64 x6.8 x4.3 128 x2.7 x2.4 256 x2.5 x2.5 512 +26% same plvision.eu
  17. 17. CONCLUSIONS Achievements • Achieved 1G full rate TX/RX on a "loop" topology on any packet size. Observations • No difference in performance between AMD and Intel CPUs set-up • DPDK is almost insensitive to CPU frequency on given configurations • VM usage doesn’t cause performance degradation. Why DPDK provides considerable speed improvements • Allows avoiding latency while copying through kernel space • Allows avoiding using Linux scheduler and bind thread to CPU core • No threads synchronization. Notes • Real CPU core usage cannot be observed: it is always 100% regardless of traffic rate and packets size (due to peculiarities of DPDK architecture). plvision.eu
  18. 18. REFERENCES DPDK overview from Intel Ostinato website plvision.eu
  19. 19. FIND OUT MORE VISIT OUR BLOG FOR MORE TECH STUFF plvision.eu SEE ALSO: SIMILAR COMPARISON FOR 10G CONFIGURATION.
  20. 20. Email: [email protected] Skype: sales.plvision http://plvision.eu/ THANK YOU FOR WATCHING! plvision.eu

×