• Like
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Linux with Network Virtualization
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Linux with Network Virtualization

  • 870 views
Published

Jun Xu of Futurewei (Huawei) presents at the DPDK Summit on IP stack implementations using DPDK in network virtualization models. …

Jun Xu of Futurewei (Huawei) presents at the DPDK Summit on IP stack implementations using DPDK in network virtualization models.

Published in Software
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
870
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
92
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Jun (Jim) Xu [email protected] Principal Engineer, Futurewei Technologies, Inc.
  • 2. Linux KVM/QEMU Switch/Router NFV
  • 3. Linux IP stack in Kernel All Applications will communicate via socket Limited raw socket applications A Perfect world (really ?) L2 L3 L4 Socket Interface User Space TCP dump apache
  • 4. ‘KVM (Kernel-based Virtual Machine) is a virtualization infrastructure for the Linux kernel that turns it into a hypervisor, which was merged into the Linux kernel mainline in February 2007” * ◦Supports multiple architectures ◦Common use in network areas and ISP/SP KVM inherits majority of Linux Kernel functions, including its IP stack * From wikipedia KVM/Linux Bare Metal APP VM VM VM Hypervisor Memory Management Process scheduling TCP/IP Stack IO Driver More…. TCP/IP Stack IO Driver Memory Management Process scheduling APP APP APP
  • 5. Intel Server vSwitch VM 11.1.1.1/24 VM 11.1.1.2/24 Huawei CE12800 Huawei CE5800 Intel Server VM 11.1.1.1/24 VM 11.1.1.2/24 Huawei CE12800 Huawei CE5800 •Traffic pattern in DC: oTraffic across VM within the host oTraffic across hosts oTraffic aggregate to core and goes to Edge •Most traffics are the first two types (east-west) •To handle the first two cases, virtual switch is introduced
  • 6. Intel Server VM 11.1.1.1/24 VM 11.1.1.2/24 Huawei CE12800 Huawei CE5800 Intel Server Intel Server VM 11.1.1.1/24 VM 11.1.1.2/24 Huawei CE12800 Huawei CE5800 Intel Server vSwitch vSwitch •VxLAN, STT, NVGRE can be used in virtual switch •Distributed Router can be deployed in the host as well
  • 7. •There is one open source virtual switch – OVS •Current OVS suitable for endpoint virtualization •A Perfect World! ( really ?) Hypervisor Linux Kernel VM 11.1.1.1/24 VM 11.1.1.2/24 Virtual switch (e.g. OVS kernel module) Huawei CE5800 Switch NIC
  • 8. VNFs will be executed in VMs (for now). The packet performance along with the functions of the virtual switches will put significant impact on the success of the NFV These introduce new challenges… Reference to www.etsi.org about NFV End Node VM1 End Node VM2 Bare Metal Virtualization Layer VNF1 LB VNF2 Firewall VNF3 nat VNF4 ipsec VNF5 router
  • 9. Hypervisor Linux Kernel VM 11.1.1.1/24 VM 11.1.1.2/24 Virtual switch (e.g. OVS kernel module) Physical Switch NIC Performance Challenges
  • 10. From: Lothar Braun, Alexander Didebulidez, etc., “Comparing and Improving Current Packet Capturing Solutions based on Commodity Hardware” in Internet Measurement Conference, 2010 http://conferences.sigcomm.org/imc/2010/papers/p206.pdf
  • 11. 1.488 Mpps Tx rate 50% CPU Utilization Intel Dual Core @1600Mhz Test result shows NETMAP with 60B packet costs ~1000cycles/packet * Netmap is a open source at http://info.iet.unipi.it/~luigi/netmap/
  • 12. VM IP App App App VM IP App App App VM IP App App Virtual Switch/Virtual Router IP stack KVM/Linux Kernel vNIC vNIC vNIC NIC NIC SR-IOV is a possible solution, but yet introduces other problems
  • 13. Intel Server Intel Server VM 11.1.1.1/24 VM 11.1.1.2/24 Huawei CE12800 Huawei CE5800 VM IP App App App VM IP App App Virtual Switch/Virtual Router KVM/Linux Kernel vNIC vNIC vNIC Huawei CE12800 Huawei CE5800 In NFV, VM may not be the endpoint anymore VM 11.1.1.2/24
  • 14. Add (Distributed) Routing Add MPLS Add (Distributed) Firewall Add QOS Add IP Filter Add Packet Classifier Add Redirect Add Load Balance Plus existing L2 functions …. All into (Linux) Kernel OS/hypervisor/Network be the monolithic piece of all (Sounds familiar?) VxLAN DHCP Socket Interface User Space TCP dump apache IPv6 GRE NVGRE VLAN ARP MPLS QOS Redirect Mirror LB Classifier IPv4 NAT
  • 15. Use ASIC to offload the switching function Common approach from Network Vendors Challenges to address portability, feature velocity in ASICs Switch ASIC NIC VM IP App App App VM IP App App App VM IP App App vNIC Hypervisor/ Linux Kernel vNIC
  • 16. Network Service Appliance QEMU-KVM KVM Linux Kernel Guest Kernel Huawei CE5800 Switch NIC Guest Userspace Processes •SR-IOV provides the separated access to a network adaptor among various PCIe hardware functions. •It bypasses the virtual switch function in the kernel to allow network traffic directly goes between VF and VM •Combine with UIO, QEMU can access the FV and provide the IO to Guest Kernel •SR-IOV and UIO are orthogonal technology. Other IO solutions are also feasible. vNIC
  • 17. DPDK demonstrates the desired network packet performance for NFV DPDK userspace design points to a better software approach Packet processing enhancement provides further opportunities: ◦Integration of High Bandwidth PCIe Gen3 ◦ New AVX Extensions ◦ Intel® Virtualization Technology (Intel® VT) ◦ Intel® Data Direct I/O Technology (Intel® DDIO) *From Intel DPDK. For DPDK reference to http://www.intel.com/go/dpdk •Quad Core Intel® Core™ i7-3610QE Processor 2.30GHz (E1), 6MB L3 cache •Mobile Intel® QM77 Express Chipset (A1) •Emerald Lake 2 Platform (CRB) •DDR3 1600MHz, 2 x dual rank 4GB (total 8GB), Dual-Channel Configuration •2 x Intel® 82599 Dual Port PCI-Express x8 10 Gigabit Ethernet NIC
  • 18. QEMU-KVM QEMU-KVM KVM Linux Kernel Guest Kernel Huawei CE5800 Switch NIC Guest Userspace Processes Guest Kernel Guest Userspace Processes •KVM/Linux Kernel back to what it is designed for, and yet robust. •Userspace Virtual Switch/Router uses UIO to directly access Physical NIC, and benefits from DDIO, PCIe Gen3, IOTLB, etc •Userspace Virtual Switch/Router provides vNIC for VM for accomplishing the inter-VM communication •Elastic performance with multi-core support. Userspace Virtual Switch/Router Network Service Appliance vNIC
  • 19. QEMU-KVM QEMU-KVM KVM Linux Kernel Guest Kernel Huawei CE5800 Switch NIC Guest Userspace Processes Guest Kernel Guest Userspace Processes •Userspace packet path can be directly from VM to VM •Zero copy is possible for inter-VM packets in the same host when optimizing with Guest kernel UIO, and Frontend Driver in QEMU. Userspace Virtual Switch/Router Network Service Appliance vNIC
  • 20. Support more IO types Less intrusive IO Feature Rich Userspace Switch/Router Multi-core expansion Multiple Instances to support Multi-tenancy
  • 21. Monolithic hypervisors may not be extensible with NFV deployment Linux/KVM/hypervisor can focus on its main tasks, e.g. process scheduling, resource management, virtualization Userspace IP stack provides another approach for easy development environment, high packet performance, and compatible VM communication support.
  • 22. Reference DPDK https://01.org/packet-processing/overview/dpdk-detail NETMAP http://info.iet.unipi.it/~luigi/netmap/ PR_RING http://www.ntop.org/products/pf_ring/ KVM http://www.linux-kvm.org/page/Main_Page Contact me at [email protected]