Wednesday, March 5, 2014

Openflow switches - flow setup & termination rate

I attended ONS yesterday.  This is the 1st time, I see many OF hardware switch vendors (either based on NPUs,  FPGAs or ASICs) advertising the performance numbers.  Though the throughput numbers are impressive, flow setup/termination rates are, in my view, are disappointing.  I see the flow setup rate claims are any where between 100 to 2000/sec.

Software based virtual switches support flow setup rate, up to10K/sec for every core. If 4 core part is used, one can get easily 40K flow setups/sec.

In my view,  unless the flow setup rate is improved, it is very unlikely the hardware based OF switches would be popular as the market addressability is limited.

By talking to few, I understand that poor flow setup rate is mainly due to the way TCAMs are used.  As I understand, every flow update (add/delete) requires rearranging the existing entries and that leads to bad performance. I also understand from many of these vendors that they intend to support algorithmic search accelerators to improve the performance of flow insertions and deletions. I also understand from them that this could improve the performance to hundreds of thousands of flow setups/sec.

Unless following are taken care,  hardware based OF switch adoption would be limited.
  • Higher flow setup rate. (Must be better than software based switches)
  • Ability to maintain millions of flows.
  • Support for multiple tables (All 256 tables)
  • VxLAN Support
  • VxLAN with IPSec
Throughput performance is very important and hardware based OF switches are very good at that.  In my view, all of above are also important for any deployments to seriously consider hardware based OF switches in place of software based switches.

Tuesday, March 4, 2014

OVS Acceleration - Are you reading between the lines?

OVS, part of Linux distributions, is openflow based software switch implementation.  OVS has two components - User space component and kernel space component.  User space component consists of   OVSDB n(configuration) sub-component,  Openflow agent that communicates with openflow controllers and OF normal path processing logic.  Kernel component implements the fastpath.  OVS calls the kernel component 'datapath'.  OVS datapath maintains one table.  When the packet comes into the datapath logic, it determines the matching flow in the table.  If no entry, then the packet is sent to the user space 'normal path processing' module.  OVS, in user space, finds out if there are all matching entries across all OF tables in that pipeline. If there are all matching entries,  then it pushes the aggregate flow to the kernel datapath.  Further packets of the aggregate flow gets processed in the kernel itself.  Just to complete this discussion,  if OVS does not find any matching flow in the OF pipeline, then it, based on the miss entry,  sends the packet to the controller. Controller, then pushes the flow mod entries in the OVS.

It is believed that Linux Kernel based datapath is not efficient for following reasons.
  • Linux Kernel overhead before packet is handed over to the datapath.  Before packet is handed over to the OVS datapath, following processing occurs on packets.
    • Interrupt processing
    • Softirq processing
    • Dev Layer processing
  •  Interrupt overhead 
  • Fast Path to Normal path overhead for first few packets of a flow.

Few software/networking vendors have implemented user space datapath using OVS 'dpif' provider interface. Many of these vendors have implemented user space  datapath using Intel DPDK in poll mode.  Poll mode dedicates few cores to receive packets from the Ethernet controller directly, eliminating interrupts and  thereby avoiding overhead associated with interrupt processing. Also, since the Ethernet ports are owned by this custom user space process, there is no need for any complex hook processing as needed in Linux kernel.  This user space datapath can start OF processing on the packets immediately upon reception  from Ethernet controllers, thereby avoiding any intermediate processing overhead. There is no doubt that these two features by themselves provide good performance boost.

Many of the OVS acceleration implementations, as I know as on today, don't perform well  in Openstack environment.  Openstack orchestration with KVM not only uses OVS to implement virtual switch, but also it uses OVS to realize network virtualization (using VxLAN, NVGRE). In addition, it also uses Linux IPTables to implement isolation among VMs using security groups configuration.  Moreover, some deployments even use IPSec Over VxLAN to protect the traffic from eaves dropping.  Based on some literature I have read,  user space datapath packages also don't perform well if VMs are connected to OVS via tuntap interfaces.   They work well if there is no need for  isolation among VMs  and where there is no need for overlay based network virtualization.  That is, these acceleration packages work if VMs are trusted and when VLAN interfaces are good enough.  But, we know that Openstack integrated OVS requires VxLANs,  isolation among VMs,  usage of tuntap interfaces and even VxLAN over IPSec.

Note that isolation using firewall,  tuntap interfaces & VxLAN-over-IPsec today use Linux Kernel capabilities.  If OVS datapath is in user space, then packets anyway have to traverse through the Kernel (some times twice) to avail these capabilities, which even may have lot more performance issues over having OVS datapath in Kernel.  I will not be surprised, if the performance is lot lower than native OVS kernel datapath.

Hence, while evaluating OVS acceleration, one should ask for following questions:

  • What features of OVS applied in measuring the performance?
  • How many OF tables are used are hitting in this performance test?  Make sure that at least 5 tables are used in the OVS as this is very realistic number. 
  • Does it use VxLAN?
  • Does it implement firewall?
  • Does it implement VxLAN over IPSec?
  • Does it assume that VMs are untrusted?
If there is a performance boost with all features enabled, then that implementation is good. Otherwise, one should be very careful.

SDN Controllers - Be aware of Some critical interoperability & Robustness issues

Are you evaluating robustness of a SDN solution?  There are few tests, I believe one should run to ensure that SDN solution is robust.  Since SDN controller is brain behind traffic orchestration,  it should and must be robust enough to handle various openflow switches from various vendors.  Also, SDN controller must not assume that all openflow switches behave well.

Connectivity tests

Though initial connections to OF controllers are normally successful,  my experience is that connectivity fails in very simple non-standard cases.  Following tests would bring out any issues associated with connectivity. Ensure that the following tests are successful.

  • Controller restart :  Once switch is connected successfully,  restart the controller and ensure that switch connects successfully with the controller. It is observed few times that either switch is not initiating the connection or controller loses some important configuration during restart and does not honor connections from some switches.  In some cases, it is also found that the controller gets the dynamic IP address, but has a fixed domain  name. But switches are incapable of taking the FQDN as the controller address. Hence it is important to ensure that this test suite is successful.
  • Controller restart & controller creating pro-active flows in the switches :  Once the switch is connected successfully and traffic is flowing normally, restart the controller and ensure that switch connects to the controller successfully and traffic continues to flow.   It is observed that many switch implementations remove all the flows when it loses the main connection to the master controller. When the controller comes back up,  it is, normally, responsibility of controller to establish the basic flows (pro-active flows).  Reactive flows are okay as they can get established again upon the packet-in.  To ensure that the controller is working properly across restarts,  it is important to ensure that this test is successful.
  • Controller application restarts, but not the entire controller:  This is very critical as applications typically restart more often due to upgrades and due to bug fixes.  One must ensure that there is no issue with traffic related to other applications on the controller. Also,  one must ensure that there are no issues when the application is restarted either with new image or with existing image.  Also, one should ensure that there are no memory leaks.
  • Switch restart :  In this test, one should ensure that switch reconnects back successfully with the controller once it is restarted.  Also, one must ensure that the pro-active flows are programmed successfully by observing that the traffic continues to flow even after the switch restarts.
  • Physical wire disconnection :  One must also ensure that the temporary discontinuity does not affect the traffic flow.  It is observed that in few cases switch realizes TCP termination and controller does not. Some switches seems to be removing the flows immediately after the main connection breaks and when connected again,  either it does not get the flow states immediately as yet times controller itself may not have known the connection termination. It is observed that any new connection coming from the switch is assumed to be duplicate and hence does not initiate flow setup process. It is required that the controller must need to assume any new main connection from the switch is connecting back after some blackout.  It should treat it as two events - Disconnection with the switch followed by new connection.  
  • Change the IP address of the switch after successful connection with the controller :  Similar to above.  One must ensure that connectivity is restored and traffic flows smoothly.
  • Run above tests while traffic is flowing :  Just to ensure the system is stable even when the controller restarts while traffic is flowing through the switches.  And also ensure that controller is good when the switch restarts while packet-ins are pending in the controller and while controller is waiting for responses from the switch, especially during multi-part message response.  One black-box way of doing this is to wait until the application creates large number of flows in a switch (Say 100K flows).  Then issue "flow list" command (typically from CLI - many controllers provide mechanism to list out the flows) and immediately restart the switch and observe the behavior of controller. Once the switch is reconnected, let the flows be created and issue "flow list" command and ensure that this list is successful.
  • Check for memory leaks :  Run 24 hour tests by restarting the switch continuously.  Ensure that the script restarts the switch software only after it connects to the controller successful and traffic flows.  You should be surprised number of problems you could find with this simple test.  After 24 hours of test, ensure that there are no memory or fd (file descriptor) leaks by observing the process statistics.
  • Check for robustness of the connectivity by running 24 hour test with flow listing from the controller.  Let the controller application create large number of flows in a switch (say 100K) and in a loop execute a script which issues various commands including "flow list".  Ensure that this test is successful and there are no memory or fd leaks.

Keep Alive tests

Controllers share the memory across all openflow switch it controls.  Yet times, one misbehaving openflow switch might consume lot of OF message buffers or memory in the controller leaving controller not respond to keep alive messages from other switches.  SDN controllers are expected to reserve some message buffers and memory to receive keep alive messages and respond to them. This reservation not only required for keep alive messages, but also for each openflow switch.  Some tests that can be run to ensure that proper fairness by the controllers are:
  • Overload the controller by running cbench, which sends lot of packet-in messages and expect controller to crate flows.  While cbench is running, ensure that ta normal switch connectivity does not suffer.  Keep this test for 1/2 hour to ensure that controller is robust enough.
  • Same test, but tweak the cbench to generate and flood keep alive messages towards the controller.  While cbench is running,  ensure a normal switch connectivity to the controller does not suffer. Keep running this test continuously for 1/2 hour to ensure that the controller is robust enough.

DoS Attack Checks

There are few tests,  I believe need to be performed to ensure that the controller code is robust enough.  Buggy switches might send truncated packets or very corrupted packets which wrong (very big) length value.
  • Check all the messages & fields in the OF specifications that have length field.  Generate message (using cbench?) with wrong length value (higher value than the message size) and ensure that the controller does not crash or controller does not stop misbehaving.  It is expected that the controller eventually terminates the connection with the switch with no memory leaks.
  • Run above test continuously to ensure that there are no leaks.
  • Messages with maximum length value in the header:  It is possible that controllers might be allocating big chunk of memory and leading to memory exhaustion.  Right thing for controller is to do is to have maximum message length and drop the message (drain) without storing it in the memory.   
I would like to hear from others on what kind of tests you believe must be performed to ensure that robustness of controllers.

Monday, March 3, 2014

SDN Controller - What to look for?


ONF is making Openflow specification as one of the standards enabling non-proprietary communication between central control plane entity & distribute data plane entities. SDN Controllers are the ones which implement control plane for various data path entities.  OVS, being part of the Linux distributions,  is becoming a defacto virtual switch entity in data centers and service provider market segments.  OVS virtual switch, sitting in the Linux host acts as a switch (data path entity) between virtual machines on the Linux host and  rest of the network.

As with virtualization of compute and storage,  networks are also being virtualized. VLAN used to be the one of the techniques  to realize virtual networks. With the limitations of number of VLANs and inability of extending virtual networks using VLANs over L3 networks,  overlay based virtual network technology is replacing VLAN technology.   VxLAN overlay protocol is becoming a choice of virtual network technology.  Today virtual switches (such as OVS) are supporting VxLAN and becoming defactor overlay protocol in data center and service provider networks.

Another important technology that is becoming popular is Openstack.  Openstack is virtual resource orchestration technology to manage virtualization of compute, storage and network resources.  Neutron component of openstack takes care of configuration & management of virtual networks,  network services such as router,  DHCP, Firewall, IPSec VPN and Load balancers.  Neutron provides API to configure these network resources.  Horizon, which is GUI of openstack provides user interface for these services.

Network Nodes (A term used by Openstack community) are the ones which normally sit at the edge of the data centers. They provide firewall capability between Internet & data center networks,  IPSec capability to terminate IPSec tunnels with the  peer networks, SSL offload capability and load balancing capability to distribute the incoming connections to various servers.  Network nodes also acts as routers between external networks & internal virtual networks.  In today networks,  network nodes are self-contained devices.  They have both control plane and data plane  in each node.  Increasingly, it is being felt that SDN concepts can be used to separate out control plane & normal path software from data plane & fast path software.

Network nodes are also being used as routers across virtual networks within data centers for east-west traffic.  Some even  use them as firewall and load balancers for east-west traffic.  Increasingly,  it is being realized that network nodes should not be burdened with the east-west traffic and rather use virtual switches within each compute node to do this job.  That is, virtual switches are being thought to be used as distributed router, firewall and load balancer for east-west traffic.

Advanced network services, which do deep inspection of packets and data, such as Intrusion Prevention,  Web application firewalls,  SSL offload are being deployed in L2 transparent mode to avoid reconfiguration of networks and also to enable vmotion easily.  When deployed as virtual appliances, it also provides agility and scale-out functions.  It requires traffic steering capability to steer the traffic across required virtual appliances.  Though most of the network services are required for north-south traffic, some of them (such as IPS) are equally needed for east-west traffic.


As one see from above introduction,  operators would like to see following supported by centralized control plane entity (SDN Controllers)
  • Realization of virtual networks
  • Control plane for network nodes 
  • Normal path software for network nodes.
  • Traffic Steering capability to steer the traffic across advanced network services
  • Distributed routing, firewall & Load balancing capability for east-west traffic.
  • Integration with Openstack Neutron

At no time, centralized entity should  become a bottleneck, hence following additional requirements come in play.

  • Scale-out of control plane entity (Clustered Controllers) - Controller Manager.
  • Performance of each control plane entity.
  • Capacity of each control plane entity.
  • Security of control plane entity

Let us dig through each one of the above.

Realization of Virtual Networks:

SDN Controller  is expected to provide following:


  • Ability to program the virtual switches in compute nodes.
  • No special agent in compute nodes.
  • Ability to use OVS  using Openflow 1.3+ 
  • Ability to realize VxLAN based virtual networks using flow  based tunneling mechanism provided by OVS.
  • Ability to realize broadcast & unicast traffic using OF groups.
  • Ability to  integrate with Openstack to come to know about VM MAC addresses and the compute nodes on which they are present.
  • Ability to use above repository to program the flow entries in virtual switches without resorting broadcasting the traffic to all peer compute nodes.
  • Ability to auto-learn VTEP entries.
  • Ability to avoid multiple data path entities in a compute nodes - One single data path for each compute node.
  • Ability to honor security groups configured in Openstack Nova. That is, ability to program flows based on security groups configuration without using 'IP tables" in the compute node. 
  • Ability to use 'Connection tracking" feature to enable stateful firewall functionality.
  • Ability to support IPSec in virtual networks across compute nodes.


Capacity is entirely based on deployment scenario.  Based on best of my knowledge, I believe these parameters are realistic from deployment perspective and also based on capability of hardware.
  • Ability to support 256 compute nodes by one controller entity.  if there are more  256 compute nodes, then more controllers in the cluster should be able to take care of rest.
  • Ability to support multiple controllers - Ability to distribute controllers across the virtual switches.
  • Support for 16K Virtual networks.
  • Support for 128K Virtual ports
  • Support for 256K VTEP entries.
  • Support for 16K IPSec transport mode tunnels


  • 100K Connections/sec per SDN Controller node (Due to firewall being taken care in the controllers).  With new feature, that is being thought in ONF, called connection template,  this requirement of 100K connections/sec can go down dramatically.  I think 50K connections/sec or connection templates/sec would be good enough.
  • 512 IPSec tunnels/sec.

Control Plane & Normal Path software for network nodes

Functionality such as router control plane,  Firewall normal path,  Load balancer normal path & control plane for IPSec (IKE) are the requirements to implement control plane for network nodes.


  • Ability to integrate with Neutron configuration of routers,  firewalls,  load balancers & IPSec.
  • Support for IPv4 & IPv6 unicast routing protocols - OSPF, BGP, RIP and IS-IS.
  • Support for IPv4 & IPv6 Multicast routing protocols - PIM-SM
  • Support for IP-tables kind of firewall normal path software.
  • Support for IKE with public key based authentication.
  • Support for LVS kind of L4 load balancing software.
  • Ability to support multiple routes, firewall, load balancer instances.
  • Ability to support multiple Openflow switches that implement datapath/fastpath functionality of network nodes.
  • Ability to receive exception packets from Openflow switches, process them through control plane/normal-path software & programming the resulting flows in openflow switches.
  • Ability to support various extensions to Openflow specifications such as
    • Bind Objects 
      • To bind client-to-server & Server-to-client flows.
      • To realize IPSec SAs
      • To bind multiple flow together for easy revalidation in case of firewalls.
    • Multiple actions/instructions to support:
      • IPSec outbound/inbound SA processing.
      • Attack checks - Sequence number checks.
      • TCP sequence number NAT with delta history table.
      • Generation of ICMP error messages.
      • Big Metadata
      • LPM table support
      • IP Fragmentation
      • IP Reassembly on per table basis.
      • Ability to go back to the tables whose ID is less than the current table ID.
      • Ability to receive all pipe line fields via packet-in and sending them back via packet-out.
      • Ability for controller to set the starting table ID along with the packet-out.
      • Ability to define actions when the flow is created or bind object is created.
      • Ability to define actions when the flow is  being deleted or bind object is being deleted.
      • Connection template support to auto-create the flows within the virtual switches.


  • Ability to support multiple network node switches - Minimum 32.
  • Ability to support multiple routers -  256 per controller node,  that is, 256 name spaces per controller node.
  • Ability to support 10K Firewall rules on per router.
  • Ability to support 256 IPSec policy rules on per router.
  • Ability to support 1K pools in LVS on per router basis.
  • Ability to support 4M firewall/Load balancer sessions.
  • Ability to support 100K IPSec SAs. (If you need to support mobile users coming in via from IPSec)


  • 100K Connections or Connection templates/sec on per controller node basis.
  • 10K IPSec SAs/sec on per controller node basis.

Traffic Steering 


  • Ability to support network service chains
  • Ability to define multiple network services in a chain.
  • Ability to define bypass rules - to bypass some services for various traffic types.
  • Ability to associate multiple network service chains to a virtual network.
  • Ability to define service chain selection rules - Select a service chain based on the the type of traffic.
  • Ability to support multiple virtual networks.
  • Ability to establish rules in virtual switches that are part of the chain.
  • Ability to support scale-out of network services.


  • Support for 4K virtual networks.
  • Support for 8 network services in each chain.
  • Support for 4K chains.
  • Support for 32M flows.


  • 256K Connections Or connection templates/sec.

Distributed Routing/Firewall/Load balancing for East-West traffic

As indicated before, virtual switches in the compute nodes should be used as data plane entity for these functions. As a controller, in addition to programming the flows to realize virtual networks and traffic steering capabilities,  it should also program flows to control the traffic based on firewall rules and direct the east-west traffic based on the routing information and load balancing decisions.


  • Ability to integrate with Openstack to get to know the routers, firewall & LB configurations.
  • Ability to act as control plane/normal-path entity for firewall & LB (Similar to network node except that it programs the virtual switches of compute nodes).
  • Ability to add routes in multiple virtual switches (Unlike in network node where the routes are added to only corresponding data plane switch).
  • Ability to support many extensions (as specified in network node section).
  • Ability to collect internal server load (For load balancing decisions).


  •  Support for 512 virtual switches.
  •  8M+ firewall/SLB entries.


  • 100K Connections/sec by one SDN controller node.

SDN Controller Manager

When there are multiple controller nodes in a cluster or multiple clusters of controllers,  I believe there is a need for a manager to manage these controller nodes.


  • Ability to on-board new clusters 
  • Ability to on-board new controller nodes and assigning them to clusters.
  • Ability to recognize virtual switches - Automatically wherever possible.  Where not possible, via on-boarding.
  • Ability to associate virtual switches to controller nodes and ability  to inform controller nodes on the virtual switches that would be connected to it.
  • Ability to schedule virtual switches to controller nodes based on controller node capabilities to take in more virtual switches.
  • Ability to act as a bridge between Openstack Neutron & SDN controller nodes in synchronizing the resources & configuration of Neutron with all SDN controller nodes.  Configuration includes:
    • Ports & Networks.
    • Routers
    • Firewall, SLB & IPSec VPN configuration.
  • Ensuring that configuration in appropriate controller node is set to avoid any race conditions.
  • Ability to set backup relations.

Securing the SDN Controller

Since SDN Controller is brain behind realization of virtual networks and network services, it is required that it is highly available and not prone to attacks. Some of the security features it should implement in my view are:
  • Support SSL/TLS based OF connections.
  • Accept connections only from authorized virtual switches.
  • Always work with  backup controller.
  • Synchronize state information with backup controller.
  • DDOS Prevention 
    • Enable Syn-Cookie mechanism.
    • Enable host based Firewall
    • Allow traffic that is of interest to SDN Controller. Drop all other traffic.
    • Enable rate limiting of the traffic.
    • Enable rate  limiting on the exception packets from virtual switches.
    • Control number of flow setups/sec.
  • Constant vulnerability asssesment.
  • Running fragroute tools and isic tools to ensure that no known vulnerabilities are present.
  • Always authenticate the configuration users.
  • Provide higher priority to the configuration traffic.
Note: If one SDN controller node is implementing all the functions listed above,  it is required to combine all performance and capacity requirements.

Check SDN controller from Freescale consisting of comprehensive feature set,  takes advantage of multiple cores to provide very high performance system. Check the details here: