Sunday, December 26, 2010

User space Packet processing applications - Execution Engine differences with processors

Please read this post to understand Execution Engine.

Many processors with descriptor based IO devices have their own interrupts.  For each device, there is corresponding UIO device.  Hence software poll based EE provides 'file descriptor' based interface to register, deregister and get hold of indication through callbacks.  EE applications are expected to read the packets from the the hardware by themselves and do rest of the processing.

As discussed in UIO related posts,  we have discussed ways to share the interrupts across devices. As long as UIO related application kernel driver knows the type of event for which interrupt is generated, appropriate UIO FD is woken up and  things will work fine.

Non-descriptor based IO is becoming quite common in recent Multicore processors.  Hardware events (packets from the Ethernet controllers,  acceleration results from the acceleration engines) are given to the software through set of HW interfaces.  Selection of  HW interface by the hardware is based on some load balancing algorithms or based on some software inputs.  But the point is that, the events which are being given to the software through one HW interface are from multiple hardware IO sources.  Each HW interface is normally associated with one interrupt.  One might say that this can be treated as interrupt being shared across multiple devices.  But, some of  the Multicore processors don't have facility to know the reason for HW interrupt.  Nor they have facility to know the event type of first pending event in HW interface.  Unless the event is dequeued from the HW interface, it is impossible to know the type of event.  Also, due to interrupt coalescing requirements, a given interrupt instance might represent multiple events of different IO source devices.  Due to this behavior,  there may be only one  UIO device for multiple IO devices.  Hence responsibility of demultiplexing these events to right EE application falls on the EE itself.  EE needs to read the event and find out the right application and call the appropriate callback function registered to it. Let us call this functionality in EE as 'EE Event DeMux'.

In Descriptor based systems,  EE applications are expected to read the HW events (packets & acceleration results) by each EE application.  Callback function invocation only provides indication for EE application to read the events from associated hardware descriptors.  In case of 'EE Event DeMux',  the event is already read by the EE itself.  Hence, event is expected to be passed to the callback function.

'EE Event DeMux' submodule registers itself with the rest of EE module to get UIO indication in case of software poll method.  In case of  hardware poll,  'EE Event DeMux' in invoked by the  hardware poll function.

Multicore processors normally provides HW interface for multiple IO devices for the devices which are part of the Multicore processors.  External devices such as PCI and other HW bus based IO devices are still implemented using descriptor based mechanism.  Software poll based EE should not assume that all devices are satisfied using 'EE Event DeMux'.   As far as core Software poll system is concerned,  'EE Event DeMux' is another EE application.  Hardware Poll based method, if they need to use descriptor based HW interfaces, then the hardware poll should also poll descriptor based HW interfaces.

When 'EE Event DeMux' is used by EE applications (such as Ethernet Driver,  Accelerator drivers,), it is necessary that 'EE Event DeMux' considers following requirements.
  • It should have its own 'Quota' as number of maximum events it is going to read from the HW interface as part of the callback function invocation by the EE core. Once it reads the 'Quota' number of events or if there are no more events, then it should return back to the 'Core EE' module.  
  • Since this the module which demuxes to some EE applications, it should provide its own register/De-register functions.
  • When 'Core EE' module invokes this module callback function due to interrupt generation or due to hardware poll,  as described above, it is expected to read at the most 'quota' number of events. While giving the control back to the  'Core EE', it is expected to call EE applications that there are no more events in this iteration. Some EE applications might register to get this indication.  For example, Ethernet driver application might register for this to do the 'Generic Receive Offload' function.  GRO functionality requires to know when to give up while doing TCP coalescing functionality.  In case descriptor based drivers, this issue does not arise as each Ethernet driver as part of callback invocation by the EE itself reads the events and knows when to give up.
Thanks for reading my earlier post.  I hope this helps.

No comments: