Sunday, April 4, 2010

Data Center Switch requirements for new Data Center Architectures

Traditionally data centers have three tiers of switches -  Core switches,  Aggregate switches and Access switches.
  • Core Switches :  These switches connect to the network which are connected to the WAN links. This is farthest switch farm with respect to servers.
  • Access Switches :  These switches are also called top-of-rack switches.  Servers (Web Servers, Email Servers, Application Servers, Database Servers and others for which data center is built) get connected to the ports of these switches. 
  • Aggregation Switches:  Aggregation switches is intermediate switch layer which is sandwiched between Core and Access switch layers.  Aggregation switch aggregates the traffic between core and access layers.  Note that there could be lot of traffic among servers (Specifically among application, web and database servers). This traffic need not be seen by the core switches.  This traffic just need to be among the access layer switches.  Aggregation layer eliminates the traffic being seen by every switch.  Core switches only see the traffic going to/coming from WAN/Corporate network.  Aggregation layer also reduces the traffic among access layer switches.
It was necessary to have three tiers in earlier data center architectures due to
  •  Large number of physical machines serving the content requires large number of Ethernet ports.  Due to poor density of the ports on the switches, multiple access layer switches were necessary. Multiple switches means there is lot more traffic across access layer switches. One more hierarchy of switches enable good throughput by eliminating mesh kind of access layer switches for intra switch traffic.
What are some of the changes in Data Centers? One big change is collapse of three tiers to two tiers. Aggregation layer is disappearing.  Let us see what is making this change.
  • Virtualizaton technology is reducing the number of physical machines:  This implies that there are less number of ports.
  • Traffic on each port is increasing : Virtualization and Mulitcore processor are enabling multiple applications in one physical machine.  It is not uncommon to see the requirement of multi-gig traffic on a single port.
  • 10G and in future 40G/100G ports are facilitating the unified fabric for both kinds of traffic - Application traffic and SAN traffic, thus eliminating number of ports and interconnects.
These technologies are reducing the cost by reducing equipment, interconnects, by amount of power required and amount of cooling required. It also reduces the maintenance and hence reduction in cost.

What kind of features one would expect in the switches in new data centers:
  • Latency of traffic should be very less:  By eliminating the aggregation layer itself reduces the latency. But that is not good enough for SAN traffic, Video and Voice workloads.  Non-blocking switching or cut-through switching is expected to support real time traffic such as Video, Voice etc..   Traditionally, switches oversubscribe the bandwidth, that is, switches are not capable of receiving and transmitting of traffic of all ports at the same time with full port bandwidth. Hence the packets get blocked.  In non-blocking switches,  they are expected to send and receive traffic equal to number ports * each port bandwidth.  If there are ten 1G ports, switches are expected to receive 10G traffic and send 10G traffic.  
    •  802.1qbb (Priority based Flow Control):  When there is a congestion in the receiving node, 802.3x pause frame is generated normally. This makes all the traffic pause for some time. This standard allows pause frame generation on 802.1p priority levels. It lets the high priority traffic flow. Switches are expected to honor and generate theses kinds of frames.
    • 802.1qaz (Enhanced Traffic Selection):  This standard allows the bandwidth allocation for different priority levels or group of priority levels.  It lets higher priority bandwidth to be consumed lower priority traffic if there is no higher priority traffic.  SAN traffic would need to be going with higher priority levels. This feature is also expected to be supported by data center switches.
    • 802.1qau (Congestion Notification):  This standard allows end nodes to communicate the congestion notification.  It lets the end node receiving the congestion notification to apply rate limiting on the out traffic.  This feature is also expected to be supported by data center switches.
  • Port Density should be high.
  • Multi-Path support is required - I am not sure whether there are any standard at this time, but spanning tree is not used in these cases as it only provides one path. 
  • VEPA Support would be required eventually. Due to VEPA,  it may need to support C-VLAN and P-VLANs.
  • Large number of VLANs support is required to work with other network services such as ADCs, WAN Optimization and Network Security (Firewall, IPS, IPSec VPN etc..). 
  • Ability to redirect the traffic not only based on L2 and L3 fields, but also L4 fields such as TCP, UDP Source and destination ports. 
  • Any switch architecture should work with VM migration from one physical server to new physical server.
  • Public Data Center networks require Virtual Instance kind of concept within the switches to reuse VLANs (across different subscribers) due to limited number of VLAN IDs.  

No comments: