There is plenty of talk about Bonding and Multipath IO, but it is very difficult to get solid information about either one. Typically what documentation can be found is very bulky and the most important practical questions go unanswered.
As a result the following questions are often heard:
- When should I use bonding and when should I use multipath?
- I was expecting better throughput with bonding, why am I not seeing this?
- My RAID array shows 400MB/sec with local test, how can I get 400/MB sec outside?
Before we answer the above questions, let’s understand first how MPIO and bonding works.
MPIO allows a server with multiple NICs to transmit and receive I/O across all available interfaces to a corresponding MPIO-enabled server. If a server has two 1Gb NICs and the storage server has two 1Gb NICs, the theoretical maximum throughput would be about 200 MB/s.
Link aggregation (LACP, 802.3ad, etc.) via NIC teaming does not work the same way as MPIO. Link aggregation does not improve the throughput of a single I/O flow. A single flow will always traverse only one path. The benefit of link aggregation is seen when several “unique” flows exist, each from different source. Each individual flow will be sent down its own available NIC interface which is determined by a hash algorithm. Thus with more unique flows, more NICs will provide greater aggregate throughput. Link aggregation will not provide improved throughput for iSCSI, although it does provide a degree of redundancy.
Bonding works between a server and switch. Numerous workstations using each using a single NIC connected to the switch will benefit from bonded connections between the switch and storage server.
MPIO works between a storage server and the client server, whether or not there is a switch in the path.
With these basics fact, it will now be easier to answer our questions.
Q: When do I need bonding, and when is multipath appropriate?
A: Bonding works for a NAS server with multiple workstations connected.
MPIO works between hosts and initiators on FC or iSCSI. An example of MPIO configuration with a performance test showing 200MB/sec using dual Gb NIC’s is demonstrated step-by-step at: How to configure DSS V6 MPIO with Windows 2008 Server .
In short:
- Bonding works for NAS
- MPIO works for SAN
Q: I was expecting better throughput with bonding, in my case why does it not work?
A: Let us consider 4 workstations and a storage server. Bonding will increase performance only if the single 1 GB connection between the storage server and switch is a bottleneck. This can happen when total aggregate benchmark of all four workstations is a bit above 100 MB/sec. In such a case it is possible that the RAID array can move more data and the bottleneck is the single 1Gb NIC. If we add a second NIC and create bonding the total aggregate performance may improve. This will happen if the RAID array is fast enough to move more than 100MB/sec with four streams. However, as more streams are active, the RAID array performance may drop because the performance test pattern will be closer to random and not sequential. The sequential test pattern is valid for one single data stream only. In a sequential pattern performance test we can observe a very high data rate, but in random patterns it can be very low. This is because the hard disks need to make frequent seeks. In practical applications it is very difficult to make effective use of more than 2 NICs for bonding.
Q: My RAID array shows 400MB/sec with a local sequential benchmark test, how can I get 400/MB sec outside?
A: It is rather impossible to get these kinds of results with NAS and bonding, because bonding works with multiple workstations and this will result in predominantly random patterns. The same local test but with random patterns may drop well below 100MB/s and then even a single NIC will be not the bottleneck – instead it will be the RAID array (hard disks). With MPIO we can keep sequential pattern performance because it can operate with a single data flow. Thus, with 2 NICs in MPIO we can observe 200 MB/sec, with 4 NIC’s, 400MB/sec.
10 Comments
dan pugh /
14, 12 2010 09:57:45interesting, surely what you describe is different depending on circumstances
lacp bonds together 2 x 1gb links and (assuming it is switch assisted) using single virutal mac address so 1 flow can = 200
mpio can = 200 when 2 flows but otherwise hits maximum with 1 flow of 100
network load balancing (adaptor teaming) is maxed out like mpio at 1-in and 1-out so 100
although there are several options for loadbalancing overall throughput is not increased unless the switch helps
thats how i understood it anyway
http://en.wikipedia.org/wiki/Link_aggregation
“Network interface cards (NICs) trunked together can also provide network links beyond the throughput of any one single NIC”
microsoft doesnt support adaptor teaming isci, but you can bond using lacp at the storage end.
testing (lacp bonded iscsi with cisco switch in port-channel) ive just done seems to be slower with mpio, but the failover cluster validation seems happier with mpio
could be wrong of course 😉
Bruno PELZER – CELEM /
16, 03 2011 01:25:59If I use MPIO (3 gigabit NIC) and I use cluster Failover for my iSCSI connection, my gigabit replication link is the bottleneck and performances are about 100Mbps instead of 300 Mbps… If I disable cluster failover my performance is better (about 300Mbps). How can I get very good performances (300Mbps) WITH cluster Failover enabled ? I have a solution : using 10Gigabit ethernet link between 2 Open-E SAN (for replication link) but it’s a very expensive solution… Do you have another idea ?
TooMeeK /
04, 07 2011 11:03:34eee… what? with good switch and bonding mode round-robin (just like iSCSI MCS does) You can get raw performance at twice of Gigabit link (or more) for NFS for example. Check this by iperf, should get 1,8-1,9Gbit/s even with single connection. It utilises all slaves.
But better solution is to always have redundand link (2 switches at least) and use different bonding mode (balance-alb) or many paths for iSCSI. This means seaparate IP for all network cards.
NeltStevy /
28, 11 2012 12:49:35Thanks for the info, I was looking for this kind of explanation.