Real World Deployment Scenarios with VMware Networking Solutions: Scaled up Virtualization at Medtronic Heath Reynolds
Outline  Large Workload vMotion challenges  Enabling Multiple-NIC vMotion  Traffic Flow Considerations  QOS  NIOC  Class Based QOS on the 1000v  Real World Design Discussion  Quad 10Gb CNA / 1000v / FCoE on UCS  Questions
Challenges with Large Workloads At Medtronic  78% of servers are virtualized, the low hanging fruit is already gone.  Remaining physical servers are 64GB and larger – Exchange, Oracle, SQL, SAP Middleware  Experienced vMotion failures with large workloads on ESX 4.1  Aging vMware Hosts (3+ Years) Requirements for new a environment  Reduced physical footprint  Support for a few guests up to 256GB (Current requests are for 128GB)  High consolidation ratio – 100+ VMs per Host  Network cable consolidation and operationalize support
Large Workload vMotion  Two key features of ESX 5 provide better support for vMotion of larger workloads than previous versions  Multiple-NIC vMotion provides more bandwidth to Motion process  More bandwidth is always better…the faster the pre-copy phase completes the less time the guest has to dirty the pages…  Reduced time to evacuate a host going into maintenance mode  Stun During Page-Send (SDPS)  SDPS can induce small delays in processor scheduling reducing the rate that the guest is ―dirtying‖ memory pages  Guest performance is only reduced if the guest is ―dirtying‖ memory pages faster than vMotion can pre-copy them
Multiple-NIC vMotion Performance With QOS + FCoE, Without Jumbo Frames vMotion Throughput on 10G CNA (Gb per Second) on 1000v / UCS 6248 FI 14 12 10 8 6 4 2 0 1 10G CNA 2 10G CNA 4 10G CNA
Enabling Multiple-NIC vMotion • Follow best practices and use dedicated VMKernel interfaces for mgmt, vmotion, storage, etc… • Create a vMotion VMkernel interface for each physical NIC you would like to use for vMotion traffic • For all practical purposes the vMotion VMKernel interfaces need to be backed by the same VLAN and address within the same subnet. • All VMKernel interfaces enabled for vMotion will be used for both single or multiple concurrent vMotions • Supports up to 16 interfaces with 1Gb NICS, or 4 interfaces with 10Gb
VMKernel to NIC Association- vSwitch
VMKernel to NIC Association - vDS  Create dvPortGroups before creating VMKernel adapters  Create one dvPortGroup for each physical NIC you want to carry vMotion traffic
VMKernel to NIC Association – 1000v  VPC-HM with Mac-Pinning  Operates in a similar manner to vSwitch and vDS default options, VMKernel interfaces are pinned to physical NIC  ―channel-group auto mode on mac-pinning‖ is used to enable in the ethernet (uplink) port-profile on the 1000v  ―show port-channel internal info all‖ to learn pinning id  Apply the pinning-id command to pin a VMKernel interface to a NIC  port profile type vethernet vMotionA pinning id 1  port profile type vethernet vMotionB pinning id 2  Verify = module vem # execute vemcmd show pinning
VMKernel to NIC Association – 1000v  VPC LACP  Traditional LACP based etherchannel (Active or Passive)  Upstream switch needs to support multi-chassis etherchannel  ―channel-group auto mode active‖ is used to enable in the ethernet (uplink) port-profile on the 1000v  vMotion VMKernel traffic is distributed among the member interfaces based on the selected load balancing algorithm  ―port-channel load-balance ethernet‖ to change algorithm.  If the default isn’t distributing vMotion traffic evenly try ―source-ip-vlan‖ and use consecutive IP address for the vMotion enabled VMKernel interfaces on a host.  Use increments of 2,4,8 ports for even distribution
Traffic Flow During a vMotion  vCenter steps through the list of vMotion VMKernel adapters on each host in the order they were presented to vCenter and pairs them off  Speed mismatch will be handled by bandwidth – multiple 1Gb NICs can be paired to a single 10Gb  There isn’t a way to reliably control which interfaces are paired up, this could lead to vMotion traffic overwhelming switch interconnects  A dedicated vMotion switch avoids switch interconnect issues  Multi-chassis etherchannel eliminates switch interconnect issues  If the NICs aren’t dedicated to vMotion use QOS
vMotion Traffic Flow
Possibility of Inter-switch Traffic
Network IO Control  Only available on the dVS (requires enterprise+ )  Has built-in resource pools for classes of system traffic such as vMotion, Management, iSCSI, NFS  Traffic shares assign a relative importance to traffic that is used to create minimum bandwidth reservations on a per dvUplink basis  Only applies to outbound traffic  Limits are used to cap traffic on a per dVS basis
Class Based WFQ on the 1000v  CBWFQ QOS provides minimum bandwidth reservations on a per- physical port basis  Provides built in protocol matches to classify n1kv, vMotion, management, and storage traffic  Only applies to outbound traffic  QOS on the 1000v is a three step process 1. Define traffic classes using the class-map command 2. Create a traffic policy with the policy-map command 3. Attach the traffic policy to an interface or port-profile with the service-policy command  FOR CBWFQ QOS EXAMPLE PLEASE DOWNLOAD THE PRESENTATION
Real World Design Discussion - Four 10Gb FcOE CNA on UCS  Design Goals –  Support large workloads with up to 256Gb of RAM  Operationalize support into existing frameworks  Support both FC and NFS storage to consolidate existing farms  UCS c460 with 1Tb of RAM, two P81E dual port FCoE VICs  UCS 6248 Fabric Interconnect with 2232 FEX  40Gb VPC uplink from each fabric interconnect  Nexus 1010X / 1000v  4 vNICs and 4 vHBAs presented to ESX  Four vMotion VMKernel interfaces per host  Currently running at a consolidation ratio of 157 – 1.  Replaced 96 ESX Hosts with 8…  Succesful vMotion of an 8 way 256gb VM running SQLioSIM.
Four 10Gb FCoE CNA on UCS  QOS Marking Policy • Apply to vethernet port-profiles policy-map type qos class-cos1 port-profile vMotionA description vMotion service-policy input class-cos1 class class-default port-profile vMotionB set cos 1 service-policy input class-cos1 policy-map type qos class-cos2 description NFS port policy NFS class class-default service-policy input class-cos2 set cos 2 port-policy v174 policy-map type qos class-cos4 service-policy input class-cos4 description Gold-Data port-policy ESX-Management class class-default service-policy input class-cos6 set cos 4 policy-map type qos class-cos6 description ESX-Management class class-default set cos 6
Four 10Gb FcOE CNA on UCS MGMT SAP NFS vMotion • vNIC QOS Policy must be set to ―host control full‖ to trust COS markings • 1000v has no visibility to vHBAs utilization of the link • Instead of queuing on the 1000v the UCS will Queue on the adapter and fabric interconnect • The ―Palo‖ adapters are reduced to three queues when placed in host control full, • The UCS fabric interconnect leverages the advanced QOS functions of the 5k hardware such as virtual output queues to provide effective ingress queuing
Key Takeaways  ESX 5 can vMotion larger guests than 4.1 with the addition of SDPS, but more bandwidth reduces the impact of vMotion on the guest  Consideration should be given to traffic flow when implementing multiple-NIC vMotion, switch interconnects can be easily overwhelmed  Dedicated vMotion adapters are best and should always be used in 1G environments, but aren’t always practical in 10G environments  Without dedicated adapters QOS both on the virtual and physical switch become important
Questions ?  Relevant Sessions Remaining :  INF-VSP1549 - Insight Into vMotion: Architectures, Performance, Best Practices, and Futures  INF-NET2161 - VMware Networking 2012: Enabling the Software Defined Network  INF-NET1590 - What's New in vSphere – Networking  INF-NET2207 - VMware vSphere Distributed Switch— Technical Deep Dive  Please complete your session surveys heath@heathreynolds.com
Appendix 1  CBWFQ QOS Example
Class Based WFQ on the 1000v  Step 1 – Classify the traffic class-map type queuing match-any n1kv_control_packet-class match protocol n1k_control match protocol n1k_packet match protocol n1k_mgmt class-map type queuing match-all nfs-class match protocol vmw_nfs class-map type queuing match-all vmotion_class match protocol vmw_vmotion class-map type queuing match-all vmw_mgmt_class match protocol vmw_mgmt
Class Based WFQ on the 1000v  Step 2 – Create a policy policy-map type queuing uplink_queue_policy class type queuing n1kv_control_packet_class bandwidth percent 10 class type queuing nfs_class bandwidth percent 25 class type queuing vmotion_class bandwidth percent 20 class type queuing vmw_mgmt_class bandwidth percent 10  Step 3 – Attach policy to an interface port-profile uplink service-policy type queuing output uplink_queue_policy

Inf net2227 heath

  • 1.
    Real World Deployment Scenarioswith VMware Networking Solutions: Scaled up Virtualization at Medtronic Heath Reynolds
  • 2.
    Outline  Large WorkloadvMotion challenges  Enabling Multiple-NIC vMotion  Traffic Flow Considerations  QOS  NIOC  Class Based QOS on the 1000v  Real World Design Discussion  Quad 10Gb CNA / 1000v / FCoE on UCS  Questions
  • 3.
    Challenges with LargeWorkloads At Medtronic  78% of servers are virtualized, the low hanging fruit is already gone.  Remaining physical servers are 64GB and larger – Exchange, Oracle, SQL, SAP Middleware  Experienced vMotion failures with large workloads on ESX 4.1  Aging vMware Hosts (3+ Years) Requirements for new a environment  Reduced physical footprint  Support for a few guests up to 256GB (Current requests are for 128GB)  High consolidation ratio – 100+ VMs per Host  Network cable consolidation and operationalize support
  • 4.
    Large Workload vMotion Two key features of ESX 5 provide better support for vMotion of larger workloads than previous versions  Multiple-NIC vMotion provides more bandwidth to Motion process  More bandwidth is always better…the faster the pre-copy phase completes the less time the guest has to dirty the pages…  Reduced time to evacuate a host going into maintenance mode  Stun During Page-Send (SDPS)  SDPS can induce small delays in processor scheduling reducing the rate that the guest is ―dirtying‖ memory pages  Guest performance is only reduced if the guest is ―dirtying‖ memory pages faster than vMotion can pre-copy them
  • 5.
    Multiple-NIC vMotion Performance With QOS + FCoE, Without Jumbo Frames vMotion Throughput on 10G CNA (Gb per Second) on 1000v / UCS 6248 FI 14 12 10 8 6 4 2 0 1 10G CNA 2 10G CNA 4 10G CNA
  • 6.
    Enabling Multiple-NIC vMotion •Follow best practices and use dedicated VMKernel interfaces for mgmt, vmotion, storage, etc… • Create a vMotion VMkernel interface for each physical NIC you would like to use for vMotion traffic • For all practical purposes the vMotion VMKernel interfaces need to be backed by the same VLAN and address within the same subnet. • All VMKernel interfaces enabled for vMotion will be used for both single or multiple concurrent vMotions • Supports up to 16 interfaces with 1Gb NICS, or 4 interfaces with 10Gb
  • 7.
    VMKernel to NICAssociation- vSwitch
  • 8.
    VMKernel to NICAssociation - vDS  Create dvPortGroups before creating VMKernel adapters  Create one dvPortGroup for each physical NIC you want to carry vMotion traffic
  • 9.
    VMKernel to NICAssociation – 1000v  VPC-HM with Mac-Pinning  Operates in a similar manner to vSwitch and vDS default options, VMKernel interfaces are pinned to physical NIC  ―channel-group auto mode on mac-pinning‖ is used to enable in the ethernet (uplink) port-profile on the 1000v  ―show port-channel internal info all‖ to learn pinning id  Apply the pinning-id command to pin a VMKernel interface to a NIC  port profile type vethernet vMotionA pinning id 1  port profile type vethernet vMotionB pinning id 2  Verify = module vem # execute vemcmd show pinning
  • 10.
    VMKernel to NICAssociation – 1000v  VPC LACP  Traditional LACP based etherchannel (Active or Passive)  Upstream switch needs to support multi-chassis etherchannel  ―channel-group auto mode active‖ is used to enable in the ethernet (uplink) port-profile on the 1000v  vMotion VMKernel traffic is distributed among the member interfaces based on the selected load balancing algorithm  ―port-channel load-balance ethernet‖ to change algorithm.  If the default isn’t distributing vMotion traffic evenly try ―source-ip-vlan‖ and use consecutive IP address for the vMotion enabled VMKernel interfaces on a host.  Use increments of 2,4,8 ports for even distribution
  • 11.
    Traffic Flow Duringa vMotion  vCenter steps through the list of vMotion VMKernel adapters on each host in the order they were presented to vCenter and pairs them off  Speed mismatch will be handled by bandwidth – multiple 1Gb NICs can be paired to a single 10Gb  There isn’t a way to reliably control which interfaces are paired up, this could lead to vMotion traffic overwhelming switch interconnects  A dedicated vMotion switch avoids switch interconnect issues  Multi-chassis etherchannel eliminates switch interconnect issues  If the NICs aren’t dedicated to vMotion use QOS
  • 12.
  • 13.
  • 14.
    Network IO Control Only available on the dVS (requires enterprise+ )  Has built-in resource pools for classes of system traffic such as vMotion, Management, iSCSI, NFS  Traffic shares assign a relative importance to traffic that is used to create minimum bandwidth reservations on a per dvUplink basis  Only applies to outbound traffic  Limits are used to cap traffic on a per dVS basis
  • 15.
    Class Based WFQon the 1000v  CBWFQ QOS provides minimum bandwidth reservations on a per- physical port basis  Provides built in protocol matches to classify n1kv, vMotion, management, and storage traffic  Only applies to outbound traffic  QOS on the 1000v is a three step process 1. Define traffic classes using the class-map command 2. Create a traffic policy with the policy-map command 3. Attach the traffic policy to an interface or port-profile with the service-policy command  FOR CBWFQ QOS EXAMPLE PLEASE DOWNLOAD THE PRESENTATION
  • 16.
    Real World DesignDiscussion - Four 10Gb FcOE CNA on UCS  Design Goals –  Support large workloads with up to 256Gb of RAM  Operationalize support into existing frameworks  Support both FC and NFS storage to consolidate existing farms  UCS c460 with 1Tb of RAM, two P81E dual port FCoE VICs  UCS 6248 Fabric Interconnect with 2232 FEX  40Gb VPC uplink from each fabric interconnect  Nexus 1010X / 1000v  4 vNICs and 4 vHBAs presented to ESX  Four vMotion VMKernel interfaces per host  Currently running at a consolidation ratio of 157 – 1.  Replaced 96 ESX Hosts with 8…  Succesful vMotion of an 8 way 256gb VM running SQLioSIM.
  • 18.
    Four 10Gb FCoECNA on UCS  QOS Marking Policy • Apply to vethernet port-profiles policy-map type qos class-cos1 port-profile vMotionA description vMotion service-policy input class-cos1 class class-default port-profile vMotionB set cos 1 service-policy input class-cos1 policy-map type qos class-cos2 description NFS port policy NFS class class-default service-policy input class-cos2 set cos 2 port-policy v174 policy-map type qos class-cos4 service-policy input class-cos4 description Gold-Data port-policy ESX-Management class class-default service-policy input class-cos6 set cos 4 policy-map type qos class-cos6 description ESX-Management class class-default set cos 6
  • 19.
    Four 10Gb FcOECNA on UCS MGMT SAP NFS vMotion • vNIC QOS Policy must be set to ―host control full‖ to trust COS markings • 1000v has no visibility to vHBAs utilization of the link • Instead of queuing on the 1000v the UCS will Queue on the adapter and fabric interconnect • The ―Palo‖ adapters are reduced to three queues when placed in host control full, • The UCS fabric interconnect leverages the advanced QOS functions of the 5k hardware such as virtual output queues to provide effective ingress queuing
  • 21.
    Key Takeaways  ESX5 can vMotion larger guests than 4.1 with the addition of SDPS, but more bandwidth reduces the impact of vMotion on the guest  Consideration should be given to traffic flow when implementing multiple-NIC vMotion, switch interconnects can be easily overwhelmed  Dedicated vMotion adapters are best and should always be used in 1G environments, but aren’t always practical in 10G environments  Without dedicated adapters QOS both on the virtual and physical switch become important
  • 22.
    Questions ?  RelevantSessions Remaining :  INF-VSP1549 - Insight Into vMotion: Architectures, Performance, Best Practices, and Futures  INF-NET2161 - VMware Networking 2012: Enabling the Software Defined Network  INF-NET1590 - What's New in vSphere – Networking  INF-NET2207 - VMware vSphere Distributed Switch— Technical Deep Dive  Please complete your session surveys heath@heathreynolds.com
  • 23.
  • 24.
    Class Based WFQon the 1000v  Step 1 – Classify the traffic class-map type queuing match-any n1kv_control_packet-class match protocol n1k_control match protocol n1k_packet match protocol n1k_mgmt class-map type queuing match-all nfs-class match protocol vmw_nfs class-map type queuing match-all vmotion_class match protocol vmw_vmotion class-map type queuing match-all vmw_mgmt_class match protocol vmw_mgmt
  • 25.
    Class Based WFQon the 1000v  Step 2 – Create a policy policy-map type queuing uplink_queue_policy class type queuing n1kv_control_packet_class bandwidth percent 10 class type queuing nfs_class bandwidth percent 25 class type queuing vmotion_class bandwidth percent 20 class type queuing vmw_mgmt_class bandwidth percent 10  Step 3 – Attach policy to an interface port-profile uplink service-policy type queuing output uplink_queue_policy