Friday, September 4, 2020

vSphere Network Performance Troubleshooting - Part IV

 

Now that we have come to an end to our vSphere Network Performance Troubleshooting Series. Some information about NIC drivers and ring buffers are good to have and thus have been provided in this post. In order to understand ring buffers, it may be good to understand the DMA as well.

DMA, a hardware mechanism that allows peripheral components to transfer their I/O data directly to and from main memory without the need to involve the system processor. Use of this mechanism can greatly increase throughput to and from a device, because a great deal of computational overhead is eliminated. Hardware support is required – DMA controllers DMA “steals” cycles from the processor. Synchronization mechanisms must be provided to avoid accessing non-updated information from RAM.

The DMA ring allows the NIC to directly access the memory used by the software. The software (NIC's driver in the kernel case) is allocating memory for the rings and then mapping it as DMA memory, so the NIC would know it may access it. TX packets will be created in this memory by the software and will be read and transmitted by the NIC (usually after the software signals the NIC it should start transmitting). RX packets will be written to this memory by the NIC and will be read and processed by the software (usually after an interrupt is issued to signal there's work).

Ring Buffer Contains Start and End Address of Buffer in RAM. TX Ring will contain addresses of Buffer in RAM that contains data to be transmitted. RX Ring will contain address of Buffer in RAM where NIC will place data. NIC ring buffer sizes vary per NIC vendor and NIC grade.

These rings are present in RAM. TX buffer and RX buffer are in RAM pointed by TX/RX rings. Network Card Register has Location of Rings Buffer in RAM. These can be DMA buffers and are called DMA TX/RX ring and DMA TX/RX buffer.

Basically, DMA ring buffer and TX/RX rings are the same thing. DMA has two type of ring buffers:

TX ring buffer - used for transmitting data from kernel (NIC driver/software) to device.

RX ring buffer - used for receiving data from device to kernel (NIC driver/software).

 

Following is the script to capture the NIC being used, NIC driver information, their advanced parameter which can be used to modify/set properties, Rx/Tx ring buffers stats, Rx Mini (is for undersized frames), Rx jumbo (jumbo for oversized) and packet summary and stats.

 

for nic in $(esxcli network nic list | grep -i vmnic* | awk '{print $1}');do echo "";echo Nic $nic :;echo Max supported ring buffer:;esxcli network nic ring preset get -n $nic;echo "";echo Currently set ring buffer:;esxcli network nic ring current get -n $nic;echo "";echo "Current ring stats of $nic"; vsish -e cat /net/pNics/$nic/stats | grep -iE "rxq[0-9]|txq[0-9]";echo "=========================";done;echo "";echo List of NIC drivers:;for drvr in $(esxcli network nic list | awk '{print $3}' | grep -viE "device|------" | uniq);do echo "====================";echo -n Driver name:;esxcli system module list | grep $drvr | awk '{print $1}';esxcli system module get -m $drvr | sed 's/^ *//';echo "";echo Parameters list:;esxcli system module parameters list -m $drvr;echo "";done

 

Output :








You can increase the size of the Ethernet device RX ring buffer if the packet drop rate causes applications to report loss of data, slow performance or time outs. 

The exhaustion of the RX ring buffer causes an increment in the counters, such as "discard" or "drop" in the output of NIC stats. The discarded packets indicate that the available buffer is filling up faster than the ESXi kernel or NIC driver can process the packets. Increase the RX ring buffer to reduce a high packet drop rate. Depending on the driver your network interface card uses, changing in the ring buffer can shortly interrupt the network connection. 

By this point we could easily detect the bottle neck in the network or main issue in complete network path (vsphere), you should be able to identify if your physical switches needs attention about various parameters e.g. MTU settings, traffic or bandwidth constraints related to vSphere Distributed switch. 

There may be more to dig and dive further in order to identify the culprits behind sluggish network performance or latency issue. I am leaving it to this point and will be taking this topic up again if have found additional actions to investigate further. 

This has been pretty cliché series for me as all I had to was just transforming all manual steps into Bash script which then can be used to troubleshoot possible network latency, network performance and packet drops issue in a vSphere/VMware SDDC environment. 

I know few of you may think about the hefty theory given above in this last post unlike my previous ones where I was up to the mark about what you have to do to get something from ESXi shell but its important to understand the basic concepts before you dive into something. 

I hope you all had nice learning experience with this series. 

Gonna be posting soon in other topics also...stay tuned till then.


Thanks for reading, be social and share it in your circle.

 

 

 

Link to Page - vSphere




Tuesday, September 1, 2020

vSphere Network Performance Troubleshooting - Part III


As I stated in my last post about utilizing the net-stats and vsish (vmkernel sys info shell) to gather useful network related information, here we will get :

·        Port number

·        Switch name

·        MAC Address

·        Client port status of ports associated with switch

·        Client port stats of ports associated with switch

·        Client name

·        Client port type e.g. 3 (vmkernel), 4 (PNIC) or 5 (virtual NIC)

·        Tx and Rx of packets associated to ports related to individual switches.

·        Dropped packets associated with them.


With following script you can easily get all of these information in one go:

net-stats -l;echo "";for switch in $(net-stats -l | awk '{print $4}' | grep -vi switchname | uniq);do echo "For this switch: $switch:========================";for port in $(net-stats -l | awk '{print $1}' | grep -vi portnum); do echo ""; echo Switch $switch and port $port:;echo "Status :";vsish -e cat /net/portsets/$switch/ports/$port/status 2>/dev/null | grep -i client ;echo "";echo "Stats :";vsish -e cat /net/portsets/$switch/ports/$port/stats 2>/dev/null | grep -iv "packet stats";done;done

Output:








Ports which are not associated to given switch will not have any info against port stats and port status.

The last part of this series will be containing another useful script that can dig the nic driver related information, ring buffer associated, default ring configuration and getting advanced module parameters for given NIC drivers. 

Stay tuned....till then. 

Thanks for reading, be social and share it in your circle if found useful. 


Link to Page - vSphere



vSphere Series

vSphere Network Performance Troubleshooting - Part III

As I stated in my last post about utilizing the net-stats and vsish (vmkernel sys info shell) to gather useful network related information...