Hey Checkyourlogs,
Connectivity issues can sometimes arise when working with Azure Virtual Network Gateway connections from an on-premises environment to an Azure VM. For SportCo, identifying and resolving these issues requires an understanding of the network components involved. This blog post explores common networking terms, components, and tools you can use to diagnose and resolve such problems effectively.
Understanding the Network Routing Domains
At a high level, there are three major network routing domains involved in connectivity between on-premises and Azure: (I borrowed the diagram from Microsoft for this example)
- Azure Network (Blue Cloud): The virtual network hosting your Azure resources.
- Internet or WAN (Green Cloud): The external routing domain connecting Azure to your on-prem environment.
- Corporate Network (Orange Cloud): Your internal network environment.
You can systematically isolate issues by breaking the connectivity path into these domains.
Key Network Components in Azure
Before diving into troubleshooting, it’s essential to understand each network component and its potential role in connectivity issues.
1. Virtual Machine
- Ensure the VM SKU has sufficient bandwidth. For testing, consider a larger SKU like DS5v2.
- Verify the VM’s NIC configurations, such as static routes and OS-level settings.
2. Network Interface Card (NIC)
- Confirm the private IP address assigned to the NIC.
- Verify the NIC Network Security Group (NSG) rules allow necessary traffic (e.g., ports for RDP, SSH, or iPerf).
3. Subnet
- Ensure the NIC is correctly associated with the subnet and that the Subnet NSG rules align with traffic requirements.
- Check User-Defined Routes (UDRs) in the subnet to ensure they direct traffic to the correct next hop.
4. Gateway Subnet
- Confirm the gateway subnet’s NSGs and UDRs do not block traffic.
- For ExpressRoute or VPN connections, review settings like connection weights for multiple tunnels.
5. VNet Gateway
- While VNet Gateways require minimal configuration for routing, ensure they are properly connected to ExpressRoute circuits or VPN tunnels.
6. Route Filters
- If using Microsoft Peering in ExpressRoute, verify that the route filters are correctly configured and applied.
Troubleshooting Connectivity
Given the complexity of these interconnected environments, the best approach is to start at the edges and work inward:
- Ping and Traceroute: Use these tools to determine where the connectivity breaks.
- Inspect Logs: Check Azure NSG flow logs and VPN Gateway logs for anomalies.
- Isolate Domains: Identify whether the issue lies in the Azure network, corporate network, or WAN.
Azure Connectivity Toolkit (AzureCT)
AZURECT is a powerful PowerShell module that simplifies connectivity testing and performance troubleshooting. It includes tools like iPerf and PSPing for bandwidth and latency analysis.
Installing AzureCT
Run the following command to download and install AzureCT:
(new-object Net.WebClient).DownloadString(“https://aka.ms/AzureCT”) | Invoke-Expression
Installing Supporting Applications
Install tools like iPerf and PSPing using this command:
Install-LinkPerformance
This command also configures Windows Firewall rules to allow ICMP and iPerf traffic on port 5201.
Running Performance Tests
- Prepare the Remote Host
- Install and run iPerf in server mode.
- Ensure RDP (3389) or SSH (22) and iPerf (5201) ports are open.
- Start the iPerf Server by running c:\ACTTools\iperf3.exe -s
- Run Tests from the Local Machine Use the following command to test link performance:
Get-LinkPerformance -RemoteHost 172.16.0.4 -TestSeconds 10 -detailedoutput
This command performs a series of concurrent load and latency tests.
- Review Test Results Results are displayed in PowerShell and saved as text files in the C:\ACTTools directory.
Common Scenarios and Resolutions
- Blocked Ports: Verify NSG rules at the NIC and subnet levels.
- Incorrect Routing: Inspect UDRs and ensure they direct traffic appropriately.
- Low Bandwidth: Upgrade the VM SKU or optimize network paths.
- Firewall Issues: Check intermediate devices for blocked traffic.
Conclusion
This tool can be run on-prem as well, and between any of your WAN sites, or if you want to bandwidth test a new set of core switches, it is really limitless.
Connectivity issues between on-premises environments and Azure VMs can be complex, but breaking down the problem into components simplifies troubleshooting. By leveraging tools like AzureCT and systematically verifying configurations, you can quickly identify and resolve issues, ensuring seamless operations for SportCo.
Stay tuned for more insights on Azure networking and troubleshooting tips!
Thanks,
Dave