Troubleshoot CDN Connectivity Issues

Troubleshooting CDN Connectivity Issues Using dnyso

by Todd Troutman

A while back, one of our remote offices had trouble reaching our web application. The DNS result for our CDN was taking our users on a terrible route with a high amount of packet loss.

Performing DNS lookups against different resolvers around the world produced different -- better! -- results than Google's resolvers for our CDN provider.

I wanted an easy way to identify every possible response that our CDN could possibly return from DNS. (I'm also passoinately obsessed with finding one-line solutions to problems!) Searching the internets revealed a pip installable python module called dnsyo which resolves a hostname against thousands of resolvers all over the world. Pairing it up with fping lets me determine the best paths to my endpoint.

Example setup and use:

This is a boring result. Since I'm on Google Fiber and using Google's resolvers, it should work great. Though, as you may have noticed, we still didn't get sent to the absolute most responsive endpoint.

At that Qualpay office, ensuring a good quality CDN result was found by changing from Google resolvers (8.8.8.8 and 8.8.9.9) to either Quad9's resolvers (9.9.9.9) or the ISP's default assigned resolvers. I suspect that this suboptimal CDN+DNS condition is much more common than people know.

Let's take a look at some possible ways you could find yourself with a poorly defined set of DNS servers.

Example scenario 1:

An administrator based in the USA either uses centralized configuration mangement to configure DNS resolvers throughout their organization to the correct resolvers for the ISP providing service to the office from which the admin is stationed. The administrator thinks they are very clever now, as the DNS resolvers are described in the config as 23.23.23.23 (fictional for this illustration) expanded over multiple offices. The problem is that that ISP provides connectivity to many, but not all, of these offices.

Example scenario 2:

Your ISP's tech thinks they're smarter than you, and will truck roll into a remote location and change all resolvers to "The Right Thing" if they can.

Example scenario 3:

Users will manually override their resolvers, and fail to use location-based network settings for a variety of reasons.

Let's simulate a ridiculous but not implausible case:

With my local DNS set to an open resolver in Poland, I can resolve the address, but the performance is suboptimal:

Of 27 possible results, we received the 20th worst option. (You don't even want to know the traceroute results!)

The most common challenge I've encountered while explaining this comes from getting people to understand that the internet -- or services operating on the fabric of the internet -- can make bad choices that cause access to a service to be slower than expected.

It happens. One of These Days™, this will not be an issue. But, today, it is occasionally a little noticed issue that can suck tiny amounts of productivity out of an entire site that can equal days or weeks by the end of a year.

About Qualpay

Qualpay is a leading provider of integrated, omnichannel payment solutions. The company's cloud-based payments platform enables businesses to modernize strategically through the use of reporting intelligence to streamline the payment process. Qualpay addresses and resolves the payment challenges businesses face and ensures a stronger, more robust infrastructure for a business, developer, and partner. Simply, Qualpay enables a better way to manage payments. For more information on how Qualpay is reinventing a new era of payment processing, visit www.qualpay.com.

Share