Advanced Network Troubleshooting: Using My Traceroute (MTR)

This article is part of our Network Troubleshooting series. You can refer to our other two articles here and here. On this discussion, we focus on examining MTR output when troubleshooting network connectivity issues.

MTR stands for My Traceroute but it was originally called Matt’s traceroute. It is a diagnostic tool that combines the functionality of the commands traceroute and ping so you can do repeated network traces in real time. MTR can show the round-trip time it takes to reach each hop along the path, which gives you at-a-glance data on which hops are problematic and at what times they are slow.

MTR supports IPv6 and works just like the usual traceroute but uses ICMPv6 messages instead. Use the following commands for the specified OS to install MTR:

  • Debian and Ubuntu apt-get install mtr-tiny

  • CentOS yum install mtr

  • Windowsdownload from here

In Linux, use the syntax below to display MTR output. It will display live interactive data as it becomes available:

# mtr [flag] [–report] [–report-cycles COUNT] HOSTNAME

 Where:

  • flag:
    • -4flag connotes use of IPv4 only
    • -6flag connotes use of IPv6 only
  • –report Puts mtr in report mode. In it, mtr runs for the number of cycles specified by the -c option, prints statistics, then exits.
  • –report-cycles Sets number of pings sent by mtr to identify the machines on the network and the connection reliability of those machines. Each ping cycle lasts one (1) second.

Note: It is good practice to send at least 100 packets and to trace in both directions of the route (i.e. externally to the server and from the server back) so you have a more accurate picture of your traffic status.

Below are examples of MTR outputs:

Example 1

# mtr -4 –report –report-cycles 24 google.com

HOST: localhost           Loss%   Snt   Last   Avg  Best  Wrst StDev
1. 208.69.X.X              0.0%    24    0.3   1.2   0.3  14.6   3.0
2. 67.23.161.132           0.0%    24    0.4   0.3   0.2   0.5   0.1
3. 67.23.161.142           0.0%    24    6.5   7.5   6.4  19.2   3.3
4. aix.pr1.atl.google.com  0.0%    24    6.7   7.3   6.6  21.9   3.1
5. 72.14.233.54            0.0%    24    6.8   7.2   6.8   7.9   0.3
6. 216.239.51.243          0.0%    24    7.7   7.6   7.5   7.7   0.1
7. 216.239.48.41           0.0%    24   20.9  20.9  20.7  22.0   0.2
8. 72.14.234.55            0.0%    24   20.2  20.3  20.1  23.0   0.6
9. ???                    100.0    24    0.0   0.0   0.0   0.0   0.0
10. qg-in-f113.1e100.net   0.0%    24   20.2  20.2  20.1  20.4   0.1

Example 2

# mtr -6 –report google.com

HOST: localhost           Loss%   Snt   Last   Avg  Best Wrst StDev
1. 2607:9800::Х::1          0.0%    10   0.8  10.4  0.6  97.7  30.7
2. 2607:9800:1842:132::1    0.0%    10   0.6   0.6  0.5   0.7   0.0
3. 2607:9800:1842:142::2    0.0%    10   6.7   6.7  6.7   6.9   0.1
4. ???                     100.0    10   0.0   0.0  0.0   0.0   0.0
5. 2001:4860::1:0:8bd5      0.0%    10   7.1   7.6  7.0  12.3   1.6
6. 2001:4860::8:0:52bb      0.0%    10  17.2  10.6  7.1  19.7   5.5
7. 2001:4860::2:0:878c      0.0%    10   7.5   8.7  7.5  18.9   3.6
8. ???                     100.0    10   0.0   0.0  0.0   0.0   0.0
9. yk-in-x66.1e100.net      0.0%    10   7.4   7.4  7.4   7.5   0.1

MTR can give you the best, worst, and average round-trip times made by the probe packets between each hop on their the way to the target IP. This feature helps you easily monitor communication quality if you let MTR run for an extended period of time. Because of the constant refresh, you can also spot changes quickly, which makes it more convenient than if you just do a conventional traceroute.

How to verify packet loss

When you are analyzing an MTR output, you are on the lookout for packet loss and network latency. This section talks about verifying packet loss.

Note though, that it is common among Internet service providers to rate-limit the ICMP traffic that MTR uses, which can make it seem that packet loss occurred. To verify if what you are seeing is actual packet loss, look at the next hop immediately after the one where the loss happened. If that subsequent hop shows “0.0%” loss, then the ISP is just doing some rate limiting.

 In this example, the loss between hops 1 and 2 is likely due to the rate limiting on the second hop. Keep in mind that if the loss continues for more than one hop, it means that there are some problems with packet loss or routing. Note that rate limiting and loss may occur at the same time. In this case, take the lowest percentage of loss in a sequence as the actual loss. For example, consider the following conclusion:

# mtr -4 –report www.google.com

HOST: localhost            Loss%   Snt   Last   Avg  Best  Wrst StDev
1. 208.69.X.X              0.0%    10    0.3  19.9   0.3 160.5  49.9
2. 67.23.161.132          50.0%    10    0.3   0.3   0.2   0.3   0.0
3. 67.23.161.142           0.0%    10    6.6   6.6   6.5   6.7   0.1
4. aix.pr1.atl.google.com  0.0%    10    6.8   6.7   6.6   6.8   0.1
5. 72.14.233.56            0.0%    10    6.7   8.4   6.7  22.8   5.1
6. 66.249.94.20            0.0%    10    7.3   7.4   7.3   7.9   0.2
7. 216.239.46.186          0.0%    10    7.4   7.4   7.3   7.5   0.1
8. yk-in-f103.1e100.net    0.0%    10    7.4   7.4   7.3   7.5   0.1

Keep in mind that rate limiting and packet loss can happen concurrently. See the example below:

# mtr -4 –report www.google.com

HOST: localhost           Loss%   Snt   Last   Avg  Best  Wrst StDev  
1. 208.69.X.X             0.0%    10    0.3   0.6   0.3   1.2   0.3
2. 67.23.161.132          0.0%    10    0.4   1.0   0.4   6.1   1.8
3. 67.23.161.142          60.0%   10    0.8   2.7   0.8  19.0   5.7
4. aix.pr1.atl.google.com 60.0%   10    6.7   6.8   6.7   6.9   0.1
5. 72.14.233.56           50.0%   10    7.2   8.3   7.1  16.4   2.9
6. 66.249.94.20           40.0%   10   39.1  39.4  39.1  39.7   0.2
7. 216.239.46.186         40.0%   10   39.6  40.4  39.4  46.9   2.3
8. yk-in-f103.1e100.net   40.0%   10   39.6  40.5  39.5  46.7   2.2

There is 60% loss between hops 2 and 3, and between hops 3 and 4. In this scenario, you can assume that packet loss is happening between hops 3 and 4 since no subsequent router reports zero loss. Notice though that this path is also experiencing loss due to rate limiting as hops 6 to 8 are only registering 40% loss. When different amounts of loss come up, trust the reports from later hops. As mentioned earlier, always trace from both directions as some loss can be explained by problems in the return route.

You should also keep in mind that there is really no need to investigate each and every packet loss. Internet protocols, like Transmission Control Protocol (TCP), are designed to be resilient to some form of network degradation. TCP, for example, allows reliable two-way communication even if links are imperfect or overloaded. It is able to do this by requiring the endpoints of the route to expect minor packet loss, duplication, reordering, and corruption to preserve data integrity and only allow reduction of throughput.

Reading network latency in MTR reports

Network latency is the time it takes the initial packet to reach its destination, for the destination machine to reply, and for that reply to reach the requestor. In short, it is the round-trip time the initial packet takes travelling between the two endpoints of a route. Latency varies widely depending on network conditions.

The physical constraints of distance always increase latency because the number of hops en route to the target IP also increases. These increases should ideally be linear and consistent. The reality of network latency however, is that it will always be relative to both hosts’ connections and how far apart they physically are from each other. When you use MTR outputs to assess possibly problematic connections, use earlier fully functional reports as reference to put context to known connection speeds in an area.Aside from physical distance, the quality of connection can also affect latency. You would know, for example, that dial-up connections give much higher latency than cable modem connections to the same target IP. Let us analyze the MTR report below, which shows high latency:

# mtr -4 –report www.google.com

HOST: localhost            Loss%   Snt   Last   Avg  Best  Wrst StDev
1. 208.69.X.X              0.0%    10    0.3   0.6   0.3   1.2   0.3
2. 67.23.161.132           0.0%    10    0.4   1.0   0.4   6.1   1.8
3. 67.23.161.142           0.0%    10    0.8   2.7   0.8  19.0   5.7
4. aix.pr1.atl.google.com  0.0%    10  388.0 360.4 342.1 396.7   0.2
5. 72.14.233.56            0.0%    10  390.6 360.4 342.1 396.7   0.2
6. 66.249.94.20            0.0%    10  391.6 360.4 342.1 396.7   0.4
7. 216.239.46.186          0.0%    10  391.8 360.4 342.1 396.7   2.1
8. yk-in-f103.1e100.net    0.0%    10  392.0 360.4 342.1 396.7   1.2

Though latency increased between hops 3 and 4, there is no notable increase in the subsequent hops and traffic seems to still reach the target IP. In this scenario, there may be an issue with the fourth router. However, high latency does not always mean that there is a problem with the current route as the return path could also be causing the issue. For this reason it is best to collect MTR reports in both directions as best practice.

As with packet loss, ICMP rate limiting can also create the appearance of latency as shown in the example below:

# mtr -4 –report www.google.com

HOST: localhost               Loss%   Snt   Last   Avg  Best  Wrst StDev 
1. 208.69.X.X                   0.0%    10    0.3   0.6   0.3   1.2   0.3
2. 67.23.161.132                0.0%    10    0.4   1.0   0.4   6.1   1.8
3. 67.23.161.142                0.0%    10    0.8   2.7   0.8  19.0   5.7
4. aix.pr1.atl.google.com       0.0%    10    6.7   6.8   6.7   6.9   0.1
5. 72.14.233.56                 0.0%    10  254.2 250.3 230.1 263.4   2.9
6. 66.249.94.20                 0.0%    10   39.1  39.4  39.1  39.7   0.2
7. 216.239.46.186               0.0%    10   39.6  40.4  39.4  46.9   2.3
8. yk-in-f103.1e100.net         0.0%    10   39.6  40.5  39.5  46.7   2.2

Even though the latency between hops 4 and 5 appears to be worrying, the latency drops immediately after hop 5 and stays consistent. In cases like this, MTR draws attention to an issue which does not affect the service. Again, as with packet loss, trust the latency to the final hop when reading your MTR output.

Common MTR Reports

Some networking issues require escalation to the network team of the upstream networks. However, there are some common MTR reports patterns that describe common networking issues. Let’s look at them.

Destination Host Networking Improperly Configured

In the next example, it appears that there is 100% loss to a the destination host because of an incorrectly configured router.

# mtr -4 –report www.google.com

HOST: localhost                  Loss%   Snt   Last   Avg  Best  Wrst StDev 
1. 208.69.X.X                    0.0%    10    0.3   0.4   0.3   1.0   0.2
2. 67.23.161.132                 0.0%    10    0.4   0.4   0.3   0.5   0.1
3. 67.23.161.142                 0.0%    10    6.5   6.6   6.4   6.8   0.1
4. aix.pr1.atl.google.com        0.0%    10    6.7   6.7   6.6   6.9   0.1
5. 72.14.233.54                  0.0%    10    6.9  17.7   6.9  26.5   8.1
6. 66.249.94.6                   0.0%    10    7.3   8.7   7.3  13.7   2.4
7. 209.85.243.26                 0.0%    10    7.3   9.6   7.3  29.7   7.1
8. yk-in-f99.1e100.net          100.0    10    0.0   0.0   0.0   0.0   0.0

On the contrary, traffic is reaching the destination host. The MTR report is showing loss because the target IP is not sending a reply. This may be the result of improperly configured networking or firewall rules that cause the host to drop ICMP packets.

Improperly configured ISP router

An improperly configured router can stop your packets from reaching the target IP as shown in the example below:

# mtr -4 –report www.google.com

HOST: localhost                  Loss%   Snt   Last   Avg  Best  Wrst StDev 
1. 208.69.X.X                    0.0%    10    0.3   0.4   0.3   1.0   0.2
2. 67.23.161.132                 0.0%    10    0.4   0.4   0.3   0.5   0.1
3. 67.23.161.142                 0.0%    10    6.5   6.6   6.4   6.8   0.1
4. aix.pr1.atl.google.com        0.0%    10    6.7   6.7   6.6   6.9   0.1
5. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
6. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
7. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
8. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
9. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
10.???                           0.0%    10    0.0   0.0   0.0   0.0   0.0

The question marks indicate there is no route information available. It is indicating that the router at hop 4 is not properly configured. To resolve the issue, contact the network administrator team at the source host.

Timeouts

Timeouts happen for a variety of reasons. They are usually shown as question marks (???) in the hop/s before the final one:

# mtr -4 –report www.google.com

HOST: localhost                  Loss%   Snt   Last   Avg  Best  Wrst StDev
1. 208.69.X.X                    0.0%    10    0.3   0.4   0.3   1.0   0.2
2. 67.23.161.132                 0.0%    10    0.4   0.4   0.3   0.5   0.1
3. 67.23.161.142                 0.0%    10    6.5   6.6   6.4   6.8   0.1
4. aix.pr1.atl.google.com        0.0%    10    6.7   6.7   6.6   6.9   0.1
5. ???                           0.0%    10    6.9  17.7   6.9  26.5   8.1
6. ???                           0.0%    10    7.3   8.7   7.3  13.7   2.4
7. 209.85.243.26                 0.0%    10    7.3   9.6   7.3  29.7   7.1
8. yk-in-f99.1e100.net           0.0%    10    7.4   7.4   7.4   7.6   0.1

Timeouts are not necessarily a symptom of packet loss. More often than not, packets still reach their destination without significant loss or latency. The usual causes of timeouts can be attributed to routers dropping packets to improve quality of service or there may be some issue with the return routes.

Resolving issues found in MTR report

Majority of issues that come up when viewing MTR reports are temporary.>Most of them usually clear up by themselves during the 24 hours. If you experience extended periods of degraded service, you can alert the ISP of the issue. When contacting a service provider, best practice is to send MTR reports and any other helpful data you gathered.

See also Basic Network Troubleshooting.
See also Advanced Network Troubleshooting: Using traceroute.
See our Knowledgebase for more How-To articles.

Comments are closed.