Open-Sourcing Our Netdiag Crate

Will Glozer -

Kentik Labs is happy to announce the release of our first open-source Rust crate, netdiag, which provides scalable, asynchronous, implementations of a number of low-level network diagnostics including ping, trace, and a custom diagnostic we call knock. Netdiag is at the core of our synthetics product which monitors application and network performance using a global network of public and private agents.

Netdiag is built on top of tokio and the raw-socket crate. Diagnostic instances are Send and Sync so concurrent tasks can share the same underlying open sockets and state. This provides for efficient execution of many concurrent diagnostic tasks. Both IPv4 and IPv6 are supported for all diagnostics.

Example

The following async code snippet from netdiag’s ping example demonstrates using Pinger to stream ping results:

let pinger = Pinger::new(&Bind::default()).await?;
let ping   = Ping { addr, count, expiry };
let stream = pinger.ping(&ping).enumerate();
pin_mut!(stream);

while let Some((n, item)) = stream.next().await {
    match item? {
        Some(d) => println!("seq {} RTT {:0.2?} ", n, d),
        None    => println!("seq {} timeout", n),
    }
    sleep(delay).await;
}

Pinger::new takes a Bind reference that allows using a specific IPv4 and/or IPv6 source address. This can be useful on a host with multiple routable interfaces.

let pinger = Pinger::new(&Bind::default()).await?;

Pinger.ping returns a Stream of Some(Duration) or None results to indicate no response was received before the expiry time elapsed. StreamExt.enumerate is a convenient extension to Stream that returns the current iteration count in addition to the next value.

let stream = pinger.ping(&ping).enumerate();

Writing async code in Rust is quite pleasant in general, however it isn’t uncommon to run into ergonomics issues around pinning. stream must be pinned via futures::pin_mut prior to calling stream.next().

pin_mut!(stream);

Ping

Pinger and the ping module implement the classic ping diagnostic which uses ICMP echo request & reply packets to estimate round-trip-time (RTT) and packet loss to a host.

Ping is a simple and well-known diagnostic tool that gives a reasonably good view of network latency and packet loss in the general case. However it may not accurately reflect application performance as ICMP traffic can be handled differently than typical application protocols such as TCP and UDP.

Trace

Tracer and the trace module implement the classic traceroute diagnostic which sends UDP, or TCP, packets with an increasing time-to-live (TTL) to determine each hop in the route between source and destination. Examining latency and packet loss for each node in a route can help pinpoint network performance issues.

The netdiag trace implementation discovers multiple paths between source and destination by sending multiple probes per hop while varying the packet header. UDP probes increment the destination port for each probe while TCP probes increment the sequence number instead to allow targeting a specific destination port.

Unix implementations of traceroute usually default to sending UDP probes, however networks may block, rate-limit, or otherwise alter the profile of UDP traffic. TCP probes can bypass some or all of these issues.

Knock

Knocker and the knock module implement a custom diagnostic which performs a partial TCP handshake to estimate RTT and packet loss. Knock can provide a more accurate view of application performance since most application traffic uses TCP and networks often block or rate-limit ICMP traffic.

IPv4 & IPv6

IPv4 and IPv6 support for ping is relatively simple. ICMP raw sockets do not need the IP header when sending, and the ICMPv4 and ICMPv6 echo packet format is identical. However the ICMP type differs, the ICMP checksum must be calculated for ICMPv4, and a IPv4 raw socket receives the IP header while a IPv6 raw socket does not.

IPv4 and IPv6 support for knock & trace is more complicated. Sending and receiving TCP & UDP packets with an IPv4 raw socket require encoding & decoding the IP header. With IPv6 raw sockets the kernel manages the IP header and ancillary data is used to send and receive IP fields such as hop limit and destination address.

The raw-socket crate supports ancillary data via CMsg and the control buffer passed to the send_msg & recv_msg methods of RawSocket.

Rust at Kentik

Rust is one of the core languages used for systems development at Kentik. We rely on a significant amount of open-source software and are happy to be able to contribute back to that community. Our netdiag crate is in active use in hundreds of locations worldwide, powering tens of thousands of active diagnostic tasks, and we hope others find it useful too. And if you are a programmer interested in Rust or any of the topics covered in this post, we’re hiring!