The Reality of NDN Video Streaming

As of 2019, video accounts for over 60% of downstream traffic on the Internet. It is believed that video streaming could benefit from the in-network caching feature of Named Data Networking (NDN), which would reduce the total traffic volume and bring cost savings for Internet service providers and content publishers. Far Cry: Will CDNs Hear NDN's Call?, a paper published at ACM-ICN 2020 conference, is the latest attempt on NDN video streaming.

How iViSA Works

In Far Cry, the authors implemented iViSA, a browser-based video streaming application that runs on the global NDN testbed, and then performed some comparison study between this application and similar HTTP video streaming application deployed on commercial CDN services.

It's said that if you want reproducible science, the software needs to be open source. The authors released most of their source code, and we can get a peek into how iViSA actually works.

The backend server repository contains:

  • An NDN producer that serves Data packets from a MongoDB database, with a simple version discovery logic.

  • Scripts to encode and prepare videos using ffmpeg and Shaka Packager.

    • An input video is encoded into five different resolutions from 240p to 1080p using x264 codec. They all have fixed bitrate, ranging from 300 Kbps to 6000 Kbps.
    • Then, encoded videos are prepared as HTTP Live Streaming (HLS) format for adaptive streaming.
  • A small program that converts the files generated by Shaka Packager into NDN Data packets and inserts them into the MongoDB database.

The web application is built with:

  • ndn-js, the original NDN JavaScript library.

  • Shaka Player, a JavaScript library to playback HLS content.

  • A plugin that integrates with Shaka Player's NetworkEngine component.

    • When Shaka Player wants to retrieve a file, the plugin intercepts the file request so that Shaka Player does not send an HTTP request.
    • The plugin invokes ndn-js SegmentedFetcher to retrieve a segmented object from the NDN testbed.
    • Browser's Cache API is used to store recently retrieved files, but the plugin does not properly handle cache eviction.

Additionally, the web application can collect statistics regarding file retrieval and video playback. The statistics are sent as NDN Interests, and there is a producer program that answers such Interests with empty Data packets, and writes a log file.

Video Streaming with NDNts

When I saw an early demo of iViSA and read the code, I was both impressed on the simplicity of this solution (compared to NDN-RTC that only works on macOS) and baffled by the implementation details. As I'm developing NDNts - NDN libraries for the modern web, I started my edition of a frontend web application that can play videos from iViSA servers, which provided a way to test and improve the congestion control algorithm in @ndn/segmented-object package.

I made a few different design decisions:

  • File-level caching:

    • iViSA: browser Cache API.
    • NDNts-video: none.
  • Congestion control algorithm:

    • iViSA: each file is retrieved with a separate instances of RTT estimator and CUBIC congestion control state.
    • NDNts-video: all files of the same video are retrieved with the same instance of RTT estimator and CUBIC congestion control state.
  • Segment count estimation:

    • iViSA: number of Data segments in a file does not affect subsequent retrievals.
    • NDNts-video: number of Data segments in each file is estimated from recent retrievals of similar files (e.g. video clip of the same resolution), which is used to set the initial fetcher pipeline size.

Over a period of several months, I tested my video player in various locations including my apartment, my office, restaurants, and cellular networks. I made many adjustments to the congestion control parameters, and fixed several bugs in the algorithm.

"NDN push-ups"

It's boring to watch the educational videos on iViSA over and over, so I wanted to publish my own content. I made a producer program with NDNts, using the "embedded repo" functionality of @ndn/repo package. Then, I figured out how to use ffmpeg and Shaka Packager to encode and prepare my own videos. In addition, I also made scripts to download iViSA content and replicate their videos into my repository server, so that I can continue testing my web application when they are experiencing downtimes.

The original videos that I publish on my repository servers are me doing push-ups. I fell in love with push-ups after completing the #22pushups challenge in 2016 on my YouTube channel. Moreover, inspired by Smosh's KISS CURRENCY, I would take "payment" in the form of push-ups. I recorded a few new push-up videos, along with some push-ups I received in place of a hackathon prize, and launched the "NDN push-ups" website.

Then, I posted links to this website in various forums. For example, on LowEndTalk, a forum about VPS hosting, I wrote:

Push-ups solve all the problems. Serve push-up videos from all your VPS. Never leave a VPS idle again!

push-ups

Google Analytics on the website logged 184 Unique PageViews within the first week. However, the initial version of the website didn't have statistics collection functionality like iViSA does, so I had no insights on how many viewers watched the video and how well the application was performing, apart from occasional compliments and complaints from forum members. After all, "NDN push-ups" was meant to be a meme, not a science project.

Seeing the traffic numbers from Google Analytics, I saw an opportunity: I could let my viewers help with testing my web application. Thus, I started adding statistics collection functionality, and launched a new version just before New Year's Day.

My implementation differs from iViSA in a few ways:

  • Video codec:

    • iViSA: HLS; x264; five resolutions (up to 1080p).
    • NDNts-video: DASH; I prefer VP9, but had to use x264 for some videos due to audio problems; four resolutions (up to 720p).
  • Server deployment:

    • iViSA: one dedicated server in Arizona, connected to a testbed router with a low-latency link.
    • NDNts-video: two KVM servers in Tokyo (Oracle Cloud) and Buffalo (VirMach), connected to nearby testbed routers with ~10ms ping RTT.
    • iViSA mirror: if the main iViSA producer is down, I would activate my mirror in Buffalo.
  • Statistics report timing:

    • iViSA: after every file retrieval.
    • NDNts-video: after every file retrieval, and every 5 seconds during playback.
  • Statistics submission method:

    • iViSA: NDN Interests; those Interests use a different routing announcement from video content, to reduce impact on forwarding strategy behavior.
    • NDNts-video: HTTP requests using browser's Beacon API.

I Have Viewers Around the World

I continued trolling on various forums with "NDN push-ups" in January. I didn't receive as much traffic: Google Analytics logged 179 Unique PageViews on this website, plus 41 on another page that plays iViSA content using my frontend web application. Nevertheless, I got a log file about video retrieval and playback performance throughout January.

There are 38209 entries in this log file, consisting of the following types:

  • successful file retrieval: 33967
  • failed file retrieval: 860
  • video playback (every 5 seconds): 3382

My frontend web application generates an 8-octet session ID on every page load, which is reported as part of every log entry. I wrote a Python script to collect log entries with the same session ID together.

There are 433 total sessions. Among these sessions:

  • 26 sessions didn't play any video for at least 5 seconds
  • 301 sessions played 1 video
  • 78 sessions played 2 videos
  • 28 sessions played 3 or more videos

My beacon server stores the client IPv4 address (anonymized to /24 subnet) with each log entry. I cross-referenced these IP addresses in the MaxMind GeoLite2 database to determine viewer locations. I have viewers all around the world:

continent session count top country
Africa 5 Morocco 4
Antarctica 0 🤪
Asia 145 China 26
Europe 174 Germany 38
North America 85 USA 65
Oceania 11 Australia 9
South America 13 Brazil 6

Playback Duration and Video Resolution

Every 5 seconds, my web application retrieves several counters from Shaka Player, including the current video resolution, and reports this information to the beacon server. There were 3382 video playback log entries, representing 16910 seconds (4.7 hours) of total playback time.

There were 554 total playbacks (i.e. videos played during sessions). I analyzed the duration of these playbacks:

  • 420 playbacks lasted up to 30 seconds.
  • 99 playbacks lasted between 30 and 60 seconds.
  • 23 playbacks lasted between 60 and 120 seconds.
  • 12 playbacks lasted more than 120 seconds.

Then, I looked at the video resolution selected by Shaka Player in every 5-second interval:

From the chart, we can see that, while some users were able to watch videos in HD 720p resolution, the majority of users were stuck with the lowest 240p resolution. Moreover, the percentage of HD resolution decreases as playback progresses, which indicates that Shaka Player switched to a lower resolution after failing to keep up with the initially selected HD resolution.

Note that my "NDN push-ups" site does not have 1080p content, so the highest available resolution is 720p. There are a small number of 1080p log entries in the chart, because the viewer could select an educational title from iViSA video repository that publishes 1080p content.

The resolution null is recorded when the viewer has pressed the play button. but content has not been retrieved. In this case, Shaka Player would remain in "buffering" state. Typically, file retrieval should not last more than 5 seconds (when my web application sends the first playback report). Thus, this represents either a testbed connectivity issue or a server downtime.

Cross-referencing with MaxMind data, I can count the video resolution for viewers from each continent. In this chart, the line represents the percentage of time waiting for content (null resolution); the stacked bars represent the percentage of time playing each resolution (excluding null).

We can see that:

  • European viewers were able to get 720p for 37% of the time, followed by North American viewers at 21%.
  • In both Europe and North America, videos were playing at 240p for 36% of the time.
  • Asian viewers were experiencing failures (null resolution) more than half of the time.
  • More than half of successful playing times in Asia were at 240p resolution.
  • Although the chart includes viewers from Africa, Oceania, and South America, there isn't enough data to draw meaningful conclusions.

iViSA has Better Video Resolution?

Far Cry paper reported unbelievably high video resolution numbers for iViSA. According to figure 2 in their paper, 100% of North American viewers, as well as more than 95% of European and South American viewer were able to play 1080p content, while 80% of Asian viewers could play at 1080p resolution.

The video resolution numbers of my application are significantly worse than what's reported in their paper. Initially, I thought something was wrong with my implementation of the congestion control algorithm. However, after fixing several bugs (done before data collection), the difference persisted.

Then, I tested the iViSA website on my computers and phones, connected to various networks. Although iViSA does not display the current video resolution, I can tell from picture quality that the videos are playing at 360p or lower resolution most of the time, and I can rarely get 720p or better content. I am located in North America, but I am definitely not getting 100% full HD as the paper claims.

A careful read of the Far Cry paper reveals the problem. Their experiment setup was described as:

End-user virtual machines are placed in five different locations.

We run a headless video player on the end-users for two weeks.

When I described my differing user experience to iViSA authors, I'm told that they were using "Amazon data centers" to emulate end users.

As we know, data centers generally have better Internet connectivity than residential and business connections. However, almost no real user would be watching videos via a data center connection. Therefore, evaluating video streaming quality with emulated viewers deployed in data centers would yield unrealistic results. Instead, quality of experience evaluation must be conducted where the eyeballs are - on residential and business Internet connections.

Since my "NDN push-ups" website is being visited by real Internet users around the world, I believe that the data collected from my application is a better representation of video streaming quality over the NDN testbed.

Router Distance

As I have argued, video streaming quality evaluation should be conducted on residential and business Internet connections. Nevertheless, despite the unrealistic numbers, several conclusions in the Far Cry paper are still valid, such as:

A clear example of this observation is a high startup delay in South America due to the lack of testbed infrastructure in that area. The closest NDN surrogate to the end users in South America was 148ms away.

Both iViSA and NDNts-video utilize the NDN-FCH service to establish a connection to the NDN testbed network. In both applications, the browser asks NDN-FCH service for a list of nearby NDN routers (determined by IP geolocation) that accept WebSocket connections, connects to these routers, sends an Interest to each router to measure its roundtrip time, and then selects the fastest connection.

Using the data I have, I can visualize the correlation between router distance and video resolution. In this chart, each symbol represents a session that played at least one video.

  • Horizontal position indicates the geographical distance between the viewer and the router.

    • Viewer locations are obtained from MaxMind GeoLite2 database.
      • 36 of 433 sessions do not have city-level geolocation, but only have country-level accuracy. Most of these are cellular networks, according to their Autonomous System (AS) numbers.
    • Router locations are extracted from testbed-nodes.json.
  • Vertical position indicates the average estimated bandwidth calculated by Shaka Player.

  • Symbol color indicates the median video resolution.

    • If playback could not start successfully, the resolution would be null and the estimated bandwidth would be zero.
  • Symbol shape indicates the viewer's continent.

From the chart, we can see that:

  • Having higher estimated bandwidth allows higher video resolutions.

    • This is a direct result of how adaptive video playback works in Shaka Player.
    • Typically, Shaka Player would select 720p if estimated bandwidth is above 3500 Kbps.
  • Most viewers were using a router near them.

    • 194 sessions (45%) used a router no more than 1000 KM away.
    • 279 sessions (64%) used a router no more than 2000 KM away.
  • Higher bandwidth and resolution can be achieved only if the router is near the viewer.

  • However, connecting to a nearby router does not imply a high resolution.

    • Among the 173 sessions that had a successful playback using a router within 1000 KM, 78 sessions (45%) were playing at 360p or worse.

Startup Delay

Startup delay is another quality of experience metric in video streaming. Before Shaka Player could start playing a video, it must retrieve a metadata file that describes the video format as well a few initial audio and video segments. During this period, the viewer would be looking at a blank screen, (im)patiently waiting for my awesome push-ups to appear. Startup delay measures how long does it take to retrieve those critical files in order to start video playback, and is reported by Shaka Player only if the retrieval was successful.

This chart shows the cumulative distribution function (CDF) of startup delay in each continent. We can see that:

  • European and North American viewers experienced similar startup delays, with the medium around 900ms.
  • Asian viewers experienced significantly higher startup delays, with the medium around 1600ms.
  • Looking at the 80th percentile, North America has lower startup delay than Europe. One explanation is that, the initial retrieval includes a version discovery step via RDR protocol, which requires a fresh response from the producer, and I have a producer in North America but none in Europe.

Asian Optimization?

From the previous charts, we can see that the video playback experience is much better in Europe and North America than in Asia. Correlating with this "router distance" chart, we can see that Asian viewers tend to use a router farther away (average 2948 KM) than European and North American viewers (average 1071 KM and 1073 KM).

There are two reasons:

  • There are fewer NDN routers in Asia than in Europe and North America.
  • Regional connectivity between countries in Asia is worse than Europe.

Bandwidth pricing is much higher in Asian than in Europe and North America. In the case of mainland China, connecting to a nearby country is often worse than connecting to USA or Germany. However, the geo-nearest logic of NDN-FCH service could not capture such information. It works well in Europe and North America where ISPs have multiple peering points, so that geographical distance is a good estimation of network latency. It does not work well in China, and would break down in Africa where local peering is non-existent.

Although I have speed tests in the web application, it only considers the top 4 results returned by NDN-FCH, and thus provides limited benefits. It is infeasible to do many more speed tests due to browser's limitation on the number of concurrent WebSockets.

A potential Asian optimization is to select a random set of routers and perform speed tests, and then report the results to an improved version of NDN-FCH server. After that, the server could use machine learning techniques to analyze this crowd-sourced data, and give better recommendations to future clients.

Conclusion

This article describes my edition of NDN video streaming in the "NDN push-ups" website. Using the real world data collected during January 2021, I performed some analysis on quality of experience metrics such as video resolution and startup latency. I noted that my results are significantly worse than what's reported in the Far Cry paper about iViSA application, which is likely due to different experiment methodology. I also proposed a potential optimization to improve video streaming quality for Asian viewer.

Although this is not a scientific publication, raw data and scripts in this article are available as a GitHub Gist. If you find this article interesting, please do a few push-ups in my honor, cheers!