This is another blog post that has been long due. I’ve written a lot about Live Smooth Streaming in my past blog posts but so far very few on On-Demand Smooth Streaming. Well, partially because getting On-Demand Smooth Streaming to work is quite simple. You just need to copy your content to the server and then everything should just work. For low volume usage, that’s pretty much all you need to care about. But if you are doing serious production systems with a large content library and significant client traffic, you would want to tune the system to its best performance in which case this blog post should be helpful to you.
First let’s take a quick look at what’s the typical server logic when serving On-Demand Smooth Streaming requests:
Figure 1. On-Demand Smooth Streaming Server Data Flow
- Client issues a request to get the client manifest (.ISMC file in the diagram). From the manifest file, client figures out which bitrate and timestamp it wants to request from the server.
- Client issues a fragment request as shown in the diagram (first video fragment at bitrate 300kbps).
- Using the server manifest file, server maps the incoming request to a particular ISMV file that contains the requested bitrate.
- If it’s the first time that the server serves any fragment request from this file, server would need to first locate the “index” data at the end of the file which is based on ISO Fragmented MP4 file format.
- From the index, server locates the corresponding fragment in the file and send it back to the client.
Note: This describes a typical data flow for Smooth Streaming requests. Apple Http Live Streaming (HLS) requests handling in IIS Media Services follow a similar data flow but based on different file formats (MPEG2-TS).
As you can see, the server logic is quite simple. For the most part, all it does is mapping incoming requests to sections in the media file and then serve them out. For most On-Demand Smooth Streaming deployments, the performance bottleneck is the disk I/O because the server needs to read a lot of media fragments and its logic is mostly dealing with disk read I/O with very minimum processing overhead. If we drill down into the disk I/O, in most cases, the majority of disk time is spent in seeking the disk head to locate the next fragment. This is especially true if you have a CDN which blocks most of the redundant requests which means the requests to the IIS Media server would be mostly random. As you may know, random I/O read is much slower than sequential I/O read for spindle-based disk storage.
So naturally, the performance tuning of On-Demand Smooth Streaming server is going to be mostly disk I/O related.
2. Caching, Caching, Caching
How do we improve the performance for On-Demand Smooth Streaming? – By doing lots of caching. We try to cache everything that could be cached to reduce disk I/O cost and improve performance. Here are the things that the server tries to cache:
- Manifest Files – These XML based files are typically small so the server would try to cache the entire file in memory after reading it.
- File Handles – Opening/closing file handles could be an expensive operation when the system is under load. The server can optionally cache all the opened file handles so that it can reuse them for future I/O reads. There are a couple of configuration settings that are related to file caching.
- Critical information in ISMV Files – This includes the “Index” information at the end of each ISMV file as well as some specific information about each fragment. Note that it does NOT include the actual media data in all the fragments which is much bigger in size.
- Media Content – This is the actual media data in all the fragments. Server tries to leverage Windows OS’s file cache to effectively cache those data and manage the lifetime of the caches.
So for the most part, performance tuning for On-Demand Smooth Streaming is all about tweaking these caching behavior through various configuration settings which will be discussed below.
3. On-Demand Smooth Streaming Configuration Settings
Below is a table listing the available configuration settings for On-Demand Smooth Streaming in IIS Media Services 4.0 and 4.1 (“*” denotes new in 4.1). The configuration path to this section is “system.webServer/media/smoothStreaming”.
|enableServerCache||This setting controls whether HTTP.sys kernel cache is enabled for media responses.||If you are running Windows Server 2008 SP1 or later versions of Windows server (including Windows Server 2008 R2), you can safely turn this option on. It is disabled by default due to an issue in IIS7 in Windows Server 2008 RTM which was fixed in SP1. However due to the default size limit of HTTP.sys cache and the fact that most likely you’ve already turned on other caching mechanisms (like file cache), you may not see significant performance gain by turning on this option.||False|
|serverCacheTimeout||TTL for HTTP.sys kernel cache entries||Only used if enableServerCache is TRUE.||5 sec|
|Enable/disable file handle caching||Turning this option on would allow the server to cache file handles. This option should be turned on in most cases as it usually gives significant performance boost.||True|
|Sets a limit on how many file handles the server should cache.||The default value zero means no limit in file handle caching which is the recommended setting for most deployments. This option is useful in certain scenarios where the storage device (e.g. some NAS devices) only supports a limited number of concurrent file handles/descriptors. When the cached file handle reaches the limit, server would use a LRU (Least Recently Used) algorithm to retire the least “popular” files to make space for the new ones.||0|
|Specifies what type of file handle the server should open: |
CacheOnly – One file cache enabled file handle for each file.
NonCacheOnly – One file cache disabled file handle for each file.
Both – Two file handles, one being file cache enabled and one being file cache disabled, for each file.
|With file cache enabled file handle, the operating system’s file cache can help caching the media content in physical memory to increase the read performance. This is especially useful for frequently requested fragments. However, for “unpopular” fragments, enabling file cache actually wastes some cycle as the chance of reusing it is small. “Both” is the recommended value for most scenarios. The other two options could be useful when dealing with certain storage devices. “Both” is also the built-in behavior in IIS Media Services 4.0 release.||Both|
|frequentHitThreshold||Server uses a combination of frequently hit threshold and time period (see below) to determine if a particular fragment should be deemed “popular” and thus cached by file cache. For example, with the default values, fragments that were requested at least two times in the last eight seconds would be considered “frequentHit” and enabled for file cache.||This setting is only effective if the fileHandleType (see above) is set to “Both”.||2|
|frequentHitTimePeriod||See above||See above||8 sec|
|enableFileMonitoring*||When file handle is cached, this setting instructs the server whether it should monitor file changes.||When this option is turned on, server keeps monitoring any changes to the file whose handle is being cached by the server. If the file gets updated, server would trigger a refresh logic to reopen and re-parse the file. This is also useful in NAS case when the file handle becomes invalid due to either timeout or temporary network issues in which case server would try to re-open the file handle. You should only turn this option off if you don’t expect any file changes or storage device errors.||True|
|fileMonitoringInterval*||How often the server should check the file for potential changes as discussed above.||This is the minimum interval for file monitoring. The timing of the file checking logic is also driven by client requests. So the actual monitoring interval could be longer than the specified value if the file is not being frequently requested.||5 sec|
|indexCacheSizeLimit||How much memory the server should use for caching index data in ISMV files.||Even though the index data in ISMV file is pretty small, it could still occupy a significant amount of memory if you have a very large content library. If you see the worker process (w3wp.exe) is using too much memory by itself, you could set a limit on the index cache size. The default value of zero means no limit which should be ok for most cases. When the memory limit is reached, server would use the same LRU algorithm (as discussed in fileHandleCacheCountLimit) to delete index data from the least “popular” content to make room for the newer content while always keeping the total memory usage below the limit.||0|
|enableFragmentValidation*||Specify whether the server should perform high level data validation for each fragment.||Each fragment should comply with ISO MP4 spec by having the right MP4 box structure. With this feature turned on, server would perform some simple checks before sending it out to the client. This check does incur some additional I/O cost due to extra I/O read. So if your content is coming from trusted encoder products, you can consider turning this option off.||True|
|enableClientCache||Instruct the client whether it should cache the response||This affects the HTTP cache control header that server sets in the response. It has no impact on local server performance. However turning if off could mean more client requests coming back to the server because no HTTP caching can/should happen downstream.||True|
|cacheControlHeader||Set the TTL for cache control in HTTP response.||Only effective if enableClientCache is set to true. It has no impact on local server performance with the same caveat as above.||max-age=7200|
There is one more configuration setting that’s related to On-Demand Smooth Streaming server’s performance which lives under "system.webServer/media/common". It’s called "systemFileCacheRAMPercentageLimit" which is for setting the upper limit on how much physical memory (RAM), percentage wise, the operating system should use for file cache. The default value is 50% which could be a bit conservative for high end machines with more RAM. You can try to tweak this number higher to leverage more physical memory for file cache. This is a global setting for the machine.
4. Dynamic UNC
If you compared the schema file of IIS Media Services and the table above, you might notice that I omitted a configuration element called “dynamicUNCResolution” (and its child settings) in the “system.webServer/media/smoothStreaming” section. This is a new feature we added in IIS Media Services 4.1 specifically for NAS (Network Attached Storage) scenarios where media content is accessed through UNC paths. The diagram below shows a typical configuration how this feature can be used.
Figure 2 – Dynamic UNC Configuration
The idea of Dynamic UNC is that it allows the server to simultaneously leverage multiple NAS nodes for better I/O throughput while making it completely transparent to the client and IIS core pipeline (in terms of virtual directory settings).
Here is how to set it up:
- Setup a virtual directory on IIS as you normally would with a single UNC path, say \\NAS1\content .
- Enable Dynamic UNC feature in IIS Media Services by setting “enabled” to True (under “dynamicUNCResolution”).
- Pick a routing algorithm via the “routingAlgorithm” configuration setting. There are two options:
- CARP (Cache Array Routing Protocol) – CARP is usually used in CDNs to efficiently route the traffic to upstream cache nodes. It creates an affinity between an URL and a upstream node so that each upstream node only needs to deal with a subset of the content library. With a sizeable content library, it is also able to evenly distribute the load among the upstream nodes while maintaining the URL-to-node affinity.
- Round-robin – This is a simple routing logic that’s based on round-robin logic.
- Edit the "UNCMappingList" configuration element to add entries for all the UNC nodes. You can use the built-in Configuration Editor in IIS management UI to do that:
- The “hostName” property should be the same as the virtual directory path (\\NAS1\content in this case).
- Add one entry for each of the NAS device you have. In this example, you would add four of them (NAS1 to NAS4). The “address” property for each entry can be either host names or IP addresses. If you use CARP, a list of similarly constructed host names or IP addresses would work better in terms of distributing the load more evenly, e.g. NAS1, NAS2,…. or 192.168.1.2, 192.168.1.3, … etc.
After setting it up, you need to make sure that each of the NAS node has access to the same content library (either by copying the content to each NAS machine or having them be part of the same storage cluster).
This Dynamic UNC feature can be useful in many different ways:
- It enables IIS Media Services to work really well with some high-end NAS systems which expose multiple nodes from a single cluster. It allows the server to use a single “mount point” (configured as the virtual directory) to target multiple NAS nodes at the backend without the need to fiddle with multiple virtual directories and URL rewriting rules.
- It allows the IIS Media server to use more SMB sessions to communicate with the NAS backend due to the usage of multiple UNC host names. This can give the system a significant boost in SMB throughput on SMB 2.0 (on Windows Server 2008) and SMB 2.1 (on Windows Server 2008 R2).
- For low end scenarios, it allows you to build a high-performance storage backend with cheap machines and disks (no RAID, no cluster, etc.). It gives you an easy way to assemble more CPU power and, more importantly, disk throughput for media streaming. The built-in health monitoring feature (discussed below) completes the story by giving it a reasonably good fault tolerance capability.
The Dynamic UNC feature also has the capability to monitor the health of the NAS node and seamlessly re-route the traffic in case of node failure. Any node failure will also be reported through Windows Event Log. The “errorRetryIntervalMS” configuration setting can be used to instruct how frequently the server should check the status of each NAS node. When a node failure happens, server would first mark the failed node “unhealthy” and immediately re-route the traffic to other nodes (supported in both CARP and Round-robin mode). Server also has logic to periodically retry the failed node to see if it went back up in which case it would put the node back into the workgroup again. So everything you need for a simple failover/recovery scenario should be automatically handled.
As you can see, getting a On-Demand Smooth Streaming server to work is not difficult, but making it work really well with the best possible performance requires a lot more thinking and tuning. In IIS Media Services 4.1, we did a lot of work to push the server performance further with more configuration settings for fine tuning. We also put a lot of effort in better NAS integration which allows IIS Media Services to run much more efficiently in those environment. In some of the real world deployments, we saw the overall server throughput increased by multiple folds with NAS clusters.
As usual, I hope this blog is helpful to you. Please go ahead and download the latest IIS Media Services 4.1 release and let us know what you think.