URL Rewrite v2.1
The IIS team just released URL Rewrite v2.1. The blog post below details the changes introduced in this release. You can download the latest version from https://www.iis.net/downloads/microsoft/url-rewrite or from WebPI.
Control response cacheability of URL Rewrite Rules
URL Rewrite v7.1.1909 removed `HTTP_HOST` from the set of server variables that are cacheable. This meant that any URL Rewrite rule that referred to `HTTP_HOST` in the condition or whose action is a rewrite/redirect AND set the `HTTP_HOST` as part of its action was no longer kernel cacheable. The objective of this fix was prevent customers from being stuck in rewrite loops due to cacheing as there was no way for URL Rewrite to detect loops. However, this update removed the ability for customers to allow their responses to be kernel cacheable if they knew they didn't have any redirect loops.
Introduction of a responseCacheDirective
URL Rewrite rules can be explicitily marked as cacheable by the introduction of
a new directive on the rule element- responseCacheDirective.
The responseCacheDirective accepts four possible values:
1. Always: The response is always cacheable.
2. Never: The response is never cacheable
3. NotIfRuleMatched: The response is not cacheable if the rule matched.
4. Auto(default): URL rewrite determines the cache friendliness of the rule based on the server variables used in the rule.
The risk of entering a redirect loop has not been mitigated and hence setting responseCacheDirective to always should only used when you can verify there are no redirect loops.
What happens when you define multiple rules with different responseCacheDirective?
URL rewrite tries to match an incoming URL to a set of rules sequentially. Each of the rules has three possible results as it is applied to an incoming URL: Unmatched, URL Matched, and Rule Matched in increasing degrees of matching. Rule Matched differs from URL Matched when the rule conditions are met in addition to URL being a match.
The cacheability of the response is reconsidered with each rule, the initial state being a neutral state where URL rewrite will not instruct the kernel cache either way. If the current state changes to not cacheable, further rules are not considered for determining cacheability. In other words, a single rule among all the rules executed is enough to make the entire response uncacheable. This makes the ordering of the rules important in cases where the processing is stopped when a "Rule Matched" occurs. Consider the case where there is at least one rule that evaluates to "URL Matched" and the rule is set either as "Never" or "Auto" with cache unfriendly servers. If this rule is sequentially before the Rule that would be "Rule Matched", then it would cause the kernel cache to be disabled. If on the other hand the rule is skipped because it is after the "Rule Matched" rule, it would have no effect on the cacheability.
For a rule to have an effect on cacheability, the rule should at the minimum be "URL Matched". If "NotIfRuleMatched" is selected for the responseCacheDirective for a given rule, kernel caching for that response will be disabled if the entire rule is matched with URL and the conditions. Keep in mind that "NotIfRuleMatched" does not take into account the cache unfriendly server variables. This is also true for Never and Always, leaving Auto to be the only value where the existence of cache unfriendly server variables cause the kernel caching to be disabled.
Preserve original URL encoding
In URL Rewrite versions prior to v7.1.1980, when one tries to use UNENCODED_URL, URL Rewrite will encode it which may lead to double encoding if the original URL was already encoded This is in violation of section 2.4 of RFC3986, which says "Implementations must not percent-encode or decode the same string more than once, as decoding an already decoded string might lead to misinterpreting a percent data octet as the beginning of a percent-encoding, or vice versa in the case of percent-encoding an already percent-encoded string." It also made the use of UNENCODED_URL impractical, especially in reverse forwarder scenarios with ARR where the backend servers expect the URL to be passed unmodified.
In v7.1.1980, we are adding a feature flag, useOriginalURLEncoding that allows you to turn off this non-compliant URL Encoding when set to true. The default behavior will remain unchanged (useOriginalURLEncoding is true by default).
To further explain this, let's look at the example below where the incoming URL is https://contoso.com/ab%2fde/. In this example, the cooked representation of the URL IIS receives from HTTP.SYS is the URL once decoded /ab/de/.
When Original URL Encoding is preserved (useOriginalURLEncoding == true), the UNENCODED_URL server variable is computed by encoding the incoming URL, which leads to the double encoding ab%252f. After turning off the non-compliant behavior (useOriginalURLEncoding = false), the UNENCODED_URL is now just the incoming URL.
A more common way of using URL Rewrite is with Back-References where {R:0} represents the entire part of the URL that was matched for a rule and {R:n} represents the parts of the URL that matched to a specific part of the regular expression which is enclosed in parentheses. If there are more than one part of the RE that was enclosed in parentheses, n denotes the order within where 1<=n<=# of parentheses pairs used in the RE.
A resulting back-reference is computed by encoding the corresponding part of the cooked URL. However, since we are encoding the cooked URL, it is impossible to ascertain if the “/” was present in the original URL or it was an artifact of the first decode and hence, we will not attempt to encode it. After setting useOriginalURLEncoding to false, the back reference is now just the cooked URL.
Original URL: https://contoso.com/ab%2fde/
Rewrite rule contains | useOriginalURLEncoding=true | useOriginalEncoding=false |
BACK_REFERENCE | /ab/de/ | /ab/de/ |
UNENCODED_URL | /ab%252fde/ | /ab%2fde/ |
Let's look at another example where the incoming URL is already double-encoded: http://contoso.com/ab%2520de/. In this example, the *cooked* representation of the URL IIS receives from HTTP.SYS is the URL once decoded /ab%20de/.
When Original URL Encoding is preserved, the UNENCODED_URL server variable is again computed by encoding the cooked URL. After turning off non-compliant behavior, the UNENCODED_URL is still just the original URL.
The back-reference is computed by encoding the cooked URL. After setting useOriginalURLEncoding to false, the URL server variable is now just the cooked URL.
Rewrite rule contains | useOriginalURLEncoding=true | useOriginalURLEncoding=false |
BACK_REFERENCE | /ab%2520de/ | /ab%20de/ |
UNENCODED_URL | /ab%252520de/ | /ab%2520de/ |
In both examples, the useOriginalURLEncoding=false provides a way to pass the original URL unmodified by using UNENCODED_URL. This is usually the desired outcome in a reverse-proxy scenario. It also eliminates any double encoding that would have been otherwise performed by the URL Rewrite Module.