1Requirements for Recursive Caching Resolver 2 (a.k.a. Treeshrew, Unbound-C) 3By W.C.A. Wijngaards, NLnet Labs, October 2006. 4 5Contents 61. Introduction 72. History 83. Goals 94. Non-Goals 10 11 121. Introduction 13--------------- 14This is the requirements document for a DNS name server and aims to 15document the goals and non-goals of the project. The DNS (the Domain 16Name System) is a global, replicated database that uses a hierarchical 17structure for queries. 18 19Data in the DNS is stored in Resource Record sets (RR sets), and has a 20time to live (TTL). During this time the data can be cached. It is 21thus useful to cache data to speed up future lookups. A server that 22looks up data in the DNS for clients and caches previous answers to 23speed up processing is called a caching, recursive nameserver. 24 25This project aims to develop such a nameserver in modular components, so 26that also DNSSEC (secure DNS) validation and stub-resolvers (that do not 27run as a server, but a linked into an application) are easily possible. 28 29The main components are the Validator that validates the security 30fingerprints on data sets, the Iterator that sends queries to the 31hierarchical DNS servers that own the data and the Cache that stores 32data from previous queries. The networking and query management code 33then interface with the modules to perform the necessary processing. 34 35In Section 2 the origins of the Unbound project are documented. Section 363 lists the goals, while Section 4 lists the explicit non-goals of the 37project. Section 5 discusses choices made during development. 38 39 402. History 41---------- 42The unbound resolver project started by Bill Manning, David Blacka, and 43Matt Larson (from the University of California and from Verisign), that 44created a Java based prototype resolver called Unbound. The basic 45design decisions of clean modules was executed. 46 47The Java prototype worked very well, with contributions from Geoff 48Sisson and Roy Arends from Nominet. Around 2006 the idea came to create 49a full-fledged C implementation ready for deployed use. NLnet Labs 50volunteered to write this implementation. 51 52 533. Goals 54-------- 55o A validating recursive DNS resolver. 56o Code diversity in the DNS resolver monoculture. 57o Drop-in replacement for BIND apart from config. 58o DNSSEC support. 59o Fully RFC compliant. 60o High performance 61 * even with validation. 62o Used as 63 * stub resolver. 64 * full caching name server. 65 * resolver library. 66o Elegant design of validator, resolver, cache modules. 67 * provide the ability to pick and choose modules. 68o Robust. 69o In C, open source: The BSD license. 70o Highly portable, targets include modern Unix systems, such as *BSD, 71solaris, linux, and maybe also the windows platform. 72o Smallest as possible component that does the job. 73o Stub-zones can be configured (local data or AS112 zones). 74 75 764. Non-Goals 77------------ 78o An authoritative name server. 79o Too many Features. 80 81 825. Choices 83---------- 84o rfc2181 discourages duplicates RRs in RRsets. unbound does not create 85 duplicates, but when presented with duplicates on the wire from the 86 authoritative servers, does not perform duplicate removal. 87 It does do some rrsig duplicate removal, in the msgparser, for dnssec qtype 88 rrsig and any, because of special rrsig processing in the msgparser. 89o The harden-glue feature, when yes all out of zone glue is deleted, when 90 no out of zone glue is used for further resolving, is more complicated 91 than that, see below. 92 Main points: 93 * rfc2182 trust handling is used. 94 * data is let through only in very specific cases 95 * spoofability remains possible. 96 Not all glue is let through (despite the name of the option). Only glue 97 which is present in a delegation, of type A and AAAA, where the name is 98 present in the NS record in the authority section is let through. 99 The glue that is let through is stored in the cache (marked as 'from the 100 additional section'). And will then be used for sending queries to. It 101 will not be present in the reply to the client (if RD is off). 102 A direct query for that name will attempt to get a msg into the message 103 cache. Since A and AAAA queries are not synthesized by the unbound cache, 104 this query will be (eventually) sent to the authoritative server and its 105 answer will be put in the cache, marked as 'from the answer section' and 106 thus remove the 'from the additional section' data, and this record is 107 returned to the client. 108 The message has a TTL smaller or equal to the TTL of the answer RR. 109 If the cache memory is low; the answer RR may be dropped, and a glue 110 RR may be inserted, within the message TTL time, and thus return the 111 spoofed glue to a client. When the message expires, it is refetched and 112 the cached RR is updated with the correct content. 113 The server can be spoofed by getting it to visit a especially prepared 114 domain. This domain then inserts an address for another authoritative 115 server into the cache, when visiting that other domain, this address may 116 then be used to send queries to. And fake answers may be returned. 117 If the other domain is signed by DNSSEC, the fakes will be detected. 118 119 In summary, the harden glue feature presents a security risk if 120 disabled. Disabling the feature leads to possible better performance 121 as more glue is present for the recursive service to use. The feature 122 is implemented so as to minimise the security risk, while trying to 123 keep this performance gain. 124o The method by which dnssec-lameness is detected is not secure. DNSSEC lame 125 is when a server has the zone in question, but lacks dnssec data, such as 126 signatures. The method to detect dnssec lameness looks at nonvalidated 127 data from the parent of a zone. This can be used, by spoofing the parent, 128 to create a false sense of dnssec-lameness in the child, or a false sense 129 or dnssec-non-lameness in the child. The first results in the server marked 130 lame, and not used for 900 seconds, and the second will result in a 131 validator failure (SERVFAIL again), when the query is validated later on. 132 133 Concluding, a spoof of the parent delegation can be used for many cases 134 of denial of service. I.e. a completely different NS set could be returned, 135 or the information withheld. All of these alterations can be caught by 136 the validator if the parent is signed, and result in 900 seconds bogus. 137 The dnssec-lameness detection is used to detect operator failures, 138 before the validator will properly verify the messages. 139 140 Also for zones for which no chain of trust exists, but a DS is given by the 141 parent, dnssec-lameness detection enables. This delivers dnssec to our 142 clients when possible (for client validators). 143 144 The following issue needs to be resolved: 145 a server that serves both a parent and child zone, where 146 parent is signed, but child is not. The server must not be marked 147 lame for the parent zone, because the child answer is not signed. 148 Instead of a false positive, we want false negatives; failure to 149 detect dnssec-lameness is less of a problem than marking honest 150 servers lame. dnssec-lameness is a config error and deserves the trouble. 151 So, only messages that identify the zone are used to mark the zone 152 lame. The zone is identified by SOA or NS RRsets in the answer/auth. 153 That includes almost all negative responses and also A, AAAA qtypes. 154 That would be most responses from servers. 155 For referrals, delegations that add a single label can be checked to be 156 from their zone, this covers most delegation-centric zones. 157 158 So possibly, for complicated setups, with multiple (parent-child) zones 159 on a server, dnssec-lameness detection does not work - no dnssec-lameness 160 is detected. Instead the zone that is dnssec-lame becomes bogus. 161 162o authority features. 163 This is a recursive server, and authority features are out of scope. 164 However, some authority features are expected in a recursor. Things like 165 localhost, reverse lookup for 127.0.0.1, or blocking AS112 traffic. 166 Also redirection of domain names with fixed data is needed by service 167 providers. Limited support is added specifically to address this. 168 169 Adding full authority support, requires much more code, and more complex 170 maintenance. 171 172 The limited support allows adding some static data (for localhost and so), 173 and to respond with a fixed rcode (NXDOMAIN) for domains (such as AS112). 174 175 You can put authority data on a separate server, and set the server in 176 unbound.conf as stub for those zones, this allows clients to access data 177 from the server without making unbound authoritative for the zones. 178 179o the access control denies queries before any other processing. 180 This denies queries that are not authoritative, or version.bind, or any. 181 And thus prevents cache-snooping (denied hosts cannot make non-recursive 182 queries and get answers from the cache). 183 184o If a client makes a query without RD bit, in the case of a returned 185 message from cache which is: 186 answer section: empty 187 auth section: NS record present, no SOA record, no DS record, 188 maybe NSEC or NSEC3 records present. 189 additional: A records or other relevant records. 190 A SOA record would indicate that this was a NODATA answer. 191 A DS records would indicate a referral. 192 Absence of NS record would indicate a NODATA answer as well. 193 194 Then the receiver does not know whether this was a referral 195 with attempt at no-DS proof) or a nodata answer with attempt 196 at no-data proof. It could be determined by attempting to prove 197 either condition; and looking if only one is valid, but both 198 proofs could be valid, or neither could be valid, which creates 199 doubt. This case is validated by unbound as a 'referral' which 200 ascertains that RRSIGs are OK (and not omitted), but does not 201 check NSEC/NSEC3. 202 203o Case preservation 204 Unbound preserves the casing received from authority servers as best 205 as possible. It compresses without case, so case can get lost there. 206 The casing from the query name is used in preference to the casing 207 of the authority server. This is the same as BIND. RFC4343 allows either 208 behaviour. 209 210o Denial of service protection 211 If many queries are made, and they are made to names for which the 212 authority servers do not respond, then the requestlist for unbound 213 fills up fast. This results in denial of service for new queries. 214 To combat this the first 50% of the requestlist can run to completion. 215 The last 50% of the requestlist get (200 msec) at least and are replaced 216 by newer queries when older (LIFO). 217 When a new query comes in, and a place in the first 50% is available, this 218 is preferred. Otherwise, it can replace older queries out of the last 50%. 219 Thus, even long queries get a 50% chance to be resolved. And many 'short' 220 one or two round-trip resolves can be done in the last 50% of the list. 221 The timeout can be configured. 222 223o EDNS fallback. Is done according to the EDNS RFC (and update draft-00). 224 Unbound assumes EDNS 0 support for the first query. Then it can detect 225 support (if the servers replies) or non-support (on a NOTIMPL or FORMERR). 226 Some middleboxes drop EDNS 0 queries, mainly when forwarding, not when 227 routing packets. To detect this, when timeouts keep happening, as the 228 timeout approached 5-10 seconds, and EDNS status has not been detected yet, 229 a single probe query is sent. This probe has a sub-second timeout, and 230 if the server responds (quickly) without EDNS, this is cached for 15 min. 231 This works very well when detecting an address that you use much - like 232 a forwarder address - which is where the middleboxes need to be detected. 233 Otherwise, it results in a 5 second wait time before EDNS timeout is 234 detected, which is slow but it works at least. 235 It minimizes the chances of a dropped query making a (DNSSEC) EDNS server 236 falsely EDNS-nonsupporting, and thus DNSSEC-bogus, works well with 237 middleboxes, and can detect the occasional authority that drops EDNS. 238 For some boxes it is necessary to probe for every failing query, a 239 reassurance that the DNS server does EDNS does not mean that path can 240 take large DNS answers. 241 242o 0x20 backoff. 243 The draft describes to back off to the next server, and go through all 244 servers several times. Unbound goes on get the full list of nameserver 245 addresses, and then makes 3 * number of addresses queries. 246 They are sent to a random server, but no one address more than 4 times. 247 It succeeds if one has 0x20 intact, or else all are equal. 248 Otherwise, servfail is returned to the client. 249 250o NXDOMAIN and SOA serial numbers. 251 Unbound keeps TTL values for message formats, and thus rcodes, such 252 as NXDOMAIN. Also it keeps the latest rrsets in the rrset cache. 253 So it will faithfully negative cache for the exact TTL as originally 254 specified for an NXDOMAIN message, but send a newer SOA record if 255 this has been found in the mean time. In point, this could lead to a 256 negative cached NXDOMAIN reply with a SOA RR where the serial number 257 indicates a zone version where this domain is not any longer NXDOMAIN. 258 These situations become consistent once the original TTL expires. 259 If the domain is DNSSEC signed, by the way, then NSEC records are 260 updated more carefully. If one of the NSEC records in an NXDOMAIN is 261 updated from another query, the NXDOMAIN is dropped from the cache, 262 and queried for again, so that its proof can be checked again. 263 264o SOA records in negative cached answers for DS queries. 265 The current unbound code uses a negative cache for queries for type DS. 266 This speeds up building chains of trust, and uses NSEC and NSEC3 267 (optout) information to speed up lookups. When used internally, 268 the bare NSEC(3) information is sufficient, probably picked up from 269 a referral. When answering to clients, a SOA record is needed for 270 the correct message format, a SOA record is picked from the cache 271 (and may not actually match the serial number of the SOA for which the 272 NSEC and NSEC3 records were obtained) if available otherwise network 273 queries are performed to get the data. 274 275o Parent and child with different nameserver information. 276 A misconfiguration that sometimes happens is where the parent and child 277 have different NS, glue information. The child is authoritative, and 278 unbound will not trust information from the parent nameservers as the 279 final answer. To help lookups, unbound will however use the parent-side 280 version of the glue as a last resort lookup. This resolves lookups for 281 those misconfigured domains where the servers reported by the parent 282 are the only ones working, and servers reported by the child do not. 283 284o Failure of validation and probing. 285 Retries on a validation failure are now 5x to a different nameserver IP 286 (if possible), and then it gives up, for one name, type, class entry in 287 the message cache. If a DNSKEY or DS fails in the chain of trust in the 288 key cache additionally, after the probing, a bad key entry is created that 289 makes the entire zone bogus for 900 seconds. This is a fixed value at 290 this time and is conservative in sending probes. It makes the compound 291 effect of many resolvers less and easier to handle, but penalizes 292 individual resolvers by having less probes and a longer time before fixes 293 are picked up. 294 295