1*7f2fe78bSCy SchubertSome (intentional) changes from Sun's submission are noted in the 2*7f2fe78bSCy Schubertinstall guide. 3*7f2fe78bSCy Schubert 4*7f2fe78bSCy SchubertBugs or issues: 5*7f2fe78bSCy Schubert 6*7f2fe78bSCy SchubertThe "full resync" part of the protocol involves the primary side 7*7f2fe78bSCy Schubertfiring off a normal kprop (and going back to servicing requests), and 8*7f2fe78bSCy Schubertthe replica side stopping all the incremental propagation stuff and 9*7f2fe78bSCy Schubertwaiting for the kprop. If the connection from the primary never comes 10*7f2fe78bSCy Schubertin for some reason, the replica side just blocks forever, and never 11*7f2fe78bSCy Schubertresumes incremental propagation. 12*7f2fe78bSCy Schubert 13*7f2fe78bSCy SchubertThe protocol does not currently pass policy database changes; this was 14*7f2fe78bSCy Schubertan intentional decision on Sun's part. The policy database is only 15*7f2fe78bSCy Schubertrelevant to the primary KDC, and is usually fairly static (aside from 16*7f2fe78bSCy Schubertrefcount updates), but not propagating it does mean that a replica 17*7f2fe78bSCy Schubertmaintained via iprop can't simply be promoted to a primary in disaster 18*7f2fe78bSCy Schubertrecovery or other cases without doing a full propagation or restoring 19*7f2fe78bSCy Schuberta database from backups. 20*7f2fe78bSCy Schubert 21*7f2fe78bSCy SchubertShawn had a good suggestion after I started the integration work, and 22*7f2fe78bSCy Schubertwhich I haven't had a chance to implement: Make the update-log code 23*7f2fe78bSCy Schubertfit in as a sort of pseudo-database layer via the DAL, being called 24*7f2fe78bSCy Schubertthrough the standard DAL methods, and doing its work around calls 25*7f2fe78bSCy Schubertthrough to the real database back end again through the DAL methods. 26*7f2fe78bSCy SchubertSo for example, creating a "iprop+db2" database would create an update 27*7f2fe78bSCy Schubertlog and the real db2 database; storing a principal entry would update 28*7f2fe78bSCy Schubertthe update log as well; etc. At least initially, we wouldn't treat it 29*7f2fe78bSCy Schubertas a differently-named database; the installation of the hooks would 30*7f2fe78bSCy Schubertbe done by explicitly checking if iprop is enabled, etc. 31*7f2fe78bSCy Schubert 32*7f2fe78bSCy SchubertThe "iprop role" is assumed to be either primary or replica. The 33*7f2fe78bSCy Schubertprimary writes a log, and the replica fetches it. But what about a 34*7f2fe78bSCy Schubertcascade propagation model where A sends to B which sends to C, perhaps 35*7f2fe78bSCy Schubertbecause A's bandwidth is highly limited, or B and C are co-located? 36*7f2fe78bSCy SchubertIn such a case, B would want to operate in both modes. Granted, with 37*7f2fe78bSCy Schubertiprop the bandwidth issues should be less important, but there may 38*7f2fe78bSCy Schubertstill be reasons one may wish to run in such a configuration. 39*7f2fe78bSCy Schubert 40*7f2fe78bSCy SchubertThe propagation of changes does not happen in real time. It's not a 41*7f2fe78bSCy Schubert"push" protocol; the replicas poll periodically for changes. Perhaps 42*7f2fe78bSCy Schuberta future revision of the protocol could address that. 43*7f2fe78bSCy Schubert 44*7f2fe78bSCy Schubertkadmin/cli/kadmin.c call to kadm5_init_iprop - is this needed in 45*7f2fe78bSCy Schubertclient-side program? Should it be done in libkadm5srv instead as part 46*7f2fe78bSCy Schubertof the existing kadm5_init* so that database-accessing applications 47*7f2fe78bSCy Schubertthat don't get updated at the source level will automatically start 48*7f2fe78bSCy Schubertchanging the update log as needed? 49*7f2fe78bSCy Schubert 50*7f2fe78bSCy SchubertLocking: Currently DAL exports the DB locking interface to the caller; 51*7f2fe78bSCy Schubertwe want to slip the iprop code in between -- run it plus the DB update 52*7f2fe78bSCy Schubertoperation with the DB lock held, whether or not the caller grabbed the 53*7f2fe78bSCy Schubertlock. (Does the caller always grab the lock before making changes?) 54*7f2fe78bSCy SchubertCurrently we're using a file lock on the update log itself; this will 55*7f2fe78bSCy Schubertbe independent of whether the DB back end implements locking (which 56*7f2fe78bSCy Schubertmay be a good thing or a bad thing, depending). 57*7f2fe78bSCy Schubert 58*7f2fe78bSCy SchubertVarious logging calls with odd format strings like "<null>" should be 59*7f2fe78bSCy Schubertfixed. 60*7f2fe78bSCy Schubert 61*7f2fe78bSCy SchubertWhy are different principal names used, when incremental propagation 62*7f2fe78bSCy Schubertrequires that normal kprop (which uses host principals) be possible 63*7f2fe78bSCy Schubertanyways? 64*7f2fe78bSCy Schubert 65*7f2fe78bSCy SchubertWhy is this tied to kadmind, aside from (a) wanting to prevent other 66*7f2fe78bSCy Schubertdb changes, which locking protocols should deal with anyways, (b) 67*7f2fe78bSCy Schubertexisting acl code, (c) existing server process? 68*7f2fe78bSCy Schubert 69*7f2fe78bSCy SchubertThe incremental propagation protocol requires an ACL entry on the 70*7f2fe78bSCy Schubertprimary, listing the replica. Since the full-resync part uses normal 71*7f2fe78bSCy Schubertkprop, the replica also has to have an ACL entry for the primary. If 72*7f2fe78bSCy Schubertthis is missing, I suspect the behavior will be that every two 73*7f2fe78bSCy Schubertminutes, the primary side will (at the prompting of the replica) dump 74*7f2fe78bSCy Schubertout the database and attempt a full propagation. 75*7f2fe78bSCy Schubert 76*7f2fe78bSCy SchubertPossible optimizations: If an existing dump file has a recent enough 77*7f2fe78bSCy Schubertserial number, just send it, without dumping again? Use just one dump 78*7f2fe78bSCy Schubertfile instead of one per replica? 79*7f2fe78bSCy Schubert 80*7f2fe78bSCy SchubertRequiring normal kprop means the replica still can't be behind a NAT 81*7f2fe78bSCy Schubertor firewall without special configuration. The incremental parts can 82*7f2fe78bSCy Schubertwork in such a configuration, so long as outgoing TCP connections are 83*7f2fe78bSCy Schubertallowed. 84*7f2fe78bSCy Schubert 85*7f2fe78bSCy SchubertStill limited to IPv4 because of limitations in MIT's version of the 86*7f2fe78bSCy SchubertRPC code. (This could be fixed for kprop, if IPv6 sites want to do 87*7f2fe78bSCy Schubertfull propagation only. Doing incremental propagation over IPv6 will 88*7f2fe78bSCy Schuberttake work on the RPC library, and probably introduce 89*7f2fe78bSCy Schubertbackwards-incompatible ABI changes.) 90*7f2fe78bSCy Schubert 91*7f2fe78bSCy SchubertOverflow checks for ulogentries times block size? 92*7f2fe78bSCy Schubert 93*7f2fe78bSCy SchubertIf file can't be made the size indicated by ulogentries, should we 94*7f2fe78bSCy Schuberttruncate or error out? If we error out, this could blow out when 95*7f2fe78bSCy Schubertresizing the log because of a too-large log entry. 96*7f2fe78bSCy Schubert 97*7f2fe78bSCy SchubertThe kprop invocation doesn't specify a realm name, so it'll only work 98*7f2fe78bSCy Schubertfor the default realm. No clean way to specify a port number, either. 99*7f2fe78bSCy SchubertWould it be overkill to come up with a way to configure host+port for 100*7f2fe78bSCy Schubertkpropd on the primary? Preferably in a way that'd support cascading 101*7f2fe78bSCy Schubertpropagations. 102*7f2fe78bSCy Schubert 103*7f2fe78bSCy SchubertThe kadmind process, when it needs to run kprop, extracts the replica 104*7f2fe78bSCy Schuberthost name from the client principal name. It assumes that the 105*7f2fe78bSCy Schubertprincipal name will be of the form foo/hostname@REALM, and looks 106*7f2fe78bSCy Schubertspecifically for the "/" and "@" to chop up the string form of the 107*7f2fe78bSCy Schubertname. If looking up that name won't give a working IPv4 address for 108*7f2fe78bSCy Schubertthe replica, kprop will fail (and kpropd will keep waiting, 109*7f2fe78bSCy Schubertincremental updates will stop, etc). 110*7f2fe78bSCy Schubert 111*7f2fe78bSCy SchubertMapping between file offsets and structure addresses, we should be 112*7f2fe78bSCy Schubertcareful about alignment. We're probably okay on current platforms, 113*7f2fe78bSCy Schubertbut if we break log-format compatibility with Sun at some point, use 114*7f2fe78bSCy Schubertthe chance to make the kdb_ent_header_t offsets be more strictly 115*7f2fe78bSCy Schubertaligned in the file. (16 or 32 bytes?) 116*7f2fe78bSCy Schubert 117*7f2fe78bSCy SchubertNot thread safe! The kdb5.c code will get a lock on the update log 118*7f2fe78bSCy Schubertfile while making changes, but the lock is per-process. Currently 119*7f2fe78bSCy Schubertthere are no processes I know of that use multiple threads and change 120*7f2fe78bSCy Schubertthe database. (There's the Novell patch to make the KDC 121*7f2fe78bSCy Schubertmultithreaded, but the kdc-kdb-update option doesn't currently 122*7f2fe78bSCy Schubertcompile.) 123*7f2fe78bSCy Schubert 124*7f2fe78bSCy SchubertLogging in kpropd is poor to useless. If there are any problems, run 125*7f2fe78bSCy Schubertit in debug mode ("-d"). You'll still lose all output from the 126*7f2fe78bSCy Schubertinvocation of kdb5_util dump and kprop run out of kadmind. 127*7f2fe78bSCy Schubert 128*7f2fe78bSCy SchubertOther man page updates needed: Anything with new -x options. 129*7f2fe78bSCy Schubert 130*7f2fe78bSCy SchubertComments from lha: 131*7f2fe78bSCy Schubert 132*7f2fe78bSCy SchubertVerify both client and server are demanding privacy from RPC. 133*7f2fe78bSCy Schubert 134*7f2fe78bSCy SchubertAuthorization code in check_iprop_rpcsec_auth is weird. Check realm 135*7f2fe78bSCy Schubertchecking, is it trusting the client realm length? 136*7f2fe78bSCy Schubert 137*7f2fe78bSCy SchubertWhat will happen if my realm is named "A" and I can get a cross realm 138*7f2fe78bSCy Schubert(though multihop) to ATHENA.MIT.EDU's iprop server? 139*7f2fe78bSCy Schubert 140*7f2fe78bSCy SchubertWhy is the ACL not applied before we get to the functions themselves? 141