127c74787SPoul-Henning Kamp.\" 227c74787SPoul-Henning Kamp.\" Copyright (c) 2002 Poul-Henning Kamp 327c74787SPoul-Henning Kamp.\" Copyright (c) 2002 Networks Associates Technology, Inc. 427c74787SPoul-Henning Kamp.\" All rights reserved. 527c74787SPoul-Henning Kamp.\" 627c74787SPoul-Henning Kamp.\" This software was developed for the FreeBSD Project by Poul-Henning Kamp 727c74787SPoul-Henning Kamp.\" and NAI Labs, the Security Research Division of Network Associates, Inc. 827c74787SPoul-Henning Kamp.\" under DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"), as part of the 927c74787SPoul-Henning Kamp.\" DARPA CHATS research program. 1027c74787SPoul-Henning Kamp.\" 1127c74787SPoul-Henning Kamp.\" Redistribution and use in source and binary forms, with or without 1227c74787SPoul-Henning Kamp.\" modification, are permitted provided that the following conditions 1327c74787SPoul-Henning Kamp.\" are met: 1427c74787SPoul-Henning Kamp.\" 1. Redistributions of source code must retain the above copyright 1527c74787SPoul-Henning Kamp.\" notice, this list of conditions and the following disclaimer. 1627c74787SPoul-Henning Kamp.\" 2. Redistributions in binary form must reproduce the above copyright 1727c74787SPoul-Henning Kamp.\" notice, this list of conditions and the following disclaimer in the 1827c74787SPoul-Henning Kamp.\" documentation and/or other materials provided with the distribution. 1927c74787SPoul-Henning Kamp.\" 3. The names of the authors may not be used to endorse or promote 2027c74787SPoul-Henning Kamp.\" products derived from this software without specific prior written 2127c74787SPoul-Henning Kamp.\" permission. 2227c74787SPoul-Henning Kamp.\" 2327c74787SPoul-Henning Kamp.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 2427c74787SPoul-Henning Kamp.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 2527c74787SPoul-Henning Kamp.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 2627c74787SPoul-Henning Kamp.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 2727c74787SPoul-Henning Kamp.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 2827c74787SPoul-Henning Kamp.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 2927c74787SPoul-Henning Kamp.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 3027c74787SPoul-Henning Kamp.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 3127c74787SPoul-Henning Kamp.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 3227c74787SPoul-Henning Kamp.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 3327c74787SPoul-Henning Kamp.\" SUCH DAMAGE. 3427c74787SPoul-Henning Kamp.\" 3527c74787SPoul-Henning Kamp.\" $FreeBSD$ 3627c74787SPoul-Henning Kamp.\" 37*e5bc2547STom Hukins.Dd March 27, 2023 3827c74787SPoul-Henning Kamp.Dt GEOM 4 39aa12cea2SUlrich Spörlein.Os 4027c74787SPoul-Henning Kamp.Sh NAME 4127c74787SPoul-Henning Kamp.Nm GEOM 4278ad5421SRuslan Ermilov.Nd "modular disk I/O request transformation framework" 439218a6ebSJoel Dahl.Sh SYNOPSIS 449218a6ebSJoel Dahl.Cd options GEOM_BDE 459218a6ebSJoel Dahl.Cd options GEOM_CACHE 469218a6ebSJoel Dahl.Cd options GEOM_CONCAT 479218a6ebSJoel Dahl.Cd options GEOM_ELI 489218a6ebSJoel Dahl.Cd options GEOM_GATE 499218a6ebSJoel Dahl.Cd options GEOM_JOURNAL 509218a6ebSJoel Dahl.Cd options GEOM_LABEL 519218a6ebSJoel Dahl.Cd options GEOM_LINUX_LVM 52fcdb1ffcSAndrey V. Elsukov.Cd options GEOM_MAP 539218a6ebSJoel Dahl.Cd options GEOM_MIRROR 5428ffa766SEdward Tomasz Napierala.Cd options GEOM_MOUNTVER 559218a6ebSJoel Dahl.Cd options GEOM_MULTIPATH 569218a6ebSJoel Dahl.Cd options GEOM_NOP 579218a6ebSJoel Dahl.Cd options GEOM_PART_APM 589218a6ebSJoel Dahl.Cd options GEOM_PART_BSD 59fcdb1ffcSAndrey V. Elsukov.Cd options GEOM_PART_BSD64 609218a6ebSJoel Dahl.Cd options GEOM_PART_EBR 619218a6ebSJoel Dahl.Cd options GEOM_PART_EBR_COMPAT 629218a6ebSJoel Dahl.Cd options GEOM_PART_GPT 639218a6ebSJoel Dahl.Cd options GEOM_PART_LDM 649218a6ebSJoel Dahl.Cd options GEOM_PART_MBR 659218a6ebSJoel Dahl.Cd options GEOM_PART_VTOC8 669218a6ebSJoel Dahl.Cd options GEOM_RAID 679218a6ebSJoel Dahl.Cd options GEOM_RAID3 689218a6ebSJoel Dahl.Cd options GEOM_SHSEC 699218a6ebSJoel Dahl.Cd options GEOM_STRIPE 709218a6ebSJoel Dahl.Cd options GEOM_UZIP 719218a6ebSJoel Dahl.Cd options GEOM_VIRSTOR 729218a6ebSJoel Dahl.Cd options GEOM_ZERO 7327c74787SPoul-Henning Kamp.Sh DESCRIPTION 7478ad5421SRuslan ErmilovThe 7578ad5421SRuslan Ermilov.Nm 7678ad5421SRuslan Ermilovframework provides an infrastructure in which 7778ad5421SRuslan Ermilov.Dq classes 7827c74787SPoul-Henning Kampcan perform transformations on disk I/O requests on their path from 7927c74787SPoul-Henning Kampthe upper kernel to the device drivers and back. 8027c74787SPoul-Henning Kamp.Pp 8178ad5421SRuslan ErmilovTransformations in a 8278ad5421SRuslan Ermilov.Nm 8378ad5421SRuslan Ermilovcontext range from the simple geometric 84d773aebdSPoul-Henning Kampdisplacement performed in typical disk partitioning modules over RAID 8527c74787SPoul-Henning Kampalgorithms and device multipath resolution to full blown cryptographic 8627c74787SPoul-Henning Kampprotection of the stored data. 8727c74787SPoul-Henning Kamp.Pp 8878ad5421SRuslan ErmilovCompared to traditional 8978ad5421SRuslan Ermilov.Dq "volume management" , 9078ad5421SRuslan Ermilov.Nm 9178ad5421SRuslan Ermilovdiffers from most 9227c74787SPoul-Henning Kampand in some cases all previous implementations in the following ways: 9327c74787SPoul-Henning Kamp.Bl -bullet 9427c74787SPoul-Henning Kamp.It 9578ad5421SRuslan Ermilov.Nm 9678ad5421SRuslan Ermilovis extensible. 975203edcdSRuslan ErmilovIt is trivially simple to write a new class 985203edcdSRuslan Ermilovof transformation and it will not be given stepchild treatment. 995203edcdSRuslan ErmilovIf 10027c74787SPoul-Henning Kampsomeone for some reason wanted to mount IBM MVS diskpacks, a class 10127c74787SPoul-Henning Kamprecognizing and configuring their VTOC information would be a trivial 10227c74787SPoul-Henning Kampmatter. 10327c74787SPoul-Henning Kamp.It 10478ad5421SRuslan Ermilov.Nm 10578ad5421SRuslan Ermilovis topologically agnostic. 1065203edcdSRuslan ErmilovMost volume management implementations 10727c74787SPoul-Henning Kamphave very strict notions of how classes can fit together, very often 10878ad5421SRuslan Ermilovone fixed hierarchy is provided, for instance, subdisk - plex - 10927c74787SPoul-Henning Kampvolume. 11027c74787SPoul-Henning Kamp.El 11127c74787SPoul-Henning Kamp.Pp 11227c74787SPoul-Henning KampBeing extensible means that new transformations are treated no differently 11327c74787SPoul-Henning Kampthan existing transformations. 11427c74787SPoul-Henning Kamp.Pp 11527c74787SPoul-Henning KampFixed hierarchies are bad because they make it impossible to express 11627c74787SPoul-Henning Kampthe intent efficiently. 11778ad5421SRuslan ErmilovIn the fixed hierarchy above, it is not possible to mirror two 11856cf50adSPoul-Henning Kampphysical disks and then partition the mirror into subdisks, instead 11927c74787SPoul-Henning Kampone is forced to make subdisks on the physical volumes and to mirror 12078ad5421SRuslan Ermilovthese two and two, resulting in a much more complex configuration. 12178ad5421SRuslan Ermilov.Nm 12278ad5421SRuslan Ermilovon the other hand does not care in which order things are done, 12327c74787SPoul-Henning Kampthe only restriction is that cycles in the graph will not be allowed. 12478ad5421SRuslan Ermilov.Sh "TERMINOLOGY AND TOPOLOGY" 12578ad5421SRuslan Ermilov.Nm 12678ad5421SRuslan Ermilovis quite object oriented and consequently the terminology 12756cf50adSPoul-Henning Kampborrows a lot of context and semantics from the OO vocabulary: 12827c74787SPoul-Henning Kamp.Pp 12978ad5421SRuslan ErmilovA 13078ad5421SRuslan Ermilov.Dq class , 13178ad5421SRuslan Ermilovrepresented by the data structure 13278ad5421SRuslan Ermilov.Vt g_class 13378ad5421SRuslan Ermilovimplements one 1345203edcdSRuslan Ermilovparticular kind of transformation. 1355203edcdSRuslan ErmilovTypical examples are MBR disk 13656cf50adSPoul-Henning Kamppartition, BSD disklabel, and RAID5 classes. 13727c74787SPoul-Henning Kamp.Pp 13878ad5421SRuslan ErmilovAn instance of a class is called a 13978ad5421SRuslan Ermilov.Dq geom 14078ad5421SRuslan Ermilovand represented by the data structure 14178ad5421SRuslan Ermilov.Vt g_geom . 14278ad5421SRuslan ErmilovIn a typical i386 14378ad5421SRuslan Ermilov.Fx 14478ad5421SRuslan Ermilovsystem, there 14527c74787SPoul-Henning Kampwill be one geom of class MBR for each disk. 14627c74787SPoul-Henning Kamp.Pp 14778ad5421SRuslan ErmilovA 14878ad5421SRuslan Ermilov.Dq provider , 14978ad5421SRuslan Ermilovrepresented by the data structure 15078ad5421SRuslan Ermilov.Vt g_provider , 15178ad5421SRuslan Ermilovis the front gate at which a geom offers service. 15278ad5421SRuslan ErmilovA provider is 15378ad5421SRuslan Ermilov.Do 15478ad5421SRuslan Ermilova disk-like thing which appears in 15578ad5421SRuslan Ermilov.Pa /dev 15678ad5421SRuslan Ermilov.Dc - a logical 15727c74787SPoul-Henning Kampdisk in other words. 15878ad5421SRuslan ErmilovAll providers have three main properties: 15978ad5421SRuslan Ermilov.Dq name , 16078ad5421SRuslan Ermilov.Dq sectorsize 16178ad5421SRuslan Ermilovand 16278ad5421SRuslan Ermilov.Dq size . 16327c74787SPoul-Henning Kamp.Pp 16478ad5421SRuslan ErmilovA 16578ad5421SRuslan Ermilov.Dq consumer 16678ad5421SRuslan Ermilovis the backdoor through which a geom connects to another 16756cf50adSPoul-Henning Kampgeom provider and through which I/O requests are sent. 16827c74787SPoul-Henning Kamp.Pp 16927c74787SPoul-Henning KampThe topological relationship between these entities are as follows: 17027c74787SPoul-Henning Kamp.Bl -bullet 17127c74787SPoul-Henning Kamp.It 17227c74787SPoul-Henning KampA class has zero or more geom instances. 17327c74787SPoul-Henning Kamp.It 17427c74787SPoul-Henning KampA geom has exactly one class it is derived from. 17527c74787SPoul-Henning Kamp.It 17627c74787SPoul-Henning KampA geom has zero or more consumers. 17727c74787SPoul-Henning Kamp.It 17856cf50adSPoul-Henning KampA geom has zero or more providers. 17927c74787SPoul-Henning Kamp.It 18027c74787SPoul-Henning KampA consumer can be attached to zero or one providers. 18127c74787SPoul-Henning Kamp.It 18227c74787SPoul-Henning KampA provider can have zero or more consumers attached. 18327c74787SPoul-Henning Kamp.El 18427c74787SPoul-Henning Kamp.Pp 18556cf50adSPoul-Henning KampAll geoms have a rank-number assigned, which is used to detect and 1865203edcdSRuslan Ermilovprevent loops in the acyclic directed graph. 1875203edcdSRuslan ErmilovThis rank number is 18827c74787SPoul-Henning Kampassigned as follows: 18927c74787SPoul-Henning Kamp.Bl -enum 19027c74787SPoul-Henning Kamp.It 19178ad5421SRuslan ErmilovA geom with no attached consumers has rank=1. 19227c74787SPoul-Henning Kamp.It 19356cf50adSPoul-Henning KampA geom with attached consumers has a rank one higher than the 19427c74787SPoul-Henning Kamphighest rank of the geoms of the providers its consumers are 19527c74787SPoul-Henning Kampattached to. 19627c74787SPoul-Henning Kamp.El 19757bd0fc6SJens Schweikhardt.Sh "SPECIAL TOPOLOGICAL MANEUVERS" 19856cf50adSPoul-Henning KampIn addition to the straightforward attach, which attaches a consumer 19957bd0fc6SJens Schweikhardtto a provider, and detach, which breaks the bond, a number of special 20057bd0fc6SJens Schweikhardttopological maneuvers exists to facilitate configuration and to 20127c74787SPoul-Henning Kampimprove the overall flexibility. 20278ad5421SRuslan Ermilov.Bl -inset 20378ad5421SRuslan Ermilov.It Em TASTING 20456cf50adSPoul-Henning Kampis a process that happens whenever a new class or new provider 20578ad5421SRuslan Ermilovis created, and it provides the class a chance to automatically configure an 20616e88145SCeri Daviesinstance on providers which it recognizes as its own. 20756cf50adSPoul-Henning KampA typical example is the MBR disk-partition class which will look for 20878ad5421SRuslan Ermilovthe MBR table in the first sector and, if found and validated, will 20927c74787SPoul-Henning Kampinstantiate a geom to multiplex according to the contents of the MBR. 21027c74787SPoul-Henning Kamp.Pp 21156cf50adSPoul-Henning KampA new class will be offered to all existing providers in turn and a new 21227c74787SPoul-Henning Kampprovider will be offered to all classes in turn. 21327c74787SPoul-Henning Kamp.Pp 21427c74787SPoul-Henning KampExactly what a class does to recognize if it should accept the offered 21578ad5421SRuslan Ermilovprovider is not defined by 21678ad5421SRuslan Ermilov.Nm , 21778ad5421SRuslan Ermilovbut the sensible set of options are: 21827c74787SPoul-Henning Kamp.Bl -bullet 21927c74787SPoul-Henning Kamp.It 22027c74787SPoul-Henning KampExamine specific data structures on the disk. 22127c74787SPoul-Henning Kamp.It 22278ad5421SRuslan ErmilovExamine properties like 22378ad5421SRuslan Ermilov.Dq sectorsize 22478ad5421SRuslan Ermilovor 22578ad5421SRuslan Ermilov.Dq mediasize 22678ad5421SRuslan Ermilovfor the provider. 22727c74787SPoul-Henning Kamp.It 22856cf50adSPoul-Henning KampExamine the rank number of the provider's geom. 22927c74787SPoul-Henning Kamp.It 23056cf50adSPoul-Henning KampExamine the method name of the provider's geom. 23127c74787SPoul-Henning Kamp.El 23278ad5421SRuslan Ermilov.It Em ORPHANIZATION 23327c74787SPoul-Henning Kampis the process by which a provider is removed while 23456cf50adSPoul-Henning Kampit potentially is still being used. 23527c74787SPoul-Henning Kamp.Pp 236c1c85751SPoul-Henning KampWhen a geom orphans a provider, all future I/O requests will 23778ad5421SRuslan Ermilov.Dq bounce 23878ad5421SRuslan Ermilovon the provider with an error code set by the geom. 2395203edcdSRuslan ErmilovAny 24027c74787SPoul-Henning Kampconsumers attached to the provider will receive notification about 241c1c85751SPoul-Henning Kampthe orphanization when the event loop gets around to it, and they 242d773aebdSPoul-Henning Kampcan take appropriate action at that time. 24327c74787SPoul-Henning Kamp.Pp 24456cf50adSPoul-Henning KampA geom which came into being as a result of a normal taste operation 24516e88145SCeri Daviesshould self-destruct unless it has a way to keep functioning whilst 24616e88145SCeri Davieslacking the orphaned provider. 24778ad5421SRuslan ErmilovGeoms like disk slicers should therefore self-destruct whereas 24816e88145SCeri DaviesRAID5 or mirror geoms will be able to continue as long as they do 24916e88145SCeri Daviesnot lose quorum. 25027c74787SPoul-Henning Kamp.Pp 251c1c85751SPoul-Henning KampWhen a provider is orphaned, this does not necessarily result in any 252c1c85751SPoul-Henning Kampimmediate change in the topology: any attached consumers are still 253c1c85751SPoul-Henning Kampattached, any opened paths are still open, any outstanding I/O 254c1c85751SPoul-Henning Kamprequests are still outstanding. 25527c74787SPoul-Henning Kamp.Pp 25678ad5421SRuslan ErmilovThe typical scenario is: 25778ad5421SRuslan Ermilov.Pp 258c1c85751SPoul-Henning Kamp.Bl -bullet -offset indent -compact 259c1c85751SPoul-Henning Kamp.It 260c1c85751SPoul-Henning KampA device driver detects a disk has departed and orphans the provider for it. 261c1c85751SPoul-Henning Kamp.It 262c1c85751SPoul-Henning KampThe geoms on top of the disk receive the orphanization event and 26316e88145SCeri Daviesorphan all their providers in turn. 26416e88145SCeri DaviesProviders which are not attached to will typically self-destruct 265c1c85751SPoul-Henning Kampright away. 266c1c85751SPoul-Henning KampThis process continues in a quasi-recursive fashion until all 26716e88145SCeri Daviesrelevant pieces of the tree have heard the bad news. 268c1c85751SPoul-Henning Kamp.It 269c1c85751SPoul-Henning KampEventually the buck stops when it reaches geom_dev at the top 270c1c85751SPoul-Henning Kampof the stack. 271c1c85751SPoul-Henning Kamp.It 27278ad5421SRuslan ErmilovGeom_dev will call 27378ad5421SRuslan Ermilov.Xr destroy_dev 9 27416e88145SCeri Daviesto stop any more requests from 275c1c85751SPoul-Henning Kampcoming in. 27616e88145SCeri DaviesIt will sleep until any and all outstanding I/O requests have 277c1c85751SPoul-Henning Kampbeen returned. 27878ad5421SRuslan ErmilovIt will explicitly close (i.e.: zero the access counts), a change 279c1c85751SPoul-Henning Kampwhich will propagate all the way down through the mesh. 280c1c85751SPoul-Henning KampIt will then detach and destroy its geom. 281c1c85751SPoul-Henning Kamp.It 28256b341a2SEdward Tomasz NapieralaThe geom whose provider is now detached will destroy the provider, 283c1c85751SPoul-Henning Kampdetach and destroy its consumer and destroy its geom. 284c1c85751SPoul-Henning Kamp.It 285c1c85751SPoul-Henning KampThis process percolates all the way down through the mesh, until 286c1c85751SPoul-Henning Kampthe cleanup is complete. 287c1c85751SPoul-Henning Kamp.El 28827c74787SPoul-Henning Kamp.Pp 28956cf50adSPoul-Henning KampWhile this approach seems byzantine, it does provide the maximum 290c1c85751SPoul-Henning Kampflexibility and robustness in handling disappearing devices. 291c1c85751SPoul-Henning Kamp.Pp 29216e88145SCeri DaviesThe one absolutely crucial detail to be aware of is that if the 293c1c85751SPoul-Henning Kampdevice driver does not return all I/O requests, the tree will 294d773aebdSPoul-Henning Kampnot unravel. 29578ad5421SRuslan Ermilov.It Em SPOILING 29627c74787SPoul-Henning Kampis a special case of orphanization used to protect 29727c74787SPoul-Henning Kampagainst stale metadata. 29827c74787SPoul-Henning KampIt is probably easiest to understand spoiling by going through 29927c74787SPoul-Henning Kampan example. 30027c74787SPoul-Henning Kamp.Pp 30178ad5421SRuslan ErmilovImagine a disk, 30216e88145SCeri Davies.Pa da0 , 30378ad5421SRuslan Ermilovon top of which an MBR geom provides 30478ad5421SRuslan Ermilov.Pa da0s1 30578ad5421SRuslan Ermilovand 30678ad5421SRuslan Ermilov.Pa da0s2 , 30778ad5421SRuslan Ermilovand on top of 30878ad5421SRuslan Ermilov.Pa da0s1 30978ad5421SRuslan Ermilova BSD geom provides 31078ad5421SRuslan Ermilov.Pa da0s1a 31178ad5421SRuslan Ermilovthrough 31278ad5421SRuslan Ermilov.Pa da0s1e , 31316e88145SCeri Daviesand that both the MBR and BSD geoms have 31427c74787SPoul-Henning Kampautoconfigured based on data structures on the disk media. 31578ad5421SRuslan ErmilovNow imagine the case where 31678ad5421SRuslan Ermilov.Pa da0 31778ad5421SRuslan Ermilovis opened for writing and those 31878ad5421SRuslan Ermilovdata structures are modified or overwritten: now the geoms would 31927c74787SPoul-Henning Kampbe operating on stale metadata unless some notification system 32027c74787SPoul-Henning Kampcan inform them otherwise. 321d773aebdSPoul-Henning Kamp.Pp 32278ad5421SRuslan ErmilovTo avoid this situation, when the open of 32378ad5421SRuslan Ermilov.Pa da0 32478ad5421SRuslan Ermilovfor write happens, 32516e88145SCeri Daviesall attached consumers are told about this and geoms like 32678ad5421SRuslan ErmilovMBR and BSD will self-destruct as a result. 32778ad5421SRuslan ErmilovWhen 32878ad5421SRuslan Ermilov.Pa da0 32916e88145SCeri Daviesis closed, it will be offered for tasting again 33016e88145SCeri Daviesand, if the data structures for MBR and BSD are still there, new 33127c74787SPoul-Henning Kampgeoms will instantiate themselves anew. 33227c74787SPoul-Henning Kamp.Pp 33327c74787SPoul-Henning KampNow for the fine print: 33427c74787SPoul-Henning Kamp.Pp 33527c74787SPoul-Henning KampIf any of the paths through the MBR or BSD module were open, they 33616e88145SCeri Davieswould have opened downwards with an exclusive bit thus rendering it 33778ad5421SRuslan Ermilovimpossible to open 33878ad5421SRuslan Ermilov.Pa da0 33916e88145SCeri Daviesfor writing in that case. 34016e88145SCeri DaviesConversely, 34127c74787SPoul-Henning Kampthe requested exclusive bit would render it impossible to open a 34278ad5421SRuslan Ermilovpath through the MBR geom while 34378ad5421SRuslan Ermilov.Pa da0 34478ad5421SRuslan Ermilovis open for writing. 34527c74787SPoul-Henning Kamp.Pp 34627c74787SPoul-Henning KampFrom this it also follows that changing the size of open geoms can 347d773aebdSPoul-Henning Kamponly be done with their cooperation. 34827c74787SPoul-Henning Kamp.Pp 34927c74787SPoul-Henning KampFinally: the spoiling only happens when the write count goes from 35016e88145SCeri Davieszero to non-zero and the retasting happens only when the write count goes 351d773aebdSPoul-Henning Kampfrom non-zero to zero. 35278ad5421SRuslan Ermilov.It Em CONFIGURE 35327c74787SPoul-Henning Kampis the process where the administrator issues instructions 3545203edcdSRuslan Ermilovfor a particular class to instantiate itself. 3555203edcdSRuslan ErmilovThere are multiple 35616e88145SCeri Daviesways to express intent in this case - a particular provider may be 35716e88145SCeri Daviesspecified with a level of override forcing, for instance, a BSD 35827c74787SPoul-Henning Kampdisklabel module to attach to a provider which was not found palatable 35927c74787SPoul-Henning Kampduring the TASTE operation. 36027c74787SPoul-Henning Kamp.Pp 36116e88145SCeri DaviesFinally, I/O is the reason we even do this: it concerns itself with 36227c74787SPoul-Henning Kampsending I/O requests through the graph. 36316e88145SCeri Davies.It Em "I/O REQUESTS" , 36478ad5421SRuslan Ermilovrepresented by 36578ad5421SRuslan Ermilov.Vt "struct bio" , 36678ad5421SRuslan Ermilovoriginate at a consumer, 36716e88145SCeri Daviesare scheduled on its attached provider and, when processed, are returned 36827c74787SPoul-Henning Kampto the consumer. 36978ad5421SRuslan ErmilovIt is important to realize that the 37078ad5421SRuslan Ermilov.Vt "struct bio" 37178ad5421SRuslan Ermilovwhich enters through the provider of a particular geom does not 37278ad5421SRuslan Ermilov.Do 37378ad5421SRuslan Ermilovcome out on the other side 37478ad5421SRuslan Ermilov.Dc . 37527c74787SPoul-Henning KampEven simple transformations like MBR and BSD will clone the 37678ad5421SRuslan Ermilov.Vt "struct bio" , 37778ad5421SRuslan Ermilovmodify the clone, and schedule the clone on their 37827c74787SPoul-Henning Kampown consumer. 37978ad5421SRuslan ErmilovNote that cloning the 38078ad5421SRuslan Ermilov.Vt "struct bio" 38178ad5421SRuslan Ermilovdoes not involve cloning the 38278ad5421SRuslan Ermilovactual data area specified in the I/O request. 38327c74787SPoul-Henning Kamp.Pp 38478ad5421SRuslan ErmilovIn total, four different I/O requests exist in 38578ad5421SRuslan Ermilov.Nm : 38678ad5421SRuslan Ermilovread, write, delete, and 38778ad5421SRuslan Ermilov.Dq "get attribute". 38827c74787SPoul-Henning Kamp.Pp 38956cf50adSPoul-Henning KampRead and write are self explanatory. 39027c74787SPoul-Henning Kamp.Pp 39127c74787SPoul-Henning KampDelete indicates that a certain range of data is no longer used 39227c74787SPoul-Henning Kampand that it can be erased or freed as the underlying technology 39327c74787SPoul-Henning Kampsupports. 39427c74787SPoul-Henning KampTechnologies like flash adaptation layers can arrange to erase 39527c74787SPoul-Henning Kampthe relevant blocks before they will become reassigned and 39656cf50adSPoul-Henning Kampcryptographic devices may want to fill random bits into the 39727c74787SPoul-Henning Kamprange to reduce the amount of data available for attack. 39827c74787SPoul-Henning Kamp.Pp 39927c74787SPoul-Henning KampIt is important to recognize that a delete indication is not a 40027c74787SPoul-Henning Kamprequest and consequently there is no guarantee that the data actually 40127c74787SPoul-Henning Kampwill be erased or made unavailable unless guaranteed by specific 4025203edcdSRuslan Ermilovgeoms in the graph. 40378ad5421SRuslan ErmilovIf 40478ad5421SRuslan Ermilov.Dq "secure delete" 40578ad5421SRuslan Ermilovsemantics are required, a 40627c74787SPoul-Henning Kampgeom should be pushed which converts delete indications into (a 40727c74787SPoul-Henning Kampsequence of) write requests. 40827c74787SPoul-Henning Kamp.Pp 40978ad5421SRuslan Ermilov.Dq "Get attribute" 41078ad5421SRuslan Ermilovsupports inspection and manipulation 41127c74787SPoul-Henning Kampof out-of-band attributes on a particular provider or path. 41278ad5421SRuslan ErmilovAttributes are named by 41378ad5421SRuslan Ermilov.Tn ASCII 41478ad5421SRuslan Ermilovstrings and they will be discussed in 41527c74787SPoul-Henning Kampa separate section below. 41678ad5421SRuslan Ermilov.El 41727c74787SPoul-Henning Kamp.Pp 41878ad5421SRuslan Ermilov(Stay tuned while the author rests his brain and fingers: more to come.) 419ba3eb872SScott Long.Sh DIAGNOSTICS 42078ad5421SRuslan ErmilovSeveral flags are provided for tracing 42178ad5421SRuslan Ermilov.Nm 42278ad5421SRuslan Ermilovoperations and unlocking 423ba3eb872SScott Longprotection mechanisms via the 424ba3eb872SScott Long.Va kern.geom.debugflags 425ba3eb872SScott Longsysctl. 426ba3eb872SScott LongAll of these flags are off by default, and great care should be taken in 427ba3eb872SScott Longturning them on. 42878ad5421SRuslan Ermilov.Bl -tag -width indent 4294f068961SRuslan Ermilov.It 0x01 Pq Dv G_T_TOPOLOGY 430ba3eb872SScott LongProvide tracing of topology change events. 4314f068961SRuslan Ermilov.It 0x02 Pq Dv G_T_BIO 432ba3eb872SScott LongProvide tracing of buffer I/O requests. 4334f068961SRuslan Ermilov.It 0x04 Pq Dv G_T_ACCESS 434ba3eb872SScott LongProvide tracing of access check controls. 435ba3eb872SScott Long.It 0x08 (unused) 436ba3eb872SScott Long.It 0x10 (allow foot shooting) 437ba3eb872SScott LongAllow writing to Rank 1 providers. 438ba3eb872SScott LongThis would, for example, allow the super-user to overwrite the MBR on the root 4394f068961SRuslan Ermilovdisk or write random sectors elsewhere to a mounted disk. 4404f068961SRuslan ErmilovThe implications are obvious. 4414f068961SRuslan Ermilov.It 0x40 Pq Dv G_F_DISKIOCTL 4428a3dce33SCeri DaviesThis is unused at this time. 4434f068961SRuslan Ermilov.It 0x80 Pq Dv G_F_CTLDUMP 444ba3eb872SScott LongDump contents of gctl requests. 445ba3eb872SScott Long.El 44656b341a2SEdward Tomasz Napierala.Sh SEE ALSO 447a7c13cccSEdward Tomasz Napierala.Xr libgeom 3 , 448*e5bc2547STom Hukins.Xr geom 8 , 44956b341a2SEdward Tomasz Napierala.Xr DECLARE_GEOM_CLASS 9 , 450923544aaSBaptiste Daroussin.Xr disk 9 , 45156b341a2SEdward Tomasz Napierala.Xr g_access 9 , 45256b341a2SEdward Tomasz Napierala.Xr g_attach 9 , 45356b341a2SEdward Tomasz Napierala.Xr g_bio 9 , 45456b341a2SEdward Tomasz Napierala.Xr g_consumer 9 , 45556b341a2SEdward Tomasz Napierala.Xr g_data 9 , 45656b341a2SEdward Tomasz Napierala.Xr g_event 9 , 45756b341a2SEdward Tomasz Napierala.Xr g_geom 9 , 45856b341a2SEdward Tomasz Napierala.Xr g_provider 9 , 45956b341a2SEdward Tomasz Napierala.Xr g_provider_by_name 9 46027c74787SPoul-Henning Kamp.Sh HISTORY 461ac8e5d02SConrad MeyerThis software was initially developed for the 46278ad5421SRuslan Ermilov.Fx 46378ad5421SRuslan ErmilovProject by 46478ad5421SRuslan Ermilov.An Poul-Henning Kamp 4654f068961SRuslan Ermilovand NAI Labs, the Security Research Division of Network Associates, Inc.\& 46678ad5421SRuslan Ermilovunder DARPA/SPAWAR contract N66001-01-C-8035 46778ad5421SRuslan Ermilov.Pq Dq CBOSS , 46878ad5421SRuslan Ermilovas part of the 46927c74787SPoul-Henning KampDARPA CHATS research program. 47027c74787SPoul-Henning Kamp.Pp 471ac8e5d02SConrad MeyerThe following obsolete 47278ad5421SRuslan Ermilov.Nm 473ac8e5d02SConrad Meyercomponents were removed in 474ac8e5d02SConrad Meyer.Fx 13.0 : 475ac8e5d02SConrad Meyer.Bl -bullet -offset indent -compact 476ac8e5d02SConrad Meyer.It 477ac8e5d02SConrad Meyer.Cd GEOM_BSD , 478ac8e5d02SConrad Meyer.It 479ac8e5d02SConrad Meyer.Cd GEOM_FOX , 480ac8e5d02SConrad Meyer.It 481ac8e5d02SConrad Meyer.Cd GEOM_MBR , 482ac8e5d02SConrad Meyer.It 483ac8e5d02SConrad Meyer.Cd GEOM_SUNLABEL , 484ac8e5d02SConrad Meyerand 485ac8e5d02SConrad Meyer.It 486ac8e5d02SConrad Meyer.Cd GEOM_VOL . 487ac8e5d02SConrad Meyer.El 488ac8e5d02SConrad Meyer.Pp 489ac8e5d02SConrad MeyerUse 490ac8e5d02SConrad Meyer.Bl -bullet -offset indent -compact 491ac8e5d02SConrad Meyer.It 492ac8e5d02SConrad Meyer.Cd GEOM_PART_BSD , 493ac8e5d02SConrad Meyer.It 494ac8e5d02SConrad Meyer.Cd GEOM_MULTIPATH , 495ac8e5d02SConrad Meyer.It 496ac8e5d02SConrad Meyer.Cd GEOM_PART_MBR , 497ac8e5d02SConrad Meyer.It 498ac8e5d02SConrad Meyer.Cd GEOM_PART_VTOC8 , 499ac8e5d02SConrad Meyerand 500ac8e5d02SConrad Meyer.It 501ac8e5d02SConrad Meyer.Cd GEOM_LABEL 502ac8e5d02SConrad Meyer.El 503ac8e5d02SConrad Meyeroptions, respectively, instead. 50427c74787SPoul-Henning Kamp.Sh AUTHORS 5056c899950SBaptiste Daroussin.An Poul-Henning Kamp Aq Mt phk@FreeBSD.org 506