Copyright (c) 2004, Sun Microsystems, Inc. All Rights Reserved
The contents of this file are subject to the terms of the Common Development and Distribution License (the "License"). You may not use this file except in compliance with the License.
You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing. See the License for the specific language governing permissions and limitations under the License.
When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner]
/sbin/metainit -h
/sbin/metainit [generic options] concat/stripe numstripes width component... [-i interlace]
/sbin/metainit [width component... [-i interlace]] [-h hot_spare_pool]
/sbin/metainit [generic options] mirror -m submirror [read_options] [write_options] [pass_num]
/sbin/metainit [generic options] RAID -r component... [-i interlace] [-h hot_spare_pool] [-k] [-o original_column_count]
/sbin/metainit [generic options] hot_spare_pool [hotspare...]
/sbin/metainit [generic options] metadevice-name
/sbin/metainit [generic options] -a
/sbin/metainit [generic options] softpart -p [-e] component [-A alignment] size
/sbin/metainit -r
The metainit command configures metadevices and hot spares according to the information specified on the command line. Alternatively, you can run metainit so that it uses configuration entries you specify in the /etc/lvm/md.tab file (see md.tab(4)). All metadevices must be set up by the metainit command before they can be used.
Solaris Volume Manager supports storage devices and logical volumes greater than 1 terabyte (TB) when a system runs a 64-bit Solaris kernel. Support for large volumes is automatic. If a device greater than 1 TB is created, Solaris Volume Manager configures it appropriately and without user intervention.
If a system with large volumes is rebooted under a 32-bit Solaris kernel, the large volumes are visible through metastat output. Large volumes cannot be accessed, modified or deleted, and no new large volumes can be created. Any volumes or file systems on a large volume in this situation are unavailable. If a system with large volumes is rebooted under a version of Solaris prior to the Solaris 9 4/03 release, Solaris Volume Manager does not start. You must remove all large volumes before Solaris Volume Manager runs under an earlier version of the Solaris Operating System.
If you edit the /etc/lvm/md.tab file to configure metadevices, specify one complete configuration entry per line. You then run the metainit command with either the -a option, to activate all metadevices you entered in the /etc/lvm/md.tab file, or with the metadevice name corresponding to a specific configuration entry.
metainit does not maintain the state of the volumes that would have been created when metainit is run with both the -a and -n flags. Any volumes in md.tab that have dependencies on other volumes in md.tab are reported as errors when metainit -a -n is run, although the operations might succeed when metainit -a is run. See md.tab(4).
Solaris Volume Manager never updates the /etc/lvm/md.tab file. Complete configuration information is stored in the metadevice state database, not md.tab. The only way information appears in md.tab is through editing it by hand.
When setting up a disk mirror, the first step is to use metainit create a one-on-one concatenation for the root slice. See EXAMPLES.
The following options are supported:
Root privileges are required for all of the following options except -h.
The following generic options are supported: -f
Forces the metainit command to continue even if one of the slices contains a mounted file system or is being used as swap, or if the stripe being created is smaller in size than the underlying soft partition. This option is required when configuring mirrors on root (/), swap, and /usr.
Displays usage message.
Checks the syntax of your command line or md.tab entry without actually setting up the metadevice. If used with -a, all devices are checked but not initialized.
Only used in a shell script at boot time. Sets up all metadevices that were configured before the system crashed or was shut down. The information about previously configured metadevices is stored in the metadevice state database (see metadb(1M)).
Specifies the name of the diskset on which metainit works. Without the -s option, the metainit command operates on your local metadevices and/or hotspares.
The following concat/stripe options are supported: concat/stripe
Specifies the metadevice name of the concatenation, stripe, or concatenation of stripes being defined.
Specifies the number of individual stripes in the metadevice. For a simple stripe, numstripes is always 1. For a concatenation, numstripes is equal to the number of slices. For a concatenation of stripes, numstripes varies according to the number of stripes.
Specifies the number of slices that make up a stripe. When width is greater than 1, the slices are striped.
The logical name for the physical slice (partition) on a disk drive, such as /dev/dsk/c0t0d0s0. For RAID level 5 metadevices, a minimum of three slices is necessary to enable striping of the parity information across slices.
Specifies the interlace size. This value tells Solaris Volume Manager how much data to place on a slice of a striped or RAID level 5 metadevice before moving on to the next slice. interlace is a specified value, followed by either `k' for kilobytes, `m' for megabytes, or `b' for blocks. The characters can be either uppercase or lowercase. The interlace specified cannot be less than 16 blocks, or greater than 100 megabytes. If interlace is not specified, it defaults to 512 kilobytes.
Specifies the hot_spare_pool to be associated with the metadevice. If you use the command line, the hot spare pool must have been previously created by the metainit command before it can be associated with a metadevice. Use /-h hspnnn when the concat/stripe being created is to be used as a submirror. Names for hot spare pools can be any legal file name that is composed of alphanumeric characters, a dash ("-"), an underscore ("_"), or a period ("."). Names must begin with a letter. The words "all" and "none" are reserved and cannot be used.
The following mirror options are supported: mirror -m submirror
Specifies the metadevice name of the mirror. The -m indicates that the configuration is a mirror. submirror is a metadevice (stripe or concatentation) that makes up the initial one-way mirror. Solaris Volume Manager supports a maximum of four-way mirroring. When defining mirrors, first create the mirror with the metainit command as a one-way mirror. Then attach subsequent submirrors using the metattach command. This method ensures that Solaris Volume Manager properly syncs the mirrors. (The second and any subsequent submirrors are first created using the metainit command.)
The following read options for mirrors are supported: -g
Enables the geometric read option, which results in faster performance on sequential reads.
Directs all reads to the first submirror. This should only be used when the devices comprising the first submirror are substantially faster than those of the second mirror. This flag cannot be used with the -g flag.
The following write options for mirrors are supported: -S
Performs serial writes to mirrors. The first submirror write completes before the second is started. This can be useful if hardware is susceptible to partial sector failures. If -S is not specified, writes are replicated and dispatched to all mirrors simultaneously.
A number in the range 0-9 at the end of an entry defining a mirror that determines the order in which that mirror is resynced during a reboot. The default is 1. Smaller pass numbers are resynced first. Equal pass numbers are run concurrently. If 0 is used, the resync is skipped. 0 should be used only for mirrors mounted as read-only, or as swap.
The following RAID level 5 options are available: RAID -r
Specifies the name of the RAID level 5 metadevice. The -r specifies that the configuration is RAID level 5.
For RAID level 5 metadevices, informs the driver that it is not to initialize (zero the disk blocks) due to existing data. Only use this option to recreate a previously created RAID level 5 device. Use the -k option with extreme caution. This option sets the disk blocks to the OK state. If any errors exist on disk blocks within the metadevice, Solaris Volume Manager might begin fabricating data. Instead of using the -k option, you might want to initialize the device and restore data from tape.
For RAID level 5 metadevices, used with the -k option to define the number of original slices in the event the originally defined metadevice was grown. This is necessary since the parity segments are not striped across concatenated devices. Use the -o option with extreme caution. This option sets the disk blocks to the OK state. If any errors exist on disk blocks within the metadevice, Solaris Volume Manager might begin fabricating data. Instead of using the -o option, you might want to initialize the device and restore data from tape.
The following soft partition options are supported: softpart -p [-e] component [-A alignment] size
The softpart argument specifies the name of the soft partition. The -p specifies that the configuration is a soft partition. The -e specifies that the entire disk specified by component as c*t*d* should be repartitioned and reserved for soft partitions. The specified component is repartitioned such that slice 7 reserves space for system (state database replica) usage and slice 0 contains all remaining space on the disk. Slice 7 is a minimum of 4MB, but can be larger, depending on the disk geometry. The newly created soft partition is placed on slice 0 of the device. The component argument specifies the disk (c*t*d*), slice (c*t*d*s*), or meta device (d*) from which to create the soft partition. The size argument determines the space to use for the soft partition and can be specified in K or k for kilobytes, M or m for megabytes, G or g for gigabytes, T or t for terabyte (one terabyte is the maximum size), and B or b for blocks (sectors). All values represent powers of 2, and upper and lower case options are equivalent. Only integer values are permitted. The -A alignment option sets the value of the soft partition extent alignment. This option used when it is important specify a starting offset for the soft partition. It preserves the data alignment between the metadevice address space and the address space of the underlying physical device. For example, a hardware device that does checksumming should not have its I/O requests divided by Solaris Volume Manager. In this case, use a value from the hardware configuration as the value for the alignment. When you use this option in conjunction with a software I/O load, the alignment value corresponds to the I/O load of the application. This prevents I/O from being divided unnecessarily and affecting performance. The literal all, used instead of specifying size, specifies that the soft partition should occupy all available space on the device.
The following hot spare pool options are supported: hot_spare_pool [ hotspare... ]
When used as arguments to the metainit command, hot_spare_pool defines the name for a hot spare pool, and hotspare... is the logical name for the physical slice(s) for availability in that pool. Names for hot spare pools can be any legal file name that is composed of alphanumeric characters, a dash ("-"), an underscore ("_"), or a period ("."). Names must begin with a letter. The words "all" and "none" are reserved and cannot be used.
The following md.tab file options are supported: metadevice-name
When the metainit command is run with a metadevice-name as its only argument, it searches the /etc/lvm/md.tab file to find that name and its corresponding entry. The order in which entries appear in the md.tab file is unimportant. For example, consider the following md.tab entry:
d0 2 1 c1t0d0s0 1 c2t1d0s0When you run the command metainit d0, it configures metadevice d0 based on the configuration information found in the md.tab file.
Activates all metadevices defined in the md.tab file. metainit does not maintain the state of the volumes that would have been created when metainit is run with both the -a and -n flags. If a device d0 is created in the first line of the md.tab file, and a later line in md.tab assumes the existence of d0, the later line fails when metainit -an runs (even if it would succeed with metainit -a).
Example 1 Creating a One-on-One Concatenation
The following command creates a one-on-one concatenation for the root slice. This is the first step you take when setting up a mirror for the root slice (and any other slice that cannot be unmounted). The -f option is required to create a volume with an existing file system, such as root(/).
# metainit -f dl 1 1 c0t0d0s0
The preceding command makes d1 a one-on-one concatenation, using the root slice. You can then enter:
# metainit d0 -m d1
...to make a one-way mirror of the root slice.
Example 2 Concatenation
All drives in the following examples have the same size of 525 Mbytes.
This example shows a metadevice, /dev/md/dsk/d7, consisting of a concatenation of four slices.
# metainit d7 4 1 c0t1d0s0 1 c0t2d0s0 1 c0t3d0s0 1 /dev/dsk/c0t4d0s0
The number 4 indicates there are four individual stripes in the concatenation. Each stripe is made of one slice, hence the number 1 appears in front of each slice. The first disk sector in all of these devices contains a disk label. To preserve the labels on devices /dev/dsk/c0t2d0s0, /dev/dsk/c0t3d0s0, and /dev/dsk/c0t4d0s0, the metadisk driver must skip at least the first sector of those disks when mapping accesses across the concatenation boundaries. Because skipping only the first sector would create an irregular disk geometry, the entire first cylinder of these disks is skipped. This allows higher level file system software to optimize block allocations correctly.
Example 3 Stripe
This example shows a metadevice, /dev/md/dsk/d15, consisting of two slices.
# metainit d15 1 2 c0t1d0s0 c0t2d0s0 -i 32k
The number 1 indicates that one stripe is being created. Because the stripe is made of two slices, the number 2 follows next. The optional -i followed by 32k specifies the interlace size as 32 Kbytes. If the interlace size were not specified, the stripe would use the default value of 16 Kbytes.
Example 4 Concatentation of Stripes
This example shows a metadevice, /dev/md/dsk/d75, consisting of a concatenation of two stripes of three disks.
# metainit d75 2 3 c0t1d0s0 c0t2d0s0 \e c0t3d0s0 -i 16k \e 3 c1t1d0s0 c1t2d0s0 c1t3d0s0 -i 32k
On the first line, the -i followed by 16k specifies that the stripe interlace size is 16 Kbytes. The second set specifies the stripe interlace size as 32 Kbytes. If the second set did not specify 32 Kbytes, the set would use the default interlace value of 16 Kbytes. The blocks of each set of three disks are interlaced across three disks.
Example 5 Mirroring
This example shows a two-way mirror, /dev/md/dsk/d50, consisting of two submirrors. This mirror does not contain any existing data.
# metainit d51 1 1 c0t1d0s0 # metainit d52 1 1 c0t2d0s0 # metainit d50 -m d51 # metattach d50 d52
In this example, two submirrors, d51 and d52, are created with the metainit command. These two submirrors are simple concatenations. Next, a one-way mirror, d50, is created using the -m option with d51. The second submirror is attached later using the metattach command. When creating a mirror, any combination of stripes and concatenations can be used. The default read and write options in this example are a round-robin read algorithm and parallel writes to all submirrors.
Example 6 Creating a metadevice in a diskset
This example shows a metadevice, /dev/md/dsk/d75, consisting of a concatenation of two stripes within a diskset called set1.
# metainit -s set1 d75 2 3 c2t1d0s0 c2t2d0s0 \e c2t3d0s0 -i 32k # metainit -s set1 d51 1 1 c2t1d0s0 # metainit -s set1 d52 1 1 c3t1d0s0 # metainit -s set1 d50 -m d51 # metattach -s set1 d50 d52
In this example, a diskset is created using the metaset command. Metadevices are then created within the diskset using the metainit command. The two submirrors, d51 and d52, are simple concatenations. Next, a one-way mirror, d50, is created using the -m option with d51. The second submirror is attached later using the metattach command. When creating a mirror, any combination of stripes and concatenations can be used. The default read and write options in this example are a round-robin read algorithm and parallel writes to all submirrors.
Example 7 RAID Level 5
This example shows a RAID level 5 device, d80, consisting of three slices:
# metainit d80 -r c1t0d0s0 c1t1d0s0 c1t3d0s0 -i 20k
In this example, a RAID level 5 metadevice is defined using the -r option with an interlace size of 20 Kbytes. The data and parity segments are striped across the slices, c1t0d0s0, c1t2d0s0, and c1t3d0s0.
Example 8 Soft Partition
The following example shows a soft partition device, d1, built on metadevice d100 and 100 Mbytes (indicated by 100M) in size:
# metainit d1 -p d100 100M
The preceding command creates a 100 Mbyte soft partition on the d100 metadevice. This metadevice could be a RAID level 5, stripe, concatenation, or mirror.
Example 9 Soft Partition on Full Disk
The following example shows a soft partition device, d1, built on disk c3t4d0:
# metainit d1 -p -e c3t4d0 9G
In this example, the disk is repartitioned and a soft partition is defined to occupy all 9 Gbytes of disk c3t4d0s0.
Example 10 Soft Partition Taking All Available Space
The following example shows a soft partition device, d1, built on disk c3t4d0:
# metainit d1 -p -e c3t4d0 all
In this example, the disk is repartitioned and a soft partition is defined to occupy all available disk space on slice c3t4d0s0.
Example 11 Hot Spare
This example shows a two-way mirror, /dev/md/dsk/d10, and a hot spare pool with three hot spare components. The mirror does not contain any existing data.
# metainit hsp001 c2t2d0s0 c3t2d0s0 c1t2d0s0 # metainit d41 1 1 c1t0d0s0 -h hsp001 # metainit d42 1 1 c3t0d0s0 -h hsp001 # metainit d40 -m d41 # metattach d40 d42
In this example, a hot spare pool, hsp001, is created with three slices from three different disks used as hot spares. Next, two submirrors are created, d41 and d42. These are simple concatenations. The metainit command uses the -h option to associate the hot spare pool hsp001 with each submirror. A one-way mirror is then defined using the -m option. The second submirror is attached using the metattach command.
Example 12 Setting the Value of the Soft Partition Extent Alignment
This example shows how to set the alignment of the soft partition to 1 megabyte.
# metainit -s red d13 -p c1t3d0s4 -A 1m 4m
In this example the soft partition, d13, is created with an extent alignment of 1 megabyte. The metainit command uses the -A option with an alignment of 1m to define the soft partition extent alignment.
Contains list of metadevice and hot spare configurations for batch-like creation.
This section contains information on different types of warnings.
Do not create large (>1 TB) volumes if you expect to run the Solaris Operating Environment with a 32-bit kernel or if you expect to use a version of the Solaris Operating Environment prior to Solaris 10.
Do not use the metainit command to create a multi-way mirror. Rather, create a one-way mirror with metainit then attach additional submirrors with metattach. When the metattach command is not used, no resync operations occur and data could become corrupted.
If you use metainit to create a mirror with multiple submirrors, the following message is displayed:
WARNING: This form of metainit is not recommended. The submirrors may not have the same data. Please see ERRORS in metainit(1M) for additional information.
When creating stripes on top of soft partitions it is possible for the size of the new stripe to be less than the size of the underlying soft partition. If this occurs, metainit fails with an error indicating the actions required to overcome the failure.
If you use the -f option to override this behavior, the following message is displayed:
WARNING: This form of metainit is not recommended. The stripe is truncating the size of the underlying device. Please see ERRORS in metainit(1M) for additional information.
When mirroring data in Solaris Volume Manager, transfers from memory to the disks do not all occur at exactly the same time for all sides of the mirror. If the contents of buffers are changed while the data is in-flight to the disk (called write-on-write), then different data can end up being stored on each side of a mirror.
This problem can be addressed by making a private copy of the data for mirror writes, however, doing this copy is expensive. Another approach is to detect when memory has been modified across a write by looking at the dirty-bit associated with the memory page. Solaris Volume Manager uses this dirty-bit technique when it can. Unfortunately, this technique does not work for raw I/O or direct I/O. By default, Solaris Volume Manager is tuned for performance with the liability that mirrored data might be out of sync if an application does a "write-on-write" to buffers associated with raw I/O or direct I/O. Without mirroring, you were not guaranteed what data would actually end up on media, but multiple reads would return the same data. With mirroring, multiple reads can return different data. The following line can be added to /etc/system to cause a stable copy of the buffers to be used for all raw I/O and direct I/O write operations.
set md_mirror:md_mirror_wow_flg=0x20
Setting this flag degrades performance.
The following exit values are returned: 0
Successful completion.
An error occurred.
See attributes(5) for descriptions of the following attributes:
ATTRIBUTE TYPE ATTRIBUTE VALUE |
Interface Stability Stable |
mdmonitord(1M), metaclear(1M), metadb(1M), metadetach(1M), metahs(1M), metaoffline(1M), metaonline(1M), metaparam(1M), metarecover(1M), metarename(1M), metareplace(1M), metaroot(1M), metaset(1M), metassist(1M), metastat(1M), metasync(1M), metattach(1M), md.tab(4), md.cf(4), mddb.cf(4), md.tab(4), attributes(5), md(7D)
Recursive mirroring is not allowed; that is, a mirror cannot appear in the definition of another mirror.
Recursive logging is not allowed; that is, a trans metadevice cannot appear in the definition of another metadevice.
Stripes, concatenations, and RAID level 5 metadevices must consist of slices only.
Mirroring of RAID level 5 metadevices is not allowed.
Soft partitions can be built on raw devices, or on stripes, RAID level 5, or mirrors.
RAID level 5 or stripe metadevices can be built directly on soft partitions.
Trans metadevices have been replaced by UFS logging. Existing trans devices are not logging--they pass data directly through to the underlying device. See mount_ufs(1M) for more information about UFS logging.