xref: /titanic_41/usr/src/man/man1m/metaset.1m (revision 2a62b4a97f3b5b0ef5f3288ebbfc8509fb793b60)
te
Copyright (c) 2008, Sun Microsystems, Inc. All Rights Reserved
The contents of this file are subject to the terms of the Common Development and Distribution License (the "License"). You may not use this file except in compliance with the License. You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing.
See the License for the specific language governing permissions and limitations under the License. When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this CDDL HEADER, with the
fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner]
METASET 1M "Mar 4, 2009"
NAME
metaset - configure disk sets
SYNOPSIS

/usr/sbin/metaset -s setname [-M-a -h hostname]

/usr/sbin/metaset -s setname -A {enable | disable}

/usr/sbin/metaset -s setname [-A {enable | disable}] -a -h hostname...

/usr/sbin/metaset -s setname -a [-l length] [-L] drivename...

/usr/sbin/metaset -s setname -C {take | release | purge}

/usr/sbin/metaset -s setname -d [-f] -h hostname...

/usr/sbin/metaset -s setname -d [-f] drivename...

/usr/sbin/metaset -s setname -j

/usr/sbin/metaset -s setname -r

/usr/sbin/metaset -s setname -w

/usr/sbin/metaset -s setname -t [-f] [-u tagnumber] [y]

/usr/sbin/metaset -s setname -b

/usr/sbin/metaset -s setname -P

/usr/sbin/metaset -s setname -q

/usr/sbin/metaset -s setname -o [-h hostname]

/usr/sbin/metaset [-s setname]

/usr/sbin/metaset [-s setname] -a | -d
 [ [m] mediator_host_list]
DESCRIPTION

The metaset command administers sets of disks in named disk sets. Named disk sets include any disk set that is not in the local set. While disk sets enable a high-availability configuration, Solaris Volume Manager itself does not actually provide a high-availability environment.

A single-owner disk set configuration manages storage on a SAN or fabric-attached storage, or provides namespace control and state database replica management for a specified set of disks.

In a shared disk set configuration, multiple hosts are physically connected to the same set of disks. When one host fails, another host has exclusive access to the disks. Each host can control a shared disk set, but only one host can control it at a time.

When you add a new disk to any disk set, Solaris Volume Manager checks the disk format. If necessary, it repartitions the disk to ensure that the disk has an appropriately configured reserved slice 7 (or slice 6 on an EFI labelled device) with adequate space for a state database replica. The precise size of slice 7 (or slice 6 on an EFI labelled device) depends on the disk geometry. For tradtional disk sets, the slice is no less than 4 Mbytes, and probably closer to 6 Mbytes, depending on where the cylinder boundaries lie. For multi-owner disk sets, the slice is a minimum of 256 Mbytes. The minimal size for slice 7 might change in the future. This change is based on a variety of factors, including the size of the state database replica and information to be stored in the state database replica.

For use in disk sets, disks must have a dedicated slice (six or seven) that meets specific criteria:

The slice must start at sector 0

The slice must include enough space for disk label

The state database replicas cannot be mounted

The slice does not overlap with any other slices, including slice 2

If the existing partition table does not meet these criteria, or if the -L flag is specified, Solaris Volume Manager repartitions the disk. A small portion of each drive is reserved in slice 7 (or slice 6 on an EFI labelled device) for use by Solaris Volume Manager. The remainder of the space on each drive is placed into slice 0. Any existing data on the disks is lost by repartitioning.

After you add a drive to a disk set, it can be repartitioned as necessary, with the exception that slice 7 (or slice 6 on an EFI labelled device) is not altered in any way.

After a disk set is created and metadevices are set up within the set, the metadevice name is in the following form:

/dev/md/setname/{dsk,rdsk}/dnumber

where setname is the name of the disk set, and number is the number of the metadevice (0-127).

If you have disk sets that you upgraded from Solstice DiskSuite software, the default state database replica size on those sets is 1034 blocks, not the 8192 block size from Solaris Volume Manager. Also, slice 7 on the disks that were added under Solstice DiskSuite are correspondingly smaller than slice 7 on disks that were added under Solaris Volume Manager.

If disks you add to a disk set have acceptable slice 7s (that start at cylinder 0 and that have sufficient space for the state database replica), they are not reformatted.

Hot spare pools within local disk sets use standard Solaris Volume Manager naming conventions. Hot spare pools with shared disk sets use the following convention:

setname/hot_spare_pool

where setname is the name of the disk set, and hot_spare_pool is the name of the hot spare pool associated with the disk set.

"Multi-node Environment"

To create and work with a disk set in a multi\(emnode environment, root must be a member of Group 14 on all hosts, or the /.rhosts file must contain an entry for all other host names. This is not required in a SunCluster 3.x enviroment.

"Tagged data"

Tagged data occurs when there are different versions of a disk set's replicas. This tagged data consists of the set owner's nodename, the hardware serial number of the owner and the time it was written out to the available replicas. The system administer can use this information to determine which replica contains the correct data.

When a disk set is configured with an even number of storage enclosures and has replicas balanced across them evenly, it is possible that up to half of the replicas can be lost (for example, through a power failure of half of the storage enclosures). After the enclosure that went down is rebooted, half of the replicas are not recognized by SVM. When the set is retaken, the metaset command returns an error of "stale databases", and all of the metadevices are in a read-only state.

Some of the replicas that are not recognized need to be deleted. The action of deleting the replicas also causes updates to the replicas that are not being deleted. In a dual hosted disk set environment, the second node can access the deleted replicas instead of the existing replicas when it takes the set. This leads to the possibility of getting the wrong replica record on a disk set take. An error message is displayed, and user intervention is required.

Use the -q to query the disk set and the -t, -u, and -y, options to select the tag and take the disk set. See OPTIONS.

"Mediator Configuration"

SVM provides support for a low-end HA solution consisting of two hosts that share only two strings of drives. The hosts in this type of configuration, referred to as mediators or mediator hosts, run a special daemon, rpc.metamedd(1M). The mediator hosts take on additional responsibilities to ensure that data is available in the case of host or drive failures.

A mediator configuration can survive the failure of a single host or a single string of drives, without administrative intervention. If both a host and a string of drives fail (multiple failures), the integrity of the data cannot be guaranteed. At this point, administrative intervention is required to make the data accessible. See mediator(7D) for further details.

Use the -m option to add or delete a mediator host. See OPTIONS.

OPTIONS

The following options are supported: -a drivename

Add drives or hosts to the named set. For a drive to be accepted into a set, the drive must not be in use within another metadevice or disk set, mounted on, or swapped on. When the drive is accepted into the set, it is repartitioned and the metadevice state database replica (for the set) can be placed on it. However, if a slice 7 (or slice 6 on an EFI labelled device), starts at cylinder 0, and is large enough to hold a state database replica, then the disk is not repartitioned. Also, a drive is not accepted if it cannot be found on all hosts specified as part of the set. This means that if a host within the specified set is unreachable due to network problems, or is administratively down, the add fails. Specify a drive name in the form cnumtnumdnum. Do not specify a slice number (snum). For drives in a Sun Cluster, you must specify a complete pathname for each drive. Such a name has the form:

/dev/did/[r]dsk/dnum
-a | -d | -m mediator_host_list

Add (-a) or delete (-d) mediator hosts to the specified disk set. A mediator_host_list is the nodename(4) of the mediator host to be added and (for adding) up to two other aliases for the mediator host. The nodename and aliases for each mediator host are separated only by commas. Up to three mediator hosts can be specified for the named disk set. Specify only the nodename of that host as the argument to -m to delete a mediator host. In a single metaset command you can add or delete up to three mediator hosts. See EXAMPLES.

-A {enable | disable}

Specify auto-take status for a disk set. If auto-take is enabled for a set, the disk set is automatically taken at boot, and file systems on volumes within the disk set can be mounted through /etc/vfstab entries. Only a single host can be associated with an auto-take set, so attempts to add a second host to an auto-take set or attempts to configure a disk set with multiple hosts as auto-take fails with an error message. Disabling auto-take status for a specific disk set causes the disk set to revert to normal behavior. That is, the disk set is potentially shared (non-concurrently) among hosts, and unavailable for mounting through /etc/vfstab.

-b

Insure that the replicas are distributed according to the replica layout algorithm. This can be invoked at any time, and does nothing if the replicas are correctly distributed. In cases where the user has used the metadb command to manually remove or add replicas, this command can be used to insure that the distribution of replicas matches the replica layout algorithm.

-C {take | release | purge}

Do not interact with the Cluster Framework when used in a Sun Cluster 3 environment. In effect, this means do not modify the Cluster Configuration Repository. These options should only be used to fix a broken disk set configuration. take

Take ownership of the disk set but do not inform the Cluster Framework that the disk set is available. This option is not for use with a multi-owner disk set.

release

Release ownership of the disk set without informing the Cluster Framework. This option should only be used if the disk set ownership was taken with the corresponding -C take option. This option is not for use with a multi-owner disk set.

purge

Remove the disk set without informing the Cluster Framework that the disk set has been purged. This option should only be used when the disk set is not accessible and requires rebuilding.

-d drivename

Delete drives or hosts from the named disk set. For a drive to be deleted, it must not be in use within the set. The last host cannot be deleted unless all of the drives within the set are deleted. Deleting the last host in a disk set destroys the disk set. Specify a drive name in the form cnumtnumdnum. Do not specify a slice number (snum). For drives in a Sun Cluster, you must specify a complete pathname for each drive. Such a name has the form:

/dev/did/[r]dsk/dnum
This option fails on a multi-owner disk set if attempting to withdraw the master node while other nodes are in the set.
-f

Force one of three actions to occur: takes ownership of a disk set when used with -t; deletes the last disk drive from the disk set; or deletes the last host from the disk set. Deleting the last drive or host from a disk set requires the -d option. When used to forcibly take ownership of the disk set, this causes the disk set to be grabbed whether or not another host owns the set. All of the disks within the set are taken over (reserved) and fail fast is enabled, causing the other host to panic if it had disk set ownership. The metadevice state database is read in by the host performing the take, and the shared metadevices contained in the set are accessible. You can use this option to delete the last drive in the disk set, because this drive would implicitly contain the last state database replica. You can use -f option to delete hosts from a set. When specified with a partial list of hosts, it can be used for one-host administration. One-host administration could be useful when a host is known to be non-functional, thus avoiding timeouts and failed commands. When specified with a complete list of hosts, the set is completely deleted. It is generally specified with a complete list of hosts to clean up after one-host administration has been performed.

-h hostname...

Specify one or more host names to be added to or deleted from a disk set. Adding the first host creates the set. The last host cannot be deleted unless all of the drives within the set have been deleted. The host name is not accepted if all of the drives within the set cannot be found on the specified host. The host name is the same name found in /etc/nodename.

-j

Join a host to the owner list for a multi-owner disk set. The concepts of take and release, used with traditional disk sets, do not apply to multi-owner sets, because multiple owners are allowed. As a host boots and is brought online, it must go through three configuration levels to be able to use a multi-owner disk set:

1. It must be included in the cluster nodelist, which happens automatically in a cluster or single-node sitatuion.

2. It must be added to the multi-owner disk set with the -a -h options documented elsewhere in this man page

3. It must join the set. When the host is first added to the set, it is automatically joined.

On manual restarts, the administrator must manually issue
metaset -s multinodesetname -j
to join the host to the owner list. After the cluster reconfiguration, when the host reenters the cluster, the node is automatically joined to the set. The metaset -j command joins the host to all multi-owner sets that the host has been added to. In a single node situation, joining the node to the disk set starts any necessary resynchronizations.
-L

When adding a disk to a disk set, force the disk to be repartitioned using the standard Solaris Volume Manager algorithm. See DESCRIPTION.

-l length

Set the size (in blocks) for the metadevice state database replica. The length can only be set when adding a new drive; it cannot be changed on an existing drive. The default (and maximum) size is 8192 blocks, which should be appropriate for most configurations. Replica sizes of less than 128 blocks are not recommended.

-M

Specify that the disk set to be created or modified is a multi-owner disk set that supports multiple concurrent owners. This option is required when creating a multi-owner disk set. Its use is optional on all other operations on a multi-owner disk set and has no effect. Existing disk sets cannot be converted to multi-owner sets.

-o

Return an exit status of 0 if the local host or the host specified with the -h option is the owner of the disk set.

-P

Purge the named disk set from the node on which the metaset command is run. The disk set must not be owned by the node that runs this command. If the node does own the disk set, the command fails. If you need to delete a disk set but cannot take ownership of the set, use the -P option.

-q

Displays an enumerated list of tags pertaining to ``tagged data'' that can be encountered during a take of the ownership of a disk set. This option is not for use with a multi-owner disk set.

-r

Release ownership of a disk set. All of the disks within the set are released. The metadevices set up within the set are no longer accessible. This option is not for use with a multi-owner disk set.

-s setname

Specify the name of a disk set on which metaset works. If no setname is specified, all disk sets are returned.

-t

Take ownership of a disk set safely. If metaset finds that another host owns the set, this host is not be allowed to take ownership of the set. If the set is not owned by any other host, all the disks within the set are owned by the host on which metaset was executed. The metadevice state database is read in, and the shared metadevices contained in the set become accessible. The -t option takes a disk set that has stale databases. When the databases are stale, metaset exits with code 66, and prints a message. At that point, the only operations permitted are the addition and deletion of replicas. Once the addition or deletion of the replicas has been completed, the disk set should be released and retaken to gain full access to the data. This option is not for use with a multi-owner disk set.

-u tagnumber

Once a tag has been selected, a subsequent take with -u tagnumber can be executed to select the data associated with the given tagnumber.

w

Withdraws a host from the owner list for a multi-owner disk set. The concepts of take and release, used with traditional disk sets, do not apply to multi-owner sets, because multiple owners are allowed. Instead of releasing a set, a host can issue

metaset -s multinodesetname -w
to withdraw from the owner list. A host automatically withdraws on a reboot, but can be manually withdrawn if it should not be able to use the set, but should be able to rejoin at a later time. A host that withdrew due to a reboot can still appear joined from other hosts in the set until a reconfiguration cycle occurs. metaset -w withdraws from ownership of all multi-owner sets of which the host is a member. This option fails if you attempt to withdraw the master node while other nodes are in the disk set owner list. This option cancels all resyncs running on the node. A cluster reconfiguration process that is removing a node from the cluster membership list effectively withdraws the host from the ownership list.
-y

Execute a subsequent take. If the take operation encounters ``tagged data,'' the take operation exits with code 2. You can then run the metaset command with the -q option to see an enumerated list of tags.

EXAMPLES

Example 1 Defining a Disk Set

This example defines a disk set.

# metaset -s relo-red -a -h red blue

The name of the disk set is relo-red. The names of the first and second hosts added to the set are red and blue, respectively. (The hostname is found in /etc/nodename.) Adding the first host creates the disk set. A disk set can be created with just one host, with the second added later. The last host cannot be deleted until all of the drives within the set have been deleted.

Example 2 Adding Drives to a Disk Set

This example adds drives to a disk set.

# metaset -s relo-red -a c2t0d0 c2t1d0 c2t2d0 c2t3d0 c2t4d0 c2t5d0

The name of the previously created disk set is relo-red. The names of the drives are c2t0d0, c2t1d0, c2t2d0, c2t3d0, c2t4d0, and c2t5d0. There is no slice identifier ("sx") at the end of the drive names.

Example 3 Adding Multiple Mediator Hosts

The following command adds three mediator hosts to the specified disk set.

# metaset -s mydiskset -a -m myhost1,alias1 myhost2,alias2 myhost3,alias3

Example 4 Purging a Disk Set from the Node

The following command purges the disk set relo-red from the node:

# metaset -s relo-red -P

Example 5 Querying a Disk Set for Tagged Data

The following command queries the disk set relo-red for a list of the tagged data:

# metaset -s relo-red -q

This command produces the following results:

The following tag(s) were found:
 1 - vha-1000c - Fri Sep 20 17:20:08 2002
 2 - vha-1000c - Mon Sep 23 11:01:27 2002

Example 6 Selecting a tag and taking a Disk set

The following command selects a tag and takes the disk set relo-red:

# metaset -s relo-red -t -u 2

Example 7 Defining a Multi-Owner Disk Set

The following command defines a multi-owner disk set:

# metaset -s blue -M -a -h hahost1 hahost2

The name of the disk set is blue. The names of the first and second hosts added to the set are hahost1 and hahost2, respectively. The hostname is found in /etc/nodename. Adding the first host creates the multi-owner disk set. A disk set can be created with just one host, with additional hosts added later. The last host cannot be deleted until all of the drives within the set have been deleted.

FILES
/etc/lvm/md.tab

Contains list of metadevice configurations.

EXIT STATUS

The following exit values are returned: 0

Successful completion.

>0

An error occurred.

ATTRIBUTES

See attributes(5) for descriptions of the following attributes:

ATTRIBUTE TYPE ATTRIBUTE VALUE
Interface Stability Stable
SEE ALSO

mdmonitord(1M), metaclear(1M), metadb(1M), metadetach(1M), metahs(1M), metainit(1M), metaoffline(1M), metaonline(1M), metaparam(1M), metarecover(1M), metarename(1M), metareplace(1M), metaroot(1M), metassist(1M), metastat(1M), metasync(1M), metattach(1M), md.tab(4), md.cf(4), mddb.cf(4), md.tab(4), attributes(5), md(7D)

NOTES

Disk set administration, including the addition and deletion of hosts and drives, requires all hosts in the set to be accessible from the network.