1 /* 2 * CDDL HEADER START 3 * 4 * The contents of this file are subject to the terms of the 5 * Common Development and Distribution License, Version 1.0 only 6 * (the "License"). You may not use this file except in compliance 7 * with the License. 8 * 9 * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 10 * or http://www.opensolaris.org/os/licensing. 11 * See the License for the specific language governing permissions 12 * and limitations under the License. 13 * 14 * When distributing Covered Code, include this CDDL HEADER in each 15 * file and include the License file at usr/src/OPENSOLARIS.LICENSE. 16 * If applicable, add the following below this CDDL HEADER, with the 17 * fields enclosed by brackets "[]" replaced with your own identifying 18 * information: Portions Copyright [yyyy] [name of copyright owner] 19 * 20 * CDDL HEADER END 21 */ 22 /* 23 * Copyright 1998-2003 Sun Microsystems, Inc. All rights reserved. 24 * Use is subject to license terms. 25 */ 26 27 #pragma ident "%Z%%M% %I% %E% SMI" 28 29 /* 30 * Overview of sort(1) 31 * 32 * sort(1) implements a robust sorting program, compliant with the POSIX 33 * specifications for sort, that is capable of handling large sorts and merges 34 * in single byte and multibyte locales. Like most sort(1) implementations, 35 * this implementation uses an internal algorithm for sorting subsets of the 36 * requested data set and an external algorithm for sorting the subsets into the 37 * final output. In the current implementation, the internal algorithm is a 38 * ternary radix quicksort, modified from the algorithm described in Bentley and 39 * Sedgewick [1], while the external algorithm is a priority-queue based 40 * heapsort, as outlined in Sedgewick [2]. 41 * 42 * We use three major datatypes, defined in ./types.h: the line record, 43 * line_rec_t; the stream, stream_t; and the field definition, field_t. 44 * Because sort supports efficient code paths for each of the C, single-byte, 45 * and wide character/multibyte locales, each of these types contains unions 46 * and/or function pointers to describe appropriate properties or operations for 47 * each locale type. 48 * 49 * To utilize the radix quicksort algorithm with the potentially complex sort 50 * keys definable via the POSIX standard, we convert each line to a collatable 51 * string based on the key definition. This approach is somewhat different from 52 * historical implementations of sort(1), which have built a complex 53 * field-by-field comparison function. There are, of course, tradeoffs that 54 * accompany this decision, particularly when the duration of use of a given 55 * collated form is short. However, the maintenance costs of parallel 56 * conversion and collation functions are estimated to be high, and the 57 * performance costs of a shared set of functions were found to be excessive in 58 * prototype. 59 * 60 * [1] J. Bentley and R. Sedgewick, Fast Algorithms for Sorting and Searching 61 * Strings, in Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, 62 * 1997 (SODA 1997), 63 * [2] R. Sedgewick, Algorithms in C, 3rd ed., vol. 1, Addison-Wesley, 1998. 64 */ 65 66 #include "main.h" 67 68 static sort_t S; 69 70 int 71 main(int argc, char *argv[]) 72 { 73 initialize_pre(&S); 74 75 if (options(&S, argc, argv)) 76 return (2); 77 78 initialize_post(&S); 79 80 if (S.m_check_if_sorted_only) 81 check_if_sorted(&S); 82 83 if (!S.m_merge_only) 84 internal_sort(&S); 85 86 merge(&S); 87 88 return (0); 89 } 90