xref: /titanic_51/usr/src/cmd/sgs/libelf/common/README.LFS (revision bebb829deac32e16136b725d421267d3dceb6cfd)
1df14233eSab196087#
2df14233eSab196087# CDDL HEADER START
3df14233eSab196087#
4df14233eSab196087# The contents of this file are subject to the terms of the
5df14233eSab196087# Common Development and Distribution License (the "License").
6df14233eSab196087# You may not use this file except in compliance with the License.
7df14233eSab196087#
8df14233eSab196087# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9df14233eSab196087# or http://www.opensolaris.org/os/licensing.
10df14233eSab196087# See the License for the specific language governing permissions
11df14233eSab196087# and limitations under the License.
12df14233eSab196087#
13df14233eSab196087# When distributing Covered Code, include this CDDL HEADER in each
14df14233eSab196087# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
15df14233eSab196087# If applicable, add the following below this CDDL HEADER, with the
16df14233eSab196087# fields enclosed by brackets "[]" replaced with your own identifying
17df14233eSab196087# information: Portions Copyright [yyyy] [name of copyright owner]
18df14233eSab196087#
19df14233eSab196087# CDDL HEADER END
20df14233eSab196087#
21df14233eSab196087
22df14233eSab196087#
23df14233eSab196087# Copyright 2008 Sun Microsystems, Inc.  All rights reserved.
24df14233eSab196087# Use is subject to license terms.
25df14233eSab196087#
26df14233eSab196087
27df14233eSab196087
28df14233eSab196087Why 32-bit libelf is not Large File Aware
29df14233eSab196087-----------------------------------------
30df14233eSab196087
31df14233eSab196087The ELF format uses unsigned 32-bit integers for offsets, so the
32df14233eSab196087theoretical limit on a 32-bit ELF object is 4GB. However, libelf
33df14233eSab196087imposes a 2GB limit on the objects it can create. The Solaris
34df14233eSab196087link-editor and related tools are all based on libelf, so the
35df14233eSab19608732-bit version of the link-editor also has a 2GB limit, despite
36df14233eSab196087the theoretical limit of 4GB.
37df14233eSab196087
38df14233eSab196087Large file support (LFS) is a half step between the 32 and 64-bit
39df14233eSab196087worlds, in which an otherwise 32-bit limited process is allowed to
40df14233eSab196087read and write data to a file that can be larger than 2GB (the extent
41df14233eSab196087of a signed 32-bit integer, as represented by the system type off_t).
42df14233eSab196087LFS is useful if the program only needs to access a small subset of
43df14233eSab196087the file data at any given time (e.g. /usr/bin/cat). It is less useful
44df14233eSab196087if the program needs to access a large amount of data at once --- having
45df14233eSab196087been freed from the file limit, the program will simply hit the virtual
46df14233eSab196087memory limit (4GB).
47df14233eSab196087
48df14233eSab196087In particular, the link-editor generally requires twice as much
49df14233eSab196087memory as the size of the output object, half to hold the input
50df14233eSab196087objects, and half to hold the result. This means that a 32-bit
51df14233eSab196087link-editor process will hit the 2GB file size limit and the 4GB
52df14233eSab196087address space limit at roughly the same time. As a result, a
53df14233eSab196087large file aware 32-bit version of libelf has no significant value.
54df14233eSab196087Despite this, the question of what it would take to make libelf
55df14233eSab196087large file aware comes up from time to time.
56df14233eSab196087
57df14233eSab196087The first step would be to provide alternative versions of
58df14233eSab196087all public data structures that involve the off_t data type.
59df14233eSab196087These structs, found in /usr/include/libelf.h, are:
60df14233eSab196087
61df14233eSab196087	/*
62df14233eSab196087	 * Archive member header
63df14233eSab196087	 */
64df14233eSab196087	typedef struct {
65df14233eSab196087		char		*ar_name;
66df14233eSab196087		time_t		ar_date;
67df14233eSab196087		uid_t		ar_uid;
68df14233eSab196087		gid_t 		ar_gid;
69df14233eSab196087		mode_t		ar_mode;
70df14233eSab196087		off_t		ar_size;
71df14233eSab196087		char 		*ar_rawname;
72df14233eSab196087	} Elf_Arhdr;
73df14233eSab196087
74df14233eSab196087
75df14233eSab196087	/*
76df14233eSab196087	 * Data descriptor
77df14233eSab196087	 */
78df14233eSab196087	typedef struct {
79df14233eSab196087		Elf_Void	*d_buf;
80df14233eSab196087		Elf_Type	d_type;
81df14233eSab196087		size_t		d_size;
82df14233eSab196087		off_t		d_off;		/* offset into section */
83df14233eSab196087		size_t		d_align;	/* alignment in section */
84df14233eSab196087		unsigned	d_version;	/* elf version */
85df14233eSab196087	} Elf_Data;
86df14233eSab196087
87df14233eSab196087As off_t is a signed type, these alternative versions would have to use
88df14233eSab196087an off64_t type instead.
89df14233eSab196087
90df14233eSab196087In addition to providing alternative large file aware Elf_Arhdr and
91df14233eSab196087Elf_Data types, it would be necessary to implement large file aware
92df14233eSab196087versions of the public functions that use them, also found in
93df14233eSab196087/usr/include/libelf.h:
94df14233eSab196087
95df14233eSab196087	/*
96df14233eSab196087	 * Function declarations
97df14233eSab196087	 */
98df14233eSab196087	unsigned  elf_flagdata(Elf_Data *, Elf_Cmd, unsigned);
99df14233eSab196087	Elf_Arhdr *elf_getarhdr(Elf *);
100df14233eSab196087	off_t	  elf_getbase(Elf *);
101df14233eSab196087	Elf_Data  *elf_getdata(Elf_Scn *, Elf_Data *);
102df14233eSab196087	Elf_Data  *elf_newdata(Elf_Scn *);
103df14233eSab196087	Elf_Data  *elf_rawdata(Elf_Scn *, Elf_Data *);
104df14233eSab196087	off_t	  elf_update(Elf *, Elf_Cmd);
105df14233eSab196087	Elf_Data  *elf32_xlatetof(Elf_Data *, const Elf_Data *, unsigned);
106df14233eSab196087	Elf_Data  *elf32_xlatetom(Elf_Data *, const Elf_Data *, unsigned);
107df14233eSab196087	Elf_Data  *elf64_xlatetof(Elf_Data *, const Elf_Data *, unsigned);
108df14233eSab196087	Elf_Data  *elf64_xlatetom(Elf_Data *, const Elf_Data *, unsigned);
109df14233eSab196087
110df14233eSab196087It is important to note that these new versions cannot replace the
111df14233eSab196087original definitions. Those must continue to be available to support
112*bebb829dSRod Evansnon-large-file-aware programs. These new types and functions would be in
113df14233eSab196087addition to the pre-existing versions.
114df14233eSab196087
115df14233eSab196087When you make code like this large file aware, it is necessary to undertake
116df14233eSab196087a careful analysis of the code to ensure that all the surrounding code uses
117df14233eSab196087variable types large enough to handle the increased range. Hence, this work
118df14233eSab196087is more complicated than simply supplying variants that use a bigger
119df14233eSab196087off_t and rebuilding --- that is just the first step.
120df14233eSab196087
121df14233eSab196087There are two standard preprocessor definitions used to control
122df14233eSab196087large file support:
123df14233eSab196087
124df14233eSab196087	_LARGEFILE64_SOURCE
125df14233eSab196087	_FILE_OFFSET_BITS
126df14233eSab196087
127df14233eSab196087These preprocessor definitions would be used to determine whether
128df14233eSab196087a given program linked against libelf would see the regular, or
129*bebb829dSRod Evansthe large file aware versions of the above types and routines.
130df14233eSab196087This is the same approach used in other large file capable software,
131df14233eSab196087such as libc.
132df14233eSab196087
133df14233eSab196087Finally, all the applications that rely on libelf would need to be made
134df14233eSab196087large file aware. As with libelf itself, there is more to such an effort
135df14233eSab196087than recompiling with preprocessor macros set. The code in these
136df14233eSab196087applications would need to be examined carefully. Some of these programs
137df14233eSab196087are very old, and were not originally written with such type portability
138df14233eSab196087in mind. Such code can be difficult to transition.
139df14233eSab196087
140df14233eSab196087To work around the 2GB limit in 32-bit libelf:
141df14233eSab196087
142df14233eSab196087    - The fundamental limits of a 32-bit address space mean
143df14233eSab196087      that a program this large should be 64-bit. Only a 64-bit
144df14233eSab196087      address space has enough room for that much code, plus the
145df14233eSab196087      stack and heap needed to do useful work with it.
146df14233eSab196087
147df14233eSab196087    - The 64-bit version of libelf is also able to process
148df14233eSab196087      32-bit objects, and does not have a 2GB file size limit.
149df14233eSab196087      Therefore, the 64-bit link-editor can be used to build a 32-bit
150df14233eSab196087      executable which is >2GB. The resulting program will consume over
151df14233eSab196087      half the available address space just to start running. However,
152df14233eSab196087      there may be enough address space left for it to do useful work.
153df14233eSab196087
154df14233eSab196087      Note that the 32-bit limit for sharable objects remains at
155df14233eSab196087      2GB --- imposed by the runtime linker, which is also not large
156df14233eSab196087      file aware.
157