xref: /illumos-gate/usr/src/cmd/sgs/libld/common/README.XLINK (revision 628e3cbed6489fa1db545d8524a06cd6535af456)
1#
2# CDDL HEADER START
3#
4# The contents of this file are subject to the terms of the
5# Common Development and Distribution License (the "License").
6# You may not use this file except in compliance with the License.
7#
8# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9# or http://www.opensolaris.org/os/licensing.
10# See the License for the specific language governing permissions
11# and limitations under the License.
12#
13# When distributing Covered Code, include this CDDL HEADER in each
14# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
15# If applicable, add the following below this CDDL HEADER, with the
16# fields enclosed by brackets "[]" replaced with your own identifying
17# information: Portions Copyright [yyyy] [name of copyright owner]
18#
19# CDDL HEADER END
20#
21
22#
23# Copyright 2008 Sun Microsystems, Inc.  All rights reserved.
24# Use is subject to license terms.
25#
26# ident	"%Z%%M%	%I%	%E% SMI"
27
28
29
30
31Notes On Cross Link-Editor Support in libld.so
32-----------------------------------------
33
34The Solaris link-editor is used in two contexts:
35
36	1) The standard ld command
37	2) Via the runtime linker (ld.so.1), when a program
38	   calls dlopen() on a relocatable object (ET_REL).
39
40To support both uses, it is packaged as a sharable library (libld.so).
41The ld command is therefore a simple wrapper that uses libld.
42
43libld.so is a cross linker. This means that it can link objects for
44a system other than the system running the link-editor (e.g. A link-editor
45running on an amd64 system processing sparc objects). This means that every
46instance of libld.so contains code for building objects for every supported
47target. It is not necessary to build libld specifically for the
48platform you're targeting. This is possible because we only support
49Solaris/ELF, with a small number of platforms, and the additional code
50required per target is small.
51
52At initialization, the caller of libld.so specifies the type of objects
53being linked. By default, the ld command determines the machine type and
54class of the object being generated from the first ELF object processed
55from the command line. The -64 and -ztarget options exists to change this
56default, which is useful when creating an object entirely from an archive
57library or a mapfile. During initialization, the link-editor configures
58itself to build an output object of the specified type. This is done via
59indirection, using the global ld_targ structure to access code, data, and
60constants for the specified target.
61
62There are two types of source files used to build libld.so:
63
64	1) Common code used for all targets
65	2) Target specific code used only when linking for
66	  a given target.
67
68All of these files reside in usr/src/cmd/sgs/libld/common. However,
69it is easy to see which files belong in each category by examining
70the object lists maintained in usr/src/cmd/sgs/libld/Makefile.com.
71In addition, the target-specific files usually include the target
72in their name (i.e. machrel.sparc.c).
73
74Although the target dependent and independent (common) code is well separated,
75they are interdependent. For example, the common code is aware of
76the target-specific section types that can occur only for some targets
77(i.e. SHT_AMD64_UNWIND). This is not an architecture that allows
78for arbitrary target support to be dynamically plugged into an unchanged
79platform independent core. Rather, it is an organization that allows
80a common core to support all the targets it knows about in a way that
81is understandable and maintainable. A truly pluggable architecture
82would be considerably more opaque and complex, and is neither necessary,
83nor desirable, given the wide commonality between modern computer
84architectures.
85
86It is possible to add support for new targets to libld.so. The process
87of doing so is largely a matter of examining the files for existing
88platforms, studying the ABI for the new target platform, and then
89filling in the missing pieces for the new target. The remainder of this
90file consists of sections that describe some of the issues and steps
91that you will encounter in adding a new target.
92
93-----------------------------------------------------------------------------
94The relocation code used by ld is shared by the runtime linker (ld.so.1)
95and by the kernel module loader (ktrld), and is therefore found under
96usr/src/uts. You must add code for a relocation engine to support the
97new target. To do so, examine the common header files:
98
99	usr/src/uts/common/krtld/reloc.h
100	usr/src/uts/common/krtld/reloc_defs.h
101
102   and the existing relocation engines:
103
104	usr/src/uts/intel/amd64/krtld/doreloc.c
105	usr/src/uts/intel/ia32/krtld/doreloc.c
106	usr/src/uts/sparc/krtld/doreloc.c
107
108The ABI for the target platform will be the primary information
109you require. If your new system has attributes not found in an existing
110target, you may have to add/modify fields in the Rel_entry struct typedef
111(reloc_defs.h), or you may have to add new flags. Either sort of change
112may require you to also modify the existing relocation engines, and
113undoubtedly the common code in libld.so as well.
114
115When compiled for use by libld, the relocation engine requires an
116argument named "bswap". Each relocation engine must be prepared to
117swap the data bytes it is operating on. This support allows a link-editor
118running on a platform with a different byte order than the target to
119build objects for that target. To see how this is implemented, and how
120to ifdef that support so it only exists in the libld version of
121the engine, examine the code for the existing engines.
122
123-----------------------------------------------------------------------------
124You must create a target subdirectory in usr/src/cmd/sgs/include,
125and construct a machdep_XXX.h file (where XXX is the name of the
126target). The machdep files for the current platforms can be helpful:
127
128	usr/src/cmd/sgs/include/sparc/machdep_sparc.h
129	usr/src/cmd/sgs/include/i386/machdep_x86.h
130
131Note that these files support both the 32 and 64-bit versions of
132a given platform, so there is only one subdirectory and machdep
133file for each platform (i.e. "sparc", instead of "sparc" and "sparcv9").
134
135Once you have created the target machdep_XXX.h file, you must edit:
136
137	usr/src/cmd/sgs/include/machdep.h
138
139and add a #include for your new file to it, surrounded by the
140appropriate #ifdef for the target platform.
141
142This two level structure allows us to #include machdep information
143in two different ways:
144
145	1) Code that wants support for the current platform,
146	   regardless of which platform that is, can include
147	   usr/src/cmd/sgs/include/machdep.h. The runtime linker
148	   (ld.so.1) is the primary consumer of this form.
149
150	2) Code that needs to support multiple targets must never
151	   include the generic machdep.h from (1) above. Instead,
152	   such code explicitly includes the machdep file for the target
153	   it is interested in. For example:
154
155		#include <sparc/machdep_sparc.h>
156
157	   libld.so uses this form to build non-native target
158	   code.
159
160You will find that most of the constants defined in the target
161machdep file end up as initialization for the information that
162libld.so accesses via the ld_targ global variable.
163
164-----------------------------------------------------------------------------
165Study the definition of the Target typedef in
166
167	usr/src/cmd/sgs/libld/common/_libld.h
168
169This is the type of the ld_targ global variable. Filling in a complete
170copy of this definition is the primary task involved in adding support
171for a new target to libld.so, so it will be helpful to be familiar with
172it before you dive deeper. libld follows two simple rules with regards
173to ld_targ, and the Target type:
174
175	1) The target-independent common code can only access
176	   target-dependent code or data via the ld_targ global
177	   variable.
178
179	2) The target-dependent code can access the common
180	   code or data freely.
181
182A given link-editor invocation is always for a single target. The choice
183of target is made at initialization, and does not change within a
184single link. Code for the other targets is present, but is not
185accessed.
186
187-----------------------------------------------------------------------------
188Files containing the target-specific code to support the new
189platform must be added to libld.so. Examine the object lists
190in usr/src/cmd/sgs/libld/Makefile.com to see the files for existing
191platforms, and read those files to get a sense of what is required.
192
193Among the other files, every platform will have a file named
194machrel.XXX.c. This file contains the relocation-related functions,
195and it also contains an init function for the target. This init function
196is responsible for initializing the ld_targ global variable so that
197the common code will use the code and definitions for your
198target.
199
200You should start by creating a machrel.XXX.c file for your new
201target. Add other files as needed. Be aware that any functions or
202variables you place in these target-dependent files must either
203be static, or must have names that will not collide with the names
204used by the rest of libld.so. The easiest way to do this is to
205add a target suffix to the end of all such non-local names
206(i.e. foo_sparc() instead of foo()).
207
208The existing code is the documentation for this phase of things: The
209best way to determine what a given function should do is to read the
210code for other platforms, taking into account the similarities and
211differences in the ABI for your new target and those existing ones.
212
213-----------------------------------------------------------------------------
214You may find that your new target requires support for new concepts
215not found in other targets. A common example of this might be
216a new target specific ELF section type (i.e. SHT_AMD64_UNWIND). Another
217might be details involving PIC code and PLT generation (as found for
218sparc). It may be necessary to add new fields to the ld_targ global
219variable, and to modify the libld.so common code to use these new
220fields.
221
222It is a standard convention that NULL function pointers are used to
223represent functionality not required by a given target. Although the
224common code can consult ld_targ.t_m.m_mach to determine the target it
225is linking for, and although there is some code that does this, it
226is generally undesirable and unnecessary. Instead, the common code
227should test for such pointers, as with this sparc-specific example
228from update.c:
229
230	/*
231	 * Assign a got offset if necessary.
232	 */
233	if ((ld_targ.t_mr.mr_assign_got != NULL) &&
234	    (*ld_targ.t_mr.mr_assign_got)(ofl, sdp) == S_ERROR)
235		return ((Addr)S_ERROR);
236
237It may be tempting to include information in the comment that explains
238the target specific nature of this, and that may even be appropriate.
239Consider however, that a new target may come along with the same feature
240later, and if that occurs, your comments will instantly be dated. In general,
241the use of ld_targ is a strong hint to the reader that they should go read
242the target-specific code referenced to understand what is going on. It is
243best to supply comments at the call site that describe the operation
244in generic terms (i.e. "assign a got if necessary") instead of in
245explicit target terms (i.e. "Assign a sparc got if necessary"). Of
246course, some features are extremely target-specific (like amd64 unwind
247sections), and it doesn't really help to be obscure in such cases.
248This is a judgement call.
249
250If you do add a new field to ld_targ that uses NULL to represent
251an option feature *YOU MUST DOCUMENT IT AS SUCH*. You will find
252comments in _libld.h for existing optional fields. It suffices to
253add a comment for your new field. In the absence of such a comment,
254the common code assumes that all function pointers are safe to call
255through (dereference) without first testing them.
256
257-----------------------------------------------------------------------------
258Byte swapping is a big issue in cross linking, as the system running
259the link-editor may have the opposite byte order from the target. It is
260important to know when, and when not, to swap bytes.
261
262If the build system and target have different byte orders, the
263FLG_OF1_ENCDIFF bit of the ofl_flags field of the output file
264descriptor will be set. If this bit is not set, the target and
265system byte orders are the same, and no byte swapping
266is required.
267
268libld uses libelf to read and write objects. libelf automatically
269swaps bytes for the sections it knows about, such as symbol tables,
270relocation records, and the usual ELF plumbing. It is therefore never
271necessary for your code to swap the bytes in this data. If you find that
272this is not the case, you have probably uncovered a bug in libelf that
273you should look into.
274
275The big exception to libelf transparently handling byte swapping is
276progbits sections (SHT_PROGBITS). libelf does not understand program
277code or data as anything other than a series of byte values, and as such,
278cannot do byte swapping for them. If your code examines or modifies
279such data, you are responsible for handling the byte swapping required.
280
281The OFL_SWAP_RELOC macros found in _libld.h can be helpful in making such
282determinations. You should use these macros instead of writing your own
283tests for this, as they have high documentation value. If you find they
284don't handle your target, add a new one that does.
285
286GOT and PLT sections are SHT_PROGBITS. You will probably find
287that the vast majority of byte swapping you have to handle
288concern the handling of these items.
289
290libld contains generic functions for byte swapping:
291
292	ld_bswap_Word();
293	ld_bswap_Xword();
294
295These functions are built on top of the of the BSWAP_ macros found
296in usr/src/cmd/sgs/include/_machelf.h:
297
298	BSWAP_HALF
299	BSWAP_WORD
300	BSWAP_XWORD
301
302When copying data from one address to another in a cross link environment,
303the source and/or destination addresses may not have adequate alignment for
304the data type being copied. For example, a sparc platform cannot access
3058-byte data types on 4-byte boundaries, but it might need to do so when
306linking X86 objects where the alignment of such data can be 4. The
307UL_ASSIGN macros can be used to copy potentially unaligned data:
308
309	UL_ASSIGN_HALF
310	UL_ASSIGN_WORD
311	UL_ASSIGN_XWORD
312
313The UL_ASSIGN_BSWAP macros do unaligned copies, and also perform
314byte swapping when the linker host and target byte orders are
315different:
316
317	UL_ASSIGN_BSWAP_HALF
318	UL_ASSIGN_BSWAP_WORD
319	UL_ASSIGN_BSWAP_XWORD
320
321If you are reading/writing to relocation data, the following
322routines understand relocation records and will get/set the
323proper amount of data while handling any needed swapping:
324
325	ld_reloc_targval_get()
326	ld_reloc_targval_set()
327
328Byte swapping is a fertile area for mistakes. If you're having trouble
329getting a successful link in a cross link situation, you should always
330do the experiment of doing the link on a platform with the same byte
331order as the target. If that link succeeds, then you are looking at
332a bug involving incorrect byte swapping.
333
334-----------------------------------------------------------------------------
335   As mentioned above, incorrect byte swapping is a common
336error when developing libld target specific code. In addition to
337trying a build machine with the same byte order as the target, elfdump
338can also be a powerful tool for debugging. The first step with
339elfdump is to simply dump everything and read it looking for obviously
340bad information:
341
342	% elfdump outobj 2>&1 | more
343
344elfdump tries to do sanity checking on the objects it
345displays. Hence, the following command is a a common
346idiom:
347
348	% elfdump outobj > /dev/null
349
350Any problems with the file that elfdump can detect will be
351written to stderr.
352
353-----------------------------------------------------------------------------
354Once you have the target-specific code in place, you must modify the
355libld initialization code so that it will know how to use it. This
356logic is found in
357
358	usr/src/cmd/sgs/libld/common/ldmain.c
359
360in the function ld_init_target().
361
362-----------------------------------------------------------------------------
363The ld front end program that uses libld must be modified so that
364the "-z target=platform" command line option recognizes your
365new target. This code is found in
366
367	usr/src/cmd/sgs/ld/common
368
369The change consists of adding an additional strcasecmp() to the
370command line processing for -ztarget.
371
372-----------------------------------------------------------------------------
373You probably changed things getting your target integrated.
374Please update this document to reflect your changes.
375