xref: /titanic_51/usr/src/cmd/sgs/libld/common/README.XLINK (revision 7e16fca05dfbcfd32c2ebc9e4d1abdac1cd8657c)
1ba2be530Sab196087#
2ba2be530Sab196087# CDDL HEADER START
3ba2be530Sab196087#
4ba2be530Sab196087# The contents of this file are subject to the terms of the
5ba2be530Sab196087# Common Development and Distribution License (the "License").
6ba2be530Sab196087# You may not use this file except in compliance with the License.
7ba2be530Sab196087#
8ba2be530Sab196087# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9ba2be530Sab196087# or http://www.opensolaris.org/os/licensing.
10ba2be530Sab196087# See the License for the specific language governing permissions
11ba2be530Sab196087# and limitations under the License.
12ba2be530Sab196087#
13ba2be530Sab196087# When distributing Covered Code, include this CDDL HEADER in each
14ba2be530Sab196087# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
15ba2be530Sab196087# If applicable, add the following below this CDDL HEADER, with the
16ba2be530Sab196087# fields enclosed by brackets "[]" replaced with your own identifying
17ba2be530Sab196087# information: Portions Copyright [yyyy] [name of copyright owner]
18ba2be530Sab196087#
19ba2be530Sab196087# CDDL HEADER END
20ba2be530Sab196087#
21ba2be530Sab196087
22ba2be530Sab196087#
23*7e16fca0SAli Bahrami# Copyright 2009 Sun Microsystems, Inc.  All rights reserved.
24ba2be530Sab196087# Use is subject to license terms.
25ba2be530Sab196087#
26ba2be530Sab196087
27ba2be530Sab196087
28ba2be530Sab196087
29ba2be530Sab196087
30ba2be530Sab196087Notes On Cross Link-Editor Support in libld.so
31ba2be530Sab196087-----------------------------------------
32ba2be530Sab196087
33ba2be530Sab196087The Solaris link-editor is used in two contexts:
34ba2be530Sab196087
35ba2be530Sab196087	1) The standard ld command
36ba2be530Sab196087	2) Via the runtime linker (ld.so.1), when a program
37ba2be530Sab196087	   calls dlopen() on a relocatable object (ET_REL).
38ba2be530Sab196087
39ba2be530Sab196087To support both uses, it is packaged as a sharable library (libld.so).
40ba2be530Sab196087The ld command is therefore a simple wrapper that uses libld.
41ba2be530Sab196087
42ba2be530Sab196087libld.so is a cross linker. This means that it can link objects for
43ba2be530Sab196087a system other than the system running the link-editor (e.g. A link-editor
44ba2be530Sab196087running on an amd64 system processing sparc objects). This means that every
45ba2be530Sab196087instance of libld.so contains code for building objects for every supported
463ced7af1Sab196087target. It is not necessary to build libld specifically for the
473ced7af1Sab196087platform you're targeting. This is possible because we only support
483ced7af1Sab196087Solaris/ELF, with a small number of platforms, and the additional code
493ced7af1Sab196087required per target is small.
50ba2be530Sab196087
51ba2be530Sab196087At initialization, the caller of libld.so specifies the type of objects
52ba2be530Sab196087being linked. By default, the ld command determines the machine type and
53ba2be530Sab196087class of the object being generated from the first ELF object processed
54ba2be530Sab196087from the command line. The -64 and -ztarget options exists to change this
55ba2be530Sab196087default, which is useful when creating an object entirely from an archive
56ba2be530Sab196087library or a mapfile. During initialization, the link-editor configures
57ba2be530Sab196087itself to build an output object of the specified type. This is done via
58ba2be530Sab196087indirection, using the global ld_targ structure to access code, data, and
59ba2be530Sab196087constants for the specified target.
60ba2be530Sab196087
61ba2be530Sab196087There are two types of source files used to build libld.so:
62ba2be530Sab196087
63ba2be530Sab196087	1) Common code used for all targets
64ba2be530Sab196087	2) Target specific code used only when linking for
65ba2be530Sab196087	  a given target.
66ba2be530Sab196087
67ba2be530Sab196087All of these files reside in usr/src/cmd/sgs/libld/common. However,
68ba2be530Sab196087it is easy to see which files belong in each category by examining
69ba2be530Sab196087the object lists maintained in usr/src/cmd/sgs/libld/Makefile.com.
70ba2be530Sab196087In addition, the target-specific files usually include the target
71ba2be530Sab196087in their name (i.e. machrel.sparc.c).
72ba2be530Sab196087
73ba2be530Sab196087Although the target dependent and independent (common) code is well separated,
74*7e16fca0SAli Bahramithey are interdependent. The common code is explicitly aware of
75*7e16fca0SAli Bahramitarget-specific section types that can occur only for some targets
76ba2be530Sab196087(i.e. SHT_AMD64_UNWIND). This is not an architecture that allows
77ba2be530Sab196087for arbitrary target support to be dynamically plugged into an unchanged
78ba2be530Sab196087platform independent core. Rather, it is an organization that allows
79ba2be530Sab196087a common core to support all the targets it knows about in a way that
80ba2be530Sab196087is understandable and maintainable. A truly pluggable architecture
81ba2be530Sab196087would be considerably more opaque and complex, and is neither necessary,
82ba2be530Sab196087nor desirable, given the wide commonality between modern computer
83ba2be530Sab196087architectures.
84ba2be530Sab196087
85ba2be530Sab196087It is possible to add support for new targets to libld.so. The process
86ba2be530Sab196087of doing so is largely a matter of examining the files for existing
87ba2be530Sab196087platforms, studying the ABI for the new target platform, and then
88ba2be530Sab196087filling in the missing pieces for the new target. The remainder of this
89ba2be530Sab196087file consists of sections that describe some of the issues and steps
90ba2be530Sab196087that you will encounter in adding a new target.
91ba2be530Sab196087
92ba2be530Sab196087-----------------------------------------------------------------------------
93ba2be530Sab196087The relocation code used by ld is shared by the runtime linker (ld.so.1)
94ba2be530Sab196087and by the kernel module loader (ktrld), and is therefore found under
95ba2be530Sab196087usr/src/uts. You must add code for a relocation engine to support the
96ba2be530Sab196087new target. To do so, examine the common header files:
97ba2be530Sab196087
98ba2be530Sab196087	usr/src/uts/common/krtld/reloc.h
99ba2be530Sab196087	usr/src/uts/common/krtld/reloc_defs.h
100ba2be530Sab196087
101ba2be530Sab196087   and the existing relocation engines:
102ba2be530Sab196087
103ba2be530Sab196087	usr/src/uts/intel/amd64/krtld/doreloc.c
104ba2be530Sab196087	usr/src/uts/intel/ia32/krtld/doreloc.c
105ba2be530Sab196087	usr/src/uts/sparc/krtld/doreloc.c
106ba2be530Sab196087
107ba2be530Sab196087The ABI for the target platform will be the primary information
108ba2be530Sab196087you require. If your new system has attributes not found in an existing
109ba2be530Sab196087target, you may have to add/modify fields in the Rel_entry struct typedef
110ba2be530Sab196087(reloc_defs.h), or you may have to add new flags. Either sort of change
111ba2be530Sab196087may require you to also modify the existing relocation engines, and
112ba2be530Sab196087undoubtedly the common code in libld.so as well.
113ba2be530Sab196087
114ba2be530Sab196087When compiled for use by libld, the relocation engine requires an
115ba2be530Sab196087argument named "bswap". Each relocation engine must be prepared to
116ba2be530Sab196087swap the data bytes it is operating on. This support allows a link-editor
117ba2be530Sab196087running on a platform with a different byte order than the target to
118ba2be530Sab196087build objects for that target. To see how this is implemented, and how
119ba2be530Sab196087to ifdef that support so it only exists in the libld version of
120ba2be530Sab196087the engine, examine the code for the existing engines.
121ba2be530Sab196087
122ba2be530Sab196087-----------------------------------------------------------------------------
123ba2be530Sab196087You must create a target subdirectory in usr/src/cmd/sgs/include,
124ba2be530Sab196087and construct a machdep_XXX.h file (where XXX is the name of the
125ba2be530Sab196087target). The machdep files for the current platforms can be helpful:
126ba2be530Sab196087
127ba2be530Sab196087	usr/src/cmd/sgs/include/sparc/machdep_sparc.h
128ba2be530Sab196087	usr/src/cmd/sgs/include/i386/machdep_x86.h
129ba2be530Sab196087
130ba2be530Sab196087Note that these files support both the 32 and 64-bit versions of
131ba2be530Sab196087a given platform, so there is only one subdirectory and machdep
132ba2be530Sab196087file for each platform (i.e. "sparc", instead of "sparc" and "sparcv9").
133ba2be530Sab196087
134ba2be530Sab196087Once you have created the target machdep_XXX.h file, you must edit:
135ba2be530Sab196087
136ba2be530Sab196087	usr/src/cmd/sgs/include/machdep.h
137ba2be530Sab196087
138ba2be530Sab196087and add a #include for your new file to it, surrounded by the
139ba2be530Sab196087appropriate #ifdef for the target platform.
140ba2be530Sab196087
141ba2be530Sab196087This two level structure allows us to #include machdep information
142ba2be530Sab196087in two different ways:
143ba2be530Sab196087
144ba2be530Sab196087	1) Code that wants support for the current platform,
145ba2be530Sab196087	   regardless of which platform that is, can include
146ba2be530Sab196087	   usr/src/cmd/sgs/include/machdep.h. The runtime linker
147ba2be530Sab196087	   (ld.so.1) is the primary consumer of this form.
148ba2be530Sab196087
149ba2be530Sab196087	2) Code that needs to support multiple targets must never
150ba2be530Sab196087	   include the generic machdep.h from (1) above. Instead,
151ba2be530Sab196087	   such code explicitly includes the machdep file for the target
152ba2be530Sab196087	   it is interested in. For example:
153ba2be530Sab196087
154ba2be530Sab196087		#include <sparc/machdep_sparc.h>
155ba2be530Sab196087
156ba2be530Sab196087	   libld.so uses this form to build non-native target
157ba2be530Sab196087	   code.
158ba2be530Sab196087
159ba2be530Sab196087You will find that most of the constants defined in the target
160ba2be530Sab196087machdep file end up as initialization for the information that
161ba2be530Sab196087libld.so accesses via the ld_targ global variable.
162ba2be530Sab196087
163ba2be530Sab196087-----------------------------------------------------------------------------
164ba2be530Sab196087Study the definition of the Target typedef in
165ba2be530Sab196087
166ba2be530Sab196087	usr/src/cmd/sgs/libld/common/_libld.h
167ba2be530Sab196087
168ba2be530Sab196087This is the type of the ld_targ global variable. Filling in a complete
169ba2be530Sab196087copy of this definition is the primary task involved in adding support
170ba2be530Sab196087for a new target to libld.so, so it will be helpful to be familiar with
171ba2be530Sab196087it before you dive deeper. libld follows two simple rules with regards
172ba2be530Sab196087to ld_targ, and the Target type:
173ba2be530Sab196087
174ba2be530Sab196087	1) The target-independent common code can only access
175ba2be530Sab196087	   target-dependent code or data via the ld_targ global
176ba2be530Sab196087	   variable.
177ba2be530Sab196087
178ba2be530Sab196087	2) The target-dependent code can access the common
179ba2be530Sab196087	   code or data freely.
180ba2be530Sab196087
181ba2be530Sab196087A given link-editor invocation is always for a single target. The choice
182ba2be530Sab196087of target is made at initialization, and does not change within a
183ba2be530Sab196087single link. Code for the other targets is present, but is not
184ba2be530Sab196087accessed.
185ba2be530Sab196087
186ba2be530Sab196087-----------------------------------------------------------------------------
187ba2be530Sab196087Files containing the target-specific code to support the new
188ba2be530Sab196087platform must be added to libld.so. Examine the object lists
189ba2be530Sab196087in usr/src/cmd/sgs/libld/Makefile.com to see the files for existing
190ba2be530Sab196087platforms, and read those files to get a sense of what is required.
191ba2be530Sab196087
192ba2be530Sab196087Among the other files, every platform will have a file named
193ba2be530Sab196087machrel.XXX.c. This file contains the relocation-related functions,
194ba2be530Sab196087and it also contains an init function for the target. This init function
195ba2be530Sab196087is responsible for initializing the ld_targ global variable so that
196ba2be530Sab196087the common code will use the code and definitions for your
197ba2be530Sab196087target.
198ba2be530Sab196087
199ba2be530Sab196087You should start by creating a machrel.XXX.c file for your new
200ba2be530Sab196087target. Add other files as needed. Be aware that any functions or
201ba2be530Sab196087variables you place in these target-dependent files must either
202ba2be530Sab196087be static, or must have names that will not collide with the names
203ba2be530Sab196087used by the rest of libld.so. The easiest way to do this is to
204ba2be530Sab196087add a target suffix to the end of all such non-local names
205ba2be530Sab196087(i.e. foo_sparc() instead of foo()).
206ba2be530Sab196087
207ba2be530Sab196087The existing code is the documentation for this phase of things: The
208ba2be530Sab196087best way to determine what a given function should do is to read the
209ba2be530Sab196087code for other platforms, taking into account the similarities and
210ba2be530Sab196087differences in the ABI for your new target and those existing ones.
211ba2be530Sab196087
212ba2be530Sab196087-----------------------------------------------------------------------------
213ba2be530Sab196087You may find that your new target requires support for new concepts
214ba2be530Sab196087not found in other targets. A common example of this might be
215ba2be530Sab196087a new target specific ELF section type (i.e. SHT_AMD64_UNWIND). Another
216ba2be530Sab196087might be details involving PIC code and PLT generation (as found for
217ba2be530Sab196087sparc). It may be necessary to add new fields to the ld_targ global
218ba2be530Sab196087variable, and to modify the libld.so common code to use these new
219ba2be530Sab196087fields.
220ba2be530Sab196087
221ba2be530Sab196087It is a standard convention that NULL function pointers are used to
222ba2be530Sab196087represent functionality not required by a given target. Although the
223ba2be530Sab196087common code can consult ld_targ.t_m.m_mach to determine the target it
224ba2be530Sab196087is linking for, and although there is some code that does this, it
225ba2be530Sab196087is generally undesirable and unnecessary. Instead, the common code
226ba2be530Sab196087should test for such pointers, as with this sparc-specific example
227ba2be530Sab196087from update.c:
228ba2be530Sab196087
229ba2be530Sab196087	/*
230ba2be530Sab196087	 * Assign a got offset if necessary.
231ba2be530Sab196087	 */
232ba2be530Sab196087	if ((ld_targ.t_mr.mr_assign_got != NULL) &&
233ba2be530Sab196087	    (*ld_targ.t_mr.mr_assign_got)(ofl, sdp) == S_ERROR)
234ba2be530Sab196087		return ((Addr)S_ERROR);
235ba2be530Sab196087
236ba2be530Sab196087It may be tempting to include information in the comment that explains
237ba2be530Sab196087the target specific nature of this, and that may even be appropriate.
238ba2be530Sab196087Consider however, that a new target may come along with the same feature
239ba2be530Sab196087later, and if that occurs, your comments will instantly be dated. In general,
240ba2be530Sab196087the use of ld_targ is a strong hint to the reader that they should go read
241ba2be530Sab196087the target-specific code referenced to understand what is going on. It is
242ba2be530Sab196087best to supply comments at the call site that describe the operation
243ba2be530Sab196087in generic terms (i.e. "assign a got if necessary") instead of in
244ba2be530Sab196087explicit target terms (i.e. "Assign a sparc got if necessary"). Of
245ba2be530Sab196087course, some features are extremely target-specific (like amd64 unwind
246ba2be530Sab196087sections), and it doesn't really help to be obscure in such cases.
247ba2be530Sab196087This is a judgement call.
248ba2be530Sab196087
249ba2be530Sab196087If you do add a new field to ld_targ that uses NULL to represent
250ba2be530Sab196087an option feature *YOU MUST DOCUMENT IT AS SUCH*. You will find
251ba2be530Sab196087comments in _libld.h for existing optional fields. It suffices to
252ba2be530Sab196087add a comment for your new field. In the absence of such a comment,
253ba2be530Sab196087the common code assumes that all function pointers are safe to call
254ba2be530Sab196087through (dereference) without first testing them.
255ba2be530Sab196087
256ba2be530Sab196087-----------------------------------------------------------------------------
257ba2be530Sab196087Byte swapping is a big issue in cross linking, as the system running
258ba2be530Sab196087the link-editor may have the opposite byte order from the target. It is
259ba2be530Sab196087important to know when, and when not, to swap bytes.
260ba2be530Sab196087
261ba2be530Sab196087If the build system and target have different byte orders, the
262ba2be530Sab196087FLG_OF1_ENCDIFF bit of the ofl_flags field of the output file
263ba2be530Sab196087descriptor will be set. If this bit is not set, the target and
264ba2be530Sab196087system byte orders are the same, and no byte swapping
265ba2be530Sab196087is required.
266ba2be530Sab196087
267ba2be530Sab196087libld uses libelf to read and write objects. libelf automatically
268ba2be530Sab196087swaps bytes for the sections it knows about, such as symbol tables,
269ba2be530Sab196087relocation records, and the usual ELF plumbing. It is therefore never
270ba2be530Sab196087necessary for your code to swap the bytes in this data. If you find that
271ba2be530Sab196087this is not the case, you have probably uncovered a bug in libelf that
272ba2be530Sab196087you should look into.
273ba2be530Sab196087
274ba2be530Sab196087The big exception to libelf transparently handling byte swapping is
275ba2be530Sab196087progbits sections (SHT_PROGBITS). libelf does not understand program
276ba2be530Sab196087code or data as anything other than a series of byte values, and as such,
277ba2be530Sab196087cannot do byte swapping for them. If your code examines or modifies
278ba2be530Sab196087such data, you are responsible for handling the byte swapping required.
279ba2be530Sab196087
280ba2be530Sab196087The OFL_SWAP_RELOC macros found in _libld.h can be helpful in making such
281ba2be530Sab196087determinations. You should use these macros instead of writing your own
282ba2be530Sab196087tests for this, as they have high documentation value. If you find they
283ba2be530Sab196087don't handle your target, add a new one that does.
284ba2be530Sab196087
285ba2be530Sab196087GOT and PLT sections are SHT_PROGBITS. You will probably find
286ba2be530Sab196087that the vast majority of byte swapping you have to handle
287ba2be530Sab196087concern the handling of these items.
288ba2be530Sab196087
289ba2be530Sab196087libld contains generic functions for byte swapping:
290ba2be530Sab196087
291ba2be530Sab196087	ld_bswap_Word();
292ba2be530Sab196087	ld_bswap_Xword();
293ba2be530Sab196087
294ba2be530Sab196087These functions are built on top of the of the BSWAP_ macros found
295ba2be530Sab196087in usr/src/cmd/sgs/include/_machelf.h:
296ba2be530Sab196087
297ba2be530Sab196087	BSWAP_HALF
298ba2be530Sab196087	BSWAP_WORD
299ba2be530Sab196087	BSWAP_XWORD
300ba2be530Sab196087
301ba2be530Sab196087When copying data from one address to another in a cross link environment,
302ba2be530Sab196087the source and/or destination addresses may not have adequate alignment for
303ba2be530Sab196087the data type being copied. For example, a sparc platform cannot access
304ba2be530Sab1960878-byte data types on 4-byte boundaries, but it might need to do so when
305ba2be530Sab196087linking X86 objects where the alignment of such data can be 4. The
306ba2be530Sab196087UL_ASSIGN macros can be used to copy potentially unaligned data:
307ba2be530Sab196087
308ba2be530Sab196087	UL_ASSIGN_HALF
309ba2be530Sab196087	UL_ASSIGN_WORD
310ba2be530Sab196087	UL_ASSIGN_XWORD
311ba2be530Sab196087
312ba2be530Sab196087The UL_ASSIGN_BSWAP macros do unaligned copies, and also perform
313ba2be530Sab196087byte swapping when the linker host and target byte orders are
314ba2be530Sab196087different:
315ba2be530Sab196087
316ba2be530Sab196087	UL_ASSIGN_BSWAP_HALF
317ba2be530Sab196087	UL_ASSIGN_BSWAP_WORD
318ba2be530Sab196087	UL_ASSIGN_BSWAP_XWORD
319ba2be530Sab196087
320ba2be530Sab196087If you are reading/writing to relocation data, the following
321ba2be530Sab196087routines understand relocation records and will get/set the
322ba2be530Sab196087proper amount of data while handling any needed swapping:
323ba2be530Sab196087
324ba2be530Sab196087	ld_reloc_targval_get()
325ba2be530Sab196087	ld_reloc_targval_set()
326ba2be530Sab196087
327ba2be530Sab196087Byte swapping is a fertile area for mistakes. If you're having trouble
328ba2be530Sab196087getting a successful link in a cross link situation, you should always
329ba2be530Sab196087do the experiment of doing the link on a platform with the same byte
330ba2be530Sab196087order as the target. If that link succeeds, then you are looking at
331ba2be530Sab196087a bug involving incorrect byte swapping.
332ba2be530Sab196087
333ba2be530Sab196087-----------------------------------------------------------------------------
334ba2be530Sab196087   As mentioned above, incorrect byte swapping is a common
335ba2be530Sab196087error when developing libld target specific code. In addition to
336ba2be530Sab196087trying a build machine with the same byte order as the target, elfdump
337ba2be530Sab196087can also be a powerful tool for debugging. The first step with
338ba2be530Sab196087elfdump is to simply dump everything and read it looking for obviously
339ba2be530Sab196087bad information:
340ba2be530Sab196087
341ba2be530Sab196087	% elfdump outobj 2>&1 | more
342ba2be530Sab196087
343ba2be530Sab196087elfdump tries to do sanity checking on the objects it
344ba2be530Sab196087displays. Hence, the following command is a a common
345ba2be530Sab196087idiom:
346ba2be530Sab196087
347ba2be530Sab196087	% elfdump outobj > /dev/null
348ba2be530Sab196087
349ba2be530Sab196087Any problems with the file that elfdump can detect will be
350ba2be530Sab196087written to stderr.
351ba2be530Sab196087
352ba2be530Sab196087-----------------------------------------------------------------------------
353ba2be530Sab196087Once you have the target-specific code in place, you must modify the
354ba2be530Sab196087libld initialization code so that it will know how to use it. This
355ba2be530Sab196087logic is found in
356ba2be530Sab196087
357ba2be530Sab196087	usr/src/cmd/sgs/libld/common/ldmain.c
358ba2be530Sab196087
359ba2be530Sab196087in the function ld_init_target().
360ba2be530Sab196087
361ba2be530Sab196087-----------------------------------------------------------------------------
362ba2be530Sab196087The ld front end program that uses libld must be modified so that
363ba2be530Sab196087the "-z target=platform" command line option recognizes your
364ba2be530Sab196087new target. This code is found in
365ba2be530Sab196087
366ba2be530Sab196087	usr/src/cmd/sgs/ld/common
367ba2be530Sab196087
368ba2be530Sab196087The change consists of adding an additional strcasecmp() to the
369ba2be530Sab196087command line processing for -ztarget.
370ba2be530Sab196087
371ba2be530Sab196087-----------------------------------------------------------------------------
372ba2be530Sab196087You probably changed things getting your target integrated.
373ba2be530Sab196087Please update this document to reflect your changes.
374