1# 2# CDDL HEADER START 3# 4# The contents of this file are subject to the terms of the 5# Common Development and Distribution License (the "License"). 6# You may not use this file except in compliance with the License. 7# 8# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 9# or http://www.opensolaris.org/os/licensing. 10# See the License for the specific language governing permissions 11# and limitations under the License. 12# 13# When distributing Covered Code, include this CDDL HEADER in each 14# file and include the License file at usr/src/OPENSOLARIS.LICENSE. 15# If applicable, add the following below this CDDL HEADER, with the 16# fields enclosed by brackets "[]" replaced with your own identifying 17# information: Portions Copyright [yyyy] [name of copyright owner] 18# 19# CDDL HEADER END 20# 21 22# 23# Copyright 2008 Sun Microsystems, Inc. All rights reserved. 24# Use is subject to license terms. 25# 26# ident "%Z%%M% %I% %E% SMI" 27 28 29 30 31Notes On Cross Link-Editor Support in libld.so 32----------------------------------------- 33 34The Solaris link-editor is used in two contexts: 35 36 1) The standard ld command 37 2) Via the runtime linker (ld.so.1), when a program 38 calls dlopen() on a relocatable object (ET_REL). 39 40To support both uses, it is packaged as a sharable library (libld.so). 41The ld command is therefore a simple wrapper that uses libld. 42 43libld.so is a cross linker. This means that it can link objects for 44a system other than the system running the link-editor (e.g. A link-editor 45running on an amd64 system processing sparc objects). This means that every 46instance of libld.so contains code for building objects for every supported 47target. This unlike GNU ld, where you build gld specifically for the 48platform you're targeting. This is possible because unlike gld, we only 49support Solaris, with a small number of platforms. 50 51At initialization, the caller of libld.so specifies the type of objects 52being linked. By default, the ld command determines the machine type and 53class of the object being generated from the first ELF object processed 54from the command line. The -64 and -ztarget options exists to change this 55default, which is useful when creating an object entirely from an archive 56library or a mapfile. During initialization, the link-editor configures 57itself to build an output object of the specified type. This is done via 58indirection, using the global ld_targ structure to access code, data, and 59constants for the specified target. 60 61There are two types of source files used to build libld.so: 62 63 1) Common code used for all targets 64 2) Target specific code used only when linking for 65 a given target. 66 67All of these files reside in usr/src/cmd/sgs/libld/common. However, 68it is easy to see which files belong in each category by examining 69the object lists maintained in usr/src/cmd/sgs/libld/Makefile.com. 70In addition, the target-specific files usually include the target 71in their name (i.e. machrel.sparc.c). 72 73Although the target dependent and independent (common) code is well separated, 74they are interdependent. For example, the common code is aware of 75the target-specific section types that can occur only for some targets 76(i.e. SHT_AMD64_UNWIND). This is not an architecture that allows 77for arbitrary target support to be dynamically plugged into an unchanged 78platform independent core. Rather, it is an organization that allows 79a common core to support all the targets it knows about in a way that 80is understandable and maintainable. A truly pluggable architecture 81would be considerably more opaque and complex, and is neither necessary, 82nor desirable, given the wide commonality between modern computer 83architectures. 84 85It is possible to add support for new targets to libld.so. The process 86of doing so is largely a matter of examining the files for existing 87platforms, studying the ABI for the new target platform, and then 88filling in the missing pieces for the new target. The remainder of this 89file consists of sections that describe some of the issues and steps 90that you will encounter in adding a new target. 91 92----------------------------------------------------------------------------- 93The relocation code used by ld is shared by the runtime linker (ld.so.1) 94and by the kernel module loader (ktrld), and is therefore found under 95usr/src/uts. You must add code for a relocation engine to support the 96new target. To do so, examine the common header files: 97 98 usr/src/uts/common/krtld/reloc.h 99 usr/src/uts/common/krtld/reloc_defs.h 100 101 and the existing relocation engines: 102 103 usr/src/uts/intel/amd64/krtld/doreloc.c 104 usr/src/uts/intel/ia32/krtld/doreloc.c 105 usr/src/uts/sparc/krtld/doreloc.c 106 107The ABI for the target platform will be the primary information 108you require. If your new system has attributes not found in an existing 109target, you may have to add/modify fields in the Rel_entry struct typedef 110(reloc_defs.h), or you may have to add new flags. Either sort of change 111may require you to also modify the existing relocation engines, and 112undoubtedly the common code in libld.so as well. 113 114When compiled for use by libld, the relocation engine requires an 115argument named "bswap". Each relocation engine must be prepared to 116swap the data bytes it is operating on. This support allows a link-editor 117running on a platform with a different byte order than the target to 118build objects for that target. To see how this is implemented, and how 119to ifdef that support so it only exists in the libld version of 120the engine, examine the code for the existing engines. 121 122----------------------------------------------------------------------------- 123You must create a target subdirectory in usr/src/cmd/sgs/include, 124and construct a machdep_XXX.h file (where XXX is the name of the 125target). The machdep files for the current platforms can be helpful: 126 127 usr/src/cmd/sgs/include/sparc/machdep_sparc.h 128 usr/src/cmd/sgs/include/i386/machdep_x86.h 129 130Note that these files support both the 32 and 64-bit versions of 131a given platform, so there is only one subdirectory and machdep 132file for each platform (i.e. "sparc", instead of "sparc" and "sparcv9"). 133 134Once you have created the target machdep_XXX.h file, you must edit: 135 136 usr/src/cmd/sgs/include/machdep.h 137 138and add a #include for your new file to it, surrounded by the 139appropriate #ifdef for the target platform. 140 141This two level structure allows us to #include machdep information 142in two different ways: 143 144 1) Code that wants support for the current platform, 145 regardless of which platform that is, can include 146 usr/src/cmd/sgs/include/machdep.h. The runtime linker 147 (ld.so.1) is the primary consumer of this form. 148 149 2) Code that needs to support multiple targets must never 150 include the generic machdep.h from (1) above. Instead, 151 such code explicitly includes the machdep file for the target 152 it is interested in. For example: 153 154 #include <sparc/machdep_sparc.h> 155 156 libld.so uses this form to build non-native target 157 code. 158 159You will find that most of the constants defined in the target 160machdep file end up as initialization for the information that 161libld.so accesses via the ld_targ global variable. 162 163----------------------------------------------------------------------------- 164Study the definition of the Target typedef in 165 166 usr/src/cmd/sgs/libld/common/_libld.h 167 168This is the type of the ld_targ global variable. Filling in a complete 169copy of this definition is the primary task involved in adding support 170for a new target to libld.so, so it will be helpful to be familiar with 171it before you dive deeper. libld follows two simple rules with regards 172to ld_targ, and the Target type: 173 174 1) The target-independent common code can only access 175 target-dependent code or data via the ld_targ global 176 variable. 177 178 2) The target-dependent code can access the common 179 code or data freely. 180 181A given link-editor invocation is always for a single target. The choice 182of target is made at initialization, and does not change within a 183single link. Code for the other targets is present, but is not 184accessed. 185 186----------------------------------------------------------------------------- 187Files containing the target-specific code to support the new 188platform must be added to libld.so. Examine the object lists 189in usr/src/cmd/sgs/libld/Makefile.com to see the files for existing 190platforms, and read those files to get a sense of what is required. 191 192Among the other files, every platform will have a file named 193machrel.XXX.c. This file contains the relocation-related functions, 194and it also contains an init function for the target. This init function 195is responsible for initializing the ld_targ global variable so that 196the common code will use the code and definitions for your 197target. 198 199You should start by creating a machrel.XXX.c file for your new 200target. Add other files as needed. Be aware that any functions or 201variables you place in these target-dependent files must either 202be static, or must have names that will not collide with the names 203used by the rest of libld.so. The easiest way to do this is to 204add a target suffix to the end of all such non-local names 205(i.e. foo_sparc() instead of foo()). 206 207The existing code is the documentation for this phase of things: The 208best way to determine what a given function should do is to read the 209code for other platforms, taking into account the similarities and 210differences in the ABI for your new target and those existing ones. 211 212----------------------------------------------------------------------------- 213You may find that your new target requires support for new concepts 214not found in other targets. A common example of this might be 215a new target specific ELF section type (i.e. SHT_AMD64_UNWIND). Another 216might be details involving PIC code and PLT generation (as found for 217sparc). It may be necessary to add new fields to the ld_targ global 218variable, and to modify the libld.so common code to use these new 219fields. 220 221It is a standard convention that NULL function pointers are used to 222represent functionality not required by a given target. Although the 223common code can consult ld_targ.t_m.m_mach to determine the target it 224is linking for, and although there is some code that does this, it 225is generally undesirable and unnecessary. Instead, the common code 226should test for such pointers, as with this sparc-specific example 227from update.c: 228 229 /* 230 * Assign a got offset if necessary. 231 */ 232 if ((ld_targ.t_mr.mr_assign_got != NULL) && 233 (*ld_targ.t_mr.mr_assign_got)(ofl, sdp) == S_ERROR) 234 return ((Addr)S_ERROR); 235 236It may be tempting to include information in the comment that explains 237the target specific nature of this, and that may even be appropriate. 238Consider however, that a new target may come along with the same feature 239later, and if that occurs, your comments will instantly be dated. In general, 240the use of ld_targ is a strong hint to the reader that they should go read 241the target-specific code referenced to understand what is going on. It is 242best to supply comments at the call site that describe the operation 243in generic terms (i.e. "assign a got if necessary") instead of in 244explicit target terms (i.e. "Assign a sparc got if necessary"). Of 245course, some features are extremely target-specific (like amd64 unwind 246sections), and it doesn't really help to be obscure in such cases. 247This is a judgement call. 248 249If you do add a new field to ld_targ that uses NULL to represent 250an option feature *YOU MUST DOCUMENT IT AS SUCH*. You will find 251comments in _libld.h for existing optional fields. It suffices to 252add a comment for your new field. In the absence of such a comment, 253the common code assumes that all function pointers are safe to call 254through (dereference) without first testing them. 255 256----------------------------------------------------------------------------- 257Byte swapping is a big issue in cross linking, as the system running 258the link-editor may have the opposite byte order from the target. It is 259important to know when, and when not, to swap bytes. 260 261If the build system and target have different byte orders, the 262FLG_OF1_ENCDIFF bit of the ofl_flags field of the output file 263descriptor will be set. If this bit is not set, the target and 264system byte orders are the same, and no byte swapping 265is required. 266 267libld uses libelf to read and write objects. libelf automatically 268swaps bytes for the sections it knows about, such as symbol tables, 269relocation records, and the usual ELF plumbing. It is therefore never 270necessary for your code to swap the bytes in this data. If you find that 271this is not the case, you have probably uncovered a bug in libelf that 272you should look into. 273 274The big exception to libelf transparently handling byte swapping is 275progbits sections (SHT_PROGBITS). libelf does not understand program 276code or data as anything other than a series of byte values, and as such, 277cannot do byte swapping for them. If your code examines or modifies 278such data, you are responsible for handling the byte swapping required. 279 280The OFL_SWAP_RELOC macros found in _libld.h can be helpful in making such 281determinations. You should use these macros instead of writing your own 282tests for this, as they have high documentation value. If you find they 283don't handle your target, add a new one that does. 284 285GOT and PLT sections are SHT_PROGBITS. You will probably find 286that the vast majority of byte swapping you have to handle 287concern the handling of these items. 288 289libld contains generic functions for byte swapping: 290 291 ld_bswap_Word(); 292 ld_bswap_Xword(); 293 294These functions are built on top of the of the BSWAP_ macros found 295in usr/src/cmd/sgs/include/_machelf.h: 296 297 BSWAP_HALF 298 BSWAP_WORD 299 BSWAP_XWORD 300 301When copying data from one address to another in a cross link environment, 302the source and/or destination addresses may not have adequate alignment for 303the data type being copied. For example, a sparc platform cannot access 3048-byte data types on 4-byte boundaries, but it might need to do so when 305linking X86 objects where the alignment of such data can be 4. The 306UL_ASSIGN macros can be used to copy potentially unaligned data: 307 308 UL_ASSIGN_HALF 309 UL_ASSIGN_WORD 310 UL_ASSIGN_XWORD 311 312The UL_ASSIGN_BSWAP macros do unaligned copies, and also perform 313byte swapping when the linker host and target byte orders are 314different: 315 316 UL_ASSIGN_BSWAP_HALF 317 UL_ASSIGN_BSWAP_WORD 318 UL_ASSIGN_BSWAP_XWORD 319 320If you are reading/writing to relocation data, the following 321routines understand relocation records and will get/set the 322proper amount of data while handling any needed swapping: 323 324 ld_reloc_targval_get() 325 ld_reloc_targval_set() 326 327Byte swapping is a fertile area for mistakes. If you're having trouble 328getting a successful link in a cross link situation, you should always 329do the experiment of doing the link on a platform with the same byte 330order as the target. If that link succeeds, then you are looking at 331a bug involving incorrect byte swapping. 332 333----------------------------------------------------------------------------- 334 As mentioned above, incorrect byte swapping is a common 335error when developing libld target specific code. In addition to 336trying a build machine with the same byte order as the target, elfdump 337can also be a powerful tool for debugging. The first step with 338elfdump is to simply dump everything and read it looking for obviously 339bad information: 340 341 % elfdump outobj 2>&1 | more 342 343elfdump tries to do sanity checking on the objects it 344displays. Hence, the following command is a a common 345idiom: 346 347 % elfdump outobj > /dev/null 348 349Any problems with the file that elfdump can detect will be 350written to stderr. 351 352----------------------------------------------------------------------------- 353Once you have the target-specific code in place, you must modify the 354libld initialization code so that it will know how to use it. This 355logic is found in 356 357 usr/src/cmd/sgs/libld/common/ldmain.c 358 359in the function ld_init_target(). 360 361----------------------------------------------------------------------------- 362The ld front end program that uses libld must be modified so that 363the "-z target=platform" command line option recognizes your 364new target. This code is found in 365 366 usr/src/cmd/sgs/ld/common 367 368The change consists of adding an additional strcasecmp() to the 369command line processing for -ztarget. 370 371----------------------------------------------------------------------------- 372You probably changed things getting your target integrated. 373Please update this document to reflect your changes. 374