DEC Alpha Options - Using the GNU Compiler Collection (GCC)

Next: DEC Alpha/VMS Options, Previous: Darwin Options, Up: Submodel Options

3.17.9 DEC Alpha Options

These `-m' options are defined for the DEC Alpha implementations:

-mno-soft-float

-msoft-float

Use (do not use) the hardware floating-point instructions for floating-point operations. When -msoft-float is specified, functions in libgcc.a will be used to perform floating-point operations. Unless they are replaced by routines that emulate the floating-point operations, or compiled in such a way as to call such emulations routines, these routines will issue floating-point operations. If you are compiling for an Alpha without floating-point operations, you must ensure that the library is built so as not to call them.

Note that Alpha implementations without floating-point operations are required to have floating-point registers.

-mfp-reg

-mno-fp-regs

Generate code that uses (does not use) the floating-point register set. -mno-fp-regs implies -msoft-float. If the floating-point register set is not used, floating-point operands are passed in integer registers as if they were integers and floating-point results are passed in $0 instead of $f0. This is a non-standard calling sequence, so any function with a floating-point argument or return value called by code compiled with -mno-fp-regs must also be compiled with that option.

A typical use of this option is building a kernel that does not use, and hence need not save and restore, any floating-point registers.

-mieee

The Alpha architecture implements floating-point hardware optimized for maximum performance. It is mostly compliant with the IEEE floating-point standard. However, for full compliance, software assistance is required. This option generates code fully IEEE-compliant code except that the inexact-flag is not maintained (see below). If this option is turned on, the preprocessor macro _IEEE_FP is defined during compilation. The resulting code is less efficient but is able to correctly support denormalized numbers and exceptional IEEE values such as not-a-number and plus/minus infinity. Other Alpha compilers call this option -ieee_with_no_inexact.

-mieee-with-inexact

This is like -mieee except the generated code also maintains the IEEE inexact-flag. Turning on this option causes the generated code to implement fully-compliant IEEE math. In addition to _IEEE_FP, _IEEE_FP_EXACT is defined as a preprocessor macro. On some Alpha implementations the resulting code may execute significantly slower than the code generated by default. Since there is very little code that depends on the inexact-flag, you should normally not specify this option. Other Alpha compilers call this option -ieee_with_inexact.

-mfp-trap-mode=trap-mode

This option controls what floating-point related traps are enabled. Other Alpha compilers call this option -fptm trap-mode. The trap mode can be set to one of four values:

`n': This is the default (normal) setting. The only traps that are enabled are the ones that cannot be disabled in software (e.g., division by zero trap).
`u': In addition to the traps enabled by `n', underflow traps are enabled as well.
`su': Like `u', but the instructions are marked to be safe for software completion (see Alpha architecture manual for details).
`sui': Like `su', but inexact traps are enabled as well.

-mfp-rounding-mode=rounding-mode

Selects the IEEE rounding mode. Other Alpha compilers call this option -fprm rounding-mode. The rounding-mode can be one of:

`n': Normal IEEE rounding mode. Floating-point numbers are rounded towards the nearest machine number or towards the even machine number in case of a tie.
`m': Round towards minus infinity.
`c': Chopped rounding mode. Floating-point numbers are rounded towards zero.
`d': Dynamic rounding mode. A field in the floating-point control register (fpcr, see Alpha architecture reference manual) controls the rounding mode in effect. The C library initializes this register for rounding towards plus infinity. Thus, unless your program modifies the fpcr, `d' corresponds to round towards plus infinity.

-mtrap-precision=trap-precision

In the Alpha architecture, floating-point traps are imprecise. This means without software assistance it is impossible to recover from a floating trap and program execution normally needs to be terminated. GCC can generate code that can assist operating system trap handlers in determining the exact location that caused a floating-point trap. Depending on the requirements of an application, different levels of precisions can be selected:

`p': Program precision. This option is the default and means a trap handler can only identify which program caused a floating-point exception.
`f': Function precision. The trap handler can determine the function that caused a floating-point exception.
`i': Instruction precision. The trap handler can determine the exact instruction that caused a floating-point exception.

Other Alpha compilers provide the equivalent options called -scope_safe and -resumption_safe.

-mieee-conformant

This option marks the generated code as IEEE conformant. You must not use this option unless you also specify -mtrap-precision=i and either -mfp-trap-mode=su or -mfp-trap-mode=sui. Its only effect is to emit the line `.eflag 48' in the function prologue of the generated assembly file. Under DEC Unix, this has the effect that IEEE-conformant math library routines will be linked in.

-mbuild-constants

Normally GCC examines a 32- or 64-bit integer constant to see if it can construct it from smaller constants in two or three instructions. If it cannot, it will output the constant as a literal and generate code to load it from the data segment at run time.

Use this option to require GCC to construct all integer constants using code, even if it takes more instructions (the maximum is six).

You would typically use this option to build a shared library dynamic loader. Itself a shared library, it must relocate itself in memory before it can find the variables and constants in its own data segment.

-malpha-as

-mgas

Select whether to generate code to be assembled by the vendor-supplied assembler (-malpha-as) or by the GNU assembler -mgas.

-mbwx

-mno-bwx

-mcix

-mno-cix

-mfix

-mno-fix

-mmax

-mno-max

Indicate whether GCC should generate code to use the optional BWX, CIX, FIX and MAX instruction sets. The default is to use the instruction sets supported by the CPU type specified via -mcpu= option or that of the CPU on which GCC was built if none was specified.

-mfloat-vax

-mfloat-ieee

Generate code that uses (does not use) VAX F and G floating-point arithmetic instead of IEEE single and double precision.

-mexplicit-relocs

-mno-explicit-relocs

Older Alpha assemblers provided no way to generate symbol relocations except via assembler macros. Use of these macros does not allow optimal instruction scheduling. GNU binutils as of version 2.12 supports a new syntax that allows the compiler to explicitly mark which relocations should apply to which instructions. This option is mostly useful for debugging, as GCC detects the capabilities of the assembler when it is built and sets the default accordingly.

-msmall-data

-mlarge-data

When -mexplicit-relocs is in effect, static data is accessed via gp-relative relocations. When -msmall-data is used, objects 8 bytes long or smaller are placed in a small data area (the .sdata and .sbss sections) and are accessed via 16-bit relocations off of the $gp register. This limits the size of the small data area to 64KB, but allows the variables to be directly accessed via a single instruction.

The default is -mlarge-data. With this option the data area is limited to just below 2GB. Programs that require more than 2GB of data must use malloc or mmap to allocate the data in the heap instead of in the program's data segment.

When generating code for shared libraries, -fpic implies -msmall-data and -fPIC implies -mlarge-data.

-msmall-text

-mlarge-text

When -msmall-text is used, the compiler assumes that the code of the entire program (or shared library) fits in 4MB, and is thus reachable with a branch instruction. When -msmall-data is used, the compiler can assume that all local symbols share the same $gp value, and thus reduce the number of instructions required for a function call from 4 to 1.

The default is -mlarge-text.

-mcpu=cpu_type

Set the instruction set and instruction scheduling parameters for machine type cpu_type. You can specify either the `EV' style name or the corresponding chip number. GCC supports scheduling parameters for the EV4, EV5 and EV6 family of processors and will choose the default values for the instruction set from the processor you specify. If you do not specify a processor type, GCC will default to the processor on which the compiler was built.

Supported values for cpu_type are

`ev4'
`ev45'
`21064': Schedules as an EV4 and has no instruction set extensions.
`ev5'
`21164': Schedules as an EV5 and has no instruction set extensions.
`ev56'
`21164a': Schedules as an EV5 and supports the BWX extension.
`pca56'
`21164pc'
`21164PC': Schedules as an EV5 and supports the BWX and MAX extensions.
`ev6'
`21264': Schedules as an EV6 and supports the BWX, FIX, and MAX extensions.
`ev67'
`21264a': Schedules as an EV6 and supports the BWX, CIX, FIX, and MAX extensions.

Native toolchains also support the value `native', which selects the best architecture option for the host processor. -mcpu=native has no effect if GCC does not recognize the processor.

-mtune=cpu_type

Set only the instruction scheduling parameters for machine type cpu_type. The instruction set is not changed.

Native toolchains also support the value `native', which selects the best architecture option for the host processor. -mtune=native has no effect if GCC does not recognize the processor.

-mmemory-latency=time

Sets the latency the scheduler should assume for typical memory references as seen by the application. This number is highly dependent on the memory access patterns used by the application and the size of the external cache on the machine.

Valid options for time are

`number': A decimal number representing clock cycles.
`L1'
`L2'
`L3'
`main': The compiler contains estimates of the number of clock cycles for “typical” EV4 & EV5 hardware for the Level 1, 2 & 3 caches (also called Dcache, Scache, and Bcache), as well as to main memory. Note that L3 is only valid for EV5.