eko

NAME

eko - The complete list of options and flags for the QLogic PathScale(TM) Compiler Suite

CG, INLINE, IPA, LANG, LNO, OPT, TENV, WOPT − other major topics covered

DESCRIPTION

This man page describes the various flags available for use with the QLogic PathScale pathcc, pathCC, and pathf95 compilers.

OPTIMIZATION FLAGS

Some suboptions either enable or disable the feature. To enable a feature, either specify only the suboption name or specify =1, =ON, or =TRUE. Disabling a feature, is accomplished by adding =0, =OFF, or =FALSE. These values are insensitive to case: ’on’ and ’ON’ mean the same thing. Below, ON and OFF are used to indicate the enabling or disabling of a feature.

Many options have an opposite ("no-") counterpart. This is represented as [no-] in the option description and if used, will turn off or prevent the action of the option. If no [no-] is shown, there is no opposite option to the listed option.

−###

Like the −v option, only nothing is run and args are quoted.

−A pred=ans

Make an assertion with the predicate ’pred’ and answer ’ans’. The −pred=ans form cancels an assertion with predicate ’pred’ and answer ’ans’.

−alignN

Align data on common blocks to specified boundaries. The alignN specifications are as follows:

Option

Action

-align8

Align data in common blocks to 8−bit boundaries.

-align16

Align data in common blocks to 16−bit boundaries.

-align32

Align data in common blocks 32−bit boundaries.

-align64

Align data in common blocks to 64−bit boundaries. This is the default.

-align128

Align data in common blocks to 128−bit boundaries.

When an alignment is specified, objects smaller than the specification are aligned on boundaries that correspond to their sizes. For example, when align64 is specified, 32−bit and larger objects are aligned on 32−bit boundaries; 16−bit and larger objects are aligned on 16−bit boundaries; and 8−bit and larger objects are aligned on 8−bit boundaries.

−ansi

(For Fortran) Generate messages about constructs which violate standard Fortran syntax rules and constraints, plus messages about obsolescent and deleted features. This also disables all nonstandard intrinsic functions and subroutines. Specifying −ansi in conjunction with −fullwarn causes all messages, regardless of level, to be generated.

−ansi

(For C/C++) Enable pure ANSI/ISO C mode.

−apo

This auto-parallelizing option signals the compiler to automatically convert sequential code into parallel code when it is safe and beneficial to do so. The resulting executable can then run faster on a machine with more than one CPU.

−ar

Create an archive using ar(1) instead of a shared object or executable. The name of the archive is specified by using the −o option. Template entities required by the objects being archived are instantiated before creating the archive. The pathCC command implicitly passes the −r and −c options of ar to ar in addition to the name of the archive and the objects being created. Any other option that can be used in conjunction with the −c option of ar can be passed to ar using −WR,option_name.

NOTE: The objects specified with this option must include all of the objects that will be included in the archive. Failure to do so may cause prelinker internal errors. In the following example, liba.a is an archive containing only a.o, b.o, and c.o. The a.o, b.o, and c.o objects are prelinked to instantiate any required template entities, and the ar −r −c −v liba.a a.o b.o c.o command is executed. All three objects must be specified with −ar even if only b.o needs to be replaced in lib.a.

     pathCC −ar −WR,−v −o liba.a a.o b.o c.o

See the ld(1) man page for more information about shared libraries and archives.

−auto-use module_name[,module_name] ...

(For Fortran) Direct the compiler to behave as if a USE module_name statement were entered in your Fortran source code for each module_name. The USE statements are entered in every program unit and interface body in the source file being compiled (for example, pathf95 −auto-use mpi_interface or pathf95 −auto-use shmem_interface). Using this option can add compiler time in some situations.

−backslash

Treat a backslash as a normal character rather than as an escape character. When this option is used, the preprocessor will not be called.

−C

(For Fortran) Perform runtime subscript range checking. Subscripts that are out of range cause fatal runtime errors. If you set the F90_BOUNDS_CHECK_ABORT environment variable to YES, the program aborts.

−C

(For C) Keep comments after preprocessing.

−c

Create an intermediate object file for each named source file, but does not link the object files. The intermediate object file name corresponds to the name of the source file; a .o suffix is substituted for the suffix of the source file.

Because they are mutually exclusive, do not specify this option with the −r option.

−CG[:...]

The Code Generation option group controls the optimizations and transformations of the instruction−level code generator.

−CG:cflow=(ON|OFF)

OFF disables control flow optimization in the code generation. Default is ON.

−CG:cse_regs=N

When performing common subexpression elimination during code generation, assume there are N extra integer registers available over the number provided by the CPU. N can be positive, zero, or negative. The default is positive infinity. See also -CG:sse_cse_regs.

−CG:gcm=(ON|OFF)

Specifying OFF disables the instruction−level global code motion optimization phase. The default is ON.

−CG:load_exe=N

Specify the threshold for subsuming a memory load operation into the operand of an arithmetic instruction. The value of 0 turns off this subsumption optimization. If N is 1, this subsumption is performed only when the result of the load has only one use. This subsumption is not performed if the number of times the result of the load is used exceeds the value N, a non−negative integer. The default value varies based on processor target and source language.

−CG:local_fwd_sched=(ON|OFF)

Change the instruction scheduling algorithm to work forward instead of backward for the instructions in each basic block. The default is OFF for 64-bit ABI, and ON for 32-bit ABI.

−CG:movnti=N

Convert ordinary stores to non−temporal stores when writing memory blocks of size larger than N KB. When N is set to 0, this transformation is avoided. The default value is 1000 (KB).

−CG:p2align=(ON|OFF)

Align loop heads to 64-byte boundaries. The default is OFF.

−CG:p2align_freq=N

Align branch targets based on execution frequency. This option is meaningful only under feedback−directed compilation. The default value N=0 turns off the alignment optimization. Any other value specifies the frequency threshold at or above which this alignment will be performed by the compiler.

−CG:prefer_legacy_regs=(ON|OFF)

Tell the local register allocator to use the first 8 integer and SSE registers whenever possible (%rax-%rbp, %xmm0-%xmm7). Instructions using these registers have smaller instruction sizes. The default is OFF.

−CG:prefetch=(ON|OFF)

Enable generation of prefetch instructions in the code generator. The default is ON. (-CG:prefetch=OFF and -LNO:prefetch=0 both suppress the generation of prefetch instructions, but -LNO:prefetch=0 also affects LNO optimizations that depend on prefetch.)

−CG:sse_cse_regs=N

When performing common subexpression elimination during code generation, assume there are N extra SSE registers available over the number provided by the CPU. N can be positive, zero, or negative. The default is positive infinity. See also -CG:cse_regs.

−CG:use_prefetchnta=(ON|OFF)

Prefetch when data is non−temporal at all levels of the cache hierarchy. This is for data streaming situations in which the data will not need to be re-used soon. The default is OFF.

−CG:use_test=(ON|OFF)

Make the code generator use the TEST instruction instead of CMP. See Opteron’s instruction description for the difference between these two instructions. The default is OFF.

−clist

(C only) Enable the C listing. Specifying −clist is the equivalent of specifying −CLIST:=ON.

−CLIST: ...

(C only) Control emission of the compiler’s internal program representation back into C code, after IPA inlining and loop−nest transformations. This is a diagnostic tool, and the generated C code may not always be compilable. The generated C code is written to two files, a header file containing file−scope declarations, and a file containing function definitions. With the exception of −CLIST:=OFF, any use of this option implies −clist. The individual controls in this group are as follows:

=(ON|OFF)

Enable the C listing. This option is implied by any of the others, but may be used to enable the listing when no other options are required. For example, specifying −CLIST:=ON is the equivalent of specifying −clist.

dotc_file=filename

Write the program units into the specified file, filename. The default source file name has the extension .w2c.c.

doth_file=filename

Specify the file into which file−scope declarations are deposited. Defaults to the source file name with the extension .w2c.h.

emit_pfetch[=(ON|OFF)]

Display prefetch information as comments in the transformed source. If ON or OFF is not specified, the default is OFF.

linelength=N

Set the maximum line length to N characters. The default is unlimited.

show[=(ON|OFF)]

Print the input and output file names to stderr. If ON or OFF is not specified, the default is ON.

−colN

(Fortran only) Specify the line width for fixed−format source lines. Specify 72, 80, or 120 for N (-col72, -col80, or -col120). By default, fixed−format lines are 72 characters wide. Specifying −col120 implies −extend-source and recognizes lines up to 132 characters wide. For more information on specifying line length, see the −extend-source and −noextend-source options.

−copyright

Show the copyright for the compiler being used.

−cpp

Run the preprocessor, cpp, on all input source files, regardless of suffix, before compiling. This preprocessor automatically expands macros outside of preprocessor statements.

The default is to run the C preprocessor (cpp) if the input file ends in a .F or .F90 suffix.

For more information on controlling preprocessing, see the −ftpp, −E, and −nocpp options. For information on enabling macro expansion, see the −macro-expand option. By default, no preprocessing is performed on files that end in a .f or .f90 suffix.

−d-lines

(Fortran only) Compile lines with a D in column 1.

−Dvar=[def][,var=[def] ...]

Define variables used for source preprocessing as if they had been defined by a #define directive. If no def is specified, 1 is used. For information on undefining variables, see the −Uvar option.

−default64

(For Fortran only) Set the sizes of default integer, real, logical, and double precision objects. This option is a synonym for the pair of options: −r8 −i8. Calling a routine in a specialized library, such as SCSL, requires that its 64−bit entry point be specified when 64−bit data are used. Similarly, its 32−bit entry point must be specified when 32−bit data are used.

−dumpversion

Show the version of the compiler being used and nothing else.

−E

Run only the source preprocessor files, without considering suffixes, and write the result to stdout. This option overrides the −nocpp option. The output file contains line directives. To generate an output file without line directives, see the −P option. For more information on controlling source preprocessing, see the −cpp, −ftpp, −macro-expand, and −nocpp options.

−extend-source

(For Fortran only) Specify a 132−character line length for fixed−format source lines. By default, fixed−format lines are 72 characters wide. For more information on controlling line length, see the −coln option.

−fb-create <path>

Used to specify that an instrumented executable program is to be generated. Such an executable is suitable for producing feedback data files with the specified prefix for use in feedback-directed compilation (FDO). The commonly used prefix is <fbdata>. This is OFF by default.

−fb-opt <prefix for feedback data files>

Used to specify feedback−directed compilation (FDO) by extracting feedback data from files with the specified prefix, which were previously generated using −fb-create. The commonly used prefix is "fbdata". The same optimization flags must have been used in the −fb-create compile. Feedback data files created from executables compiled with different optimization flags will give checksum errors. FDO is OFF by default.

-fb-phase=(0,1,2,3,4)

Used to specify the compilation phase at which instrumentation for the collection of profile data is performed, so is useful only when used with −fb-create. The values must be in the range 0 to 4. The default value is 0, and specifies the earliest phase for instrumentation, which is after the front-end processing.

−f[no-]check-new

(For C++ only) Check the result of new for NULL. When −fno−check−new is used, the compiler will not check the result of an operator of NULL.

−fe

Stop after the front-end is run.

−f[no-]unwind-tables

−funwind-tables emits unwind information. −fno-unwind-tables tells the compiler never to emit any unwind information. This is the default. Flags to enable exception handling automatically enable -funwind-tables.

−f[no-]fast−math

−ffast-math improves FP speed by relaxing ANSI & IEEE rules. −ffast-math is implied by −Ofast. −fno-fast−math tells the compiler to conform to ANSI and IEEE math rules at the expense of speed. −ffast-math implies −OPT:IEEE_arithmetic=2 -fno-math-errno. -fno-fast-math implies -OPT:IEEE_arithmetic=1 -fmath-errno.

−f[no-]fast−stdlib

The −ffast-stdlib flag improves application performance by generating code to link against special versions of some standard library routines, and linking against the PathScale compiler runtime library. This option is enabled by default.

If −fno−fast−stdlib is used during compilation, the compiler will not emit code to link against fast versions of standard library routines. During compilation, −ffast−stdlib implies −OPT:fast_stdlib=on.

If −fno−fast−stdlib is used during linking, the compiler will not link against the PathScale compiler runtime library.

If you link code with −fno−fast−stdlib that was not also compiled with this flag, you may see linker errors. Much of the PathScale compiler Fortran runtime is compiled with −ffast−stdlib, so it is not advised to link Fortran applications with −fno-fast−stdlib.

−ffloat−store

Do not store floating point variables in registers, and inhibit other options that might change whether a floating point value is taken from a register or memory. This option prevents undesirable excess precision on the X87 floating-point unit where all floating-point computations are performed in one precision regardless of the original type. (see -mx87−precision). If the program uses floating point values with less precision, the extra precision in the X87 may violate the precise definition of IEEE floating point. -ffloat−store causes all pertinent immediate computations to be stored to memory to force truncation to lower precision. However, the extra stores will slow down program execution substantially. -ffloat−store has no effect under -msse2, which is the default under both -m64 and -m32.

−ffortran-bounds-check

(For Fortran only) Check bounds.

−f[no-]gnu−keywords

(For C/C++ only) Recognize ’typeof’ as a keyword. If −fno−gnu−keywords is used, do not recognize ’typeof’ as a keyword.

−f[no-]implicit-inline-templates

(For C++ only) −fimplicit-inline-templates emits code for inline templates instantiated implicitly. −fno−implicit-inline-templates tells the compiler to never emit code for inline templates instantiated implicitly.

−f[no-]implicit-templates

(For C++ only) The −fimplicit-templates option emits code for non−inline templates instantiated implicitly. With −fno-implicit-templates the compiler will not emit code for non−inline templates instantiated implicitly.

−finhibit-size-directive

Do not generate .size directives.

−f[no-]inline-functions

(For C/C++ only) −finline−functions automatically integrates simple functions into their callers. −fno-inline-functions does not automatically integrate simple functions into their callers.

−fabi-version=N

(For C++ only) Use version N of the C++ ABI. Version 1 is the version of the C++ ABI that first appeared in G++ 3.2. Version 0 will always be the version that conforms most closely to the C++ ABI specification. Therefore, the ABI obtained using version 0 will change as ABI bugs are fixed. The default is version 1.

−fixedform

(For Fortran only) Treat all input source files, regardless of suffix, as if they were written in fixed source form (f77 72-column format), instead of F90 free format. By default, only input files suffixed with .f or .F are assumed to be written in fixed source form.

−fkeep-inline-functions

(For C/C++ only) Generate code for functions even if they are fully inlined.

−FLIST: ...

Invoke the Fortran listing control group, which controls production of the compiler’s internal program representation back into Fortran code, after IPA inlining and loop−nest transformations. This is used primarily as a diagnostic tool, and the generated Fortran code may not always compile. With the exception of −FLIST:=OFF, any use of this option implies −flist. The arguments to the −FLIST option are as follows:

Argument

Action

=setting

Enable or disable the listing. setting can be either ON or OFF. The default is OFF.

This option is enabled when any other −FLIST options are enabled, but it can also be used to enable a listing when no other options are enabled.

ansi_format=setting

Set ANSI format. setting can be either ON or OFF. When set to ON, the compiler uses a space (instead of tab) for indentation and a maximum of 72 characters per line. The default is OFF.

emit_pfetch=setting

Writes prefetch information, as comments, in the transformed source file. setting can be either ON or OFF. The default is OFF.

In the listing, PREFETCH identifies a prefetch and includes the variable reference (with an offset in bytes), an indication of read/write, a stride for each dimension, and a number in the range from 1 (low) to 3 (high), which reflects the confidence in the prefetch analysis. Prefetch identifies the reference(s) being prefetched by the PREFETCH descriptor. The comments occur after a read/write to a variable and note the identifier of the PREFETCH−spec for each level of the cache.

ftn_file=file

Write the program to file. By default, the program is written to file.w2f.f.

linelength=N

Set the maximum line length to N characters.

show=setting

Write the input and output filenames to stderr. setting can be either ON or OFF. The default is ON.

−flist

Invoke all Fortran listing control options. The effect is the same as if all −FLIST options are enabled.

−fms-extensions

(For C/C++ only) Accept broken MFC extensions without warning.

−fno-asm

(For C/C++ only) Do not recognize the ’asm’ keyword.

−fno-builtin

(For C/C++ only) Do not recognize any built in functions.

−fno-common

(For C/C++ only) Use strict ref/def initialization model.

−f[no-]exceptions

(For C++ only) −fexceptions enables exception handling. This is the default. −fno-exceptions disables exception handling. This option has a subset of the effects of −fno-gnu-exceptions. Hence, it can be used on some C++ applications, on which −fno-gnu-exceptions cannot be applied.

−f[no-]fast−math

−ffast-math improves FP speed by relaxing ANSI & IEEE rules. −fno-fast−math tells the compiler to conform to ANSI and IEEE math rules at the expense of speed.

−f[no-]gnu−exceptions

(For C++ only) −fgnu-exceptions enables exception handling, and is equivalent to −fexceptions. This is the default. −fno-gnu-exceptions disables exception handling, and is equivalent to GNU option −fno-exceptions.

−fno-ident

Ignore #ident directives.

−fno−math−errno

Do not set ERRNO after calling math functions that are executed with a single instruction, e.g. sqrt. A program that relies on IEEE exceptions for math error handling may want to use this flag for speed while maintaining IEEE arithmetic compatibility. This is implied by −Ofast. The default is −fmath-errno.

−f[no−]signed−char

(For C/C++ only) −fsigned−char makes ’char’ signed by default. −fno−signed−char makes ’char’ unsigned by default.

−fpack-struct

(For C/C++ only) Pack structure members together without holes.

−f[no−]permissive

−fpermissive will downgrade messages about non−conformant code to warnings. −fno−permissive keeps messages about non−conformant code as errors.

−f[no−]preprocessed

−fpreprocessed tells the preprocessor that input has already been preprocessed. Using −fno−preprocessed tells preprocessor that input has not already been preprocessed.

−freeform

(For Fortran only) Treats all input source files, regardless of suffix, as if they were written in free source form. By default, only input files suffixed with .f90 or .F90 are assumed to be written in free source form.

−f[no-]rtti

(For C++ only) Using −frtti will generate runtime type information. The −fno-rtti option will not generate runtime type information.

−f[no-]second-underscore

(For Fortran only) −fsecond-underscore appends a second underscore to symbols that already contain an underscore. −fno−second-underscore tells the compiler not to append a second underscore to symbols that already contain an underscore.

−f[no-]signed-bitfields

(For C/C++ only) −fsigned-bitfields makes bitfields be signed by default. The −fno-signed-bitfields will make bitfields be unsigned by default.

−f[no-]strict-aliasing

(For C/C++ only) −fstrict−aliasing tells the compiler to assume strictest aliasing rules. −fno−strict−aliasing tells the compiler not to assume strict aliasing rules.

−f[no-]PIC

−fPIC tells the compiler to generate position independent code, if possible. The default is −fno−PIC, which tells the compiler not to generate position independent code.

−fprefix-function-name

(For C/C++ only) Add a prefix to all function names.

−fshared-data

(For C/C++ only) Mark data as shared rather than private.

−fshort-double

(For C/C++ only) Use the same size for double as for float.

−fshort-enums

(For C/C++ only) Use the smallest fitting integer to hold enums.

−fshort-wchar

(For C/C++ only) Use short unsigned int for wchar_t instead of the default underlying type for the target.

−ftest-coverage

Create data files for the pathcov(1) code-coverage utility. The data file names begin with the name of your source file:

SOURCENAME.bb

A mapping from basic blocks to line numbers, which pathcov uses to associate basic block execution counts with line numbers.

SOURCENAME.bbg

A list of all arcs in the program flow graph. This allows pathcov to reconstruct the program flow graph, so that it can compute all basic block and arc execution counts from the information in the SOURCENAME.da file.

Use −ftest-coverage with −fprofile-arcs; the latter option adds instrumentation to the program, which then writes execution counts to another data file:

SOURCENAME.da

Runtime arc execution counts, used in conjunction with the arc information in the file SOURCENAME.bbg.

Coverage data will map better to the source files if −ftest-coverage is used without optimization. See the gcc man pages for more information.

−ftpp

Run the Fortran source preprocessor on input Fortran source files before compiling. By default, files suffixed with .F or .F90 are run through the C source preprocessor (cpp). Files that are suffixed with .f or .f90 are not run through any preprocessor by default.

The Fortran source preprocessor does not automatically expand macros outside of preprocessor statements, so you need to specify −macro-expand if you want macros expanded.

−fullwarn

Request that the compiler generate comment−level messages. These messages are suppressed by default. Specifying this option can be useful during software development.

−f[no-]underscoring

(For Fortran only) −funderscoring appends underscores to symbols. −fno-underscoring tells the compiler not to append underscores to symbols.

−f[no-]unsafe-math-optimizations

−funsafe-math-optimizations improves FP speed by violating ANSI and IEEE rules. −fno-unsafe-math-optimizations makes the compilation conform to ANSI and IEEE math rules at the expense of speed. This option is provided for GCC compatibility and is equivalent to −OPT:IEEE_arithmetic=3 −fno−math−errno.

−fuse-cxa-atexit

(For C++ only) Register static destructors with __cxa_atexit instead of atexit.

−fwritable-strings

(For C/C++ only) Attempt to support writable-strings K&R style C.

−g[N]

Specify debugging support and to indicate the level of information produced by the compiler. The supported values for N are:

0

No debugging information for symbolic debugging is produced. This is the default.

1

Produces minimal information, enough for making backtraces in parts of the program that you don’t plan to debug. This is also the flag to use if the user wants backtraces but does not want the overhead of full debug information. This flag also causes −−export−dynamic to be passed to the linker.

2

Produces debugging information for symbolic debugging. Specifying −g without a debug level is equivalent to specifying −g2. If there is no explicit optimization flag specified, the −O0 optimization level is used in order to maintain the accuracy of the debugging information. If optimization options −O1, −O2, −O3 or −ipa are explicitly specified, the optimizations are performed accordingly but the accuracy of the debugging cannot be guaranteed.

3

Produces additional debugging information for debugging macros.

−gcc

(For C/C++ only) Define the __GNUC__ and other predefined

preprocessor macros.

−gnu[N]

(For C/C++ only) Enables the compiler to generate code compatible with the GNU N series of compilers, where N is either 3 or 4. On systems whose system compiler is GCC 3, the default is -gnu3; on GCC 4 systems the default is -gnu4. Use -show-defaults to display the default.

−GRA:home=(ON|OFF)

Turn off the rematerialization optimization for non−local user variables in the Global Register Allocator. Default is ON.

−GRA:optimize_boundary=(ON|OFF)

Allow the Global Register Allocator to allocate the same register to different variables in the same basic-block. Default is OFF.

−help

List all available options. The compiler is not invoked.

−help:

Print list of possible options that contain a given string.

−H

Print the name of each header file used.

−Idir

Specify a directory to be searched. This is used for the following types of files:

Files named in INCLUDE lines in the Fortran source file that do not begin with a slash (/) character

Files named in #include source preprocessing directives that do not begin with a slash (/) character

Files specified on Fortran USE statements

Files are searched in the following order: first, in the directory that contains the input file; second, in the directories specified by dir; and third, in the standard directory, /usr/include.

−iN

(For Fortran only) Specify the length of default integer constants, default integer variables, and logical quantities. Specify one of the following:

Option

Action

−i4

Specifies 32−bit (4 byte−) objects. The default.

−i8

Specifies 64−bit (8 byte−) objects.

−ignore-suffix

Determine the language of the source file being compiled by the command used to invoke the compiler. By default, the language is determined by the file suffixes (.c, .cpp, .C, .cxx, .f, .f90, .s). When the −ignore-suffix option is specified, the pathcc command invokes the C compiler, pathCC invokes the C++ compiler, and pathf95 invokes the Fortran 95 compiler.

−inline

Request inline processing.

−INLINE: ...

Specify options for subprogram inlining. may not always compile. With the exception of −INLINE:=OFF, any use of this option implies −inline.

If you have included inlining directives in your source code, the −INLINE option must be specified in order for those directives to be honored.

−INLINE:aggressive=(ON|OFF)

Tell the compiler to be more aggressive about inlining. The default is −INLINE:aggressive=OFF.

−INLINE:list=(ON|OFF)

Tell the inliner to list inlining actions as they occur to stderr. The default is −INLINE:list=OFF.

−INLINE:preempt=(ON|OFF)

Perform inlining of functions marked preemptible in the light-weight inliner. Default is OFF. This inlining prevents another definition of such a function, in another DSO, from preempting the definition of the function being inlined.

−ipa

Invoke inter-procedural analysis (IPA). Specifying this option is identical to specifying −IPA or −IPA:. Default settings for the individual IPA suboptions are used.

−IPA: ...

The inter-procedural analyzer option group controls application of inter-procedural analysis and optimization, including inlining, constant propagation, common block array padding, dead function elimination, alias analysis, and others. Specify −IPA by itself to invoke the inter-procedural analysis phase with default options. If you compile and link in distinct steps, you must specify at least −IPA for the compile step, and specify −IPA and the individual options in the group for the link step. If you specify −IPA for the compile step, and do not specify −IPA for the link step, you will receive an error.

−IPA:addressing=(ON|OFF)

Invoke the analysis of address operator usage. The default is Off. −IPA:alias=ON is a prerequisite for this option.

−IPA:aggr_cprop=(ON|OFF)

Enable or disable aggressive inter-procedural constant propagation. Setting can be ON or OFF. This attempts to avoid passing constant parameters, replacing the corresponding formal parameters by the constant values. Less aggressive inter-procedural constant propagation is done by default. The default setting is ON.

−IPA:alias=(ON|OFF)

Invoke alias/mod/ref analysis. The default is ON.

−IPA:callee_limit=N

Functions whose size exceeds this limit will never be automatically inlined by the compiler. The default is 500.

−IPA:cgi=(ON|OFF)

Invoke constant global variable identification. This option marks non-scalar global variables that are never modified as constant, and propagates their constant values to all files. Default is ON.

−IPA:clone_list=(ON|OFF)

Tell the IPA function cloner to list cloning actions as they occur to stderr. The default is −IPA:clone_list=OFF.

−IPA:common_pad_size=N

This specifies the amount by which to pad common block array dimensions. By default, an amount is automatically chosen that will improve cache behavior for common block array accesses.

−IPA:cprop=(ON|OFF)

Turn on or off inter-procedural constant propagation. This option identifies the formal parameters that always have a specific constant value. Default is ON. See also -IPA:aggr_cprop.

−IPA:ctype=(ON|OFF)

When ON, causes the compiler to generate faster versions of the <ctype.h> macros such as isalpha, isascii, etc. This flag is unsafe both in multi-threaded programs and in all locales other than the 7-bit ASCII (or "C") locale. The default is OFF. Do not turn this on unless the program will always run under the 7-bit ASCII (or "C") locale and is single-threaded.

−IPA:depth=N

Identical to maxdepth=N.

−IPA:dfe=(ON|OFF)

Enable or disable dead function elimination. Removes any functions that are inlined everywhere they are called. The default is ON.

−IPA:dve=(ON|OFF)

Enable or disable dead variable elimination. This option removes variables that are never referenced by the program. Default is ON.

−IPA:echo=(ON|OFF)

Option to echo (to stderr) the compile commands and the final link commands that are invoked from IPA. Default is OFF. This option can help monitor the progress of a large system build.

−IPA:field_reorder=(ON|OFF)

Enable the re−ordering of fields in large structs based on their reference patterns in feedback compilation to minimize data cache misses. The default is OFF.

−IPA:forcedepth=N

This option sets inline depths, directing IPA to attempt to inline all functions at a depth of (at most) N in the callgraph, instead of using the default inlining heuristics. This option ignores the default heuristic limits on inlining. Functions at depth 0 make no calls to any sub-functions. Functions only making calls to depth 0 functions are at depth 1, and so on.

−IPA:ignore_lang=(ON|OFF)

Enable/disable inlining across language boundaries of Fortran on one side, and C/C++ on the other. The compiler may not always be aware of the correct effective language semantics if this optimization is done, making it unsafe in some scenarios. The default is OFF.

−IPA:inline=(ON|OFF)

This option performs inter-file subprogram inlining during the main IPA processing. The default is ON. Does not affect the light-weight inliner.

−IPA:keeplight=(ON|OFF)

This option directs IPA not to send −keep to the compiler, in order to save space. The default is OFF.

−IPA:linear=(ON|OFF)

Controls conversion of a multi-dimensional array to a single dimensional (linear) array that covers the same block of memory. When inlining Fortran subroutines, IPA tries to map formal array parameters to the shape of the actual parameter. In the case that it cannot map the parameter, it linearizes the array reference. By default, IPA will not inline such callsites because they may cause performance problems. The default is OFF.

−IPA:map_limit=N

Direct when IPA enables sp_partition. N is the maximum size (in bytes) of input files mapped before IPA invokes -IPA:sp_partition.

−IPA:maxdepth=N

This option directs IPA to not attempt to inline functions at a depth of more than N in the callgraph; where functions that make no calls are at depth 0, those that call only depth 0 functions are at depth 1, and so on. This inlining remains subject to overriding limits on code expansion. Also see −IPA:forcedepth, −IPA:space, and −IPA:plimit.

−IPA:max_jobs=N

This option limits the maximum parallelism when invoking the compiler after IPA to (at most) N compilations running at once. The option can take the following values:

0 = The parallelism chosen is equal to either the number of CPUs,

the number of cores, or the number of hyperthreading units in the compiling system, whichever is greatest.

1 = Disable parallelization during compilation (default)

>1 = Specifically set the degree of parallelism

−IPA:min_hotness=N

When feedback information is available, a call site to a procedure must be invoked with a count that exceeds the threshold specified by N before the procedure will be inlined at that call site. The default is 10.

−IPA:multi_clone=N

This option specifies the maximum number of clones that can be created from a single procedure. Default value is 0. Aggressive procedural cloning may provide opportunities for inter-procedural optimization, but may also significantly increase the code size.

−IPA:node_bloat=N

When this option is used in conjunction with −IPA:multi_clone, it specifies the maximum percentage growth of the total number of procedures relative to the original program.

−IPA:plimit=N

This option stops inlining into a specific subprogram once it reaches size N in the intermediate representation. Default is 2500.

−IPA:pu_reorder=(0|1|2)

Control re−ordering the layout of program units based on their invocation patterns in feedback compilation to minimize instruction cache misses. This option is ignored unless under feedback compilation.

0 = Disable procedure reordering. This is the default for non−C++

programs.

1 = Reorder based on the frequency in which different procedures

are invoked. This is the default for C++ programs.

2 = Reorder based on caller-callee relationship.

−IPA:relopt=(ON|OFF)

This option enables optimizations similar to those achieved with the compiler options −O and −c, where objects are built with the assumption that the compiled objects will be linked into a call-shared executable later. The default is OFF. In effect, optimizations based on position-dependent code (non-PIC) are performed on the compiled objects.

−IPA:small_pu=N

A procedure with size smaller than N is not subjected to the plimit restriction. The default is 30.

−IPA:sp_partition=[setting]

This option enables partitioning for disk/addressing−saving purposes. The default is OFF. Mainly used for building very large programs. Normally, partitioning would be done by IPA internally.

−IPA:space=N

Inline until a program expansion of N% is reached. For example, -IPA:space=20 limits code expansion due to inlining to approximately 20%. Default is no limit.

−IPA:specfile=filename

Opens a filename to read additional options. The specification file contains zero or more lines with inliner options in the form expected on the command line. The specfile option cannot occur in a specification file, so specification files cannot invoke other specification files.

−IPA:use_intrinsic=(ON|OFF)

Enable/disable loading the intrinsic version of standard library functions. The default is OFF.

−isystem dir

Search dir for header files, after all directories specified by −I but before the standard system directories. Mark it as a system directory, so that it gets the same special treatment as is applied to the standard system directories.

−keep

Write all intermediate compilation files. file.s contains the generated assembly language code. file.i contains the preprocessed source code. These files are retained after compilation is finished. If IPA is in effect and you want to retain file.s, you must specify −IPA:keeplight=OFF in addition to −keep.

−keepdollar

(For Fortran only) Treat the dollar sign ($) as a normal last character in symbol names.

−L directory

In XPG4 mode, changes the algorithm of searching for libraries named in −L operands to look in the specified directory before looking in the default location. Directories specified in −L options are searched in the specified order. Multiple instances of −L options can be specified.

−l library

In XPG4 mode, searches the specified library. A library is searched when its name is encountered, so the placement of a −l operand is significant.

−LANG: ...

Controls the language option group. The following sections describe the suboptions available in this group.

Argument

Action

copyinout=(ON|OFF)

When an array section is passed as the actual argument in a call, the compiler sometimes copies the array section to a temporary array and passes the temporary array, thus promoting locality in the accesses to the array argument. This optimization is relevant only to Fortran, and this flag controls the aggressiveness of this optimization. The default is ON for −O2 or higher and OFF otherwise.

formal_deref_unsafe=(ON|OFF)

Tell the compiler whether it is unsafe to speculate a dereference of a formal parameter in Fortran. The default is OFF, which is better for performance.

heap_allocation_threshold=size

Determine heap or stack allocation. If the size of an automatic array or compiler temporary exceeds size bytes it is allocated on the heap instead of the stack. If size is −1, objects are always put on the stack. If size is 0, objects are always put on the heap.

The default is −1 for maximum performance and for compatibility with previous releases.

IEEE_minus_zero=setting

Enable or disable the SIGN(3I) intrinsic function’s ability to recognize negative floating−point zero (−0.0). Specify either ON or OFF for setting. The default is OFF, which suppresses the minus sign. The minus sign is suppressed by default to prevent problems from hardware instructions and optimizations that can return a −0.0 result from a 0.0 value. To obtain a minus sign () when printing a negative floating−point zero (−0.0), use the −z option on the assign(1) command.

IEEE_save=setting

(For Fortran) the ISO standard requires that any procedure which accesses the standard IEEE intrinsic modules via a "use" statement must save the floating point flags, halting mode, and rounding mode on entry; must restore the halting mode and rounding mode on exit; and must OR the saved flags with the current flags on exit. Setting this option OFF may improve execution speed by skipping these steps.

recursive=setting

Invoke the language option control group to control recursion support. setting can be either ON or OFF. The default is OFF.

In either mode, the compiler supports a recursive, stack−based calling sequence. The difference lies in the optimization of statically allocated local variables, as described in the following paragraphs.

With −LANG:recursive=ON, the compiler assumes that a statically allocated local variable could be referenced or modified by a recursive procedure call. Therefore, such a variable must be stored into memory before making a call and reloaded afterwards.

With −LANG:recursive=OFF, the compiler can safely assume that a statically allocated local variable is not referenced or modified by a procedure call. This setting enables the compiler to optimize more aggressively.

rw_const=(ON|OFF)

Tell the compiler whether to treat a constant parameter in Fortran as read-only or read-write. If treated as read-write, the compiler has to generate extra code in passing these constant parameters so as to tolerate their being modified in the called function. The default is OFF, which is more efficient but will cause segmentation fault if the constant parameter is written into.

short_circuit_conditionals=(ON|OFF)

Handle .AND. and .OR. via short-circuiting, in which the second operand is not evaluated if unnecessary, even if it contains side effects. Default is ON. This flag is applicable only to Fortran, the flag has no effect on C/C++ programs.

−LIST: ...

The listing option flag controls information that gets written to a listing (.lst) file. The individual controls in this group are:

=(ON|OFF)

Enable or disable writing the listing file. The default is ON if any −LIST: group options are enabled. By default, the listing file contains a list of options enabled.

all_options[=(ON|OFF)]

Enable or disable listing of most supported options. The default is OFF.

notes[=(ON|OFF)]

If an assembly listing is generated (for example, on −S), various parts of the compiler (such as software pipelining) generate comments within the listing that describe what they have done. Specifying OFF suppresses these comments. The default is ON.

options[=(ON|OFF)]

Enable or disable listing of the options modified (directly in the command line, or indirectly as a side effect of other options). The default is OFF.

symbols[=(ON|OFF)]

Enable or disable listing of information about the symbols (variables) managed by the compiler.

−LNO: ...

Specify options and transformations performed on loop nests by the Loop Nest Optimizer (LNO). The −LNO options are enabled only if the optimization level of −O3 or higher is in effect.

For information on the LNO options that are in effect during a compilation, use the −LIST:all_options=ON option.

−LNO:apo_use_feedback=(ON|OFF)

Effective only when specified with −apo under feedback−directed compilation, this flag tells the auto-parallelizer whether to use the feedback data of the loops in deciding whether each loop should be parallelized. When the compiler parallelizes a loop, it generates both a serial and a parallel version. If the trip count of the loop is small, it is not beneficial to use the parallel version during execution. When this flag is set to ON and the feedback data indicates that the loop has small trip count, the auto−parallelizer will not generate the parallel version, thus saving the runtime check needed to decide whether to execute the serial or parallel version of the loop. The default is OFF.

−LNO:build_scalar_reductions=(ON|OFF)

Build scalar reductions before any loop transformation analysis. Using this flag may enable further loop transformations involving reduction loops. The default is OFF. This flag is redundant when -OPT:roundoff=2 or greater is in effect.

−LNO:blocking=(ON|OFF)

Enable or disable the cache blocking transformation. The default is ON.

−LNO:blocking_size=N

This option specifies a block size that the compiler must use when performing any blocking. N must be a positive integer number that represents the number of iterations.

−LNO:fission=(0|1|2)

This option controls loop fission. The options can be one of the following:

0 = Disable loop fission (default)

1 = Perform normal fission as necessary

2 = Specify that fission be tried before fusion

Because -LNO:fusion is on by default, turning on fission without turning off fusion may result in their effects being nullified. Ordinarily, fusion is applied before fission. Specifying -LNO:fission=2 will turn on fission and cause it to be applied before fusion.

−LNO:full_unroll,fu=N

Fully unroll loops with trip_count <= N inside LNO. N can be any integer between 0 and 100. The default value for N is 5. Setting this flag to 0 disables full unrolling of small trip count loops inside LNO.

−LNO:full_unroll_size=N

Fully unroll loops with unrolled loop size <= N inside LNO. N can be any integer between 0 and 10000. The conditions implied by the full_unroll option must also be satisfied for the loop to be fully unrolled. The default value for N is 2000.

−LNO:full_unroll_outer=(ON|OFF)

Control the full unrolling of loops with known trip count that do not contain a loop and are not contained in a loop. The conditions implied by both the full_unroll and the full_unroll_size options must be satisfied for the loop to be fully unrolled. The default is OFF.

−LNO:fusion=N

Perform loop fusion. N can be one of the following:

0 = Loop fusion is off

1 = Perform conservative loop fusion

2 = Perform aggressive loop fusion

The default is 1.

−LNO:fusion_peeling_limit=N

This option sets the limit for the number of iterations allowed to be peeled in fusion, where N>= 0. N=5 by default.

−LNO:gather_scatter=N

This option enables gather-scatter optimizations. N can be one of the following:

0 = Disable all gather-scatter optimizations

1 = Perform gather-scatter optimizations in non-nested IF

statements (default)

2 = Perform multi-level gather-scatter optimizations

−LNO:hoistif=(ON|OFF)

This option enables or disables hoisting of IF statements inside inner loops to eliminate redundant loops. Default is ON.

−LNO:ignore_feedback=(ON|OFF)

If the flag is ON then feedback information from the loop annotations will be ignored in LNO transformations. The default is OFF.

−LNO:ignore_pragmas=(ON|OFF)

This option specifies that the command-line options override directives in the source file. Default is OFF.

−LNO:local_pad_size=N

This option specifies the amount by which to pad local array dimensions. The compiler automatically (by default) chooses the amount of padding to improve cache behavior for local array accesses.

−LNO:minvariant,minvar=(ON|OFF)

Enable or disable moving loop-invariant expressions out of loops. The default is ON.

−LNO:non_blocking_loads=(ON|OFF)

(For C/C++ only) The option specifies whether the processor blocks on loads. If not set, the default of the current processor is used.

−LNO:oinvar=(ON|OFF)

This option controls outer loop hoisting. Default is ON.

−LNO:opt=(0|1)

This option controls the LNO optimization level. The options can be one of the following:

0 = Disable nearly all loop nest optimizations.

1 = Perform full loop nest transformations. This is the default.

−LNO:ou_prod_max=N

This option indicates that the product of unrolling of the various outer loops in a given loop nest is not to exceed N, where N is a positive integer. The default is 16.

−LNO:outer=(ON|OFF)

This option enables or disables outer loop fusion. Default is ON.

−LNO:outer_unroll_max,ou_max=N

The Outer_unroll_max option indicates that the compiler may unroll outer loops in a loop nest by as many as N per loop, but no more. The default is 5.

−LNO:parallel_overhead=N

Effective only when specified with -apo, the parallel_overhead option controls the auto-parallelizing compiler’s estimate of the overhead (in processor cycles) incurred by invoking the parallel version of a loop. When the compiler parallelizes a loop, it generates both a serial and a parallel version. If the amount of work performed by the loop is small, it may not be beneficial to use the parallel version during execution. The set value of parallel_overhead is used in this determination during execution time when the number of processors and the iteration count of the loop are taken into account. The default value is 4096. Because the optimal value varies across systems and programs, this option can be used for parallel performance tuning.

−LNO:prefetch=(0|1|2|3)

This option specifies the level of prefetching.

0 = Prefetch disabled.

1 = Prefetch is done only for arrays that are always referenced

in each iteration of a loop.

2 = Prefetch is done without the above restriction. This is the default.

3 = Most aggressive.

−LNO:prefetch_ahead=N

Prefetch N cache line(s) ahead. The default is 2.

−LNO:prefetch_verbose=(ON|OFF)

−LNO:prefetch_verbose=ON prints verbose prefetch info to stdout. Default is OFF.

−LNO:processors=N

Tells the compiler to assume that the program compiled under -apo will be run on a system with the given number of processors. This helps in reducing the amount of computation during execution for determining whether to enter the parallel or serial versions of loops that are parallelized (see the −LNO:parallel_overhead option). The default is 0, which means unknown number of processors. The default value of 0 should be used if the program is intended to run in different systems with different number of processors. If the option is set to non-zero and the value is different from the number of processors, the parallelized code will not perform optimally.

−LNO:sclrze=(ON|OFF)

Turn ON or OFF the optimization that replaces an array by a scalar variable. The default is ON.

−LNO:simd=(0|1|2)

This flag controls inner loop vectorization which makes use of SIMD instructions provided by the native processor.

0 = Turn off the vectorizer.

1 = (Default) Vectorize only if the compiler can determine that

there is no undesirable performance impact due to sub-optimal alignment. Vectorize only if vectorization does not introduce accuracy problems with floating-point operations.

2 = Vectorize without any constraints (most aggressive).

−LNO:simd_reduction=(ON|OFF)

This flag controls whether reduction loops will be vectorized. Default is ON.

−LNO:simd_verbose=(ON|OFF)

−LNO:simd_verbose=ON prints verbose vectorizer info to stdout. Default is OFF.

−LNO:svr_phase1=(ON|OFF)

This flag controls whether the scalar variable naming phase should be invoked before first phase of LNO. The default is ON.

−LNO:trip_count_assumed_when_unknown,trip_count=N

This flag is to provide an assumed loop trip-count if it is unknown at compile time. LNO uses this information for loop transformations and prefetch, etc. N can be any positive integer, and the default value is 1000.

−LNO:vintr=(0|1|2)

This flag controls loop vectorization to make use of vector intrinsic routines (Note: a vector intrinsic routine is called once to compute a math intrinsic for the entire vector). −LNO:vintr=1 is the default. −LNO:vintr=0 turns off the vintr optimization. Under −LNO:vintr=2 the compiler will do aggressive optimization for all vector intrinsic routines. Note that −LNO:vintr=2 could be unsafe in that some of these routines could have accuracy problems.

−LNO:vintr_verbose=(ON|OFF)

−LNO:vinter_verbose=ON prints verbose information to stdout on optimizing for vector intrinsic routines. Default is OFF. This flag will let you know which loops are vectorized to make use of vector intrinsic routines.

Following are LNO Transformation Options. Loop transformation arguments allow control of cache blocking, loop unrolling, and loop interchange. They include the following options.

−LNO:interchange=(ON|OFF)

Disable the loop interchange transformation in the loop nest optimizer. Default is ON.

−LNO:unswitch=(ON|OFF)

Turn ON or OFF the optimization that performs a simple form of loop unswitching. The default is ON.

−LNO:unswitch_verbose=(ON|OFF)

−LNO:unswitch_verbose=ON prints verbose info to stdout on unswitching loops. Default is OFF.

−LNO:ou=N

This option indicates that all outer loops for which unrolling is legal should be unrolled by N, where N is a positive integer. The compiler unrolls loops by this amount or not at all.

−LNO:ou_deep=(ON|OFF)

This option specifies that for loops with 3-deep (or deeper) loop nests, the compiler should outer unroll the wind-down loops that result from outer unrolling loops further out. This results in large code size, but generates faster code (whenever wind-down loop execution costs are important). Default is ON.

−LNO:ou_further=N

This option specifies whether or not the compiler performs outer loop unrolling on wind-down loops. N must be specified and be an integer.