Release Notes: PathScale, LLC. PathScale Compiler Suite Release 3.2-BETA NOTE: The most current version of these notes is on the PathScale Website ============================================================ Copyright (C) 2007,2008 PathScale, LLC. All Rights Reserved. Thank you for purchasing the PathScale Compiler Suite. This file describes new features, bugs fixed, and known issues with the PathScale Compiler Suite. Where possible, we provide workarounds for known problems. Support & contact information ----------------------------- To report bugs, or for more information, send an email message to support@pathscale.com. Please report problems the first time you encounter them, even if they are listed here as known issues. Knowing who has encountered a bug helps us prioritize the bugs we know about. New Features in 3.2 ------------------- New Linux Distribution * Support for the Ubuntu 7.10 Linux distribution GCC Compatibility and Features * Nested function gcc extension fully supported (with gcc style restrictions) * High profile bug 14254 fixed, which fixes runtime issues with OpenMPI natively compiled with PathScale Fortran * Fortran 2003 C interoperability, comprising the "iso_c_binding" module, the "value" attribute and declaration, and the "bind" attribute and declaration. (The "enumeration" declaration was already implemented in the previous release.) For 3.2 beta, "iso_c_binding" has been qualified for -m64 only; it will be qualified for -m32 as well in the 3.2 final release. * Fortran 2003 feature allowing the "intent" and "pointer" attributes to be used together. * Fortran 2003 "volatile" attribute and declaration (the compiler provided a limited implementation of "volatile" prior to this release, but it has been extended and brought into conformance with the standard.) Performance * Performance improvements, at all levels in the compiler, which we believe will result in balanced boosts in performance on both integer and floating point code. We are known for performance on floating point, so it is very satisfying to see these integer increases as well. - Better loop vectorization (not that we were bad before) - Better loop factorization - Improved prefetch efficiency - Improved instruction scheduling - Improved SPEC configuration file, reflecting these optimizations is included in the release, and available on the website - Better tuning for multicore on both AMD and Intel chip sets (-march=barcelona and -march=core) * This release has been tested on the following Linux distributions: Fedora Core 3, 4, 5, 6 Fedora 7, 8 RedHat Enterprise Linux 4 and 5 SUSE Linux Enterprise Server 9 SUSE Linux Enterprise Server 10 SUSE Linux Professional 9.3 SUSE Linux Professional 10 SUSE Linux Professional 10.1 openSUSE 10.3 Ubuntu 7.10 (Gutsy Gibbon) NOTES: Portions of the PathScale Compiler Suite are 32-bit C++ applications built with GNU 3.x tool chain. They require the 32-bit GNU 3.x C++ runtime libraries to be installed. This includes the GNU 4.x based PathScale Compilers. SUSE Linux Enterprise 10 refers to both the Desktop and Server editions. Known issues and workarounds ============================= OpenMP support for C++ ---------------------- OpenMP support for C++ is available only under -gnu4, which is the default on GNU 4.x-based systems. Thread-local storage -------------------- The compiler supports thread-local storage declared by the GCC __thread storage class. Thread-local support is not yet available for position-independent code. When compiling code that uses thread-local storage, -fPIC must not be used. Thread-local storage is also not supported under the GCC 3.x C/C++ front-ends (-gnu3). pragma support -------------- "#pragma options" and "#pragma frequency_hint" are supported only in GCC 3.x compatibility mode (-gnu3). They are not yet supported when compiling for GCC 4.x (-gnu4). Installation ------------ The PathScale Compiler Suite is a 32-bit program, and requires a 32-bit execution environment to run. We have found that some users do not have the 32-bit environment installed on their system resulting in the inability of the compiler to run. This tends to be a problem only on systems where the non-root tar installation is used or where "--force" is used to install the RPMs. Installation Ubuntu 7.10 - - - - - - - - - - - - - - - - - - - - - - - - - - - On Ubuntu you need to "export MANPATH=$MANPATH:/opt/pathscale/man" to find the manuals. Dpkg installation has been tested, we are currently not supplying a tar package, although we expect that the Fedora tar files would work. For a x86-64 system that has all prerequisites satisfied, the installation will be clean. For a x86-64 system with prereqs missing, there will be some errors, and debian will prompt the user to install the required packages. For i386, the main components get installed OK. The PathScale install script does not support Ubuntu (it is rpm based), to install the debian compiler packages on Ubuntu please use the following instructions: dpkg -i debs/*/*.deb dpkg -i subscription-server/pathscale-sub-server_3.1.99.14.465.ubuntu7.10.psc_amd64.deb dpkg -i pathdb/x86_64/pathscale-pathdb-* (the user selects x86_64 or ia32) Debian normally does not allow you to build different architecture packages on a system (other than using utilities like linux32, chroot..), so you cannot easily build a i386 package on amd64. AMD64 packages can be installed on an ia32 system by the above commands and adding --force-architecture. RPM Installation on SUSE 9.3, and SLES 9 - - - - - - - - - - - - - - - - - - - - - - - - - - - On the SUSE Linux distributions, RPM may report that no package provides ld-linux-x86-64.so.2 when that is not the case. (This library is provided by the glibc RPM.) This is a bug in the version of RPM distributed with the specified SUSE environments. Installing multiple versions on the same system: ------------------------------------------------ Systems with multiple versions of the PathScale compiler, using the RPMs may receive the following message when they try to execute their application built with the 3.2 beta (3.1.99) release of the PathScale Compiler Suite: ./a.out: relocation error: ./a.out: symbol memset.pathscale.opteron, version LIBPSCRT_1.0 not defined in file libpscrt.so.1 with link time reference The problem is the 3.2 release is expecting new symbols to be resolved by one of the runtime libraries are not in older versions of that library. When installation is done using the RPMs the system /etc/ld.so.conf is appended with: /lib/3.1.99/32 /lib/3.1.99 If multiple installations exist on this system, the ld.so.conf will contain entries for each of the releases in the order that they were installed: /export/pathscale_alt/lib/3.1/32 /export/pathscale_alt/lib/3.1 /export/pathscale_alt_2/lib/3.1.99/32 /export/pathscale_alt_2/lib/3.1.99 When the application is executed, it will find the shared library in the older directory ( in the above case) which does not have the necessary symbols in it and that will result in some form of the above error message. Workarounds: - - - - - - 1. Update /etc/ld.so.conf removing all entries pointing to older, installed versions of the compiler suite. The fix is to edit the file /etc/ld.so.conf (as root) and remove some lines. It'll look something like this: $ cat /etc/ld.so.conf /usr/X11R6/lib64 /usr/lib64/qt-3.1/lib /usr/X11R6/lib /usr/lib/qt-3.1/lib /opt/pathscale/lib/3.1/32 <=== remove this /opt/pathscale/lib/3.1 <=== remove this /opt/pathscale/lib/3.1.99/32 <=== keep this /opt/pathscale/lib/3.1.99 <=== keep this You then need to let the system know about this reconfiguration by running (as root): # /sbin/ldconfig 2. Uninstall the older versions of the compiler suite. (This way, older versions of the library will not be found and it will resolve to the most recent version. ) Some of the bugs fixed between 3.1 and 3.2 beta: BUG Description =================== 10985 regression in ipa: gnu4_c++: sanity tests don't compile with -ipa 14284 nested_func test: -Ofast + -apo: Signal Segmentation fault in phase MP Lowering 14116 regression in 3.1.99-69 : mochi suite euroben failed 14072 -g -finstrument-functions causes multiple definition of xxx 14160 regresssion in LNO; Effect of operator^=(rhs) is incorrect 14297 regression in 3.1.99-250 (BOOST1.34): undefined reference to `powi' 14226 nested_function + goto: Get_WN_Label: label 2 greater than last label 1 14227 nested_func + goto : ## RID == NULL, label 1 doesn't have a matching target 14228 nest_func: assembler error: symbol `.LBB7_main' is already defined 14230 nested_func/20010209-1: ## Compiler Error during IPA Startup phase; passes w/o -ipa 14242 nested_func + apo: test abortsat runtime 14253 regresssion at -O3: Segmentation fault in phase WN_Instrument; passes -INLINE:=off 14281 nested_func test aborts with -finstrument-function 14308 gcc flag unknown to pathcc : -fno-builtin-abs: unknown flag 13044 wrong target .so (was: MIPS scheduler broken) 14219 freepooma: -mp causes ### POOMA Assertion Failure ### 14161 Provide method to disable Fortran runtime segv handler 14255 Fortran crash due to use on a local variable of attributes meant for dummy arg 14296 SHAPE95 regression at -O3: passed -LNO:hoistif=off 14023 -O3 -ipa -finstrument-functions fails; passes -INLINE:=off 14218 -finstrument-functions causes wide characters tests seg. fault at gnu4 system 13728 boost1.34/partial_regex_match -O2 runtime failure; passes with -WOPT:iv_recog=0 13826 REGRESSIONS_FTN/dfd_input: -static-data causes Error: attempt to .org/.space backwards 14285 nested_func test: aborts -O2 -ipa; passes -O2 -ipa -CG:all_sched=0 14307 GoTo test - Segmentation fault in phase Global Optimization -- Dead Store 13686 omni_cc: parallel for 003 fail at -gnu42 13690 c++ mp: LIBSTDC++/ostream_exception aborts if -fcxx-openmp 13723 boost1.34: -fcxx-openmp causes test seg. fault at runtime 14114 NAS_PAR_EP_MP_CC regression in 3.1.99-67 on eng-63 (all opt. level) Known issues in 3.1 (this list will be updated for 3.2 Final): [Bug 595] Some AMD64 glibc distributions include broken obstack code [Bug 949] C, C++: complex integer data types not supported [Bug 1316] F77: %loc extension not supported [Bug 1320] Fortran: some kinds of variable alignment are not supported [Bug 2312] IPA: linking may fail if filenames contain shell metacharacters [Bug 2395] The implementation of __builtin_return_address is not complete. [Bug 2446] IPA linker does not handle .a files containing IPA .o files correctly [Bug 2509] Control of floating-point trapping behavior [Bug 2809] C: compiler handles unspecified array sizes incorrectly [Bug 2896] Some fast math routines are only available in the static math library [Bug 3289] pathf90: writing to a constant passed as an argument causes program to abort [Bug 3697] GNU `used' attribute not supported [Bug 3830] C: Incomplete debug information in nested lexical scopes [Bug 4374] IPA: "make: warning: Clock skew detected. [Bug 4433] pathf90: Fortran programs without a "main" will appear to successfully link [Bug 4716] Improper use of Fortran IOSTAT in data transfer list causes compiler error [Bug 5090] Fortran OpenMP: statically nested parallel constructs not supported. [Bug 5195] C OpenMP FIRSTPRIVATE and LASTPRIVATE on same variable for PARALLEL DO [Bug 5882] OpenMP: Serial version of a parallel region is not localizing "private" variables. [Bug 5952] pathf90: On 32-bit the logic to automatically set stack size may not work [Bug 6236] pathf90: The Fortran compiler currently does not support IEEE intrinsics [Bug 6259] pathf90: Restricted expressions use Fortran 90 rules. [Bug 7728] Inline assembly '=A' constraint not supported [Bug 8502] Cray Fortran 'buffer out' extension doesn't work [Bug 10192] pathf90 cpp preprocessor does not handle ## concatenation [Bug 10388] pathf95: Use of cpp with directives before or between continuations [Bug 2446] IPA linker does not handle .a files containing IPA .o files correctly The IPA linker attempts to correctly handle archive files containing .o files compiled with -ipa, but this support is not complete. While a regular linker will only use .o files that export needed symbols, the IPA linker uses all .o files inside the archive, which can often lead to undefined symbol errors. To work around this, extract the .o files you need from the .a file, and link explicitly against those files. [Bug 2809] C: compiler handles unspecified array sizes incorrectly If the compiler encounters C code that uses ISO C99 initializers to initialize an array of flexible size, it will generate assembly code that causes the assembler to issue an "attempt to move .org backwards" error. An example of this syntax usage is as follows: > struct K { > int i; > int f[]; > }; > struct K a = { > 9, > { 1, 4, 5 } > }; > struct K b = { > 9, > { 1, 4, 5, 12, 14,9,0,9 } > }; A workaround is to specify a size in the array declaration. [Bug 2509] Control of floating-point trapping behavior The PathScale compilers support the following options for controlling floating-point traps: > -TENV:X=(0..4) > Specify the level of enabled exceptions that will be > assumed for purposes of performing speculative code > motion (default is level 1 at all optimization levels) > In general, an instruction will not be speculated (i.e. > moved above a branch by the optimizer) unless any > exceptions it might cause are disabled by this option. > > Level 0 - No speculative code motion may be performed. > > Level 1 - Safe speculative code motion may be performed, > with IEEE-754 underflow and inexact exceptions > disabled. > > Level 2 - All IEEE-754 exceptions are disabled except > divide by zero. > > Level 3 - All IEEE-754 exceptions are disabled including > divide by zero. > > Level 4 - Memory exceptions may be disabled or ignored. > > -TENV:simd_imask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > invalid-operation exception. > > -TENV:simd_dmask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > denormalized-operand exception. > > -TENV:simd_zmask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > zero-divide exception. > > -TENV:simd_omask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > overflow exception. > > -TENV:simd_umask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > underflow exception. > > -TENV:simd_pmask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > precision exception. [Bug 6236] pathf90: The Fortran compiler currently does not support IEEE intrinsics The current release of the Fortran compiler does not implement the IEEE floating point intrinsics: > clear_ieee_exception > disable_ieee_exception > enable_ieee_exception > get_ieee_exception > get_ieee_interrupts > get_ieee_rounding_mode > get_ieee_status > ieee_class > ieee_next_after > ieee_unordered > set_ieee_exception > set_ieee_exceptions > set_ieee_interrupts > set_ieee_rounding_mode > set_ieee_status > test_ieee_exception > test_ieee_interrupt [Bug 2312] IPA: linking may fail if filenames contain shell metacharacters The IPA linker may produce strange error messages and fail to link if a file name passed to it contains a shell or "make" metacharacter. The list of characters to avoid is as follows: ;:$()[]{}<>% [Bug 2395] The implementation of __builtin_return_address is not complete. The current implementation of __builtin_return_address appears to only work correctly for an argument of 0. [Bug 2896] Some fast math routines are only available in the static math library Some high-performance standard math library routines (single- and double-precision versions of pow, fmin, fmax, finite, and copysign) are only available in the static version of the math library (libmpath) that we ship. In order to benefit from these faster routines, you should link explicitly against the static version of the math library. You can find out the location of this library using the following command: > pathcc -print-file-name=libmpath.a The static version of the math library is faster in general than the shared version. You should always use it if you want the highest floating point performance. [Bug 4374] IPA: "make: warning: Clock skew detected. You may receive a "make: warning: Clock skew detected. Your build may be incomplete." message when compiling with the '-ipa' option using a remote file system such as NFS for the build directory. The '-ipa' option currently uses the 'make' command which can be sensitive to the differences in system clock between the system where the compilation is taking place and the file server. Clock skew will not affect the success of your build, but you can avoid the warning by ensuring that the system times of the file server and the build server are synchronized. [Bug 5090] Fortran OpenMP: statically nested parallel constructs not supported. The compiler does not support statically nested parallel constructs. Such constructs may cause compilation or runtime failure. For example, the following is not supported: !$OMP PARALLEL $OMP PARALLEL $OMP END PARALLEL $OMP END PARALLEL [Bug 5195] C OpenMP FIRSTPRIVATE and LASTPRIVATE on same variable for PARALLEL DO The current implementation of OpenMP in the C compiler does not support FIRSTPRIVATE and LASTPRIVATE on same variable for PARALLEL DO. The compiler will issue the following error if this is encountered: > Error: FIRSTPRIVATE and LASTPRIVATE on same variable not > yet implemented for PARALLEL DO [Bug 5571] pathCC: Inlining in the C++ front end. The g++ 3.3 front end implements extremely aggressive inlining. We have found that the increase in code size caused by this inlining may cause the compiler's back end to have excessively long run times. As a result, the default is to allow the compiler's back end to do the inlining by turning off inlining in the C++ front end. In most cases the back end inlining does as well if not better than the C++ front end. In some rare cases, allowing the C++ front end to do the inlining does result in a faster executable. (This is typically true in cases where g++ produces faster runtimes than pathCC.) To turn on front end inlining use the -finline option. (NOTE: This option may go away in a future release.) If that does result in better performance for your application, please report it to PathScale support so we can further refine the back end inlining. [Bug 5882] OpenMP: Serial version of a parallel region is not localizing "private" variables. The current OpenMP implementation may not create copies of private variables in parallel regions in the single threaded case as required by the OpenMP standard. Workaround: Define the following environment variables: > export PSC_OMP_SILENT=1 > export PSC_OMP_SERIAL_OUTLINE=1 [Bug 5952] pathf90: On 32-bit the logic to automatically set stack size may not work For 32-bit applications the logic described in section 3.10 of the User Guide may not work as documented. If the application aborts during execution try setting the stack size to a large value or unlimited and see if that resolves the issue. Workaround: Set the stack size to unlimited before running the application: > ulimit -s unlimited [Bug 6259] pathf90: Restricted expressions use Fortran 90 rules. The current Fortran compiler implements the rules for restricted expressions in accordance with the Fortran 90 standard. As a result, some valid Fortran 95 programs that rely on the broader definition of restricted expressions may generate errors when compiled with pathf90. [Bug 3830] C: Incomplete debug information in nested lexical scopes The compiler omits the necessary debugging output for enabling the debugger to differentiate between variables with identical names in nested lexical scopes. For example: > void foo() { > int i = 0 ; > { int i = 1 ; > { int i = 2 ; > } > } > } A debugger will not be able to tell the difference between the three variables called 'i'. [Bug 1316] F77: %loc extension not supported The F77 %loc directive, which returns the address of a variable, is implemented by some Fortran compilers such as g77, but not yet by the PathScale compiler. Recommend using the loc(). [Bug 1320] Fortran: some kinds of variable alignment are not supported Given a Fortran program fragment like this: > character c(11) > real r > equivalence (r,c(2)) The intention of the fragment is that the variable c should start on a 7-byte alignment, i.e. that c(2) should be aligned with r. This is not standard Fortran, but is an extension supported by some Fortran compilers, e.g. g77. The PathScale Fortran compiler does not currently support this kind of alignment requirement, and will issue a compilation error. [Bug 4433] pathf90: Fortran programs without a "main" will appear to successfully link. Fortran programs that do not provide a "main" entry point will appear to successfully link but will fail at runtime with the following message: > $ ./a.out > Someone linked a Fortran program with no MAIN__! Workaround: Provide a main program. [Bug 7728] Inline assembly '=A' contraint not supported The =A constraint indicates that both eax and edx are used to hold a 64-bit value. This allows the 64-bit value to be propagated to ret without having to build it up from two 32-bit parts. The current PathScale compilers do not implement this constraint correctly. [Bug 595] Some AMD64 glibc distributions include broken obstack code Some glibc distributions for AMD64 include broken obstack code, which incorrectly mixes 32-bit and 64-bit references to stack data. This can cause code that uses obstacks (such as gcc) to crash under some circumstances, and may occur when using either pathcc or other compilers to build code that uses obstacks. There are two possible workarounds: For packages such as gcc, use their internal obstack implementations if available. For example, you can build gcc with -D_LIBC to do this. You can also fix the obstack alignment mask manually before you use any obstacks: > obstack_alignment_mask (obstack_ptr) = 3; [Bug 949] C, C++: complex integer data types not supported Although the PathScale Compiler Suite fully supports floating point complex numbers, it does not support complex integer data types, such as "_Complex int". Complex integers are a gcc "we did it because we could" extension to ISO C99, and we have never seen any uses of them outside of the gcc test suite. [Bug 1066] C, C++: __builtin_strpbrk not implemented correctly The gcc builtin function __builtin_strpbrk (gcc's implementation of the standard strpbrk library function) is not implemented correctly, and causes code to crash at runtime. [Bug 3289] pathf90: writing to a constant passed as an argument causes program to abort Fortran programs that pass constants to subprograms that then try to write to that argument will abort. The Fortran compiler places constants in read-only memory. Workaround: Use the option "-LANG:rw_const=on". Note: The use of this workaround may result in a degradation in performance. [Bug 3697] GNU `used' attribute not supported The GNU 'used' attribute is not currently supported which will result in functions that the compiler perceives as dead code being eliminated. Workaround: Use the '-INLINE:dfe=0' option when compiling the code. [Bug 4716] Improper use of Fortran IOSTAT in data transfer list causes compiler error If an iostat variable is improperly used the compiler may abort with a compiler error similar to: > $ pathf90 foo.f90 -O0 > ### Compiler Error in file foo.f90 > during Lowering phase: > ### copyout_temp_to_var: unexpected type (M) in I/O processing > pathf90 INTERNAL ERROR: /opt/pathscale/lib/2.0/be returned non-zero status 1 An example of the offending statement is: read(cval,fmt='(i4)',iostat=stat) i1 [Bug 5090] Fortran OpenMP: statically nested parallel constructs not supported. The compiler does not support statically nested parallel constructs. Such constructs may cause compilation or runtime failure. For example, the following is not supported: !$OMP PARALLEL $OMP PARALLEL $OMP END PARALLEL $OMP END PARALLEL [Bug 8502] Cray Fortran 'buffer out' extension is not currently supported. The Cray Fortran extension 'buffer out' is not currently supported. [Bug 10192] pathf95's -cpp may not handle the ## operator correctly The command "pathf95 -cpp" invokes cpp with the -traditional option so that tabs are handled properly. Without -traditional, cpp will convert tabs to single spaces which may corrupt a fixed format file. [Bug 10388] pathf95: Use of cpp with directives before or between continuations The use of cpp as the preprocessor will cause # lines to be generated. These lines before a continuation statement will cause the compile to fail with an "unexpected syntax" message.