CPMD vs. Linux: Tips and Downloads
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
IntroductionOn Linux systems, contrary to most other 'Workstation-type' operating systems, there is no single company or consortium enforcing 'the standard way(TM)' of organizing the file system layout and how to configure system services (there are always attempts to implement that, though). So there is no simple way for programs like CPMD to provide a one-size-fits-all configuration (this is already quite difficult for some of the more 'controlled' operating systems). This is made even more complicated by the fact that there are several (commercial) Fortran compilers available for Linux and the fact that these compilers are mostly mutually incompatible. Since the default Linux Fortran compiler, the GNU g77 compiler, is not sufficient to compile CPMD, but most ready-to-use precompiled libraries are configured for the GNU g77 compiler, compiling CPMD under Linux can become pretty tricky (especially when you want to run CPMD in parallel).
Disclaimer: Linux Fortran Compilers for CPMDCPMD makes heavy use of the 'Cray Pointer' extension to Fortran 77 for dynamical memory management. Also for some platforms, or when the Gromos/Amber QM/MM interface is used, a Fortran 90/95 compiler is required. The g77 compiler, part of the GNU Compiler Collection and the default Fortran compiler on Linux machines, is therefore not sufficient to compile CPMD. The same is (currently) true for the G95 compiler which is still in heavy development. The most popular alternatives (in no particular order) are the Intel Fortran Compiler, the Portland Group Fortran Compiler, the Absoft Fortran Compiler, and the Lahey/Fujitsu Fortran Compiler. For the x86_64 (Athlon64/Opteron) platform there also is the Pathscale Compiler. All of these compilers are commercial, license managed compilers. For most of them, however, one can get a trial or evaluation license, so that you can check, if it works before you have to pay the (quite sizeable) license/subscription fee(s). For the Intel Fortran Compiler you can also get a license for the Non-Commercial Unsupported Version free of charge. This version is identical to the commercial version, but adds the restriction, that it is a non-transferable personal license and you are not allowed to sell the compiled executables. As of version 8.1 it also supports the EM64T instruction set, which makes it also usable for the x86_64 (Athlon64/Opteron) platform. A nice overview about compiling and running Fortran programs under Linux can be found at http://www.nikhef.nl/~templon/fortran.html. A nice paper about how to get the most out of your compiler is at http://www.fortran-2000.com/ArnaudRecipes/CompilerTricks.html. ![]() Linux Configuration FilesAs stated above, there are many different Linux installations, so that compiling CPMD for your local machines can become a tricky hurdle to cross, before you can actually start to use CPMD. The default CPMD distribution provides a selection of configurations, that - most of the time - have to be adapted to the local configuration. Starting with CPMD version 3.9.x new configurations can be added without changing the Configure script. You only need to add files with the proper definitions to the CONFIGURE subdirectory. The following additional configurations adapted for our local installation are available for download, and may help to get you started.
![]() Optimized LAPACK/BLAS/ATLAS Library BinariesThese are unified LAPACK and BLAS libraries based on the ATLAS library and the LAPACK/BLAS sources from netlib that should give close to optimal CPMD performance on the platforms they were tuned for. Special care has been taken, that all of the libraries here do not require any other library to be installed (besides the ubiquitous libc and libm), especially not the Fortran compiler runtime libraries. All routines, that need access to the Fortran compiler runtime were replaced with Fortran compatible counterparts written in plain C. As a consequence they should be compatible to all currently available Linux Fortran compilers. To use them simply copy (or symlink) them to your compilation directory under the name libatlas.a and use '-L. -latlas' as linker flags (and delete all other linker flags related to lapack, blas, mkl etc.). The libraries are also available as RPM packages that will emulate a 'normal' BLAS/LAPACK/ATLAS installation. Please drop me a note at axel.kohlmeyer@theochem.ruhr-uni-bochum.de if you want to be notified (via email) in case i update the libraries (which is rather infrequently). Recent changes:
NOTES: The libraries utilize the special instructions of the respective
CPUs, so the resulting binaries are nonportable between platforms. RPM Packages
The following RPMs contain the ATLAS binaries from above repackaged
as RPMs. They packaged with the oldest (to me) available RPM version,
so they should be compatible with all current RPM based distributions.
The libraries itself are identical to the ones above, so there
is no need to download them both.
![]() CPMD compatible LAM-MPI RPMs for LinuxThese RPMs provide binaries of the LAM-MPI message passing library, that can be used to run CPMD in parallel on shared memory SMP computers or clusters of networked Linux PCs over TCP/IP or a combination thereof. Contrary to the standard MPI RPMs provided by RedHat, SuSE and other distributions, these RPMs were configured to be compatible with g77 as well as Intel ifc or PGI's pgf77/pgf90. To compile with the different compilers just use mpif77 (for g77), mpiifc (for Intel ifc), mpiifort (for Intel ifort), mpipgf77 (for PGI's pgf77), mpipgf90 (for PGI's pgf90), or mpifort (for the Compaq Fortran Alpha Linux Compiler). You can easily adapt the mpi-wrapper scripts to other compilers as well (as long as they have the same underscoring conventions). I have not tested the RPMs on other distributions, so you have figure out, which version matches yours. Alternatively you could download the source RPM and build a matching binary RPM for yourself (by running rpmbuild --rebuild on the source rpm). NOTE: different Fortran compilers usually produce mutually incompatible object files, so that you should compile all Fortran sources with the same compiler (or compiler wrapper script) to get a usable executable. Please drop me a note at axel.kohlmeyer@theochem.ruhr-uni-bochum.de if you want to be notified (via email) in case i update the RPMs (which is rather infrequently). ![]()
![]() Using the Intel Fortran Compiler with/without the Intel MKLWith version 8.0 Intel has switched the Fortran frontend. It now uses (almost) the same frontend as the DEC/Compaq/HP alpha compiler which has two major consequences: a) all write statements are synchroneous now which makes following files easier, but include a performance hit, especially on large networked file systems, and b) the resulting binaries need a lot more stack memory (technical explanation: local automatic variables are allocated via alloca(3) instead of malloc(3)). As a consequence you have to raise the stacksize limit in your shell or CPMD will crash with a segmentation fault. ulimit -a(for a bourne/korn shell) or limit (on (t)csh) will tell you the actual settings. You should set it to at least 320000 kbytes via ulimit -s 320000 or limit stacksize 320000, respectively. Note, that this can be especially crucial (and difficult to debug) for parallel jobs. The Intel Fortran compiler for Linux uses a large amount of shared libraries. This makes the resulting binaries highly nonportable, unless you link them statically. So using '-i-static', '-static-libcxa' or '-static' as link option is highly recommended. If this does not work for one reason or another you should at least try to link all compiler provided libraries statically. This gets even more complicated, if you use the Intel Math Kernel library (MKL). With the following set of flags you will link every Intel library statically, but pthread and libc dynamically (tested with Intel IFC v7.1 and MKL v5.2): -static-libcxa -Xlinker -Bstatic -lsvml \ -L/opt/intel/mkl/lib/32/ -lmkl_lapack -lmkl_p4 \ -lguide -Vaxlib -Xlinker -Bdynamic -lpthread
If you are using RedHat 9 (or any newer distribution, that uses the new native POSIX threads library) and an older Intel Fortran Compiler (up to version 7.1), you have to link everything dynamically. When you upgrade to the newer version 8 of the Intel Compilers, however, this limitation is removed. If this is not possible, the dynamic linking can be achieved by using ifc -i_dynamic instead of plain ifc. To avoid having to set LD_LIBRARY_PATH every time, you can use the following set of flags to hardcode the default path of those shared libraries into the binary. -lsvml -Xlinker -rpath=/opt/intel/mkl/lib/32/ \ -Xlinker -rpath=/opt/intel/compiler70/ia32/lib \ -L/opt/intel/mkl/lib/32/ -lmkl_lapack -lmkl_p4 \ -lguide -Vaxlib If your run a MKL linked CPMD binary on a multiprocessor Intel Xeon or a Hyperthreading enabled desktop machine, you should be aware, that this may prompt the library to use multiple threads internally. Depending on your configuration this may interfere with intended use of the machine and lead to suboptimal performance. Setting the enviroment variable OMP_NUM_THREADS to 1 via export OMP_NUM_THREADS=1 or setenv OMP_NUM_THREADS 1 will disable this 'feature'. Of course you can avoid some of the trouble by using the combined LAPACK/ATLAS libraries from the paragraph above. So far the performance loss for real world applications compared to MKL seems to be quite small (~5%). ![]() Compiling CPMD for OpenMPTo compile CPMD for OpenMP you start with creating a regular makefile as if you would want to compile a standard serial or MPI-parallel executable. Next you need to tell the compiler to look for OpenMP directives. With the Portland Group compilers (pgf77/pgf90) this is done by adding the flag '-mp' to the FC and LD makefile variables, e.g.: CC = gcc -O2 -Wall -D_REENTRANT FC = pgf77 -c -fast -mp -tp athlon -D_REENTRANT LD = pgf77 -fast -mp -tp athlon -D_REENTRANT For the Intel Linux Fortran compiler the equivalent compiler flags are -fpp -openmp. As of 08/2004 i have been able to compile a working OpenMP executable with the Intel compiler Version 8.0 Build 20040412Z, Version 8.1-020 works as well. You need to register with Intel Premier Support (currently at no extra cost, even for the non-commercial version) to get access to the updated binaries (the stock v8.0 compiler does not work). You then have to compile everything without the OpenMP flags, and finally delete the files util.o, mltfft.o, fftnew.o, and fftutil.o change the Makefile for OpenMP and type 'make'. This will recompile only those files for OpenMP (they also provide the largest part of the OpenMP speed gain) and link appropriately. As of CPMD version 3.9.2 OpenMP compilation of the whole package should work (again). To enable OpenMP parallelization during a CPMD run, you have to set the environment variable OMP_NUM_THREADS to the number of cpus, you want to use for OpenMP (usually 2 on SMP Linux PCs). So you have to type either OMP_NUM_THREADS=2 export OMP_NUM_THREADSif you have a Bourne shell or setenv OMP_NUM_THREADS 2if you have a c-shell. You then start the cpmd job and should see multiple cpmd threads/processes instead of a single one. Also the number of active OpenMP threads shold be visible in the CPMD output, e.g.: OPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPEN NUMBER OF CPUS PER TASK 2 OPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPEN Due to changes in the threads interface in Linux, running OpenMP applications may need to tell the dynamic linke which interface to use, by setting environment variable LD_ASSUME_KERNEL. So you have to type either LD_ASSUME_KERNEL=2.4.20 export LD_ASSUME_KERNELif you have a Bourne shell or setenv LD_ASSUME_KERNEL 2.4.20if you have a c-shell. The tricky part is to make that work on a parallel run. With LAM-MPI this may be done by adding the flag ' -x OMP_NUM_THREADS=2 ' to the mpirun command line. For Scali MPI you just have to set the environment like for the serial case. I have not yet tried another Linux MPI implementation with OpenMP. If all fails, you could try to write a short shell wrapper for your parallel CPMD executable, which could look like this: #!/bin/sh OMP_NUM_THREADS=2 LD_ASSUME_KERNEL=2.4.20 export OMP_NUM_THREADS LD_ASSUME_KERNEL exec /path/to/my/real/cpmd-omp.x "$@" Be prepared for depressing results on PC style hardware. For a small number of tasks, the MPI parallelization in CPMD is significantly better than the OpenMP parallelization and the overhead of spawning multiple threads frequently outweighs the speed gain. And although a combination of MPI and OpenMP seems to be a smart choice for running a CPMD job on a large number of cpus (particularly if you do not have a high-speed interconnect), so far i have not found any significant gain by using that kind of parallel configuration on current PC hardware. In fact, i was usually better off using either only MPI for all cpus (connecting SMP-cpus via local shared memory) or not using the second CPU at all (the latter especially on dual Pentium-4/Xeon machines). ![]() Reading binary files from other platformsCPMD stores its restart information in a so called 'unformatted' (i.e. binary) file format. While this reduces the size of the file largely (while retaining full precision) compared to a (formatted) text file while, it poses a problem, when you want to continue (or restart) a run on a different platform. The same is true if you are using Vanderbilt ultra-soft pseudopotential files in CPMD, which are also read in 'unformatted' (there is a tool in the uspp distribution to convert the binary file to TEXT and back, so you can move the text version to the new platform and create a new unformatted file from it). The unformatted output is mostly a 1:1 copy of the memory contents with block size indicators added, and since the organization of the multi-byte variables (like real*8) in the computer memory changes between platforms, this these files are not generally interchangable between platforms. There is some hope, though. Most CPUs and operating systems currently support the IEEE-754-1985 Standard for floating point numbers and use one of two ways to store the data internally: big-endian or little-endian . When you have conforming machines, then they you can interchange the binaray files as long as you have machines with the same endianness. Some compilers even allow to compile binaries for the opposite endian (see below), or allow on-the-fly conversion.
How to enable on-the-fly endian conversion under Linux: If you have the Portland Group Compiler (pgf77/pgf90) you can use the use the -Mbyteswapio or the -byteswapio flag, when compiling CPMD. Your CPMD (or other) binary will now only read and write big-endian unformatted data files. Keep that in mind, when you are using Vanderbilt USPPs, because you need to have the pseudopotential files with the same endianness. An error message like PGFIO-F-219/unformatted read/unit=22/attempt to read/write past end of record is usually an indicator for such an endianness mismatch. For the Intel Fortran compiler for Linux (ifc/efc) you don't have to recompile. Simply set the environment variable F_UFMTENDIAN to 'big' (i.e. with export F_UFMTENDIAN=big if you are in a bourne/korn shell and setenv F_UFMTENDIAN big if are in a (t)csh). Check your compiler documentation for more details (search for endian). If you have a DEC/Compaq/HP Alpha machine and use the Compaq Fortran Compiler for Alpha then you can use the compiler flag -convert big_endian to compile an executable that is able to read and write big-endian unformatted binary files. Again check your compiler documentation for more details (search for endian). ![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
Disclaimer / Author of this page: Axel.Kohlmeyer@theochem.ruhr-uni-bochum.de Source File: cpmd-linux.wml (Fri May 27 11:52:07 2005) ($Revision: 1.29 $) Translated to HTML: Mon Oct 10 00:07:28 2005 |