![]() |
1.4.2 (revision 8839)
|
Score-P provides several possibilities to instrument user application code. Besides the automatic compiler-based instrumentation (Section 'Automatic Compiler Instrumentation' ), it provides manual instrumentation using the Score-P User API (Section 'Manual Region Instrumentation' ), semi-automatic instrumentation using POMP2 directives (Section 'Semi-Automatic Instrumentation of POMP2 User Regions' ) and, if configured, automatic source-code instrumentation using the PDToolkit-based instrumenter (Section 'Source-Code Instrumentation Using PDT' ).
As well as user routines and specified source regions, Score-P currently supports the following kinds of events:
MPI library calls:
Instrumentation is accomplished using the standard MPI profiling interface PMPI. To enable it, the application program has to be linked against the Score-P MPI (or hybrid) measurement library plus MPI-specific libraries. Note that the Score-P libraries must be linked before the MPI library to ensure interposition will be effective.
SHMEM library calls:
Instrumentation is accomplished using the SHMEM profiling interface or the GNU linker for library wrapping. To enable it, the application program has to be linked against the Score-P SHMEM (or hybrid) measurement library plus SHMEM-specific libraries. Note that the Score-P libraries must be linked before the SHMEM library to ensure interposition will be effective.
OpenMP directives & API calls:
The Score-P measurement system uses the OPARI2 tool for instrumentation of OpenMP constructs. See the OPARI2 documentation on how to instrument OpenMP source code. In addition, the application must be linked with the Score-P OpenMP (or hybrid) measurement library.
The Score-P instrumenter command scorep
automatically takes care of compilation and linking to produce an instrumented executable, and should be prefixed to compile and link commands. Often this only requires prefixing definitions for CC or MPICC (and equivalents) in Makefiles.
Usually the Score-P instrumenter scorep
is able to automatically detect the programming paradigm from the set of compile and link options given to the compiler. In some cases however, when the compiler or compiler wrapper enables specific programming paradigm by default (e.g., Pthreads on Cray and Blue Gene/Q systems), scorep
needs to be made aware of the programming paradigm in order to do the correct instrumentation. Please see scorep --help
for the available options.
When using Makefiles, it is often convenient to define a "preparation preposition" placeholder (e.g., PREP
) which can be prefixed to (selected) compile and link commands:
MPICC = $(PREP) mpicc MPICXX = $(PREP) mpicxx MPIF90 = $(PREP) mpif90
These can make it easier to prepare an instrumented version of the program with
make PREP="scorep"
while default builds (without specifying PREP
on the command line) remain fully optimized and without instrumentation.
In order to instrument applications which employ GNU Autotools for building, following instrumentation procedure has to be used:
Configure application as usual, but provide additional argument:
--disable-dependency-tracking
Build application using make
command with compiler specification variables set as follows:
make CC="scorep <your-cc-compiler>" \
CXX="scorep <your-cxx-compiler>" \
FC="scorep <your-fc-compiler>" ...
When compiling without the Score-P instrumenter, the scorep-config
command can be used to simplify determining the appropriate linker flags and libraries, or include paths:
scorep-config [--mpp=none|--mpp=mpi|--mpp=shmem] \ [--thread=none|--thread=omp|--thread=pthread] --libs
The --mpp=<paradigm> switch selects which message passing paradigm is used. Currently, Score-P supports applications using MPI ( --mpp=mpi ) or SHMEM ( --mpp=shmem ) and applications without any message passing paradigm. It is not possible to specify two message passing systems for the same application. The --thread=<paradigm> switch selects which threading system is used in Score-P. You may use OpenMP ( --thread=omp ), no threading system ( --thread=none ) or POSIX threading system ( --thread=pthread ). It is not possible to specify two threading systems for the same application. However, you may combine a message passing system with a threading system.
The scorep-config
command can also be used to determine the right compiler flags for specifying the include directory of the scorep/SCOREP_User.h
or scorep/SCOREP_User.inc
header files. When compiling without using the Score-P instrumenter, necessary defines and compiler instrumentation flags can be obtained by calling one of the following, depending on the language:
scorep-config --cflags [<options>] scorep-config --cxxflags [<options>] scorep-config --fflags [<options>]
If you compile a C file, you should use --cflags
. If you use a C++ program, you should use --cxxflags
. And if you compile a Fortran source file, you should use --flags
.
With the additional options it is possible to select the used adapter, the threading system and the message passing system. For each adapter, we provides a pair of flags of the form --adapter
, and --noadapter
(please replace adapter
by the name of the adapter). This allows to get options for non-default instrumentation possibilities. E.g., --user
enables the manual instrumentation with the Score-P user API, the --nocompiler
option disables compiler instrumentation.
Score-P supports a variety of instrumentation types for user-level source routines and arbitrary regions, in addition to fully-automatic MPI and OpenMP instrumentation, as summarized in Table instopts .
Type of instrumentation | Instrumenter switch | Default value | Instrumented routines | Runtime measurement control |
---|---|---|---|---|
MPI | --mpp=mpi | (auto) | configured by install | (see Sec. Selection of MPI Groups ) |
SHMEM | --mpp=shmem/--mpp=none | (auto) | configured by install | — |
OpenCL | --opencl/--noopencl | enabled | configured by install | — |
OpenMP | --thread=omp | (auto) | all parallel constructs | — |
Pthread | --thread=pthread | (auto) | Basic Pthread library calls | — |
Compiler (see Sec. Automatic Compiler Instrumentation ) | --compiler/--nocompiler | enabled | all | Filtering (see Sec. Filtering ) |
PDT instrumentation (see Sec. Source-Code Instrumentation Using PDT ) | --pdt/--nopdt | disabled | all | Filtering (see Sec. Filtering ) |
POMP2 user regions (see Sec. Semi-Automatic Instrumentation of POMP2 User Regions ) | --pomp/--nopomp | depends on OpenMP usage | manually annotated | Filtering (see Sec. Filtering ) |
Manual (see Sec. Manual Region Instrumentation ) | --user/--nouser | disabled | manually annotated | Filtering (see Sec. Filtering ) and selective recording (see Sec. Selective Recording ) |
When the instrumenter determines that MPI or OpenMP are being used, it automatically enables MPI library instrumentation or OPARI2-based OpenMP instrumentation, respectively. The default set of instrumented MPI library functions is specified when Score-P is installed. All OpenMP parallel constructs and API calls are instrumented by default.
--opari=<parameter-list>
option. For available parameters please refer to the OPARI2 manual.By default, automatic instrumentation of user-level source routines by the compiler is enabled (equivalent to specifying --compiler
). The compiler instrumentation can be disabled with --nocompiler
when desired, such as when using PDToolkit, or POMP2 or Score-P user API manual source annotations, are enabled with --pdt
, --pomp
and --user
, respectively. Compiler, PDToolkit, POMP2 and Score-P user API instrumentation can all be used simultaneously, or in arbitrary combinations, however, it is generally desirable to avoid instrumentation duplication (which would result if all are used to instrument the same routines). Note that enabling PDToolkit instrumentation automatically enables Score-P user instrumentation, because it inserts Score-P user macros into the source code.
Sometimes it is desirable to explicitly direct the Score-P instrumenter to do nothing except execute the associated compile/link command. For such cases it is possible to disable default instrumentation with --nocompiler
, --thread=none
, and/or --mpp=none
. Although no instrumentation is performed, this can help verify that the Score-P instrumenter correctly handles the compile/link commands.
Each thread model uses a default internal locking mechanism for the Score-P measurement system. For the standard use case there is no need to specify an explicit locking mode. However, on certain systems or for performance reasons it might be useful to change the locking mode. For these cases the instrumenter provides the option --mutex=[omp|pthread|pthread:spinlock|pthread:wrap|none]. Current possibilities are the OpenMP locking (omp), Pthread mutex (pthread), Pthread spinlock (pthread:spinlock), Pthread mutex, where original functions replaced with __real functions (pthread:wrap), and none at all (none). Which of these are available for a given installation will be determined at configure time.
Most current compilers support automatic insertion of instrumentation calls at routine entry and exit(s), and Score-P can use this capability to determine which routines are included in an instrumented measurement.
Compiler instrumentation of all routines in the specified source file(s) is enabled by default by Score-P, or can be explicitly requested with --compiler. Compiler instrumentation is disabled with --nocompiler.
Automatic compiler-based instrumentation has been tested with a number of different compilers:
In all cases, Score-P supports automatic instrumentation of C, C++ and, Fortran codes, except for the SUN Studio compilers which only provide appropriate support in their Fortran compiler.
Names provided for instrumented routines depend on the compiler, which may add underscores and other decorations to Fortran and C++ routine names, and whether name "demangling" has been enabled when Score-P was installed and could be applied successfully.
In addition to the automatic compiler-based instrumentation (see Section 'Automatic Compiler Instrumentation' ), instrumentation can be done manually. Manual instrumentation can also be used to augment automatic instrumentation with region or phase annotations, which can improve the structure of analysis reports. Furthermore, it offers the possibility to record additional, user defined metrics. Generally, the main program routine should be instrumented, so that the entire execution is measured and included in the analysis.
Instrumentation can be performed in the following ways, depending on the programming language used.
Fortran:
#include "scorep/SCOREP_User.inc" subroutine foo SCOREP_USER_REGION_DEFINE( my_region_handle ) ! more declarations SCOREP_USER_REGION_BEGIN( my_region_handle, "foo", SCOREP_USER_REGION_TYPE_COMMON ) ! do something SCOREP_USER_REGION_END( my_region_handle ) end subroutine foo
C/C++:
#include <scorep/SCOREP_User.h> void foo() { SCOREP_USER_REGION_DEFINE( my_region_handle ) // more declarations SCOREP_USER_REGION_BEGIN( my_region_handle, "foo",SCOREP_USER_REGION_TYPE_COMMON ) // do something SCOREP_USER_REGION_END( my_region_handle ) }
C++ only:
#include <scorep/SCOREP_User.h> void foo() { SCOREP_USER_REGION( "foo", SCOREP_USER_REGION_TYPE_FUNCTION ) // do something }
#include
with the leading '#'-character to include the SCOREP_User.inc
header file. Otherwise, the inclusion may happen after the C preprocessor ran. As result the fortran compiler complains about unknown preprocessing directives.Region handles (my_region_handle
) should be registered in each annotated function/subroutine prologue before use within the associated body, and should not already be declared in the same program scope.
For every region, the region type can be indicated via the region type flag. Possible region types are:
To create a region of combined region types you can connect two or more types with the binary OR-operator, e.g.:
SCOREP_USER_REGION_BEGIN( handle, "foo", SCOREP_USER_REGION_TYPE_LOOP | SCOREP_USER_REGION_TYPE_PHASE | SCOREP_USER_REGION_TYPE_DYNAMIC )
For function instrumentation in C and C++, Score-P provides macros, which automatically pass the name and function type to Score-P measurement system. The SCOREP_USER_FUNC_BEGIN
macro contains a variable definition. Thus, compilers that require strict separation of declaration and execution part, may not work with this macro.
C/C++:
#include <scorep/SCOREP_User.h> void foo() { SCOREP_USER_FUNC_BEGIN() // do something SCOREP_USER_FUNC_END() }
In some cases, it might be useful to have the possibility to define region handles with a global scope. In C/C++, a region handle can be defined at a global scope with SCOREP_USER_GLOBAL_REGION_DEFINE
. In this case, the SCOREP_USER_REGION_DEFINE
must be omitted. The SCOREP_USER_GLOBAL_REGION_DEFINE
must only appear in one file. To use the same global variable in other files, too, declare the global region in other files with SCOREP_USER_GLOBAL_REGION_EXTERNAL
.
File 1:
SCOREP_USER_GLOBAL_REGION_DEFINE( global_handle ) foo() { SCOREP_USER_REGION_BEGIN( global_handle, "phase 1", SCOREP_USER_REGION_TYPE_PHASE) // do something SCOREP_USER_REGION_END( global_handle ) }
File 2:
SCOREP_USER_GLOBAL_REGION_EXTERNAL( global_handle ) bar() { SCOREP_USER_REGION_BEGIN( global_handle, "phase 1", SCOREP_USER_REGION_TYPE_PHASE) // do something SCOREP_USER_REGION_END( global_handle ) }
In addition, the macros SCOREP_USER_REGION_BY_NAME_BEGIN( name, type )
and SCOREP_USER_REGION_BY_NAME_END( name )
are available. These macros might introduce more overhead than the standard marcos but can annotate user regions without the need to take care about the handle struct. This might be useful for automatically generating instrumented code or to avoid global declaration of this variable.
C/C++:
#include <scorep/SCOREP_User.h> /* Application functions are already instrumented with these two calls. */ void instrument_begin(const char* regionname) { /* code added for Score-P instrumentation */ SCOREP_USER_REGION_BY_NAME_BEGIN( regionname, SCOREP_USER_REGION_TYPE_COMMON ) } void instrument_end(const char* regionname) { SCOREP_USER_REGION_BY_NAME_END( regionname ) }
Fortran:
#include "scorep/SCOREP_User.inc" subroutine instrument_begin(regionname) character(len=*) :: regionname SCOREP_USER_REGION_BY_NAME_BEGIN( regionname, SCOREP_USER_REGION_TYPE_COMMON ) end subroutine instrument_begin subroutine instrument_end(regionname) character(len=*) :: regionname SCOREP_USER_REGION_BY_NAME_END( regionname ) end subroutine instrument_end
len=*.
The source files instrumented with Score-P user macros have to be compiled with -DSCOREP_USER_ENABLE
otherwise SCOREP_*
calls expand to nothing and are ignored. If the Score-P instrumenter --user
flag is used, the SCOREP_USER_ENABLE
symbol will be defined automatically. Also note, that Fortran source files instrumented this way have to be preprocessed with the C preprocessor (CPP).
Manual routine instrumentation in combination with automatic source-code instrumentation by the compiler or PDT leads to double instrumentation of user routines, i.e., usually only user region instrumentation is desired in this case.
The Score-P user API provides also macros for parameter-based profiling. In parameter-based profiling, the parameters of a function are used to split up the call-path for executions of different parameter values. In Score-P parameter-based profiling is supported for integer and string parameters. To associate a parameter value to a region entry, insert a call to SCOREP_USER_PARAMETER_INT64
for signed integer parameters, SCOREP_USER_PARAMETER_UINT64
for unsigned integer parameters, or SCOREP_USER_PARAMETER_STRING
for string parameters after the region entry (e.g. after SCOREP_USER_REGION_BEGIN
or SCOREP_USER_FUNC_BEGIN
).
Fortran:
#include "scorep/SCOREP_User.inc" subroutine foo(i, s) integer :: i character (*) :: s SCOREP_USER_REGION_DEFINE( my_region_handle ) SCOREP_USER_PARAMETER_DEFINE( int_param ) SCOREP_USER_PARAMETER_DEFINE( string_param ) SCOREP_USER_REGION_BEGIN( my_region_handle, "my_region",SCOREP_USER_REGION_TYPE_COMMON ) SCOREP_USER_PARAMETER_INT64(int_param, "myint",i) SCOREP_USER_PARAMETER_UINT64(uint_param, "myuint",i) SCOREP_USER_PARAMETER_STRING(string_param, "mystring",s) // do something SCOREP_USER_REGION_END( my_region_handle ) end subroutine foo
C/C++:
#include <scorep/SCOREP_User.h> void foo(int64_t myint, uint64_t myuint, char *mystring) { SCOREP_USER_REGION_DEFINE( my_region_handle ) SCOREP_USER_REGION_BEGIN( my_region_handle, "foo",SCOREP_USER_REGION_TYPE_COMMON ) SCOREP_USER_PARAMETER_INT64("myint",myint) SCOREP_USER_PARAMETER_UINT64("myuint",myuint) SCOREP_USER_PARAMETER_STRING("mystring",mystring) // do something SCOREP_USER_REGION_END( my_region_handle ) }
In C/C++, only a name for the parameter and the value needs to be provided. In Fortran, the handle must be defined first with SCOREP_USER_PARAMETER_DEFINE
. The defined handle name must be unique in the current scope. The macro SCOREP_USER_PARAMETER_INT64
as well as the macro SCOREP_USER_PARAMETER_STRING
need the handle as the first argument, followed by the name and the value.
The Score-P user API also provides several macros for measurement control that can be incorporated in source files and activated during instrumentation. The macro SCOREP_RECORDING_OFF
can be used to (temporarily) pause recording until a subsequent SCOREP_RECORDING_ON
. Just like the already covered user-defined annotated regions, SCOREP_RECORDING_ON
and the corresponding SCOREP_RECORDING_OFF
must be correctly nested with other enter/exit events. Finally, with SCOREP_RECORDING_IS_ON
you can test whether recording is switched on.
Events are not recorded when recording is switched off (though associated definitions are), resulting in smaller measurement overhead. In particular, traces can be much smaller and can target specific application phases (e.g., excluding initialization and/or finalization) or specific iterations. Since the recording switch is process-local, and effects all threads on the process, it can only be initiated outside of OpenMP parallel regions. Switching recording on/off is done independently on each MPI process without synchronization.
The Online Access interface to the measurement system of Score-P allows remote control of measurement and access to the profile data. The online access interface may not be available on all platforms. To use the Online Access interface, Score-P must have been built with Online Access (OA) support.
The Online Access module requires the user to specify at least one online access phase. The online access phase does not show the behavior of a region of type phase as defined in Section 'Manual Region Instrumentation' . However, the way to specify an online access phase is similar to manual region instrumentation. The start and end of the online access phase defines the interaction points, where new measurement control commands are applied and data requests are answered.
To insert an online online access phase into the code, the user has to insert the macros SCOREP_USER_OA_PHASE_BEGIN
and the corresponding SCOREP_USER_OA_PHASE_END
at appropriate locations. These macros must be
Common practice is to mark the body of the application's main loop as online access phase, in order to utilize the main loop iterations for iterative online analysis. Only the measurements collected inside the OA phase could be configured and retrieved.
Instrumentation can be performed in the following ways, depending on the programming language used.
Fortran:
#include "scorep/SCOREP_User.inc" subroutine foo SCOREP_USER_REGION_DEFINE( my_region_handle ) ! more declarations SCOREP_USER_OA_PHASE_BEGIN( my_region_handle, "foo",SCOREP_USER_REGION_TYPE_COMMON ) ! do something SCOREP_USER_OA_PHASE_END( my_region_handle ) end subroutine foo
C/C++:
#include <scorep/SCOREP_User.h> void foo() { SCOREP_USER_REGION_DEFINE( my_region_handle ) // do something SCOREP_USER_OA_PHASE_BEGIN( my_region_handle, "foo",SCOREP_USER_REGION_TYPE_COMMON ) // do something SCOREP_USER_OA_PHASE_END( my_region_handle ) }
If you manually instrument the desired user functions and regions of your application source files using the POMP2 INST
directives described below, the Score-P instrumenter --pomp
flag will generate instrumentation for them. It is automatically enabled for OpenMP applications, but can be disabled with --nopomp
. POMP2 instrumentation directives are supported for Fortran and C/C++. The main advantages are that
The INST BEGIN/END
directives can be used to mark any user-defined sequence of statements. If this block has several exit points (as is often the case for functions), all but the last have to be instrumented by INST ALTEND
.
Fortran:
subroutine foo(...)
!declarations
!POMP$ INST BEGIN(foo)
...
if (<condition>) then
!POMP$ INST ALTEND(foo)
return
end if
...
!POMP$ INST END(foo)
end subroutine foo
C/C++:
void foo(...) { /* declarations */ #pragma pomp inst begin(foo) ... if (<condition>) { #pragma pomp inst altend(foo) return; } ... #pragma pomp inst end(foo) }
At least the main program function has to be instrumented in this way, and additionally, one of the following should be inserted as the first executable statement of the main program:
Fortran:
program main ! declarations !POMP$ INST INIT ... end program main
C/C++:
int main(int argc, char** argv) { /* declarations */ #pragma pomp inst init ... }
--nopomp
to the instrumenter.By default, the source code is preprocessed before POMP2 instrumentation happens. For more information on the preprocessing, see Section 'Preprocessing before POMP2 and OpenMP instrumentation' .
By default, source files are preprocessed before the semi-automatic POMP2 instrumentation or the OpenMP construct instrumentation with OPARI2 happens. This ensures, that all constructs and regions that might be contained in header files, templates, or macros are properly instrumented. Furthermore, conditional compilation directives take effect, too. The necessary steps are performed by the Score-P instrumenter tool.
Some Fortran compilers do not regard information about the original source location that the preprocessing leaves in the preprocessed code. This causes wrong source code information for regions from compiler instrumentation, and manual source code instrumentation. However, these compilers also disregard the source code information left by OPARI2. Thus, for these compilers the source location information is incorrect anyway.
If the preprocessing is not desired, you can disable it with the --nopreprocess
flag. In this case the instrumentation is performed before the preprocessing happens. In this case constructs and regions in header files, macros, or templates are not instrumented. Conditional compilation directives around constructs may also lead to broken instrumentation.
The preprocessing does not work in combination with PDT source code instrumentation. Thus, if PDT instrumentation is enabled, it changes the default to not preprocess a source file. If you manually specify preprocessing and PDT source code instrumentation, the instrumenter will abort with an error.
If Score-P has been configured with PDToolkit support, automatic source-code instrumentation can be used as an alternative instrumentation method. In this case, the source code of the target application is pre-processed before compilation, and appropriate Score-P user API calls will be inserted automatically. However, please note that this feature is still somewhat experimental and has a number of limitations (see Section 'Limitations' ).
To enable PDT-based source-code instrumentation, call scorep
with the --pdt
option, e.g.,
scorep --pdt mpicc -c foo.c
This will by default instrument all routines found in foo.c
. (To avoid double instrumentation, automatic compiler instrumentation is disabled when using Source-Code Instrumentation with PDT. However, if you you can enforce additional compiler instrumentation with --compiler.) The underlying PDT instrumentor supports a set a instrumentation options, which can be set like
scorep --pdt="-f <inclusion/exclusion file>" mpicc -c foo.c
This particular option for example can be used to manually include/exclude specific functions from the instrumentation process. The respective file format is described here. Please check the documentation about the tau_instrumentor for more valid options.
Currently the support for the PDT-based source-code instrumenter still has a number of limitations:
-ffixed-line-length-132
or -qfixed=132
) to the compiler. include
keyword) will currently not be instrumented. If the Score-P was build with shared libraries and with static libraries, the instrumenter uses the compiler defaults for linking. E.g. if the compiler chooses shared libraries by default, the instrumenter will link your application with the shared Score-P libraries. Furthermore, the linking is affected by parameters in the original link command. E.g. if your link command contains a -Bstatic
flag, afterwards appended Score-P libraries are also linked statically.
If you want to override the default and enforce linking of static or dynamic Score-P libraries, you can add the flag --static
or --dynamic
for the instrumenter. E.g. a command to enforce static linking can look like:
scorep --static mpicc foo.c -o foo
In this case, the linking against the static version of the Score-P libraries is enforced.
If enforcing static or dynamic linking is not possible on your system, e.g., because no static/dynamic Score-P libraries are installed, the instrumenter will abort with an error. You can determine whether --static
or --dynamic
is available from the output of scorep --help
. If the --static
or --dynamic
flags are not shown, then they are not available.