Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CMake support, use allocatable arrays in Fortran #11

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

mathomp4
Copy link

@mathomp4 mathomp4 commented Nov 5, 2023

This is an attempt at a more generic CMake support compared to #9. It also converts the Fortran code to use allocatable arrays.

The latter is because I tried this out on my macOS laptop and without the allocatable arrays you get:

❯ ./build/stream_f.exe
zsh: illegal hardware instruction  ./build/stream_f.exe

I think this is okay since in the fortran code we have:

*          You may put the arrays in common or not, at your discretion.
*          There is a commented-out COMMON statement below.
*          Fortran90 "allocatable" arrays are fine, too.

but maybe we can summon @jdmccalpin himself and see if he is okay with this and that it doesn't violate the STREAM terms. If not, I could add some ifdefs and say if(APPLE) we use allocatables.

I'll also mention @scivision as, well, he writes nicer CMake code than I do and he can critique it (and I used his "ignore build dir" trick).

@scivision
Copy link

scivision commented Nov 6, 2023

Anything more than a kB or so should be allocatable to avoid blowing up the stack. Those were way too big to be non-allocatable regardless of compiler / OS.

Here is my CMakeLists.txt. It shows how to set per-compiler options. The target names shouldn't explicity use .exe as this breaks on Windows.

cmake_minimum_required(VERSION 3.15)

project(
  STREAM
  VERSION 1.0
  DESCRIPTION "STREAM benchmark"
  LANGUAGES C Fortran)

enable_testing()

if(CMAKE_C_COMPILER_ID MATCHES "(GNU|Clang)")
  add_compile_options("$<$<COMPILE_LANGUAGE:C>:-O3;-march=native>")
elseif(CMAKE_C_COMPILER_ID MATCHES "Intel")
  add_compile_options("$<$<COMPILE_LANGUAGE:C>:-O3;-xHost>")
endif()

if(CMAKE_Fortran_COMPILER_ID STREQUAL "GNU")
  add_compile_options("$<$<COMPILE_LANGUAGE:Fortran>:-O3;-march=native>")
elseif(CMAKE_C_COMPILER_ID MATCHES "Intel")
  add_compile_options("$<$<COMPILE_LANGUAGE:Fortran>:-O3;-xHost>")
endif()

# The C STREAM benchmark only requires
# stream.c
add_executable(stream_c stream.c)

add_test(NAME STREAM_C COMMAND stream_c)

# The Fortran STREAM benchmark requires
# stream.f and mysecond.o from mysecond.c
add_executable(stream_f stream.f mysecond.c)

add_test(NAME STREAM_Fortran COMMAND stream_f)

# Look for OpenMP support is found, link it to the executables
# Note that if you are using clang on macOS, you will need to
# install libomp via Homebrew and then set the following
# environment variables:
#   export OpenMP_ROOT=$(brew --prefix)/opt/libomp
# see https://www.scivision.dev/cmake-openmp/ for more details

find_package(OpenMP COMPONENTS C Fortran)
target_link_libraries(stream_c PRIVATE $<$<BOOL:${OpenMP_C_FOUND}>:OpenMP::OpenMP_C>)
target_link_libraries(stream_f PRIVATE $<$<BOOL:${OpenMP_Fortran_FOUND}>:OpenMP::OpenMP_Fortran>)

# Per @scivision, ignore the build directory
# (see https://www.scivision.dev/cmake-auto-gitignore-build-dir/)
if(NOT PROJECT_SOURCE_DIR STREQUAL PROJECT_BINARY_DIR)
  file(GENERATE OUTPUT .gitignore CONTENT "*")
endif()

This was referenced Nov 6, 2023
CMakeLists.txt Outdated Show resolved Hide resolved
@mathomp4
Copy link
Author

mathomp4 commented Nov 6, 2023

After adding @scivision 's fixes, I found like @jeffhammond did that the prints were all out of wack (see #7). I've pushed a fix for that.

Fixes #7

@scivision
Copy link

scivision commented Nov 7, 2023

This all looks good. It works on Windows with MinGW as well.

#12 is unrelated to this that further improves (simplifies) the Fortran code.
The C code also works on Windows with #12 including MSVC and oneAPI and MinGW.

@jdmccalpin
Copy link

This is an attempt at a more generic CMake support compared to #9. It also converts the Fortran code to use allocatable arrays.

The latter is because I tried this out on my macOS laptop and without the allocatable arrays you get:

❯ ./build/stream_f.exe
zsh: illegal hardware instruction  ./build/stream_f.exe

I think this is okay since in the fortran code we have:

*          You may put the arrays in common or not, at your discretion.
*          There is a commented-out COMMON statement below.
*          Fortran90 "allocatable" arrays are fine, too.

but maybe we can summon @jdmccalpin himself and see if he is okay with this and that it doesn't violate the STREAM terms. If not, I could add some ifdefs and say if(APPLE) we use allocatables.

I have always accepted STREAM results using allocatable arrays. The original STREAM code was written with static arrays to help C compilers determine that no aliasing was present (and that the loop bounds were "large"). Static arrays in Fortran were never a problem in the olden days, but they often cause trouble in C. (That is one reason why the default problem size is set to 80 million elements per array -- the three arrays fit in less than 2GiB.) Many C compilers still have a bit of trouble with optimization with allocatable arrays and require extra hints (e.g., the quite reasonable "restrict" keyword or the very ugly "-qopt-streaming-stores always" compiler flag) to generate their best code.

Most of my versions of STREAM use posix_memalign() or mmap() to allocate the arrays. For the "standard" shared-memory results, the only requirement has been that the memory is allocated contiguously, so each kernel can be processed by a single loop with contiguous indices. The goal has always been to restrict "standard" results to straightforward implementations of the computational kernels, e.g., single loops or array notation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants