Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PMR^3: templated eigensolver for Elemental #189

Open
wants to merge 44 commits into
base: master
Choose a base branch
from

Conversation

mcopik
Copy link
Contributor

@mcopik mcopik commented Oct 21, 2016

This pull request introduces a templated version of PMR^3. It is a result of student project at RWTH from last winter semester and it has been already discussed on mailing list in February.
Our results generated with several tridiagonal matrices of different types (Legendre, Wilkinson, matrices obtained from chemical problems) show a decrease of computation time between 20 and 50%, when desired accuracy allows for single precision computations. We didn't notice any performance change for double-precision computation after switching from pure C to C++.

I've been working with the original PMR^3 repository to create a templatized version and then I applied changes from Elemental's version of PMR^3. Significant changes between C and C++ versions enforced a rather manual process of patching the eigensolver.

PMR^3 requires two additional preprocessor flags (pthreads and spinlock) and to avoid polluting Elemental with additional flags, a config file for PMR^3 is created by CMake and placed in both build and install directory.

There are few issues to resolve; I'm also not quite sure about coding style and properly defining imports for templated PMR^3. All suggestions and comments are welcomed.

Issues:

  1. In Elemental's version code for sorting eigenvalues has been removed, but a corresponding preprocessor flag in pmrrr.h is still there (ASSERT_SORTED_EIGENPAIRS), only the default value has been changed to false. It seems to me that sorting has been moved to Elemental. Should I apply those changes to new version and remove additional sorting?
  2. Are Fortran prototypes still necessary?
  3. global.h in PMR^3 handles different versions of C, including lack of support for standard C. Is it safe to assume that we have a decent support for C++ (at least C++03/11)? I could remove all those unnecessary definitions.

CC: @pauldj

@jeffhammond
Copy link
Member

  1. global.h in PMR^3 handles different versions of C, including lack of support for standard C. Is it safe to assume that we have a decent support for C++ (at least C++03/11)? I could remove all those unnecessary definitions.

Elemental assumed C++11 in late 2013 according to my email archives.

@rhl-
Copy link
Member

rhl- commented Oct 22, 2016

I'm not sure exactly how you've made changes here, but, it appears that you are "moving" files. I would strongly recommend making use of git mv as it helps make the review process easier. that way git shows a more intelligent diff.

@mcopik
Copy link
Contributor Author

mcopik commented Oct 22, 2016

@jeffhammond thanks!
@rhl I might have made the mistake of manually moving files in the beginning. Recently I have been using git mv all the time.

@mcopik
Copy link
Contributor Author

mcopik commented Oct 22, 2016

There is insane amount of warnings in Clang build caused by C code translated from Fortran. Maybe it's worth fixing, because they will appear in compilation of each translation unit.

@poulson
Copy link
Member

poulson commented Oct 22, 2016

I haven't had a chance to step through this yet, but this is a much needed addition. And I fully agree that it would be better to not ever use f2c. My recent experience with implementing tridiagonal and bidiagonal divide and conquer natively in Elemental shows that starting from the bottom up is not as much work as one would think.

@@ -0,0 +1,115 @@
/* odcpy.f -- translated by f2c (version 20061008) */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be easier to just have a for loop instead of calling odcpy?

@@ -0,0 +1,107 @@
/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this routine. I vote for the obvious two-line trivial implementation.


namespace pmrrr { namespace lapack {

template<typename FloatingType>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This routine seems like low-hanging fruit for a ground-up implementation as well; also, how will the fabs function for non-standard datatypes (e.g., El::BigFloat)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, to be honest I haven't paid much attention to non-standard datatypes.

Is it fine that I use existing MPI and math functions wrappers from Elemental? Or do you prefer for the PMR^3 to be independent on Elemental? I can implement simple serialization there if it is necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an ideal world, it would be part of the main library and use proper C++ functions instead of f2c, but the latter may take a substantial amount of work. You may want to look at how I handled similar issues in ElSuiteSparse in external/suite_sparse.

@mcopik
Copy link
Contributor Author

mcopik commented Oct 22, 2016

@poulson Templated eigensolver is not new, it is still based on existing PMR^3 implementation which did use f2c extensively. I agree that the code is horrible to read and some changes may be necessary. For example, detection of character encoding in olsame could be replaced only with tools from standard library.

@jedbrown
Copy link
Member

@rhl- @mcopik For what it's worth, git mv doesn't do anything special. It is semantically identical to adding the new file and removing the old one. The Git file format doesn't even have a way to state that a file was "moved". Instead, Git (log, diff, etc.) computes a similarity index when deciding whether to report a file as being new versus renamed. I haven't looked at this case, but a character encoding change would make a file look totally different (all lines changed). Anyway, you can control the similarity threshold:

       -M[<n>], --find-renames[=<n>]
           Detect renames. If n is specified, it is a threshold on the similarity index
           (i.e. amount of addition/deletions compared to the file’s size). For example,
           -M90% means Git should consider a delete/add pair to be a rename if more than
           90% of the file hasn’t changed. Without a % sign, the number is to be read as a
           fraction, with a decimal point before it. I.e., -M5 becomes 0.5, and is thus
           the same as -M50%. Similarly, -M05 is the same as -M5%. To limit detection to
           exact renames, use -M100%. The default similarity index is 50%.

typedef double Real;
typedef Complex<Real> C;
template<typename Real>
void run_example(Int n, bool print)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't properly documented, but the examples/ folder is meant to be more for demonstrating functionality, and the tests/ folder is meant for correctness tests.

As an aside, Elemental uses CamelCase for function names rather than snake_case.

It might also be preferred to test both single-precision and double-precision in the same run (with the ideal case being able to individually disable each precision).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment, I'm going to fix that.

By being able to disable each precision you mean a compilation flag? Or runtime argument?

option(HAVE_SPINLOCKS "Enable if pthread lib supports spinlocks" OFF)
MARK_AS_ADVANCED(HAVE_SPINLOCKS)
if(NOT HAVE_SPINLOCKS)
add_definitions(-DNOSPINLOCKS)
set(pmrrr_defines "${pmrrr_defines}#define NOSPINLOCKS\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those defines have been used as a flag to build PMR^3 as a library. Now most of the code (and 100% of code relying on availability of pthreads and spinlocks) have been moved to templated code included by Elemental. To make configuration flags available, I changed one of PMR^3 headers to a CMake configuration file, installed in both build and install directory.

That's why those flags are accumulated and used for configuration of the header, not passed directly to compiler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detailed reply; though it seems it would be more usual to use cmakedefine within the configure file rather than explicit string includes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I didn't know about that feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants