Question about hardware memory alignments #1016

TriceHelix · 2023-06-14T15:55:17Z

Dear ILGPU Devs & Community,

From my understanding, when allocating memory the hardware will return a pointer aligned to a specific number of bytes for technical and performance reasons. According to this, that number is 256 for CUDA GPUs. I imagine that other types of hardware will adhere to different alignments.

Does ILGPU provide a hardware independent way to query this information?

The reason I am wondering about this is because I'm trying to reduce the number of memory allocations I make (in order to avoid accumulating API overhead). Instead of multiple buffers, I allocate a single one and split it into multiple sections for arrays and variables. According to the guide I referenced, these sections should be aligned the same way the original allocation was, otherwise things may break (at least for CUDA devices).
Quote:

Any address of a variable residing in global memory or returned by one of the memory allocation routines from the driver or runtime API is always aligned to at least 256 bytes.

Reading non-naturally aligned 8-byte or 16-byte words produces incorrect results (off by a few words), so special care must be taken to maintain alignment of the starting address of any value or array of values of these types. A typical case where this might be easily overlooked is when using some custom global memory allocation scheme, whereby the allocations of multiple arrays (with multiple calls to cudaMalloc() or cuMemAlloc()) is replaced by the allocation of a single large block of memory partitioned into multiple arrays, in which case the starting address of each array is offset from the block’s starting address.

m4rs-mt · 2023-06-14T18:06:12Z

Hi @TriceHelix, thanks for reaching out to us regarding this topic. Currently, ILGPU is internally aware of the global pointer alignment (e.g., here) and uses this information to automatically vectorize IO ops, for instance. Based on the backend and the accelerator (... and your kernel of course) ILGPU will do "everything for you" when using the O2 optimization pipeline. However, you can also control the alignment of views inside kernels via view.AlignToXYZ() and view.AlignTo(xyz) (where xyz evaluates to a constant). Moreover, you can "tell" ILGPU that a particular view as a certain alignment via view.AsAlignedXYZ() and view.AsAligned(xyz) (where xyz evaluates to a constant at kernel compile time).

To answer your actual question, there is no such API to query the automatic alignment rules at the moment. However, I think this makes absolutely sense to add. @MoFtZ what do you think?

TriceHelix · 2023-06-20T18:34:17Z

Turns out most of the issues I was facing were due to the Cast method messing with aligned memory, not the actual alignments themselves. I also realized ArrayViews are not eqivalent to pointers, as they all refer to a buffer at a certain offset under the hood. As you've said, all of that alignment for the buffer is done automatically. However, I'd still say that an API to make the alignment amounts transparent would be neat. For now, consider the other issue the actual problem I've encountered.

m4rs-mt added feature A new feature (or feature request) question labels Jun 14, 2023

m4rs-mt added this to the v2.0 milestone Jun 15, 2023

This comment was marked as resolved.

Sign in to view

TriceHelix mentioned this issue Jun 20, 2023

ArrayView.Cast should support unaligned byte offsets #1018

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about hardware memory alignments #1016

Question about hardware memory alignments #1016

TriceHelix commented Jun 14, 2023 •

edited

Loading

m4rs-mt commented Jun 14, 2023

This comment was marked as resolved.

TriceHelix commented Jun 20, 2023

Question about hardware memory alignments #1016

Question about hardware memory alignments #1016

Comments

TriceHelix commented Jun 14, 2023 • edited Loading

m4rs-mt commented Jun 14, 2023

This comment was marked as resolved.

TriceHelix commented Jun 20, 2023

TriceHelix commented Jun 14, 2023 •

edited

Loading