Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImmutableView support #542

Open
CsabaStupak opened this issue Jul 18, 2021 · 9 comments
Open

ImmutableView support #542

CsabaStupak opened this issue Jul 18, 2021 · 9 comments
Labels
enhancement feature A new feature (or feature request)

Comments

@CsabaStupak
Copy link

I know that there is a support for ImmutableArray which content is copied to GPU const memory. However it must be a static variable what is not the best solution at least for my library. At least I was not able to set such ImmutableArray as kernel parameter - when I call LoadAutoGroupedStreamKernel it throws exception:

Not supported kernel-parameter type 'System.Collections.Immutable.ImmutableArray`1[System.Single]'

It would be much better if I could use ImmutableArray as input parameter to the kernel. Or maybe much better to have something like ImmutableView. Its content would be automatically copied to GPU const memory on kernel function call and from the kernel function it could be accessed as other View types.

@m4rs-mt
Copy link
Owner

m4rs-mt commented Jul 18, 2021

@CsabaStupak thanks a lot for your feature request. I think that this feature is also related to #415 and #463. We were also thinking about a separate API for ConstantMemory in general. I am not sure whether passing arrays via kernel parameters would be the best way of storing them in constant memory... 🤔

@m4rs-mt m4rs-mt added enhancement feature A new feature (or feature request) labels Jul 18, 2021
@CsabaStupak
Copy link
Author

I don't mind if the kernel will have to explicitly transfer the variables from GPU memory to GPU constant memory - similarly as I do today when I transfer data to GPU cache. The key point here is to somehow get data from CPU or GPU memory to GPU constant memory which is AFAIK much faster (even faster than GPU cache?). I can even speedup my kernels since I can use GPU cache for something different than const values.

The static approach is not the best way since my "const" arrays are changed between kernel calls but they do not change during the kernel call. So even today I have these arrays in GPU memory. Moreover for multi-thread case - when I create two accelerators, I have problem with static ImmutableArray approach.

@MoFtZ
Copy link
Collaborator

MoFtZ commented Jul 25, 2021

This could potentially be split into two parts. The first part is adding support for ImmutableArrayView, which would be useful by itself in ensuring that the code does not modify the contents of an ArrayView. Then, the second part, adding support for constant memory. We would need to decide if this is automatically handled by ILGPU, or explicitly annotated by the caller.

@CsabaStupak
Copy link
Author

My vote is for automatic handling. Today if I call a kernel method I provide its arguments - view and even other non-view/const values. I guess ILGPU automatically move these values to GPU const memory or to GPU registers. I suppose that I can't move the GPU memory content to the GPU const memory inside the kernel since it should be const during the kernel call. However I do not see the whole picture here :-).

I also hope that I will be able to use SubView and As2DView/As3DView methods on this ImmutableArrayView so I do not need to think about 2D/3D arrach indexing logic :-P

@m4rs-mt
Copy link
Owner

m4rs-mt commented Jun 16, 2022

@CsabaStupak We have internally discussed adding support for ImmutableView types in several sessions. We believe that adding a separate type (namely the ImmutableView type) will give our internal optimizer additional knowledge about read-only sections in memory and thus (potentially) leading to improved runtime performance. However, we currently have plans to add this feature in a different way, by providing an explicit AsImmutable function for views:

var immutableSubView = data.SubView(...).AsImmutable();
// Avoid potential read-write dependency issues from here on
int myReadOnlyData = immutableSubView[...];
...

In addition, we plan to add support for automatically moving array data (accessed read-only within the kernel) to sections in constant memory to improve performance.

We are closing this issue for now and will provide PRs when ready. Thanks again for your feature request 👍.

@m4rs-mt m4rs-mt closed this as completed Jun 16, 2022
@CsabaStupak
Copy link
Author

Hi guys, I love the progress on the library. I wonder if there is any update on this? Many thanks! :-)

@MoFtZ MoFtZ reopened this Dec 21, 2022
@MoFtZ
Copy link
Collaborator

MoFtZ commented Feb 14, 2023

@m4rs-mt what did you want to do with this feature request? what should we try to implement?

@CsabaStupak
Copy link
Author

It would be great to have a way to specify that my Array should go to GPU cont memory. AFAIK the kernel function non-array arguments (mostly primitives int, float, etc.) and ImmutableArray go to cont memory. This have a big limitation when I have an array which is changed by one kernel and in another kernel it is not changed at all (read-only).

Maybe the solution is to extend the kernel configuration where I specify the kernel dimension and allocate the shared memory. Here we could allocate the general read/write shared memory and also could allocate the read-only (const) shared memory which would utilize the GPU const memory.

In the kernel function then I could access the allocated read/write shared memory and transfer the data from array/view to this memory as I do it today. But I could also access the read-only/const shared memory and transfer array/view content to this memory.

So it looks like that we do not need the ImmutableArray at all. Only exception is when such array is automatically copied to GPU const memory before the kernel function is called. I guess this is what happens today. However the above kernel config approach could give more freedom. I could use any array/view to transfer its content to GPU const memory in the kernel.

@MoFtZ
Copy link
Collaborator

MoFtZ commented Apr 30, 2023

@CsabaStupak Have you tried using a struct instead of ImmutableArray?

If you used a struct with a fixed-size buffer as the kernel parameter, it would be copied over into constant memory on the GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement feature A new feature (or feature request)
Projects
None yet
Development

No branches or pull requests

3 participants