Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass all channel-related aesthetics to fortify #92

Merged
merged 4 commits into from
Aug 20, 2023

Conversation

djhammill
Copy link
Contributor

@mikejiang, this PR implements my proposed fixes for #91.

Currently, only the xy parameters are exposed to ggplot through the various fortify methods - attached as dims attribute to the flowSet. I have expanded this dims attribute to capture all top level channel-related aesthetics so that they can be exposed to ggplot along with the x-y channels. With these updates, the following code now works as CD44 is extracted along with CD4 and CD8:

ggcyto(
    gs[1],
    aes(CD4, CD8, colour = CD44),
    subset  = "T Cells"
) +
geom_point()

image

You will notice that the points are also sorted based on CD44 intensity for better visualisation. I added support for this by passing the dims attribute to .fr2dt() to allow sorting of rows in the extracted exprs matrix. At the moment, I have only added support for sorting by size, colour and fill aesthetics but this can be easily updated in the future.

I also made sure that we can still specify pData() variables through the aes as well as per current behaviour with or without interaction():

ggcyto(
    gs[1],
    aes(CD4, CD8, colour = CD44, size = name),
    subset  = "T Cells"
) +
geom_point()

image

It is also not currently possible to display the colour scale legend on these plot above as they have been completely turned off here:

p <- p + theme(legend.position = 'none')

I think that line of code should be removed if possible to allow more control over the legend, it should be perfectly fine to display counts for downsampled data in the legend (this is what CytoExploreR does too). I did have a go at removing this line but a lot of tests fail just because all the hexbin plots gain count legends - I don't see this as an issue and the plots actually look better with the legends included. I have left this line untouched for now until I receive confirmation from you that it is OK to remove it.

I ran all the tests locally for changes included in this PR and all test have passed with the exception of a couple of plots where the ellipses had shifted slightly due to the lymphGate gating method.

I have also re-rendered all the vignettes which remained unchanged with this PR.

@SamGG
Copy link

SamGG commented Aug 9, 2023

Hi,
Thanks Dillon for this interesting and relevant update. This is a great addition.
I am just wondering if there will be way to avoid the automatic sorting of points by the color. In the case I would like see the range of CD44 in each population on the plot, a random sort would be more adequate IMHO.
Best.

@djhammill
Copy link
Contributor Author

@mikejiang, I am busy working on a more elegant solution to the point sorting problem - I have a bit more testing to do before I can report back with additional commits.

@djhammill
Copy link
Contributor Author

djhammill commented Aug 10, 2023

@mikejiang, I removed the automated sorting based on size, colour or fill in favour of adding a new order aesthetic to ggcyto(). order used to be supported for ggplot2 v2.0.0 but it has since been deprecated in ggplot2 v3.0.0. The order aesthetic is not exposed to ggplot2 at all but instead it is simply attached to the dims attribute of the flowSet along with any other channel-related aesthetics. As an example, I have plotted data for two samples with different shapes based on pData(gs)$name in CD4 and CD8 dimensions with points being coloured by CD44 expression:

ggcyto(
  gs[c(25, 32)],
  aes(
    x = CD4,
    y = `Alexa Fluor 488-A`,
    color = CD44,
    shape = name
  ),
  subset = "T Cells"
) +
geom_point() +
axis_x_inverse_trans() +
axis_y_inverse_trans()

image

In the example above, the points are randomly ordered based on the sampling imposed by the sampleFilter. We can easily force the points to be plotted in order based on CD44 expression using the new order aesthetic:

ggcyto(
  gs[c(25, 32)],
  aes(
    x = CD4,
    y = `Alexa Fluor 488-A`,
    color = CD44,
    shape = name,
    order = CD44
  ),
  subset = "T Cells"
) +
geom_point() +
axis_x_inverse_trans() +
axis_y_inverse_trans()

image

I played around with allowing control over ordering as well (descending vs ascending, multiple variables etc.) but things get pretty messy, so I think it's best to only support ordering ascending order for now (not sure why anyone want want to sort in descending order anyway).

All tests pass locally and vignettes build as expected following these changes. Happy to add comments about the new order aesthetic in one of the vignettes if you think its appropriate.

Please let me know what you think about removing this line as well:

p <- p + theme(legend.position = 'none')

The plots become much easier to interpret when the legends are added:
image

@SamGG
Copy link

SamGG commented Aug 10, 2023

I like it. Great job.

@mikejiang mikejiang merged commit 6f17074 into RGLab:devel Aug 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants