-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NaN in coefficient when using model with several covariates (but FDR < 1) #135
Comments
I also encounter the same problem. I found many genes with NA logFC are also with small FDR, and they could be interesting study objects. Any help will be appreciated! |
NA is missing, NaN is "Not a Number". They are not the same. |
Hi @gfinak, thanks for your reply. Thanks |
I possibly spotted the issue. Assuming testing by |
Yes, that's precisely why. |
Ok, thanks! Just asking: is there a way that you know to define the threshold more accurately? I imagine that if one population contributes to less than 10% of total number of cells and the markers are highly specific, we might lose those markers, even if we have enough cells to compute DE. |
Think about statistical power. How much power do you have to detect a difference with a sample size of 3? Especially after you adjust for multiple testing. We chose 10% because it's an empirical lower limit for the discrete part of the test. |
Ok thanks, it is clear. |
For future reference: If I read it correctly, such genes where the continuous component cannot be estimated should be dropped. |
Hi! Thanks for your work on this tool.
I have an issue computing differential gene expression in a model with several covariates.
The model is the following:
~ group + n_genes + pair + percent_mito + percent_ribo_p
where group is the condition that I want to test (i.e. Case or Control). n_genes is the number of detected genes, pair is a categorical label that indicates the batch, percent_mito and percent_ribo_p are respectively the percentages of mitochondrial RNA and of the ribosomal proteins RNA in each cell.
When I analyze the DEG results, I notice that some genes have NAs in place of coefficient, but with FDR < 1. For example:
On the other hand, one gene with FDR < 1 but coefficient different from NA is the following:
I am not sure how to interpret genes with FDR < 0.01 (for example) but no coefficient.
Is this an issue, or how can it be interpreted? I was also reading #98 but I'm not sure how to adapt the reply to that issue to my data.
I created a small dataset (n=132) of cells in which this behavior appears, that I can share privately if necessary. Please let me know if you need other information.
Thanks
The text was updated successfully, but these errors were encountered: