Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added test data for SAMMYseq data #836

Merged
merged 7 commits into from
Oct 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 13 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,23 @@
# ![nfcore/test-datasets](docs/images/test-datasets_logo.png)
Test data to be used for automated testing with the nf-core pipelines

> ⚠️ **Do not merge your test data to `master`! Each pipeline has a dedicated branch (and a special one for modules)**
# test-datasets: `sammyseq`

## Introduction
This branch contains data to be used for automated testing with the [nf-core/sammyseq](https://github.com/daisymut/sammyseq) pipeline.

nf-core is a collection of high quality Nextflow pipelines. This repository contains various files for CI and unit testing of nf-core pipelines and infrastructure.
## Content of this repository

The principle for nf-core test data is as small as possible, as large as necessary. Please see the [guidelines](https://nf-co.re/docs/contributing/test_data_guidelines) for more detailed information. Always ask for guidance on the [nf-core slack](https://nf-co.re/join) before adding new test data.
`testdata/CTRL004_S*_chr22only.fq.gz`: Human fibroblast single-end test data for pipeline sub-sampled to map on part of chr22.

## Documentation
## Minimal test dataset origin

nf-core/test-datasets comes with documentation in the `docs/` directory:
_H. sapiens_ fibroblast, 50bp single-end 3-fraction SAMMY-seq sequences was obtained from:

01. [Add a new test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/ADD_NEW_DATA.md)
02. [Use an existing test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/USE_EXISTING_DATA.md)
> Sebestyén, E., Marullo, F., Lucini, F. et al. SAMMY-seq reveals early alteration of heterochromatin and deregulation of bivalent genes in Hutchinson-Gilford Progeria Syndrome. Nat Commun 11, 6274 (2020). https://doi.org/10.1038/s41467-020-20048-9. [Pubmed](https://pubmed.ncbi.nlm.nih.gov/33293552/) [GEO](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE118633)

## Downloading test data
### Sampling information

Due the large number of large files in this repository for each pipeline, we highly recommend cloning only the branches you would use.

```bash
git clone <url> --single-branch --branch <pipeline/modules/branch_name>
```

To subsequently clone other branches[^1]

```bash
git remote set-branches --add origin [remote-branch]
git fetch
```

## Support

For further information or help, don't hesitate to get in touch on our [Slack organisation](https://nf-co.re/join/slack) (a tool for instant messaging).

[^1]: From [stackoverflow](https://stackoverflow.com/a/60846265/11502856)
| GEO_sample | run_accession | read_count | SRA_experiment | sample_title |
| ---------- | ------------- | ---------- | -------------- | -------------------- |
| GSM3335763 | SRR7610706 | 78683296 | SRX4475555 | CTRL004 SAMMY-seq S2 |
| GSM3335764 | SRR7610707 | 60438514 | SRX4475554 | CTRL004 SAMMY-seq S3 |
| GSM3335765 | SRR7610708 | 54864540 | SRX4475553 | CTRL004 SAMMY-seq S4 |
20 changes: 10 additions & 10 deletions docs/ADD_NEW_DATA.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@

Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested when adding a new test dataset.

- [ ] Check [here](https://github.com/nf-core/test-datasets/branches/all) that there isn't already a branch containing data that could be used
- If this is the case, follow the [documentation on how to use an existing test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/USE_EXISTING_DATA.md)
- [ ] Fork the [nf-core/test-datasets repository](https://github.com/nf-core/test-datasets) to your GitHub account
- [ ] Create a new branch on your fork
- [ ] Check your proposed test data follows the [guidelines](https://nf-co.re/docs/contributing/test_data_guidelines)
- [ ] Add your test dataset
- [ ] If you clone it locally use `git clone <url> --branch <branch> --single-branch`
- [ ] Make a PR on a new branch with a relevant name
- [ ] Wait for the PR to be merged
- [ ] Use this newly created branch for your tests
- [ ] Check [here](https://github.com/nf-core/test-datasets/branches/all) that there isn't already a branch containing data that could be used
- If this is the case, follow the [documentation on how to use an existing test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/USE_EXISTING_DATA.md)
- [ ] Fork the [nf-core/test-datasets repository](https://github.com/nf-core/test-datasets) to your GitHub account
- [ ] Create a new branch on your fork
- [ ] Check your proposed test data follows the [guidelines](https://nf-co.re/docs/contributing/test_data_guidelines)
- [ ] Add your test dataset
- [ ] If you clone it locally use `git clone <url> --branch <branch> --single-branch`
- [ ] Make a PR on a new branch with a relevant name
- [ ] Wait for the PR to be merged
- [ ] Use this newly created branch for your tests
6 changes: 3 additions & 3 deletions docs/USE_EXISTING_DATA.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested when adding a new test dataset.

- [ ] Check [here](https://github.com/nf-core/test-datasets/branches/all) to find the branch corresponding to the test dataset you want to use
- [ ] Specify in the `conf/test.config` the path to the files from the test dataset
- [ ] Set up your CI tests following the nf-core best practices (cf [.github/workflows/ci.yml template](https://github.com/nf-core/tools/blob/dev/nf_core/pipeline-template/.github/workflows/ci.yml))
- [ ] Check [here](https://github.com/nf-core/test-datasets/branches/all) to find the branch corresponding to the test dataset you want to use
- [ ] Specify in the `conf/test.config` the path to the files from the test dataset
- [ ] Set up your CI tests following the nf-core best practices (cf [.github/workflows/ci.yml template](https://github.com/nf-core/tools/blob/dev/nf_core/pipeline-template/.github/workflows/ci.yml))
Binary file added testdata/CTRL004_S2_chr22_tinier.fq.gz
Binary file not shown.
Binary file added testdata/CTRL004_S3_chr22_tinier.fq.gz
Binary file not shown.
Binary file added testdata/CTRL004_S4_chr22_tinier.fq.gz
Binary file not shown.