Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adressing review suggestions for Version 1.0.0 #9

Open
wants to merge 26 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
443ba0a
Implemented minor reviewer requests
Felix-Kummer Aug 7, 2024
3dca391
Added parameters to control publishing behavior for intermediate proc…
Felix-Kummer Aug 8, 2024
a5df347
Added descriptions to custom scripts.
Felix-Kummer Aug 8, 2024
2e208fc
Applied more minor changes requested by the reviewers
Felix-Kummer Aug 21, 2024
e5d4425
Reworked merge scripts
Felix-Kummer Aug 23, 2024
fd29a4b
Added authors and licenses to scripts
Felix-Kummer Aug 23, 2024
e042a1c
Removed unused sensors_level1 parameter
Felix-Kummer Aug 23, 2024
69fd493
Applied one-sentence-per-line-scheme in .md files
Felix-Kummer Aug 28, 2024
17db3ba
Removed unnecessarily ignored parameters in test profile
Felix-Kummer Aug 29, 2024
4a7518b
Improved parameter references in docs
Felix-Kummer Aug 29, 2024
4949334
Replaced retry strategies with error_retry label
Felix-Kummer Sep 6, 2024
d7ba83b
Adopted nf-core pseudo-standard directory structure for local modules
Felix-Kummer Sep 18, 2024
64c909a
Updated nf-core untar module
Felix-Kummer Sep 18, 2024
5e006c0
Enriched schema with patterns for parameters
Felix-Kummer Sep 18, 2024
980347f
Improved output documentation for tss files
Felix-Kummer Sep 18, 2024
e036cb3
Added date format to date parameters
Felix-Kummer Sep 18, 2024
7564dd6
Removed deprecated docker parameter
Felix-Kummer Sep 18, 2024
cd5fb92
Added tags to all modules
Felix-Kummer Sep 19, 2024
f12f0c9
Removed restrictions on FORCE thread numbers and corresponding parameter
Felix-Kummer Sep 26, 2024
7f7805b
Replaced usage of params in modules and subworkflows with channels
Felix-Kummer Sep 26, 2024
fd09671
Added more output channels to the top level workflow
Felix-Kummer Sep 26, 2024
a4d2e7f
Added automatic tarball extraction for input parameters
Felix-Kummer Sep 28, 2024
99ff909
Changed UNTAR container to its default container image
Felix-Kummer Oct 7, 2024
605f154
Changed force-pyramid to process a single file instead of groups of f…
Felix-Kummer Oct 11, 2024
ac70e35
Merge branch 'nf-core:dev' into dev
Felix-Kummer Oct 14, 2024
b6c2e63
Merge remote-tracking branch 'origin/dev' into dev
Felix-Kummer Oct 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 13 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@

## Introduction

**nf-core/rangeland** is a geographical best-practice analysis pipeline for remotely sensed imagery. The pipeline processes satellite imagery alongside auxiliary data in multiple steps to arrive at a set of trend files related to land-cover changes. The main pipeline steps are:
**nf-core/rangeland** is a geographical best-practice analysis pipeline for remotely sensed imagery.
The pipeline processes satellite imagery alongside auxiliary data in multiple steps to arrive at a set of trend files related to land-cover changes. The main pipeline steps are:

1. Read satellite imagery, digital elevation model, endmember definition, water vapor database and area of interest definition
2. Generate allow list and analysis mask to determine which pixels from the satellite data can be used
Expand All @@ -28,20 +29,22 @@
5. Time series analyses to obtain trends in vegetation dynamics
6. Create mosaic and pyramid visualizations of the results

7. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))
8. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))
7. Present QC results ([`MultiQC`](http://multiqc.info/))

## Usage

> [!NOTE]
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow.
> Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.

To run the pipeline on real data, input data needs to be acquired. Concretely, satellite imagery, water vapor data, a digital elevation model, endmember definitions, a datacube specification, and a area-of-interest specification are required. Please refer to the [usage documentation](https://nf-co.re/rangeland/usage) for details on the input structure.
To run the pipeline on real data, input data needs to be acquired.
Concretely, satellite imagery, water vapor data, a digital elevation model, endmember definitions, a datacube specification, and a area-of-interest specification are required.
Please refer to the [usage documentation](https://nf-co.re/rangeland/usage) for details on the input structure.

Now, you can run the pipeline using:

```bash
nextflow run nf-core/rangeland/main.nf \
nextflow run nf-core/rangeland \
-profile <docker/singularity/.../institute> \
--input <SATELLITE IMAGES> \
--dem <DIGITAL ELEVATION MODEL> \
Expand Down Expand Up @@ -72,7 +75,8 @@ The rangeland workflow was originally written by:

The original workflow can be found on [github](https://github.com/CRC-FONDA/FORCE2NXF-Rangeland).

Transformation to nf-core/rangeland was conducted by [Felix Kummer](https://github.com/Felix-Kummer). nf-core alignment started on the [nf-core branch of the original repository](https://github.com/CRC-FONDA/FORCE2NXF-Rangeland/tree/nf-core).
Transformation to nf-core/rangeland was conducted by [Felix Kummer](https://github.com/Felix-Kummer).
nf-core alignment started on the [nf-core branch of the original repository](https://github.com/CRC-FONDA/FORCE2NXF-Rangeland/tree/nf-core).

We thank the following people for their extensive assistance in the development of this pipeline:

Expand Down Expand Up @@ -114,7 +118,8 @@ You can cite the `nf-core` publication as follows:
>
> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).

This pipeline is based one the publication listed below. The publication can be cited as follows:
This pipeline is based one the publication listed below.
The publication can be cited as follows:

> **FORCE on Nextflow: Scalable Analysis of Earth Observation Data on Commodity Clusters**
>
Expand Down
70 changes: 36 additions & 34 deletions bin/merge_boa.r
Original file line number Diff line number Diff line change
@@ -1,44 +1,46 @@
#!/usr/bin/env Rscript

args = commandArgs(trailingOnly=TRUE)
## Originally written by Felix Kummer and released under the MIT license.
## See git repository (https://github.com/nf-core/rangeland) for full license text.

# Script for merging bottom of atmosphere (boa) .tif raster files.
# This can improve the performance of downstream tasks.

if (length(args) < 3) {
stop("\nthis program needs at least 3 inputs\n1: output filename\n2-*: input files", call.=FALSE)
}

fout <- args[1]
finp <- args[2:length(args)]
nf <- length(finp)

require(raster)


img <- brick(finp[1])
nc <- ncell(img)
nb <- nbands(img)
require(terra)

args <- commandArgs(trailingOnly = TRUE)

sum <- matrix(0, nc, nb)
num <- matrix(0, nc, nb)

for (i in 1:nf){

data <- brick(finp[i])[]

num <- num + !is.na(data)

data[is.na(data)] <- 0
sum <- sum + data

if (length(args) < 3) {
stop("\nError: this program needs at least 3 inputs\n1: output filename\n2-*: input files", call.=FALSE)
}

mean <- sum/num
img[] <- mean

fout <- args[1]
finp <- args[2:length(args)]

writeRaster(img, filename = fout, format = "GTiff", datatype = "INT2S",
options = c("INTERLEAVE=BAND", "COMPRESS=LZW", "PREDICTOR=2",
"NUM_THREADS=ALL_CPUS", "BIGTIFF=YES",
sprintf("BLOCKXSIZE=%s", img@file@blockcols[1]),
sprintf("BLOCKYSIZE=%s", img@file@blockrows[1])))
# Load input rasters
rasters <- lapply(finp, rast)

# Calculate the sum of non-NA values across all rasters
sum_rasters <- Reduce("+", lapply(rasters, function(x) {
x[is.na(x)] <- 0
return(x)
}))

# Calculate the number of values non-NA values for each cell
count_rasters <- Reduce("+", lapply(rasters, function(x) {
return(!is.na(x))
}))

# Calculate the mean raster
mean_raster <- sum_rasters / count_rasters

# Write the mean raster
writeRaster(mean_raster,
filename = fout,
datatype = "INT2S",
filetype = "GTiff",
gdal = c("COMPRESS=LZW", "PREDICTOR=2",
"NUM_THREADS=ALL_CPUS", "BIGTIFF=YES",
sprintf("BLOCKXSIZE=%s", ncol(mean_raster)),
sprintf("BLOCKYSIZE=%s", nrow(mean_raster))))
59 changes: 31 additions & 28 deletions bin/merge_qai.r
Original file line number Diff line number Diff line change
@@ -1,38 +1,41 @@
#!/usr/bin/env Rscript

args = commandArgs(trailingOnly=TRUE)
## Originally written by Felix Kummer and released under the MIT license.
## See git repository (https://github.com/nf-core/rangeland) for full license text.

# Script for merging quality information (qai) .tif raster files.
# This can improve the performance of downstream tasks.

if (length(args) < 3) {
stop("\nthis program needs at least 3 inputs\n1: output filename\n2-*: input files", call.=FALSE)
}

fout <- args[1]
finp <- args[2:length(args)]
nf <- length(finp)

require(raster)


img <- raster(finp[1])
nc <- ncell(img)
require(terra)

args <- commandArgs(trailingOnly = TRUE)

last <- rep(1, nc)

for (i in 1:nf){

data <- raster(finp[i])[]

last[!is.na(data)] <- data[!is.na(data)]

if (length(args) < 3) {
stop("\nError: this program needs at least 3 inputs\n1: output filename\n2-*: input files", call.=FALSE)
}

img[] <- last

fout <- args[1]
finp <- args[2:length(args)]

writeRaster(img, filename = fout, format = "GTiff", datatype = "INT2S",
options = c("INTERLEAVE=BAND", "COMPRESS=LZW", "PREDICTOR=2",
"NUM_THREADS=ALL_CPUS", "BIGTIFF=YES",
sprintf("BLOCKXSIZE=%s", img@file@blockcols[1]),
sprintf("BLOCKYSIZE=%s", img@file@blockrows[1])))
# load raster files into single SpatRaster
rasters <- rast(finp)

# Merge rasters by maintaining the last non-NA value
merged_raster <- app(rasters, function(x) {
non_na_values <- na.omit(x)
if (length(non_na_values) == 0) {
return(1)
}
return(tail(non_na_values, 1)[1])
})

# Write merged raster
writeRaster(merged_raster,
filename = fout,
filetype = "GTiff",
datatype = "INT2S",
gdal = c("INTERLEAVE=BAND", "COMPRESS=LZW", "PREDICTOR=2",
"NUM_THREADS=ALL_CPUS", "BIGTIFF=YES",
sprintf("BLOCKXSIZE=%s", ncol(merged_raster)),
sprintf("BLOCKYSIZE=%s", nrow(merged_raster))))
20 changes: 5 additions & 15 deletions bin/test.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
#!/usr/bin/env Rscript

## Originally written by David Frantz and Felix Kummer and released under the MIT license.
## See git repository (https://github.com/nf-core/rangeland) for full license text.

# Script to verify pipeline results from test and test_full profiles.

args = commandArgs(trailingOnly=TRUE)


Expand Down Expand Up @@ -116,21 +121,6 @@ peak_year_of_change <- peak_rast["YEAR-OF-CHANGE"]



# FOR REFERENCE: SAVE RASTERS
#######################################################################

#writeRaster(woody_cover_changes, "woody_cover_chg_ref.tif")
#writeRaster(woody_cover_year_of_change, "woody_cover_yoc_ref.tif")

#writeRaster(herbaceous_cover_changes, "herbaceous_cover_chg_ref.tif")
#writeRaster(herbaceous_cover_year_of_change, "herbaceous_cover_yoc_ref.tif")

#writeRaster(peak_changes, "peak_chg_ref.tif")
#writeRaster(peak_year_of_change, "peak_yoc_ref.tif")




# COMPARE TESTRUN WITH REFERENCE EXECUTION
#######################################################################
failure <- FALSE
Expand Down
22 changes: 11 additions & 11 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,6 @@ process {
}

withName: "FORCE_PREPROCESS" {
errorStrategy = 'retry'
maxRetries = 5
publishDir = [
[
path: { "${params.outdir}/preprocess/${task.tag}/logs" },
Expand All @@ -45,16 +43,15 @@ process {
],
[
path: { "${params.outdir}/preprocess/${task.tag}" },
mode: 'symlink',
pattern: 'level2_ard/**/*'
mode: params.publish_dir_mode,
pattern: 'level2_ard/**/*',
saveAs: { params.save_ard ? it : null }
]
]
}


withName: "HIGHER_LEVEL_CONFIG" {
errorStrategy = 'retry'
maxRetries = 5
publishDir = [
path: { "${params.outdir}/higher-level/${task.tag}/param_files" },
mode: params.publish_dir_mode,
Expand All @@ -65,9 +62,9 @@ process {
withName: "FORCE_HIGHER_LEVEL" {
publishDir = [
path: { "${params.outdir}/higher-level/${task.tag}" },
mode: 'symlink',
mode: params.publish_dir_mode,
pattern: 'trend/*.tif',
saveAs: { "trend_files/${it.tokenize('/')[-1]}" }
saveAs: { params.save_ard ? "trend_files/${it.tokenize('/')[-1]}" : null }
]
}

Expand Down Expand Up @@ -97,15 +94,18 @@ process {
}

withName: "CHECK_RESULTS" {
errorStrategy = { task.exitStatus == 143 ? 'retry' : 'ignore' }
publishDir = [
enabled: false
]
}

withName: "CHECK_RESULTS_FULL" {
publishDir = [
enabled: false
]
}

withName: "PREPROCESS_CONFIG" {
errorStrategy = 'retry'
maxRetries = 5
publishDir = [
path: { "${params.outdir}/preprocess/${task.tag}/param_files" },
mode: params.publish_dir_mode,
Expand Down
8 changes: 1 addition & 7 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,6 @@ params {
dem = 'https://github.com/nf-core/test-datasets/raw/rangeland/dem/dem.tar.gz'
wvdb = 'https://github.com/nf-core/test-datasets/raw/rangeland/wvp/wvdb.tar.gz'

input_tar = true
dem_tar = true
wvdb_tar = true

data_cube = 'https://github.com/nf-core/test-datasets/raw/rangeland/datacube/datacube-definition.prj'
aoi = 'https://github.com/nf-core/test-datasets/raw/rangeland/vector/aoi.gpkg'
endmember = 'https://github.com/nf-core/test-datasets/raw/rangeland/endmember/hostert-2003.txt'
Expand All @@ -39,7 +35,6 @@ params {
start_date = '1987-01-01'
end_date = '1989-12-31'

sensors_level1 = 'LT04,LT05'
sensors_level2 = 'LND04 LND05'

// Reference data
Expand All @@ -58,12 +53,11 @@ params {
// enable mosaic for result checking
mosaic_visualization = true

validationSchemaIgnoreParams = "peak_yoc_ref,peak_change_ref,herbaceous_yoc_ref,herbaceous_change_ref,woody_yoc_ref,woody_change_ref,config_profile_description,config_profile_name"
validationSchemaIgnoreParams = "peak_yoc_ref,peak_change_ref,herbaceous_yoc_ref,herbaceous_change_ref,woody_yoc_ref,woody_change_ref"
}

process {
withName: "UNTAR_*" {
container = 'docker.io/ubuntu:23.10'
ext.args2 = "--strip-components=0"
}
}
6 changes: 0 additions & 6 deletions conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,6 @@ params {
dem = 's3://ngi-igenomes/test-data/rangeland/dem.tar'
wvdb = 's3://ngi-igenomes/test-data/rangeland/wvdb.tar'

input_tar = true
dem_tar = true
wvdb_tar = true

data_cube = 's3://ngi-igenomes/test-data/rangeland/datacube-definition.prj'
aoi = 's3://ngi-igenomes/test-data/rangeland/aoi.gpkg'
endmember = 's3://ngi-igenomes/test-data/rangeland/hostert-2003.txt'
Expand All @@ -32,7 +28,6 @@ params {
start_date = '1986-01-01'
end_date = '1989-12-31'

sensors_level1 = 'LT04,LT05'
sensors_level2 = 'LND04 LND05'

// enable time series stack output
Expand All @@ -50,7 +45,6 @@ params {

process {
withName: "UNTAR_*" {
container = 'docker.io/ubuntu:23.10'
ext.args2 = "--strip-components=1"
}
withName: "UNTAR_REF"{
Expand Down
Loading
Loading