Merge branch 'master' into topic-channel

nextflow-io · Nov 9, 2023 · 64c4eed · 64c4eed
2 parents 358224f + d49f02a
commit 64c4eed
Show file tree

Hide file tree

Showing 89 changed files with 676 additions and 923 deletions.
diff --git a/docs/config.md b/docs/config.md
@@ -789,7 +789,13 @@ The following settings are available:
 : Set the minimum CPU Platform, e.g. `'Intel Skylake'`. See [Specifying a minimum CPU Platform for VM instances](https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform#specifications) (default: none).
 
 `google.batch.network`
-: Set network name to attach the VM's network interface to. The value will be prefixed with `global/networks/` unless it contains a `/`, in which case it is assumed to be a fully specified network resource URL. If unspecified, the global default network is used.
+: The URL of an existing network resource to which the VM will be attached.
+
+  You can specify the network as a full or partial URL. For example, the following are all valid URLs:
+
+  - https://www.googleapis.com/compute/v1/projects/{project}/global/networks/{network}
+  - projects/{project}/global/networks/{network}
+  - global/networks/{network}
 
 `google.batch.serviceAccountEmail`
 : Define the Google service account email to use for the pipeline execution. If not specified, the default Compute Engine service account for the project will be used.
@@ -798,7 +804,13 @@ The following settings are available:
 : When `true` enables the usage of *spot* virtual machines or `false` otherwise (default: `false`).
 
 `google.batch.subnetwork`
-: Define the name of the subnetwork to attach the instance to must be specified here, when the specified network is configured for custom subnet creation. The value is prefixed with `regions/subnetworks/` unless it contains a `/`, in which case it is assumed to be a fully specified subnetwork resource URL.
+: The URL of an existing subnetwork resource in the network to which the VM will be attached.
+
+  You can specify the subnetwork as a full or partial URL. For example, the following are all valid URLs:
+
+  - https://www.googleapis.com/compute/v1/projects/{project}/regions/{region}/subnetworks/{subnetwork}
+  - projects/{project}/regions/{region}/subnetworks/{subnetwork}
+  - regions/{region}/subnetworks/{subnetwork}
 
 `google.batch.usePrivateAddress`
 : When `true` the VM will NOT be provided with a public IP address, and only contain an internal IP. If this option is enabled, the associated job can only load docker images from Google Container Registry, and the job executable cannot use external services other than Google APIs (default: `false`).
@@ -1335,6 +1347,12 @@ The following settings are available:
 `singularity.noHttps`
 : Pull the Singularity image with http protocol (default: `false`).
 
+`singularity.oci`
+: :::{versionadded} 23.11.0-edge
+  :::
+: Enable OCI-mode the allows the use of native OCI-compatible containers with Singularity. See [Singularity documentation](https://docs.sylabs.io/guides/4.0/user-guide/oci_runtime.html#oci-mode) for more details and requirements (default: `false`).
+
+
 `singularity.pullTimeout`
 : The amount of time the Singularity pull can last, exceeding which the process is terminated (default: `20 min`).
 
@@ -1630,10 +1648,13 @@ The following environment variables control the configuration of the Nextflow ru
 : Allows the setting Java VM options. This is similar to `NXF_OPTS` however it's only applied the JVM running Nextflow and not to any java pre-launching commands.
 
 `NXF_OFFLINE`
-: When `true` disables the project automatic download and update from remote repositories (default: `false`).
+: When `true` prevents Nextflow from automatically downloading and updating remote project repositories (default: `false`).
 : :::{versionchanged} 23.09.0-edge
   This option also disables the automatic version check (see `NXF_DISABLE_CHECK_LATEST`).
   :::
+: :::{versionchanged} 23.11.0-edge
+  This option also prevents plugins from being downloaded. Plugin versions must be specified in offline mode, or else Nextflow will fail.
+  :::
 
 `NXF_OPTS`
 : Provides extra options for the Java and Nextflow runtime. It must be a blank separated list of `-Dkey[=value]` properties.

diff --git a/docs/executor.md b/docs/executor.md
@@ -328,6 +328,10 @@ The `local` executor is used by default. It runs the pipeline processes on the c
 
 The `local` executor is useful for developing and testing a pipeline script on your computer, before switching to a cluster or cloud environment with production data.
 
+:::{note}
+While the `local` executor limits the number of concurrent tasks based on requested vs available resources, it does not enforce task resource requests. In other words, it is possible for a local task to use more CPUs and memory than it requested, in which case it may starve other tasks. An exception to this behavior is when using {ref}`container-docker` or {ref}`container-podman` containers, in which case the resource requests are enforced by the container runtime.
+:::
+
 (lsf-executor)=
 
 ## LSF

diff --git a/docs/google.md b/docs/google.md
@@ -175,11 +175,14 @@ google {
 }
 ```
 
+:::{versionadded} 23.11.0-edge
+:::
+
 Since this type of virtual machines can be retired by the provider before the job completion, it is advisable to add the following retry strategy to your config file to instruct Nextflow to automatically re-execute a job if the virtual machine was terminated preemptively:
 
 ```groovy
 process {
-    errorStrategy = { task.exitStatus==14 ? 'retry' : 'terminate' }
+    errorStrategy = { task.exitStatus==50001 ? 'retry' : 'terminate' }
     maxRetries = 5
 }
 ```

diff --git a/docs/module.md b/docs/module.md
@@ -270,3 +270,11 @@ Those scripts will be made accessible like any other command in the task environ
 :::{note}
 This feature requires the use of a local or shared file system for the pipeline work directory, or {ref}`wave-page` when using cloud-based executors.
 :::
+
+## Sharing modules
+
+Modules are designed to be easy to share and re-use across different pipelines, which helps eliminate duplicate work and spread improvements throughout the community. While Nextflow does not provide an explicit mechanism for sharing modules, there are several ways to do it:
+
+- Simply copy the module files into your pipeline repository
+- Use [Git submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules) to fetch modules from other Git repositories without maintaining a separate copy
+- Use the [nf-core](https://nf-co.re/tools#modules) CLI to install and update modules with a standard approach used by the nf-core community
diff --git a/docs/plugins.md b/docs/plugins.md
@@ -249,6 +249,33 @@ channel
 The above snippet is based on the [nf-sqldb](https://github.com/nextflow-io/nf-sqldb) plugin. The `fromQuery` factory 
 is included under the alias `fromTable`.
 
+### Process directives
+
+Plugins that implement a [custom executor](#executors) will likely need to access {ref}`process directives <process-directives>` that affect the task execution. When an executor receives a task, the process directives can be accessed through that task's configuration. As a best practice, custom executors should try to support all process directives that have executor-specific behavior and are relevant to your executor.
+
+Nextflow does not provide the ability to define custom process directives in a plugin. Instead, you can use the {ref}`process-ext` directive to provide custom process settings to your executor. Try to use specific names that are not likely to conflict with other plugins or existing pipelines.
+
+Here is an example of a custom executor that uses existing process directives as well as a custom setting through the `ext` directive:
+
+```groovy
+class MyExecutor extends Executor {
+
+    @Override
+    TaskHandler createTaskHandler(TaskRun task) {
+        final cpus = task.config.cpus
+        final memory = task.config.memory
+        final myOption = task.config.ext.myOption
+
+        println "This task is configured with cpus=${cpus}, memory=${memory}, myOption=${myOption}"
+
+        // ...
+    }
+
+    // ...
+
+}
+```
+
 ### Trace observers
 
 A *trace observer* in Nextflow is an entity that can listen and react to workflow events, such as when a workflow starts, a task completes, a file is published, etc. Several components in Nextflow, such as the execution report and DAG visualization, are implemented as trace observers.

diff --git a/docs/process.md b/docs/process.md
@@ -1177,6 +1177,52 @@ output:
 
 In this example, the process is normally expected to produce an `output.txt` file, but in the cases where the file is legitimately missing, the process does not fail. The output channel will only contain values for those processes that produce `output.txt`.
 
+(process-multiple-outputs)=
+
+### Multiple outputs
+
+When a process declares multiple outputs, each output can be accessed by index. The following example prints the second process output (indexes start at zero):
+
+```groovy
+process FOO {
+    output:
+    path 'bye_file.txt'
+    path 'hi_file.txt'
+
+    """
+    echo "bye" > bye_file.txt
+    echo "hi" > hi_file.txt
+    """
+}
+
+workflow {
+    FOO()
+    FOO.out[1].view()
+}
+```
+
+You can also use the `emit` option to assign a name to each output and access them by name:
+
+```groovy
+process FOO {
+    output:
+    path 'bye_file.txt', emit: bye_file
+    path 'hi_file.txt',  emit: hi_file
+
+    """
+    echo "bye" > bye_file.txt
+    echo "hi" > hi_file.txt
+    """
+}
+
+workflow {
+    FOO()
+    FOO.out.hi_file.view()
+}
+```
+
+See {ref}`workflow-process-invocation` for more details.
+
 ## When
 
 The `when` block allows you to define a condition that must be satisfied in order to execute the process. The condition can be any expression that returns a boolean value.
@@ -1653,15 +1699,26 @@ process mapping {
   tuple val(sampleId), path(reads)
 
   """
-  STAR --genomeDir $genome --readFilesIn $reads
+  STAR --genomeDir $genome --readFilesIn $reads ${task.ext.args ?: ''}
   """
 }
 ```
 
-In the above example, the process uses a container whose version is controlled by the `ext.version` property. This can be defined in the `nextflow.config` file as shown below:
+In the above example, the process container version is controlled by `ext.version`, and the script supports additional command line arguments through `ext.args`.
+
+The `ext` directive can be set in the process definition:
+
+```groovy
+process mapping {
+  ext version: '2.5.3', args: '--foo --bar'
+}
+```
+
+Or in the Nextflow configuration:
 
 ```groovy
 process.ext.version = '2.5.3'
+process.ext.args = '--foo --bar'
 ```
 
 (process-fair)=
@@ -2075,6 +2132,9 @@ The following options are available:
 `runAsUser: '<uid>'`
 : Specifies the user ID with which to run the container. Shortcut for the `securityContext` option.
 
+`schedulerName: '<name>'`
+: Specifies which [scheduler](https://kubernetes.io/docs/tasks/extend-kubernetes/configure-multiple-schedulers/#specify-schedulers-for-pods) is used to schedule the container. 
+
 `secret: '<secret>/<key>', mountPath: '</absolute/path>'`
 : *Can be specified multiple times*
 : Mounts a [Secret](https://kubernetes.io/docs/concepts/configuration/secret/) with name and optional key to the given path. If the key is omitted, the path is interpreted as a directory and all entries in the `Secret` are exposed in that path.

diff --git a/docs/workflow.md b/docs/workflow.md
@@ -41,6 +41,8 @@ The `main:` label can be omitted if there are no `take:` or `emit:` blocks.
 Workflows were introduced in DSL2. If you are still using DSL1, see the {ref}`dsl1-page` page to learn how to migrate your Nextflow pipelines to DSL2.
 :::
 
+(workflow-process-invocation)=
+
 ## Process invocation
 
 A process can be invoked like a function in a workflow definition, passing the expected input channels like function arguments. For example:
@@ -142,6 +144,8 @@ workflow {
 }
 ```
 
+See {ref}`process-multiple-outputs` for more details.
+
 ### Process named stdout
 
 The `emit` option can also be used to name a `stdout` output:

diff --git a/modules/nextflow/src/main/groovy/nextflow/NF.groovy b/modules/nextflow/src/main/groovy/nextflow/NF.groovy
@@ -50,11 +50,6 @@ class NF {
         NextflowMeta.instance.isDsl2()
     }
 
-    @Deprecated
-    static boolean isDsl2Final() {
-        NextflowMeta.instance.isDsl2Final()
-    }
-
     static Binding getBinding() {
         isDsl2() ? ExecutionStack.binding() : session().getBinding()
     }

diff --git a/modules/nextflow/src/main/groovy/nextflow/Nextflow.groovy b/modules/nextflow/src/main/groovy/nextflow/Nextflow.groovy
@@ -407,6 +407,4 @@ class Nextflow {
      */
     static Closure<TokenMultiMapDef> multiMapCriteria(Closure<TokenBranchDef> closure) { closure }
 
-    @Deprecated
-    static Closure<TokenMultiMapDef> forkCriteria(Closure<TokenBranchDef> closure) { closure }
 }
diff --git a/modules/nextflow/src/main/groovy/nextflow/NextflowMeta.groovy b/modules/nextflow/src/main/groovy/nextflow/NextflowMeta.groovy
@@ -112,12 +112,9 @@ class NextflowMeta {
         result.version = version.toString()
         result.build = build
         result.timestamp = parseDateStr(timestamp)
-        if( isDsl2Final() ) {
+        if( isDsl2() ) {
             result.enable = featuresMap()
         }
-        else if( isDsl2() ) {
-            result.preview = featuresMap()
-        }
         return result
     }
 
@@ -135,17 +132,6 @@ class NextflowMeta {
         enable.dsl == 2f
     }
 
-    /**
-     * As of the removal of DSL2 preview mode, the semantic of this method
-     * is identical to {@link #isDsl2()}.
-     * @return
-     *  {@code true} when the workflow script uses DSL2 syntax, {@code false} otherwise.
-     */
-    @Deprecated
-    boolean isDsl2Final() {
-        enable.dsl == 2f
-    }
-
     void enableDsl2() {
         this.enable.dsl = 2f
     }

diff --git a/modules/nextflow/src/main/groovy/nextflow/ast/NextflowDSLImpl.groovy b/modules/nextflow/src/main/groovy/nextflow/ast/NextflowDSLImpl.groovy
@@ -38,7 +38,6 @@ import nextflow.script.TokenVar
 import org.codehaus.groovy.ast.ASTNode
 import org.codehaus.groovy.ast.ClassCodeVisitorSupport
 import org.codehaus.groovy.ast.ClassNode
-import org.codehaus.groovy.ast.ConstructorNode
 import org.codehaus.groovy.ast.MethodNode
 import org.codehaus.groovy.ast.Parameter
 import org.codehaus.groovy.ast.VariableScope
@@ -79,8 +78,6 @@ import org.codehaus.groovy.transform.GroovyASTTransformation
 @GroovyASTTransformation(phase = CompilePhase.CONVERSION)
 class NextflowDSLImpl implements ASTTransformation {
 
-    @Deprecated final static private String WORKFLOW_GET = 'get'
-    @Deprecated final static private String WORKFLOW_PUBLISH = 'publish'
     final static private String WORKFLOW_TAKE = 'take'
     final static private String WORKFLOW_EMIT = 'emit'
     final static private String WORKFLOW_MAIN = 'main'
@@ -151,63 +148,6 @@ class NextflowDSLImpl implements ASTTransformation {
             super.visitMethod(node)
         }
 
-        protected Statement makeSetProcessNamesStm() {
-            final names = new ListExpression()
-            for( String it: processNames ) {
-                names.addExpression(new ConstantExpression(it.toString()))
-            }
-
-            // the method list argument
-            final args = new ArgumentListExpression()
-            args.addExpression(names)
-
-            // some magic code
-            // this generates the invocation of the method:
-            //   nextflow.script.ScriptMeta.get(this).setProcessNames(<list of process names>)
-            final scriptMeta = new PropertyExpression( new PropertyExpression(new VariableExpression('nextflow'),'script'), 'ScriptMeta')
-            final thiz = new ArgumentListExpression(); thiz.addExpression( new VariableExpression('this') )
-            final meta = new MethodCallExpression( scriptMeta, 'get', thiz )
-            final call = new MethodCallExpression( meta, 'setDsl1ProcessNames', args)
-            final stm = new ExpressionStatement(call)
-            return stm
-        }
-
-        /**
-         * Add to constructor a method call to inject parsed metadata.
-         * Only needed by DSL1.
-         *
-         * @param node The node representing the class to be invoked
-         */
-        protected void injectMetadata(ClassNode node) {
-            for( ConstructorNode constructor : node.getDeclaredConstructors() ) {
-                def code = constructor.getCode()
-                if( code instanceof BlockStatement ) {
-                    code.addStatement(makeSetProcessNamesStm())
-                }
-                else if( code instanceof ExpressionStatement ) {
-                    def expr = code
-                    def block = new BlockStatement()
-                    block.addStatement(expr)
-                    block.addStatement(makeSetProcessNamesStm())
-                    constructor.setCode(block)
-                }
-                else
-                    throw new IllegalStateException("Invalid constructor expression: $code")
-            }
-        }
-
-        /**
-         * Only needed by DSL1 to inject process names declared in the script
-         *
-         * @param node The node representing the class to be invoked
-         */
-        @Override
-        protected void visitObjectInitializerStatements(ClassNode node) {
-            if( node.getSuperClass().getName() == BaseScript.getName() )
-                injectMetadata(node)
-            super.visitObjectInitializerStatements(node)
-        }
-
         @Override
         void visitMethodCallExpression(MethodCallExpression methodCall) {
             // pre-condition to be verified to apply the transformation
@@ -508,12 +448,6 @@ class NextflowDSLImpl implements ASTTransformation {
                 visited[context] = true
 
                 switch (context) {
-                    case WORKFLOW_GET:
-                        syntaxError(stm, "Workflow 'get' is not supported anymore use 'take' instead")
-
-                    case WORKFLOW_PUBLISH:
-                        syntaxError(stm, "Workflow 'publish' is not supported anymore use process 'publishDir' instead")
-
                     case WORKFLOW_TAKE:
                     case WORKFLOW_EMIT:
                         if( !(stm instanceof ExpressionStatement) ) {