Install
To install the plugin, simply add the plugin into your nextflow.config
plugins {
id "nf-aspera@0.1.0-edge2"
}
FileSystem
Once the plugin starts, a new aspera resource schema is added to Nextflow similar to http or ftp
For example, running this pipeline you can grab the README file from NCBI:
println file('aspera://ncbi/refseq/README').text
-
"aspera" is the schema
-
"ncbi" is the identifier of the client (if you’re running the enterprise version can be
my-companyfor example) -
'/refseq/README' is the URL of the remote file
The file created is a Stream byte array so no temporary files are required
- WARNING
-
Similar to the
httpfileSystem,asperadoesn’t provide some capabilities as upload, create dir, etc
ascp operator
For users, who wish to use this plugin in channels, the plugin provides an operator to download files similar to the ascp native command:
Channel.ascp([
client: 'demo',
destination:'downloads/',
source: 'aspera-test-dir-small/10MB.1'
])
In case you want to download a bunch of files you can use sources instead of source
sources:[
'aspera-test-dir-small/10MB.1',
'aspera-test-dir-large/100MB',
]
The ascp operator will emit an event per each file downloaded
Aspera demo
The plugin includes, out of the box, a demo client configuration to allow works with the demo.asperasoft.com
server with the following configuration:
remote_host: "demo.asperasoft.com"
ssh_port: 33001
remote_user: "aspera"
remote_password: "demoaspera"
This allows you to try the plugin installation easily running for example a simple pipeline as:
long startTime = System.currentTimeMillis()
println "10MB.1 size in bytes = " + file('aspera://demo/aspera-test-dir-small/10MB.1').bytes.length
long endTime = System.currentTimeMillis()
println "Download tooks ${endTime-startTime} ms"
In-built NCBI Aspera client
The plugin includes, out of the box, a ncbi client configuration to allow works with the ftp.ncbi.nlm.nih.gov
server with the following configuration:
remote_host : 'ftp.ncbi.nlm.nih.gov',
ssh_port : 22,
remote_user : "anonftp",
ssh_private_key: 'BEGIN PRIVATE ------- ....',
ssh_private_key_passphrase : "743128bf-3bf3-45b5-ab14-4602c67f2950",
cipher : "none",
- INFO
-
NCBI uses a published private key (aspera_tokenauth_id_rsa) and a published passphrase (743128bf-3bf3-45b5-ab14-4602c67f2950) to allow anonymous people download files
Also the plugin provides with a ncbi_ascp Factory as a wrapper for the ascp function using ncbi as client
Channel
.ncbi_ascp( destination:'downloads/', source: '/refseq/release/bacteria/bacteria.1.1.genomic.fna.gz' )
| view
In-built ENA Aspera client
The plugin includes, out of the box, an ena client configuration to allow works with the fasp.ebi.ac.uk
server with the following configuration:
remote_host : 'fasp.ebi.ac.uk',
ssh_port : 33001,
remote_user : "fasp-public",
ssh_private_key: 'BEGIN PRIVATE ------- ....'
- INFO
-
ENA uses a published private key (asperaweb_id_dsa.openssh) to allow anonymous people download files
Also the plugin provides with a ena_ascp Factory as a wrapper for the ascp function using ena as client
Channel
.ena_ascp( destination:'downloads/', source: 'vol1/fastq/ERR164/ERR164407/ERR164407.fastq.gz' )
| view
Private Servers
In case you want to use some corporate Aspera server you need to provide the connection details in the aspera configuration.
aspera{
clients {
my-company { //(1)
remote_host = 'our-aspera-server'
ssh_port = 33001
remote_user= "aspera"
remote_password= "a_secret_password"
}
another-server{ //(2)
}
}
-
my-company will be the identifier to use in the transfer urls
-
You can provide multiple configurations in case you work with different servers
- INFO
-
This feature will be available in the Enterprise plugin version.
If you want to include more OpenData Aspera servers, please contact us at jorge@incsteps.com to include it in the community version