Galaxy
Galaxy is an open source, web-based platform for data intensive biomedical research. We currently are running 2 versions at EMBL, the production version and an archived version.
We develop workflows and plugins for common NGS analysis and improve data transfer from Galaxy to your own file server.
We also produce tutorials on using Galaxy and analysis data with it.
Main facts
- Log in with your usual EMBL login and password
- Each user as a quota of 200 Gb (fastq not included as we link them)
- Galaxy (production) web interface is installed on gbcs
- All the jobs are submitted to the EMBL clusters with SLURM submission system
- Continuously update/add new tools and indices
File transfer to/from Galaxy
Getting your file in the EMBL Galaxy Server
All NGS data is not loaded per se in the Galaxy space but linked to existing files. This avoid data duplication and save a great deal of costy storage.
This operation can be simply requested at the GCBridge transfer step or later through emBASE interface (see sections below).
For any other big files or files that are not in emBASE : You cannot do this file linking operation yourself ; if you need to make big NGS files available in Galaxy, please don't upload them but ask us to link your files. Simply place your files on your file server and send us a request email indicating the file list, and in which Galaxy library/folder to place them. Obviously, the files must not be renamed or move around.
For smaller files, simply use the usual Galaxy upload functionality using the options proposed under the "Get Data" link in the Tools panel.
In the GCBridge transfer validation form
Simply leave the "Push to Galaxy" checkbox checked and the files will be available in the Galaxy library named after your group name (e.g. "Furlong Lab") accessible from the Shared Data menu , then Data Libraries.
N.B.: be patient, it takes time for Galaxy to read all info stored in your data library. Please don't multi click !
emBASE-to-Galaxy Synchronization
NGS data stored in emBASE can be synchronised. Link your emBASE experiment to a Galaxy Library and Folder (experiment edit page in emBASE). Once set up, simply click the "Synchronize the data of this experiment with Galaxy" link and all raw bioassay files in emBASE with be available in the indicated Galaxy Library/Folder
Getting your data from Galaxy
Due to space limitation, Galaxy server is not meant for long term storage and you must download the data you want to keep to your own storage i.e. a project folder on your group file server.
Galaxy offers download options for this but this is mainly a one-by-one operation and is not compatible with an automated data transfer during workflow execution. To solve this issue, we developed the NFS transfer tool that can take any dataset and copy it (and all associated extra files) to your favorite location provided this location lives on a file server (like your group file server) i.e. is reachable under a (unix) path like :
/g/groupname/
Please read more about this awesome tool here.
Main tools version on EMBL Galaxy Server
- Most tools are available under SEPP (/g/software/bin)
- Bowtie: version 0.12.7
- Bowtie2: version 2.0.0
- Novoalign: version 2.08.02
- Tophat: version 1.3.0
- Tophat2: version 2.0.4
- Gsnap: version 2011-11-14
- MACS: version 1.3.7.1
- MACS14: version 1.4.1
- R: R-2.15.2
- Samtools: version 0.1.16
- Picard: version 1.56.0
Third party tools working with Galaxy
- JGalaxy is a desktop application for bulk downloading files from Galaxy (download)
- GalaxyQCReport : a home grown command line tool to parse Galaxy workflow results