Storage NGS

The NGS File Library in emBASE

All the NGS data files of your group are stored on your group file server. This prevents billing problems for storage and allows each group member to access these files directly to avoid file duplication. 

On the image below is depicted the file structure of a NGS Data Library.

The top directory of  a NGS Data Library is refferred to as the library root. This root directory (top incoming folder on the image) contains all sequencing run directories and usually a genecore_transfer directory used by genecore to copy files (these files are then moved into their final sequencing run upon GCBrdige form submission). You can forget about this genecore_transfer directory.  

So the real starting point is the run directory named after the sequencing date and flow cell id e.g. "2014-01-07-C39MUACXX". All the lane files that belong to this sequencing run (and your group) are placed directly in this directory.

A lane folder for each available lane (e.g. 'lane6') is also available. Within the lane folder, there are directories for each library described in the lane i.e. you will find from 1 library folder (non-multiplexed library) to many library folders (multiplexed libraries). Each the library folder contains the 'bam', 'fastq' and 'stats' directories. These are the only ones that are writable for the users (untill user lock them). All other files and directories are read only for the groups.


New developments:

September 2014: Use a given MD5 file to fill the StorageFile database entry.

May 2014:  The locking/unlocking of  sequence lanes and libraries. This is useful for users that want to keep track of their demultiplexed files and load them into emBASE. Also users can make space by deleting the lane files when all the libraries are demultiplexed.  

Click here to watch a video describing the NGS Data Library.