emBASE locking/unlocking functionality
The user needs this functionality when demultiplexed data is to be loaded. Also the user does not have write access to the sequence lane files but might want to delete these files after they were successfully processed (e.g. demultiplexed).
The locking functionality consists of two parts, the locked flag in the database and the physical file rights on the corresponding file servers.
Locking and the corresponding unlocking is available for NGS Assay and raw data sets. Please note that both "raw data set" or "raw bioassay" terms are used indifferentially !
A sequence lane file is locked by default when it is loaded with a lane file present.
A raw data set is only locked when a non-multiplexed experiment is loaded. In this case the lane and raw files are the same and the raw file is a symbolic link to the lane file.
The used file structure to store Sequence Lane and demultiplexed (Raw BioAssay) files is defined like this:
A run directory holds the lane and info files and also a lane directory. In this lane directory we have one folder for each Library. And in there, the bam & fastq folders can be found. The bam and fastq folders are writable for the group by default.
Locking / Unlocking
The user can put files into the bam and fastq folders, which are writable for the owner group. After a file is put, the raw bioassay can be locked. To make the files readonly for the user we might need to copy the file to have the ownership transferred to our system user (‘galaxy’). Therefore the file must always be writeable for the group so we can delete the file afterwards. The copy might take some time according to the file size.
Locking a raw bioassay can be done at the NGS Assay Edit page or the Raw BioAssay Edit page.
Click on the link ‘Lock all raw data set folders’ to load all the available files in the library folders (see picture below). If no file is found, the raw bioassay stays untouched. But if one or more files are found in either the bam or fastq folders, the files are loaded and the raw bioassay will be locked. Also the directories and files will be readonly for the user.
After clicking the link to lock all raw data, success messages will be shown and the link to ‘Unlock all raw data set folders’ will appear (see picture below). Additionally, if all raw data sets are locked, an icon to Unlock the sequence lane will appear.
A click on the ‘Lock’ symbol on the raw bioassay information page will lock a single raw bioassay.
Files that are not yet loaded are shown in the row ‘Not loaded files’. This shows all the files BAM or FASTQ that are found in the library directories (Note – only files bigger than 1 Mb are considered to avoid listing links).
To view available files already on the Assay page, you can click the link ‘Show not loaded files’ in the raw data set header – files column. This will detect all the available, but not yet loaded, files of all the libraries.
A Sequence Lane file can be unlocked when all the libraries are available and locked, this is to prevent data loss by mistake (think shared lanes here). The system knows how many libraries to expect by the sample count and the library folders present in the lan directory.
If the system identifies the assay as unlockable, you will see the ‘unlock’ icon next to the lanes status.
Clicking the unlock icon will unlock the sequence lane and on success show a success message and the link to ‘Remove lane files’. Also the status is changed to Unlocked and an icon to lock the lane is available.
Clicking the link ‘Remove lane files’ will remove the available files from the file server.
Now that the raw bioassays are locked and the Sequence Lane is unlocked, the user might want to change a file for one or more raw bioassays. This is only possible after unlocking it again. On the sequence lane page, a ‘Unlock’ icon can be found next to the raw bioassays locking status. Also on the raw bioassays edit page you find the same ‘Unlock’ icon.
After the raw bioassay was successfully unlocked, the user can change the files within the BAM and FASTQ directories of that library.
If a new file (different filename) was added to the library folders, the file will be shown in the row ‘Not loaded files’. If an existing file was replaced (same filename) nothing will indicate that something changed. In both cases a ‘Lock’ icon will appear on the raw bioassay information page where the user can lock the raw data set and (re)load the files.
After sequence lane files were deleted the user can add new files by locating the files on a mounted file server.
Clicking the ‘Upload’ button will try to load both files (if available). The files will then be copied to the run directory specified for that sequence lane.