The idea of being able to browse a genome using a great solution such as the UCSC Genome Browser seemed really interesting to me. Fortunately, using Docker it is possible with a few steps. We will need two docker containers. One for Apache and another for the database - MySQL. The Apache container will retrieve … Continue reading UCSC Genome Browser – Docker solution for a self -hosted instance
The main idea of Environment Modules is to help you manipulate the environment loading and unloading modules. Environment Modules is a powerful tool with many options. Here we just give it a try. Using a Fedora:31 docker image lets play : docker run -ti fedora:31 bash Now lets install Environment modules: dnf update -y dnf … Continue reading Environment Modules – Dynamic environment using modulefiles
From time to time I get a list of ids such as: ERR3277096 ERR3277097 ERR3277098 ERR3277099 along with a request for analysis of the data. Those ids are the so called "Run accession" ids. They usually point out to NGS data that have been deposited in public databases such as SRA, ENA and others. I … Continue reading European Nucleotide Archive (ENA) and REST – retrieving NGS data links.
HTOP is a nice ncurses-based monitoring tool. There you can have an overview of the running processes, set "niceness", send "kill" signals and more. HTOP Recently I have learned to use HTOP to monitor IO and decided to share it here. To get it to work we need to push "F2" for Setup (I am … Continue reading HTOP – Customizing it to get more.
If you don't want to check Github pages to get to know when new releases are coming, there is a neat and easy solution for that. Github provides a atom address that allows you to get informations about new releases. The pattern is: https://github.com/user/repository/releases.atom I use Newsboat to check RSS feeds. Into .newsboat directory, the … Continue reading Github – Atom feed for releases
The idea is to use the power of GNU Make to simplify installing and maintaining multiple tools with multiple versions. The point on keeping multiple versions is to be able to: 1 - Reproduce results from papers; 2 - Be able to reproduce your own results from the past. The way I organize it is … Continue reading GNU Make – Install and maintain multiple tools and versions
According to Wikipedia ACLs: "...is a list of permissions attached to an object." "An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed on given objects." meaning that ACL is an "alternative" to the traditional unix permission system. Now, to the CLI! Setfacl help provides: … Continue reading setfacl – Setting file Access Control Lists (ACLs)
General aspects UCSC Genome Browser help page states that 2bit format is a ... highly efficient way to store genomic sequence. In general it: stores multiple fasta sequences; stores masking information; is compact; is randomly-accessible; can store up to 4Gb; 2bit file contains multiple fields that are organized into: Header Index Sequence records You can … Continue reading 2bit – Storing DNA