Creating bigPsl track for UCSC genome browser

Starting point is to have a psl file. Its also needed to have in your path pslToBigPsl, bedToBigBed, bigPsl.as and chrom.sizes file. The Makefile to automatize it is: SHELL=/bin/bash .ONESHELL: PSL=$(wildcard *.psl) TXT:=$(addsuffix .txt, $(basename $(PSL))) BB:=$(addsuffix .bb,$(basename $(TXT))) .PHONY: all clear all: $(BB) %.txt: %.psl pslToBigPsl $< stdout | sort -k1,1 -k2,2n > $@ … Continue reading Creating bigPsl track for UCSC genome browser

Pandas : compare – Checking differences between DataFrames

When comparing DataFrames, compare is here to help. Imagine you have two different methods and you want to check the differences in results by comparing tables. import pandas as pd import numpy as np # Lets create two dataframes df1 = pd.DataFrame(np.array([[101, 102, 103], [201, 202, 203], [301, 302, 303]]), columns=['Value1', 'Value2', 'Value3'], index=['A1',"A2","A3"]) df1 … Continue reading Pandas : compare – Checking differences between DataFrames

Pandas : pipe – Tablewise function application

pipe is designed to help chaining function calls on DataFrames and Series. As showcase, lets grab ensembl genomes table and play with that. First, import libraries: # import libraries import pandas as pd import numpy as np # show full columns pd.set_option('display.max_colwidth', None) Now, get ensembl genomes table: # get ensembl genomes table colnames = … Continue reading Pandas : pipe – Tablewise function application

Conda environment and projects

A very important aspect of reproducible bioinformatics is to manage software, tools and environment properly. One interesting alternative for such difficult task is Conda. As stated in Conda's website: "Conda is an open source package management system and environment management system that runs on Windows, macOS and Linux." . I have been using conda environments … Continue reading Conda environment and projects