Showing posts with label unix. Show all posts
Showing posts with label unix. Show all posts

Monday, July 6, 2015

GNU Guix for easily managing bioinformatics software ( or any other software )

We are using GNU Guix to manage bioinformatics software that are installed on our HPC and also on the individual machines. By now, there are 54 bionformatics related packages in Guix repository, and the number is growing. The list includes many NGS related software such as samtools, bowtie, STAR, sra-tools etc. The full list is here:
http://guix.mdc-berlin.de/packages?/?search=bioinfo

Why use Guix ?
GNU Guix is a package managing tool, that can build from source with dependencies, set up user environments and can upgrade/downgrade installed packages.

- Easy to install and maintain packaged software. One line on the terminal can update or install a package.
- Guix creates an isolated yet shared environment. Each piece of software is installed in /gnu/store and does not alter global state and the user profiles link to this directory.
- User profiles are independent from each other. You can even create a specific user profile per project and this helps with reproducibility.
- Guix can be used in an HPC environment or individual machines connected to the same network (described here)


How to install software using Guix
If you have GNU Guix set up, installing pre-packaged software is super easy. And we have most of the popular tools already packaged. For example, below is how to install samtools.

guix package -i samtools

Setting up GNU Guix is also easy. Just follow the instructions provided by Ricardo Wurmus (who packaged most of the bioinformatics software for Guix by the way)
http://elephly.net/posts/2015-06-21-getting-started-with-guix.html


How to contribute Guix and/or Guix bioinformatics packages.
The suggested way to contribute to upstream is to use a git checkout, make changes there, create a patch with git format-patch and send that to guix-devel@gnu.org (https://lists.gnu.org/mailman/listinfo/guix-devel)
There's also a contribution page:http://www.gnu.org/software/guix/contribute/
Luckily, getting started with development is relatively simple:
  • git clone git://git.savannah.gnu.org/guix.git
  • if you already have guix do guix environment guix to load up everything you need to build guix from source
  • ./bootstrap./configure (point --localstatedir to /var to make it use the same state as your binary installation of guix), make
  • edit package modules in gnu/packages/
  • run ./pre-inst-env guix lint to find common problems (pre-inst-env makes guix use the packages in the local directory)
  • build the new package with ./pre-inst-env guix build my-package
  • if everything works fine: commit and create a patch with git format-patch -1
  • send the file to the mailing list guix-devel@gnu.org for review

Friday, December 14, 2012

Sending commands from Notepad++ to a remote R session

If you have your working environment set up in a Windows operating system, it can be a bit of a hassle to work with R sessions on remote Linux servers.

I use WinSCP + Notepad++ to handle my projects and Putty + screen to handle the R sessions. It becomes tiring to use the mouse to move the code from Notepad to the Putty all the time. Thankfully the amazing AutoHotkey comes to the rescue.

Using AutoHotkey it is possible to set up a hotkey macro which will copy the code that you want from the Notepad and paste it into Putty.

The following code does exactly that:

!q::
Send {End 2}+{Home}^c{End 2}{Down}{End 2}
WinActivate, root
Send +{Ins}{Enter}
SetTitleMatchMode, 2
WinActivate, Notepad
return


Let's break it apart.

!q:: - defines the hotkey as alt-q

Send {End 2}+{Home}^c{End 2}{Down}{End}
 - selects the current line of code and copies the stuff into clipboard. It does that by brute force manipulation of the pointer. Firstly it moves the pointer to the end of the current line, does a shift + home, which selects the text. ^c copies the text to the clipboard, and the last two commands put the pointer to the end of the next line of code. The reason for doing the End command twice is that the macro becomes confused if the word wrap is on.


WinActivate, root - the winactivate command selects any window by it's name. When I log onto my remote server the Putty window is named root, so basically this line just switches to the terminal.

Send +{Ins}{Enter} - pastes the line of code from the clipboard and executes the line

SetTitleMatchMode, 2 - this command regulates the pattern matching for the WinActivate command. 2 means that the pattern can be found anywhere in the name of the window. The default is 1 which means that pattern must be found at the beginning of the name of the window.

WinActivate, Notepad - switches us back to the Notepad 

return - ends the macro


For the code to work, you need to install the AutoHotkey, create a file that has the .ahk extension, copy the code into the file, and run it.

I hope this will save you some time!   


 

Tuesday, November 30, 2010

removing all the files that belong to a username

In UNIX, you can remove all the files that are created by a user by using a the combination of "rm" and "find" commands.


The command below removes all the files belong to the user "Karriem" at /tmp folder. But it doesn't remove directories or the files inside the directories.

rm `find /tmp -user Karriem -maxdepth 1`

Tuesday, November 23, 2010

Passing shell variables to AWK

If you have a shell variable in a bash script you can't pass it to AWK just by putting "$" sign in front of it, but you can enclose them with "'" in AWK code and they will be used in AWK with no problem.

for example you have a bed file called "example.bed":

$ cat example.bed
chr1     1000     2000   id1
chr1     4000     5000   id2
chr1     5500     6000   id3

Let's say you want to concatenate a string (in this case "brain_" string) to column 4 of this file, you can do this in AWK as follows:

$ awk '{OFS="\t";$4="brain_"$4; print;}' example.bed
chr1     1000     2000   brain_id1
chr1     4000     5000   brain_id2
chr1     5500     6000   brain_id3

however if you store the string in a variable as follows in the terminal or in a bash script:

$ TISSUE="brain_"

the following will not work,
$ awk '{OFS="\t";$4=$TISSUE$4; print;}' example.bed

 but this will :

$ awk '{OFS="\t";$4="'"$TISSUE"'"$4; print;}' example.bed
chr1     1000     2000   brain_id1
chr1     4000     5000   brain_id2
chr1     5500     6000   brain_id3

check out for details and other ways to do this at:
http://www.tek-tips.com/faqs.cfm?fid=1281

Saturday, November 20, 2010

Allowing multi-hop ssh

If you are trying to reach a server only accessible through another server, you will need to use ssh twice. This might cause mild irritation. Luckily, there is a recipe that can make things easier.

Assuming we are trying to reach hostnameB through hostnameA, add the following lines (after you put appropriate values for hostnames) to your SSH configuration in ~/.ssh/config

Host hostnameA
ProxyCommand ssh hostnameB nc hostnameA 22


For this to work, netcat needs to be installed on hostnameB, but many new systems have it, so you may have that too. Now, if you type, "ssh hostnameB" automatically you will first ssh to hostnameA and then hostnameB.

Stay logged in the server with ssh

Add the following lines to the ~/.ssh/config file and you will stay logged in the server. If you are trying to connect to your work machines from home, this might be a useful trick.

Host *
ServerAliveInterval 120
ServerAliveCountMax 3

Friday, November 19, 2010

Sending e-mail from command line (terminal) in unix/linux

Useful for people who want to have status updates on their scripts. You can send e-mails from unix terminal. I use it to report status on the scripts I'm running. If they are successfully executed or crashed, for example.

mail -s "script finished" fool@bs.com < file.txt
The line above sends a mail to fool@bs.com titled "script finished" and with the contents of file.txt. You can omit the file.txt part and send a small piece of content using "echo" and "|", like this:
echo "ABYSS 3rd  run finished" | mail -s "ABYSS run" fool@bs.com

this sends "ABYSS 3rd run finished" as the content of the e-mail.

Getting nth line in a file

Ever wondered what is the nth line of a file without using a text editor ?
Here is something you may use in a unix environment.

sed -n '5 p' file1.txt

this sed one-liner outputs the 5th line of the file1.txt