page

Aug 27, 2019

Index a string in linux

https://unix.stackexchange.com/questions/303960/index-a-string-in-bash

Substring Extraction
${string:position}
Extracts substring from $string at $position.
If the $string parameter is "*" or "@", then this extracts the positional parameters, starting at $position.
${string:position:length}
Extracts $length characters of substring from $string at $position.

stringZ=abcABC123ABCabc
#       0123456789.....
#       0-based indexing.

echo ${stringZ:0}                       # abcABC123ABCabc
echo ${stringZ:1}                       # bcABC123ABCabc
echo ${stringZ:7}                       # 23ABCabc 

echo ${stringZ:7:3}                     # 23A
                                        # Three characters of substring.


# Is it possible to index from the right end of the string?

echo ${stringZ:-4}                      # abcABC123ABCabc
# Defaults to full string, as in ${parameter:-default}.
# However . . . 

echo ${stringZ:(-4)}                    # Cabc
echo ${stringZ: -4}                     # Cabc
# Now, it works.
# Parentheses or added space "escape" the position parameter.


Aug 1, 2019

Extract Reads From a Bam File That Fall Within A Given Region

https://www.biostars.org/p/48719/


# -h : include header
# file should be indexed.bam


samtools view -h input.indexed.bam "Chr1:10000-20000" > output.sam

Jul 26, 2019

python numpy.isin - implement of matlab's ismember(A, B)

https://docs.scipy.org/doc/numpy/reference/generated/numpy.isin.html?highlight=numpy%20isin#numpy.isin


mask = np.isin(A, B)
idx=np.nonzere(mask) # return index of element A which is in element B
 
numpy.isin(element, test_elements, assume_unique=False, invert=False)[source]
Calculates element in test_elements, broadcasting over element only. Returns a boolean array of the same shape as element that is True where an element of element is in test_elements and False otherwise.
Parameters:
element : array_like
Input array.
test_elements : array_like
The values against which to test each value of element. This argument is flattened if it is an array or array_like. See notes for behavior with non-array-like parameters.
assume_unique : bool, optional
If True, the input arrays are both assumed to be unique, which can speed up the calculation. Default is False.
invert : bool, optional
If True, the values in the returned array are inverted, as if calculating element not in test_elements. Default is False. np.isin(a, b, invert=True) is equivalent to (but faster than) np.invert(np.isin(a, b)).
Returns:
isin : ndarray, bool
Has the same shape as element. The values element[isin] are in test_elements.

Linux comm command

http://www.unixcl.com/2009/08/linux-comm-command-brief-tutorial.html


From COMM(1) man page, the options available are:

-1 suppress lines unique to FILE1
-2 suppress lines unique to FILE2
-3 suppress lines that appear in both files

comm - compare two sorted files line by line

comm <(sort a.txt) <(sort b.txt)

Jul 11, 2019

Sed Command in Linux

Replacing all the occurrence of the pattern in a line : The substitute flag /g (global replacement) specifies the sed command to replace all the occurrences of the string in the line.
$sed 's/apple/lemon/g' test.txt

https://www.tecmint.com/linux-sed-command-tips-tricks/


https://www.geeksforgeeks.org/sed-command-in-linux-unix-with-examples/

https://likegeeks.com/sed-linux/

Jul 5, 2019

Redirect stderr and stdout in Bash

https://stackoverflow.com/questions/637827/redirect-stderr-and-stdout-in-bash


You can redirect stderr to stdout and the stdout into a file:
some_command >file.log 2>&1 

To append file
echo "foo" 1>> bar.txt 2>&1