I am doing a few things at the moment:
I’m just writing this to see how code snippets work using markdown and jekyll…
Here is a simple thing I often do to summarize distributions of pieces of information from the INFO field of a vcf file…
Here I’ve extracted the contents of the tag ‘DP’, which contains the depth of coverage of the variant.
In this case I’m feeding a line of code to the system, “cl”, with
pipe(), and reading the results into R with
To break down “cl”: I use tabix to quickly access arbitrary chunks of VCF files, then pipe the output to grep to pull out the depth. The flag
-o extracts the regex match, and
-P allows me to use what’s called a ‘positive lookahead’. That’s the section
(?<=DP=). This requires the regex to match “DP=”, but doesn’t return it as part of the match. Then I match the characters in the field with
[^;]+, which matches one or more of any character except “;”, which is the field delimiter.
Here’s the result:
I haven’t posted in a while, but the first of several papers from my postdoctoral work in the Whitehead lab has been published!
Reid, Noah M., and Andrew Whitehead. “Functional genomics to assess biological responses to marine pollution at physiological and evolutionary timescales: toward a vision of predictive ecotoxicology.” Briefings in functional genomics (2015): elv060.