...
Note that the input files here aren't technically XML files themselves, since they don't begin with an XML declaration, but this is a minor technical point.
Graphing a histogram
Graphing a histogram in CGView using each position as a datapoint will probably take lots of computational time, with marginal benefits to resolution. Averaging over some region of positions is probably a more efficient way of getting a histogram. The following script takes a table of position counts (as a text file), and outputs an XML fragment of the counts averaged over regions of length window_size
which can be used as input to makeMap.sh
above.
The position count table must have the form
<column 1> <column 2, probably position> <int> <int> ... <int>
<column 1> <column 2, probably position> <int> <int> ... <int>
...
where each <int>
is a count.
Expand |
---|
title | Script to average a count table |
---|
|
Code Block |
---|
|
#! /usr/bin/env python
import sys
try:
import argparse
except:
print 'Try typing "module load python" and then running this again.'
sys.exit(1)
def main():
parser = argparse.ArgumentParser(description='Averages read depth along chromosome positions from a number of samples.')
parser.add_argument('count_table')
args = parser.parse_args()
window_table = []
window_size = 250
window_sum = 0
line_count = 0
with open(args.count_table, 'r') as count_table:
for line in count_table:
fields = line.split()
count_fields = [int(x) for x in fields[2:]]
window_sum += sum(count_fields)
line_count += 1
if (line_count % window_size) == 0:
window_table.append([line_count-window_size+1, line_count, window_sum])
window_sum = 0
if (line_count % window_size) != 0:
window_table.append([line_count - (line_count % window_size) + 1, line_count, window_sum])
max_count = 0
for window in window_table:
if max_count < window[2]:
max_count = window[2]
for window in window_table:
window[2] = float(window[2])/max_count
print '<featureSlot strand="direct">'
print '<feature color="purple" decoration="arc">'
for window in window_table:
print '<featureRange start="{}" stop="{}" proportionOfThickness="{}" />'.format(window[0], window[1], window[2])
print '</feature>'
print '</featureSlot>'
if __name__ == '__main__':
main()
|
|