Changeset 5769
- Timestamp:
- Dec 3, 2019, 9:38:37 AM (3 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
other/pipeline/trunk/mut_stats.py
r5767 r5769 4 4 5 5 Script that calculates statistics for variants in a list of VCF files. 6 The script requries a tab-separated text file as input. Each line 7 should have two columns: 6 The script need two parameters: 7 8 1: Path to text file with list of VCF files to process (se below) 9 2: Path to progress reporting file 10 11 The text file given as the first parameter should be a tab-separated 12 text file. Each line should have three columns: 8 13 9 14 1: Patient identifier 10 15 2: Path to VCF file 16 3: Name of alignment (used for progress reporting) 11 17 12 18 Counts and frequencies for all variants will calculated for the total and … … 24 30 import gzip 25 31 import datetime 32 import time 26 33 27 34 # Store all variants that we have seen in the VCF files we load … … 78 85 vcf.close() 79 86 87 # Report progress to the progressFile (1-90%) 88 def progressReport(progressFile, alignment, current, total): 89 percent = 1+current * 89 / total 90 with open(progressFile, 'w') as p: 91 p.write("{0} Reading VCF file for '{1}' ({2} of {3})".format(percent, alignment, current, total)) 92 80 93 # Read the patient/VCF list 81 94 # Each line should be tab-separated <PAT>\t<PATH-TO-VCF> … … 88 101 lines.sort() 89 102 103 progressFile = sys.argv[2] 104 90 105 # Load the VCF files and count the variants in them 106 vcfCount = 0 107 totalVcf = len(lines) 108 nextProgressReport = time.time()+15 109 91 110 for line in lines: 92 cols = line.split('\t') 111 cols = line.split('\t') 112 vcfCount += 1 113 # Report progress every 15 seconds 114 if time.time() > nextProgressReport: 115 progressReport(progressFile, cols[2], vcfCount, totalVcf) 116 nextProgressReport = time.time()+15 93 117 loadVcf(cols[1], cols[0]) 94 118
Note: See TracChangeset
for help on using the changeset viewer.