-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathMakeValues.py
More file actions
37 lines (28 loc) · 1.55 KB
/
MakeValues.py
File metadata and controls
37 lines (28 loc) · 1.55 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#This file takes information from the GenBank files (MarDSV.gbff) about each
#hypothetical protein and stores it in the ValuesList. The elements of the Values List
#then becomes the values of a dictionary in which keys are hypothetical
#protein IDs. This dictionary is used to create the Hypo.csv file.
from Bio import SeqIO
from Hypo import *
MarOutput = HypoLists('MarDSV.gbff') #HypoLists function from Hypo.py file
MarHypList = MarOutput[0] #565 features from MarDSV
MarProtIDList = MarOutput[1] #562 ProtIDs from MarHypList, without the 3 pseudo gene hypotheticals
TotalProtID = len(MarProtIDList)
ValuesList = [[] for each in range(TotalProtID)] #This is a list of where each element is a list of values for each protID, will become values of ProtID dictionary
OctOldHypoIDs = []
for each in OctOldHypos:
OctOldHypoIDs.append(each.qualifiers['protein_id']) #makes list of IDs that were hypothetical in March but got new products in Oct annotation
count = 0
for each in MarHypList: #iterate through MarHypList feature so we can extract product, tag info
if 'protein_id' in each.qualifiers:
ValuesList[count].append(each.qualifiers['locus_tag'][0])
if 'old_locus_tag' in each.qualifiers:
ValuesList[count].append(each.qualifiers['old_locus_tag'][0])
else:
ValuesList[count].append('NULL')
if each.qualifiers['protein_id'] in OctOldHypoIDs:
Index = OctOldHypoIDs.index(each.qualifiers['protein_id'])
ValuesList[count].append(OctOldHypos[Index].qualifiers['product'][0])
else:
ValuesList[count].append('hypothetical')
count += 1