- Create code to develop an index that maps protein_ids to the source file. Primary key should be protein_id.
Alongside #9, this code will enable much quicker development of the parsed ENA tab file by skipping parsing unchanged assembly files (TOC table) but still getting the file's contents via reverse lookups on this index table and querying the associated protein_id row in the previous release's EFIDB.
Alongside #9, this code will enable much quicker development of the parsed ENA tab file by skipping parsing unchanged assembly files (TOC table) but still getting the file's contents via reverse lookups on this index table and querying the associated protein_id row in the previous release's EFIDB.