Problems running sci

Dear author,

Thanks for developing this package for calling sub-compartments. I was trying to run it on GM12878 dataset but met problems as described below. Maybe you can help with the setting.

My pipeline:
1) dump inter-chr matrix at 100Kb:

> hic_source='https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined.hic'
> hic2sci.sh $hic_source $dir_main/inter_100kb.txt 100000

2) run sci

> python2.7 -m sci.sci -n GM12878_100kb -f $dir_main/inter_1000kb.txt -r 100000 -g chromosome_sizes/hg19.chrom.sizes -o both -s 1 -k 5

Then I got this error (using a server with 500Gb memory):

> Reading inter_100kb_new.txt:   0%|                                                                                               | 149529/577250954 [00:00<57:29, 167287.38it/s]
> Traceback (most recent call last):
>   File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
>     "__main__", fname, loader, pkg_name)
>   File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
>     exec code in run_globals
>   File "3.Packages/sci/sci/sci.py", line 112, in <module>
>     run_sci()
>   File "/3.Packages/sci/sci/sci.py", line 97, in run_sci
>     myobject.load_interaction_data(oArgs.infile)
>   File "sci/hic.py", line 68, in load_interaction_data
>     start2, end2, count) = line.strip().split()
> ValueError: too many values to unpack
> 


Then I dumped the matrix at 1000kb and tried to do the analysis at 1Mb resolution:

> python2.7 -m sci.sci -n GM12878_1000kb -f $dir_main/inter_1000kb.txt -r 1000000 -g chromosome_sizes/hg19.chrom.sizes -o both -s 1 -k 5

The compartments I obtained seems to mix up randomly with the one in predictions/GM12878_SCI_sub_compartments.bed

![Screenshot 2020-11-17 at 20 19 55](https://user-images.githubusercontent.com/18595553/99437104-4f35ad00-2912-11eb-8ab3-cb13d7a3b721.png)

I also tried to find only two compartments, i.e. A and B:
> python2.7 -m sci.sci -n GM12878_1000kb -f $dir_main/inter_1000kb.txt -r 1000000 -g chromosome_sizes/hg19.chrom.sizes -o both -s 1 -k 2

Still, the compartments I obtained seems to mix up randomly with the one in predictions/GM12878_SCI_sub_compartments.bed

![Screenshot 2020-11-17 at 20 23 38](https://user-images.githubusercontent.com/18595553/99437470-ca975e80-2912-11eb-890a-ec4e7caa8fdf.png)

And it is not consistent with the A/B compartment annotated by the eigenvector 

![Screenshot 2020-11-17 at 20 42 25](https://user-images.githubusercontent.com/18595553/99439501-8bb6d800-2915-11eb-8a86-c2961df9902f.png)

Thanks for your help!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems running sci #5

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Problems running sci #5

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions