mean vs identity pooling?

Hi,

The paper describes four pooling functions: 1. Mean, 2. Identity, 3. Transformer, and 4. LSTM.

I am confused between ```mean``` and ```identity```. I follow that ```mean``` means simply average all the ```[CLS]``` embeddings for all the chunks which would result in a final ```[768]``` -dimensional vector. In this way, how would ```identity``` function work? Does it mean concatenating all ```[CLS]``` vectors and if so, wouldn't it turn into a very long vector like: ```number of chunks x 768``` ?

Any help in understanding this concept would be appreciated! 

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mean vs identity pooling? #15

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

mean vs identity pooling? #15

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions