송신일 | Notion

PPT 한 장 분량으로 논문 정리

CNN (1 layer) + Vpre :word2vec ⇒ 문장분류
little hyperparameter + static vector (‘universal’ feature extractors)⇒ excellent results
task-specific vectors + fine-tuning ⇒ performance gains
task-specific & static vector ⇒ 4 out of 7 tasks (감정분석, 질문분류)

=========================

< model hyperparameter >

activation function: ReLU(rectified linear units)
filter windows (h) = 3, 4, 5
feature maps = 100
dropout rate (p) = 0.5
L2 Norm constraint (s) = 3
mini-batch size =50

=========================

CNN-rand (무작위 초기화): not well
CNN-static: remarkably well
CNN-non-static: further improvements
Multichannel: results mixed

=========================

a ‘masking’ vector of Bernoulli random variables with probability p of being 1.