[논문 리뷰] Dropout as a Bayesian Approximation 설명

실제 thesis에 있는 문구를 인용한 것이다.
uncertainty를 예측하는데, 작고 가벼운 model보다는 over-parametrized model이 더 우수했다는 것이다.
- over-parametrized model이란 parameter가 많은 복잡하고 무거운 모델을 의미한다.
그에 대한 해석으로는, parameter의 갯수가 많을수록 그만큼 uncertainty에 대한 자유도가 높기 때문이다라고 말하고 있다.
(원문) Models with a large number of parameters can capture a larger class of functions, leading to more ways of explaining the data, and as a result larger uncertainty estimates further from the data.

dropout ratio는 uncertainty 예측에 있어서 중요한 부분이다.
모델이 커지면 커질수록 더 많은 dropout ratio를 주어야한다.
왜냐하면, 큰모델일수록 dropout이 잘 작용하지 않기 때문이다.
NN의 layer나 node가 많고 이를 regularization해야하는 dropout이 걸리면, 나머지 NN의 layer나 node가 그 regularization을 충당하려하기 때문이다.
여기서 적정한 dropout ratio는 grid-search로 찾았다는 후문이다.

Uncertainty를 결정하는데 중요한 요인은 model structure, model prior, approximating distribution이다.
- Model Structure: ResNet, AlexNet, LeNet, ...
- Model Prior: p(f), p(W)
- Approximating Distribution: NN의 weight에 대한 분포를 구하기 위한 방법들(Dropout, PBP)

여기까지가 논문을 정리한 내용이다.

내가 많은 논문을 읽어보진 않았지만, 매우 힘들어하면서 공부했던 기억이다.

실제로 이걸 처음 접하는 분들은 얼마나 이해가 되실지 감이 잘 안온다.

최대한 풀었쓴다고 쓴 것인데 그만큼 잘 이해가 되셨으면 하는 바람이다.

되도록이면, 최성준 박사님이이나 문인철 교수님의 edwith 강의를 추천하는 바이다.

그분들의 설명이 사실 그렇게 친절한 것은 아니지만, 그만한 설명도 없을 것이다.

이 글을 읽는 분들은 나와 같은 시행착오를 덜 겪기를 바라며 글을 마치도록 하겠다.

728x90

[논문 실습] Dropout as a Bayesian Approximation 실습 코드 - pytorch ver (2)	2021.04.20
[베이지안/pytorch] Bayesian Neural Network 코딩해보기 by torchbnn package (0)	2021.02.04
[논문 리뷰] Dropout as a Bayesian Approximation 설명 - 6.Experiment (1)	2021.02.01
[논문 리뷰] Dropout as a Bayesian Approximation 설명 - 5.Methodolgy (4)	2021.02.01
[논문 리뷰] Dropout as a Bayesian Approximation 설명 - 4.Related Research (3): Background(Bayesian Neural Network, Variational Inference, Re-parameterization trick) (0)	2021.01.30

'개인 공부 정리/Bayesian' Related Articles

Comments

끄적거림