贊助廠商

娛樂城推薦

首頁

刊登資訊

  • 刊登者:匿名
  • 時間:2021-06-03 09:00:15

尚未解答DataScience- Teacher Student Model Semi-supervised

DataScience- Teacher Student Model Semi-supervised

在 Billion-scale semi-supervised learning for image classification(Facebook AI Research)
當中有提到student model不將D與D-hat合起來train的原因:
Remark: It is possible to use a mixture of data in D and Dˆ for training like
in previous approaches [34]. However, this requires for searching for optimal
mixing parameters, which depend on other parameters. This is resource-intensi
ve in the case of our large-scale training. Additionally, as shown later in ou
r analysis, taking full advantage of large-scale un- labelled data requires ad
opting long pre-training schedules, which adds some complexity when mixing is
involved.

不太確定第一個原因searching for mixing parameters指的是?

及第二個原因 D+D-hat不是在training student model前就準備好了嗎?
為何會增加complexity

謝謝大家

--

0個答案 DataScience- Teacher Student Model Semi-supervised

其他問題

友站連結