驗證集是訓練集的子集，還是測試集的子集？

1樓：AI森

驗證集可以取自測試集，在雪梨科技大學鄭哲東的Person re-ID中，驗證集來自於測試集的一部分，https://

這是Github上的鏈結

2樓：

For this reason, no example from the test set can be used in the validation set. Therefore, we always construct the validation set from the training data. Specifically, we split the training data into two disjoint subsets.

One of these subsets is used to learn the parameters. The other subset is our validation set, used to estimate the generalization error during or after training, allowing for the hyperparameters to be updated accordingly.

——《Deep Learning》, Ian Goodfellow & Yoshua Bengio & Aaron Courville,

Chapter 5.3 Hyperparameters and Validation Sets

引用文字的意思是：

測試集的資料不能參與任何模型的選擇中，包括超引數的設定，所以驗證集的資料不能從測試集裡選。而是把訓練集分為兩部分，一部分用來學習引數（w向量），另一部分作為驗證集，用於估計訓練之後的泛化誤差，以便於我們修改事先選取的模型超引數。

故而，驗證集是訓練集的子集。

3樓：

這不矛盾啊？驗證集取自訓練集帶有監督資訊，測試集通常沒有標籤，或者有標籤你也不能看。驗證集之所以要盡量和測試集保證同分布，是為了讓你在驗證集上調好的超引數在測試集上能有最優表現，但實際中你能不能找到這樣乙個同分布的驗證集來使用是另外一回事，當然也是一件很困難的事。

驗證集是訓練集的子集，還是測試集的子集？

能不能只要訓練集和測試集，不要驗證集呢？

測試集precision 大於訓練集說明什麼？

將資料集分為訓練集和測試集，訓練集上網格搜尋調參，得到最優引數，能用到在訓練集交叉驗證上嗎？

其他用戶還看了：