Channel training in reconfigurable intelligent surfaces (RIS) is different from MIMO; in addition to pilots, it also requires setting RIS training states. Several RIS training schemes are studied in the literature, but the effect of training overhead and accuracy on capacity has not received the deserved attention. This is necessary for guiding the choice of RIS dimensionality and the parameters of modulation and coding. The present work fills this gap and shows that the spectral efficiency of RIS with DFT training (that uses the columns of DFT matrix for the RIS training states) is higher than that with the canonical training (that uses the canonical basis for RIS training states). Specifically, with 32 RIS elements, DFT training achieves a 2 bits/s/Hz gain compared to canonical training when the coherence interval is 150 time slots. Our results also reveal that beyond a certain critical RIS array size, further increase in size is not beneficial and that the optimal array size goes down with increase in SNR.