News
ModernBERT, like BERT, is an encoder-only model. Encoder-only models have the characteristic of outputting a 'list of numbers (embedding vector)', which means that they literally encode human ...
This means that BERT is pre-trained on a large corpus of unlabelled text including the entire Wikipedia (that's 2,500 million words!) and Book Corpus (800 million words).
Results that may be inaccessible to you are currently showing.
Hide inaccessible results