Skip to content
Discussion options

You must be logged in to vote

batch_size=16, block_size=8192, max_iters=50000, eval_interval=1, learning_rate=3e-4, n_embd=6144, n_head=48, n_layer=48, dropout=0.0 . The tokenizer must use GPT4 i think- BPE o200k_base

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by Eamon2009
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants