Configurations? #49

Eamon2009 · 2026-05-23T13:06:46Z

Eamon2009
May 23, 2026
Maintainer

what hyperparameters are good in case of gpu cluster ?

May 23, 2026

batch_size=16, block_size=8192, max_iters=50000, eval_interval=1, learning_rate=3e-4, n_embd=6144, n_head=48, n_layer=48, dropout=0.0 . The tokenizer must use GPT4 i think- BPE o200k_base

View full answer

codeaddict-119 · 2026-05-23T13:11:56Z

codeaddict-119
May 23, 2026
Collaborator

batch_size=16, block_size=8192, max_iters=50000, eval_interval=1, learning_rate=3e-4, n_embd=6144, n_head=48, n_layer=48, dropout=0.0 . The tokenizer must use GPT4 i think- BPE o200k_base

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurations? #49

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Configurations? #49

Uh oh!

Eamon2009 May 23, 2026 Maintainer

Replies: 1 comment

Uh oh!

codeaddict-119 May 23, 2026 Collaborator

Eamon2009
May 23, 2026
Maintainer

codeaddict-119
May 23, 2026
Collaborator