undefined | Better HN

0 pointsgermanjoey3y ago0 comments

How big is this model? (i.e., how many parameters?) I can't find this anywhere.

0 comments

welp,

This report focuses on the capabilities, limitations, and safety properties of GPT-4. GPT-4 is a Transformer-style model [33 ] pre-trained to predict the next token in a document, using both publicly available data (such as internet data) and data licensed from third-party providers. The model was then fine-tuned using Reinforcement Learning from Human Feedback (RLHF) [34 ]. Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

j / k navigate · click thread line to collapse

0 comments

germanjoeyOP3y ago

welp,

j / k navigate · click thread line to collapse