Implementing a ChatGPT-like LLM from scratch, step by step (opens in new tab)

(github.com)

739 pointsrasbt2y ago98 comments

98 comments

76 comments · 21 top-level

wslh2y ago· 11 in thread

I jumped to Github thinking this is would be a free resource (with all due respect to the author work).

What free resources are available and recommended in the "from scratch vein"?

natrys2y ago

Neural Networks: Zero to Hero[1] by Andrej Karpathy

[1] https://karpathy.ai/zero-to-hero.html

villedespommes2y ago

+1, Andrey is an amazing educator! I'd also recommend his https://youtu.be/kCc8FmEb1nY?si=mP0cQlQ4rcceL2uP and checking out his github repos. MinGPT, for example, implements a small gpt model that's compatible with HF API, whereas more modern nanoGPT shows how to use newer features such as flash attention. The quality of every video, every blog post is just so high.

larme2y ago

https://jaykmody.com/blog/gpt-from-scratch/ for a gpt2 inference engine in numpy

then

https://www.dipkumar.dev/becoming-the-unbeatable/posts/gpt-k... for adding a kv cache implementation

larme2y ago

I'd like to add that most of these text only talking about inference part. This book (I also purchased the draft version) has training and finetuning in the TOC. I assume it will include materials about how to do training and finetuning from scratch.

politelemon2y ago

I'd go with https://course.fast.ai/

It's much more accessible to regular developers, and doesn't make assumptions about any kind of mathematics background. It's a good starting poing after which other similar resources start to make more sense.

PheonixPharts2y ago

I honestly cannot fathom why anyone working in the AI space would find $50 too much to spend to gain a deeper insight into the subject. Creating educational materials requires an insane amount of work, and I can promise, no matter how successful this book is, if rasbt were do the math on income generated over hours spent creating it wouldn't make sense as an hourly rate.

Plenty of other people have this understanding of these topics, and you know what they chose to do with that knowledge? Keep it to themselves and go work at OpenAI to make far more money keeping that knowledge private.

If you want to live in a world where this knowledge is open, at the very least refrain from publicly complaining about a book that cost roughly the same as a decent dinner.

rasbtOP2y ago

Yeah, I don't think creating educational materials makes sense from an economical perspective, but it's one of my hobbies that gives me joy for some reason :). Hah, and 'insane amount of work' is probably right -- lots of sacrifices to carve out that necessary time.

layer82y ago

> anyone working in the AI space

I would have expected the main target audience to be people NOT working in the AI space, that don’t have any prior knowledge (“from scratch”), just curious to learn how an LLM works.

wslh2y ago

Not talking about affordability but about following links thinking that I would find another kind of resource. Beyond this case, this happens all the time with click-baity content. Again, if the link was to Amazon or the editors it will be clear associated with a product while Github is associated with open source content. Not being pedantic, just an observation browsing the web.

1 more reply

_giorgio_2y ago

Not to be pedantic, but in this case it's probably 30 usd for print and ebook (there are always coupons on the manning website).

rasbtOP2y ago

I added notes to the Jupyter notebooks, I hope they are also readable as standalone from the repo.

npalli2y ago· 9 in thread

  import torch

From the first code sample, not quite from scratch :-)

rasbtOP2y ago

Lol ok, otherwise it would probably be not very readable due to the verbosity. The book shows how to implement LayerNorm, Softmax, Linear layers, GeLU etc without using the pre-packaged torch versions though.

PheonixPharts2y ago

Automatic differentiation is why we are able to have complex models like transformers, it's arguably the key reason (in addition to large amounts of data and massive compute resources) that we have the revolution in AI that we have.

Nobody working in this space is hand calculating derivatives for these models. Thinking in terms of differentiable programming is a given and I think certainly counts as "from scratch" in this case.

Any time I see someone post a comment like this, I suspect the don't really understand what's happening under the hood or how contemporary machine learning works.

re2y ago

> Thinking in terms of differentiable programming is a given and I think certainly counts as "from scratch" in this case.

I have to disagree on that being an obvious assumption for the meaning of "from scratch", especially given that the book description says that readers only need to know Python. It feels like if I read "Crafting Interpreters" only to find that step one is to download Lex and Yacc because everyone working in the space already knows how parsers work.

> I suspect the don't really understand what's happening under the hood or how contemporary machine learning works.

Everyone has to start somewhere. I thought I would be interested in a book like this precisely because I don't already fully understand what's happening under the hood, but it sounds like it might not actually be a good starting point for my idea of "from scratch."

2 more replies

d0mine2y ago

Nobody writes code in terms of Nands but there is Nand to Tetris course ("The Elements of Computing Systems: Building a Modern Computer from First Principles" book) https://www.nand2tetris.org

pytorch to LLMs has a lot to show even without Python to pytorch part. It reminds me of "Neural Networks: Zero to Hero" Andrej Karpathy https://m.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9... Prerequisites: solid programming (Python), intro-level math (e.g. derivative, gaussian). https://karpathy.ai/zero-to-hero.html

1 more reply

schneems2y ago

I’m very comfortable with AI in general but not so much with Machine Lesrning. I understand transformers are a key piece of the puzzle that enables tools like LLMs but don’t know much about them.

Do you (or others) have good resources explaining what they are and how they work at a high level?

1 more reply

nerdponx2y ago

I don't think implementing autograd is relevant or in-scope for learning about how transformers work (or writing out the gradient for transformer by hand, I can't even imagine doing that).

iopq2y ago

To code from scratch you should first fab your own semiconductors

politelemon2y ago

They should probably

    import universe

first.

two_in_one2y ago

at least it wasn't

   from transformers import

whartung2y ago· 8 in thread

Can I use any of the information in this book to learn about reinforcement learning?

My goal is to have something learn to land, like a lunar lander. Simple, start at 100 feet, thrust in one direction, keep trying until you stop making craters.

Then start adding variables, such as now it's moving horizontally, adding a horizontal thruster.

next, remove the horizontal thruster and let the lander pivot.

Etc.

I just have no idea how to start with this, but this seems "mainstream" ML, curious if this book would help with that.

Buttons8402y ago

I enjoyed "Grokking Deep Reinforcement Learning"[0]. It doesn't include anything about transformers though. Also, see Python's gymnasium[1] library for a lunar lander environment, it's the one I focused on most while I was learning and I've solved it a few different ways now. You can also look at my own notebook I used when implementing Soft Actor Critic with PyTorch not too long ago[2], it's not great for teaching, but maybe you can get something out of it.

[0]: https://www.manning.com/books/grokking-deep-reinforcement-le... [1]: https://gymnasium.farama.org/environments/box2d/ [2]: https://github.com/DevJac/learn-pytorch/blob/main/SAC.ipynb

PheonixPharts2y ago

Reinforcement learning is an entirely separate area of research from LLMs and, while often seen as part of ML (Tom Mitchell's classic Machine Learning has a great section on Q learning, even if it feels a bit dated in other areas) it has little to do with contemporary ML work. Even with things like AlphaGo, what you find is basically work in using deep neural networks as an input into classic RL techniques.

Sutton and Barto's Reinforcement Learning: An Introduction is widely considered a the definitive intro to the topic.

rasbtOP2y ago

Sorry, in that case I would rather recommend a dedicated RL book. The RL part in LLMs will be very specific to LLMs, and I will only cover what's absolutely relevant in terms of background info. I do have a longish intro chapter on RL in my other general ML/DL book (https://github.com/rasbt/machine-learning-book/tree/main/ch1...) but like others said, I would recommend a dedicated RL book in your case.

thatguysaguy2y ago

Try OpenAI's spinning up: https://spinningup.openai.com/en/latest/

Buttons8402y ago

This is a good and short introduction to RL. The density of the information in Spinning Up was just right for me and I think I've referred to it more often than any other resource when actually implementing my own RL algorithms (PPO and SAC).

If I had to recommend a curriculum to a friend I would say:

(1) Spend a few hours on Spinning Up.

(2) If the mathematical notation is intimidating, read Grokking Deep Reinforcement Learning (from Manning), which is slower paced and spends a lot of time explaining the notation itself, rather than just assuming the mathematical notation is self-explanatory as is so often the case. This book has good theoretical explanations and will get you some running code.

(3) Spend a few hours with Spinning Up again. By this point you should be a little comfortable with a few different RL algorithms.

(4) Read Sutton's book, which is "the bible" of reinforcement learning. It's quite approachable, but it would be a bit dry and abstract without some hands-on experience with RL I think.

sorenjan2y ago

That's exactly what the Q-learning lab in this course does:

https://www.ida.liu.se/~TDDC17/info/labs/rl.en.shtml

smokel2y ago

This book seems to focus on large language models, for which RLHF is sometimes a useful addition.

To learn more about RL, most people would advise the Sutton and Barto book, available at: http://incompleteideas.net/book/the-book-2nd.html

Buttons8402y ago

I would recommend this as a second book after reading a "cookbook" style book that is more focused on getting real code working. After some hands-on experience with RL (whether you succeed or fail), Sutton's book will be a lot more interesting and approachable.

malermeister2y ago· 5 in thread

How does this compare to the karpathy video [0]? I'm trying to get into LLMs and am trying to figure out what the best resource to get that level of understanding would be.

[0] https://www.youtube.com/watch?v=kCc8FmEb1nY

rasbtOP2y ago

Haven't fully watched this but from a brief skimming, here are some differences that the book has:

- it implements a real word-level LLM instead of a character-level LLM

- after pretraining also shows how to load pretrained weights

- instruction-finetune that LLM after pretraining

- code the alignment process for the instruction-finetuned LLM

- also show how to finetune the LLM for classification tasks

- the book it overall has a lots of figures. For Chapter 3, there are 26 figures alone :)

The video looks awesome though. I think it's probably a great complementary resource to get a good solid intro because it's just 2 hours. I think reading the book will probably be more like 10 times that time investment.

malermeister2y ago

Thank you for the answer! What is the knowledge that your book requires? If I have a lot of software dev experience and sorta kinda remember algebra from uni, would it be a good fit?

1 more reply

_giorgio_2y ago

You can't understand it unless you already know most of the stuff.

I've watched it many times to understand well most of it.

And obviously you must already know pytorch really well, including the matrix multiplication, backpropagation etc. He speaks very fast too...

hadjian2y ago

Did you really watch all videos in the playlist? I am at video 4 and had no background in PyTorch or numpy.

In my opinion he covers everything needed to understand his lectures. Even broadcasting and multidimensional indexing with numpy.

Also in the first lecture you will implement your own python class for building expressions including backprop with an API modeled after PyTorch.

IMHO it is the second lecture I can recommend without hesitation. The other is Gilbert Strang on linear algebra.

2 more replies

tayo422y ago

He has like 4 or 5 videos that can be watched before that one where all of that is covered. He goes over stuff like writing back prop from scratch and implementing layers without torch.

1 more reply

turnsout2y ago· 4 in thread

This looks amazing @rasbt! Out of curiosity, is your primary goal to cultivate understanding and demystify, or to encourage people to build their own small models tailored to their needs?

rasbtOP2y ago

I'd say my primary motivation is an educational goal, i.e., helping people understand how LLMs work by building one. LLMs are an important topic, and there are lots of hand-wavy videos and articles out there -- I think if one codes an LLM from the ground up, it will clarify lots of concepts.

Now, the secondary goal is, of course, also to help people with building their own LLMs if they need to. The book will code the whole pipeline, including pretraining and finetuning, but I will also show how to load pretrained weights because I don't think it's feasible to pretrain an LLM from a financial perspective. We are coding everything from scratch in this book using GPT-2-like LLM (so that we can load the weights for models ranging from 124M that run on a laptop to the 1558M that runs on a small GPU). In practice, you probably want to use a framework like HF transformers or axolotl, but I hope this from-scratch approach will demystify the process so that these frameworks are less of a black box.

pr337h4m2y ago

While pretraining a decent-sized LLM from scratch is not financially feasible for the average person, it is very much feasible for the average YC/VC backed startup (ignoring the fact that it's almost always easier to just use something like Mixtral or LLaMa 2 and fine-tune as necessary).

>Introducing MPT-7B, the first entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k

https://www.databricks.com/blog/mpt-7b

turnsout2y ago

Thanks for such a thoughtful response. I'm building with LLMs, and do feel uncomfortable with my admittedly hand-wavy understanding of the underlying transformer architecture. I've ordered your book and look forward to following along!

1 more reply

teleforce2y ago

Hi Rasbt, thanks for writing the new guide and the upcoming book on LLM, another must buy book from Manning.

Just wondering are going to include any specific section or chapter in your LLM book on RAG? I think it will be very much a welcome addition for the build your own LLM crowd.

1 more reply

AndrewKemendo2y ago· 3 in thread

Writing a technical book in public is a level of anxiety I can’t imagine, so kudos to the author!

rasbtOP2y ago

It kind of is, but it's also kind of motivating :)

waynesonfire2y ago

It's actually less risky. The author may be able to reap the benefits of writing a book without actually finishing it. Ideally, maybe not much more than Chapter 1.

rasbtOP2y ago

I'd say that I've finished all of my previous books, and I have no intention of doing anything different here. Of course, there's always the chance that I get run over by a bus or equivalent, but in that case, I assume that Manning would find a replacement (as per contract) who finishes the book. I don't think there are any benefits to be reaped from not finishing.

canyon2892y ago· 2 in thread

For an additional resource I'm writing a guide book, though its in various stages of completion

The fine tuning guide is the best resource so far https://ravinkumar.com/GenAiGuidebook/language_models/finetu...

czechdeveloper2y ago

Such a great source of information. Thank you.

canyon2892y ago

Of course! Is there's anything in particular you're interested in or a topic you want me to cover let me know. This tech is powerful by itself. Hoping to empower people with knowledge of all this works too :)

photon_collider2y ago· 2 in thread

Bought a copy! Looking forward to reading it. :)

Is there a way for readers to give feedback on the book as you write it?

rasbtOP2y ago

Thanks for the support! There's the official Manning Forum for the book, but you are also welcome to use the Discussions page on the GitHub page.

_giorgio_2y ago

The book's forum on manning

iamcreasy2y ago· 2 in thread

Thank you for this endeavour.

Do you have an ETA for the completion of the book?

rasbtOP2y ago

The ETA for the last chapter is August if things continue to go well. It's usually available in the MEAP a few weeks after that, some time in September. And print version should be available early 2025 I think.

iamcreasy2y ago

I'll definitely buy it once released.

In the meantime, do you know any other free/paid resource that comes close to what you are trying to achieve with this book?

1 more reply

clueless2y ago· 2 in thread

are the code for chapter 4 through 8 missing?

rasbtOP2y ago

It's in progress still. I have most of the code working, but it's not organized into the chapter structure, yet. I am planning to add a new chapter every ~month (I wish I could do this faster, but I also have some other commitments). Chapter 4 will be either uploaded by the end of this weekend or by the end of next weekend.

_giorgio_2y ago

Depending on your level, it could take a lot of weeks to go through the already available material (code and pdf), so I'd suggest to purchase it anyway... It makes no sense to wait until the end, if you're interested in the subject.

Buttons8402y ago· 1 in thread

Question for the author:

I'm not interested in language models specifically, but there are techniques involved with language models I would like to understand better and use elsewhere. For example, I know "attention" is used in a variety of models, and I know transformers are used in more than just language models. Will this book help me understand attention and transformers well enough that I can use them outside of language models?

rasbtOP2y ago

The attention mechanism we implement in this book* is specific to LLMs in terms of the text inputs, but it's fundamentally the same attention mechanism that is used in vision transformers. The only difference is that in LLMs, you turn text into tokens, and convert these tokens into vector embeddings that go into an LLM. In vision transformers, instead of regarding images as tokens, you use an image patch as a token and turn those into vector embeddings (a bit hard to explain without visuals here). In both text or vision context, it's the same attention mechanism, and it both cases it receives vector embeddings.

(*Chapter 3, already submitted last week and should be online in the MEAP soon, in the meantime the code along with the notes is also available here: https://github.com/rasbt/LLMs-from-scratch/blob/main/ch03/01...)

towelpluswater2y ago· 1 in thread

Bought a copy! Your posts and newsletter content has been such a huge inspiration for me throughout 2023 - good luck, this is a huge effort!

rasbtOP2y ago

thanks for the kind words!

two_in_one2y ago· 1 in thread

As it's still work in progress may I suggest? It would be nice if you go beyond what others have already published and add more details. Like different position encodings, MoE, decoding methods, tokenization. As it's educational easy to use should be a priority, of course.

rasbtOP2y ago

Thanks, comparing positional encodings, MoEs, kv-caches etc are all good topics that I have in mind for either supplementary material and/or a follow-up book. The reason why it probably won't land in this current book is the length and time line. It's already going to be a big book as it is (400-500 pages). And I also want to be a bit mindful of the planned release date. However, these are indeed good suggestions.

kif2y ago· 1 in thread

Looks like just the kind of book I'd want to read. I bought a copy :)

rasbtOP2y ago

Glad to hear and thanks for the support. Chapter 3 should be in the MEAP soonish (submitted the draft last week). Will also upload my code for chapter 4 to GitHub soonish, in the next couple of days, just have to type up the notes.

theogravity2y ago· 1 in thread

Purchased the book. Really excited to read it!

rasbtOP2y ago

Thanks! And please don't hesitate to reach out via the Forum or the GitHub Discussions if you have any feedback or questions.

bosky1012y ago· 1 in thread

How was the process of pitching to Manning?

rasbtOP2y ago

That was pretty smooth. They reached out whether I was interested in writing a book for them (probably because of my other writings online), I mentioned what kind I book I want to write, submitted a proposal, and they liked that idea :)

Karupan2y ago· 1 in thread

Bought a copy. Good luck rasbt!

rasbtOP2y ago

Thanks :)

intalentive2y ago

The model architecture itself is really not too complex, especially with torch. The whole process is pretty straightforward. Nice feasible project.

SushiHippie2y ago

fyi probably qualifies as an "Show HN:"

ijustwanttovote2y ago

Wow, great info. Thanks for sharing.

corethree2y ago

Nowadays anyone can probably put together a good book about this topic by using an LLM.

j / k navigate · click thread line to collapse

98 comments

76 comments · 21 top-level

wslh2y ago· 11 in thread

I jumped to Github thinking this is would be a free resource (with all due respect to the author work).

What free resources are available and recommended in the "from scratch vein"?

natrys2y ago

Neural Networks: Zero to Hero[1] by Andrej Karpathy

[1] https://karpathy.ai/zero-to-hero.html

villedespommes2y ago

larme2y ago

https://jaykmody.com/blog/gpt-from-scratch/ for a gpt2 inference engine in numpy

then

https://www.dipkumar.dev/becoming-the-unbeatable/posts/gpt-k... for adding a kv cache implementation

larme2y ago

politelemon2y ago

I'd go with https://course.fast.ai/

PheonixPharts2y ago

If you want to live in a world where this knowledge is open, at the very least refrain from publicly complaining about a book that cost roughly the same as a decent dinner.

rasbtOP2y ago

layer82y ago

> anyone working in the AI space

I would have expected the main target audience to be people NOT working in the AI space, that don’t have any prior knowledge (“from scratch”), just curious to learn how an LLM works.

wslh2y ago

1 more reply

_giorgio_2y ago

Not to be pedantic, but in this case it's probably 30 usd for print and ebook (there are always coupons on the manning website).

rasbtOP2y ago

I added notes to the Jupyter notebooks, I hope they are also readable as standalone from the repo.

npalli2y ago· 9 in thread

  import torch

From the first code sample, not quite from scratch :-)

rasbtOP2y ago

PheonixPharts2y ago

Any time I see someone post a comment like this, I suspect the don't really understand what's happening under the hood or how contemporary machine learning works.

re2y ago

> Thinking in terms of differentiable programming is a given and I think certainly counts as "from scratch" in this case.

> I suspect the don't really understand what's happening under the hood or how contemporary machine learning works.

2 more replies

d0mine2y ago

Nobody writes code in terms of Nands but there is Nand to Tetris course ("The Elements of Computing Systems: Building a Modern Computer from First Principles" book) https://www.nand2tetris.org

1 more reply

schneems2y ago

I’m very comfortable with AI in general but not so much with Machine Lesrning. I understand transformers are a key piece of the puzzle that enables tools like LLMs but don’t know much about them.

Do you (or others) have good resources explaining what they are and how they work at a high level?

1 more reply

nerdponx2y ago

I don't think implementing autograd is relevant or in-scope for learning about how transformers work (or writing out the gradient for transformer by hand, I can't even imagine doing that).

iopq2y ago

To code from scratch you should first fab your own semiconductors

politelemon2y ago

They should probably

    import universe

first.

two_in_one2y ago

at least it wasn't

   from transformers import

whartung2y ago· 8 in thread

Can I use any of the information in this book to learn about reinforcement learning?

My goal is to have something learn to land, like a lunar lander. Simple, start at 100 feet, thrust in one direction, keep trying until you stop making craters.

Then start adding variables, such as now it's moving horizontally, adding a horizontal thruster.

next, remove the horizontal thruster and let the lander pivot.

Etc.

I just have no idea how to start with this, but this seems "mainstream" ML, curious if this book would help with that.

Buttons8402y ago

[0]: https://www.manning.com/books/grokking-deep-reinforcement-le... [1]: https://gymnasium.farama.org/environments/box2d/ [2]: https://github.com/DevJac/learn-pytorch/blob/main/SAC.ipynb

PheonixPharts2y ago

Sutton and Barto's Reinforcement Learning: An Introduction is widely considered a the definitive intro to the topic.

rasbtOP2y ago

thatguysaguy2y ago

Try OpenAI's spinning up: https://spinningup.openai.com/en/latest/

Buttons8402y ago

If I had to recommend a curriculum to a friend I would say:

(1) Spend a few hours on Spinning Up.

(3) Spend a few hours with Spinning Up again. By this point you should be a little comfortable with a few different RL algorithms.

(4) Read Sutton's book, which is "the bible" of reinforcement learning. It's quite approachable, but it would be a bit dry and abstract without some hands-on experience with RL I think.

sorenjan2y ago

That's exactly what the Q-learning lab in this course does:

https://www.ida.liu.se/~TDDC17/info/labs/rl.en.shtml

smokel2y ago

This book seems to focus on large language models, for which RLHF is sometimes a useful addition.

To learn more about RL, most people would advise the Sutton and Barto book, available at: http://incompleteideas.net/book/the-book-2nd.html

Buttons8402y ago

malermeister2y ago· 5 in thread

How does this compare to the karpathy video [0]? I'm trying to get into LLMs and am trying to figure out what the best resource to get that level of understanding would be.

[0] https://www.youtube.com/watch?v=kCc8FmEb1nY

rasbtOP2y ago

Haven't fully watched this but from a brief skimming, here are some differences that the book has:

- it implements a real word-level LLM instead of a character-level LLM

- after pretraining also shows how to load pretrained weights

- instruction-finetune that LLM after pretraining

- code the alignment process for the instruction-finetuned LLM

- also show how to finetune the LLM for classification tasks

- the book it overall has a lots of figures. For Chapter 3, there are 26 figures alone :)

malermeister2y ago

Thank you for the answer! What is the knowledge that your book requires? If I have a lot of software dev experience and sorta kinda remember algebra from uni, would it be a good fit?

1 more reply

_giorgio_2y ago

You can't understand it unless you already know most of the stuff.

I've watched it many times to understand well most of it.

And obviously you must already know pytorch really well, including the matrix multiplication, backpropagation etc. He speaks very fast too...

hadjian2y ago

Did you really watch all videos in the playlist? I am at video 4 and had no background in PyTorch or numpy.

In my opinion he covers everything needed to understand his lectures. Even broadcasting and multidimensional indexing with numpy.

Also in the first lecture you will implement your own python class for building expressions including backprop with an API modeled after PyTorch.

IMHO it is the second lecture I can recommend without hesitation. The other is Gilbert Strang on linear algebra.

2 more replies

tayo422y ago

He has like 4 or 5 videos that can be watched before that one where all of that is covered. He goes over stuff like writing back prop from scratch and implementing layers without torch.

1 more reply

turnsout2y ago· 4 in thread

This looks amazing @rasbt! Out of curiosity, is your primary goal to cultivate understanding and demystify, or to encourage people to build their own small models tailored to their needs?

rasbtOP2y ago

pr337h4m2y ago

https://www.databricks.com/blog/mpt-7b

turnsout2y ago

1 more reply

teleforce2y ago

Hi Rasbt, thanks for writing the new guide and the upcoming book on LLM, another must buy book from Manning.

Just wondering are going to include any specific section or chapter in your LLM book on RAG? I think it will be very much a welcome addition for the build your own LLM crowd.

1 more reply

AndrewKemendo2y ago· 3 in thread

Writing a technical book in public is a level of anxiety I can’t imagine, so kudos to the author!

rasbtOP2y ago

It kind of is, but it's also kind of motivating :)

waynesonfire2y ago

It's actually less risky. The author may be able to reap the benefits of writing a book without actually finishing it. Ideally, maybe not much more than Chapter 1.

rasbtOP2y ago

canyon2892y ago· 2 in thread

For an additional resource I'm writing a guide book, though its in various stages of completion

The fine tuning guide is the best resource so far https://ravinkumar.com/GenAiGuidebook/language_models/finetu...

czechdeveloper2y ago

Such a great source of information. Thank you.

canyon2892y ago

photon_collider2y ago· 2 in thread

Bought a copy! Looking forward to reading it. :)

Is there a way for readers to give feedback on the book as you write it?

rasbtOP2y ago

Thanks for the support! There's the official Manning Forum for the book, but you are also welcome to use the Discussions page on the GitHub page.