Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
DesaiAshu
2mo ago
0 comments
Save
Share
data bandwidth limits distributed training under current architectures. really interesting implications if we can make progress on that
0 comments
3 comments · 2 top-level
top
newest
oldest
andoando
2mo ago
· 1 in thread
What bandwith limits? Im assuming the forward and backward passes have to be done sequentially?
DesaiAshu
OP
2mo ago
Yes also passing data within each layer
dogcomplex
2mo ago
Limits but doesn't prohibit. See
https://www.primeintellect.ai/blog/intellect-3
- still useful and can scale enormously. Takes a particular shape and relies heavily on RL, but still big.
j
/
k
navigate · click thread line to collapse