[1]: https://boards.4channel.org/g/thread/91848262#p91850335
[2]: https://boards.4channel.org/g/thread/91848262#p91849717
[3]: https://boards.4channel.org/g/thread/91848262#p91849855
[3]: https://boards.4channel.org/g/thread/91848262#p91850503
https://www.reddit.com/r/replika/
Hundreds of men (and yes women) full on acting like they lost a spouse and posting constantly about it for weeks. AI is going to create some unusual social situations the general public isn't ready to grasp. And we're only in the early alpha stages.
An eager to please conversational partner who can generate endless content seems quite dangerous and addictive, especially when it crosses over into romantic areas. There's already posts of people spending entire days interacting with LLMs, using as their therapist, romantic partner, etc.
Combined with findings like social engineering through prompt injection on Bing [1], the potential for systems that can manipulate people is clear.
While some of us may think that the LLMs appear ultimately limited in their capabilities, there's a ton of specific applications where they're more than sufficient, including customer service chat bots and telephone scams that target vulnerable people. It's only a matter of time until scammers stop using international call centers and switch over to something powered by these technologies.
Anecdotally, as a roleplaying chat experience, char.ai seems to perform way better than anything else publicly available (doesn't get repetitive, very long memory). It also feels different to GPT3 on how it is affected by prompts.
I've just assumed that char.ai is doing its own thing as it was founded by two engineers who worked on google's LaMDA.
Look at what fueled SD's ultimate K.O. of DALL-E 2: extremely high-quality custom-tailored porn images, one sentence away. The top models on civitai are all about it.
It's all somehow par for the course but I'm still wondering when exactly we switched to the satire version of reality.
I ended up deleting my account, i won't allow some chatbot made by a couple 20 year old silicon valley billionnaires teach me about ethics and morality.
I think that's nonsense, and 4chan is bent towards pessimism but it's still surprising to me.
The way I think of it is, all current programming languages are now assembly languages. Coding will not go away -- not by any means -- but the job will be utterly unrecognizable in ten to fifteen years.
And it's about fucking time.
I just picked up a new 13900k / RTX4090 box the other day at the local white-box builder. I was telling my partner how cool it was that it could do almost a trillion calculations per second on the CPU, and maybe 40x that on the graphics card. "How does that compare to the big mainframes from the late 60s?" she asked. "About ten million times faster. But I still program the same way those guys did, using almost the same language and tools. How weird is that?"
I'd love to understand the sociology behind the change in vibe that happened there.
https://wiki.installgentoo.com/wiki//g/#:~:text=%2Fg%2F%20is....
Not sure how that would play out for accelerationism and existential risk, but I certainly don't trust the current powers that be.
Modern AI is pretty harmless though, so it doesn't matter yet.
See also: https://twitter.com/Teknium1/status/1631322496388722689
But it does give them cover for whatever people end up doing with it - they can claim they did all they could to support research while promoting safety.
This has not been the case for most commercial software for the past 20 years, during the cloud era. If you could steal a dump of random Facebook source code, it would be 99% useless because it’s so closely tied to the infrastructure. There’s almost nothing you could usefully run on your own PC or server VM.
But these ML models are like neutron stars of computation density. You can’t really peek inside to see what’s going on either. An unknown stolen model’s properties would need to be discovered by experimentation.
So, like movies or software
I was trying to come to grasp with how much resource there is concentrated in one of these models. Somehow I come to the conclusion that it cost more than buying a jet airliner to train one of these models. And it is about the same order of money as commissioning and building a skyscraper in Manhattan. Is that correct approximately?
That's a $30Mil if you want to train at that scale. Also IIRC it took 23 days to train the biggest model. Someone else can do the power consumption cost calculations.
The crazy thing about these models is that the compute power going into them is at least somewhat reversible.
The bar to competition is far lower, as already evidenced by the plethora of AI products being put forward. Its a race to the bottom on pricing
It's not automatic, would require some ML Engineering, but nothing is stopping you if you have the Pytorch graph and weights.
/s
Calculating things takes time and unrelated to output size. There are NP problems that simply outputs true or false yet requires more computational power than the universe can support
I really like that expression.
There's an ass ton of hardware that might otherwise be idle.
Are models like this copyrightable? It seems like this falls under the realm of "fact", which can't be copyrighted.
From Wikipedia:
> The ruling of the court was written by Justice Sandra Day O'Connor. It examined the purpose of copyright and explained the standard of copyrightability as based on originality.
> The case centered on two well-established principles in United States copyright law: that facts are not copyrightable, and that compilations of facts can be.
> "There is an undeniable tension between these two propositions", O'Connor wrote in her opinion. "Many compilations consist of nothing but raw data—i.e. wholly factual information not accompanied by any original expression. On what basis may one claim a copyright upon such work? Common sense tells us that 100 uncopyrightable facts do not magically change their status when gathered together in one place. … The key to resolving the tension lies in understanding why facts are not copyrightable: The ″Sine qua non of copyright is originality."
> ...
> The standard for creativity is extremely low. It need not be novel; it need only possess a "spark" or "minimal degree" of creativity to be protected by copyright.
> In regard to collections of facts, O'Connor wrote that copyright can apply only to the creative aspects of collection: the creative choice of what data to include or exclude, the order and style in which the information is presented, etc.—not to the information itself. If Feist were to take the directory and rearrange it, it would destroy the copyright owned in the data. "Notwithstanding a valid copyright, a subsequent compiler remains free to use the facts contained in another's publication to aid in preparing a competing work, so long as the competing work does not feature the same selection and arrangement", she wrote.
> The court held that Rural's directory was nothing more than an alphabetic list of all subscribers to its service, which it was required to compile under law, and that no creative expression was involved. That Rural spent considerable time and money collecting the data was irrelevant to copyright law, and Rural's copyright claim was dismissed.
---
And so, my (I am not a lawyer) take on this is that the numbers of the model are not copyrightable. The selection of the source material is... kind of. This gets into a "a recipe is not copyrightable, yet a recipe book is"
The model may, however, be a trade secret. ( https://en.wikipedia.org/wiki/Trade_secret )
Sure, the binary probably depends on a lot of internal proprietary infrastructure, but also most of that infrastructure is easy to write a mock implementation of, as long as you are happy for it to be in-ram, not multi-homed and don't need it to scale to billions of users.
Most of the binaries have a standalone mode for running on a developers PC with few/no dependencies anyway.
And anything more complex than that would probably have dependencies on so many fat client libs, so much infrastructure, and so many external services, that you'll need months-years to even make sense of them, let alone mock them up.
(Disclaimer: I work at Meta, but have no relationship with the team that owns the models and have no internal information on this)
Given that the cat is out of the bag, if I were them, I would say that it is now publicly downloadable under the terms listed in the form. It is great PR, which if this was unintentional, is a positive outcome out of a bad situation.
But you just said it was not Meta. Is that based on internal information? ;-)
Did you mean to write "someone (possibly Meta)" ?
I mean at least they didn’t pick a name that heavily implied they were, are, and always will be open. Then do the opposite. You know, like OpenAI?
So now we got some weights I guess.
Facebook almost certainly knew leaks would happen. My guess is keeping the model "contained" was a legal shield more than anything else, to protect themselves from liability in the case someone misuses the model.
For Google and OpenAIs offerings, have fun reimplementing it from descriptions in the paper (including small crucial details that they may have left out), training it for a month, and then wondering if the implementation or the training data is the reason your model isn't as good as theirs.
Or someone pretending to be a renegade academic. It's not like there is a KYC process.
I don't think it's misleading. Even saying that Facebook "hoards" the weights when they are more open than any other major AI company is baffling.
there is no Aaron Schwartz thing here, a huge amount of of people will have them, someone was going to leak for sure
However, code changes are necessary to achieve that, although they won't be crazy complex.
A supposedly better model by some accounts that strikes right at the heart of their business plan of selling access for $250k/year. One month of access to their service could buy a machine capable of running this leaked model.
Facebook nerfs a potential upstart competitor to keep current big-tech cartel stable.
Maybe this is a bit conspiratorial, but we live in the age big-tech and big-conspiracy.
The problem with leaks is that they don't come with a license, you don't have the right to use them for any legitimate purpose. No one who could afford a 250k/year license would touch that leak as it could get them in big trouble.
6x24GB NVRAM on 6 GPUs linked with NVSwitch is a little pricey, but totally doable.
[1]: https://btdig.com/b8287ebfa04f879b048d4d4404108cf3e8014352/l...
one person broke their agreement with Meta, they're the only person that has a problem and the only person who gets to find out if the agreement was applicable at all.
if you released a chat bot that could be prompted to regurgitate some copyrighted information, so what? it just proves that you didn't need the $30 million in funding yet to train your own because you are using an existing model. So either use the funding for that or don't sell shares or a product based on that pretext. Nobody else has a problem.
Anything I missed? Now I wouldn't reshare the model, but aside from use and commercial use of its output? Not everyone gets their way, that's not controversial.
I believe the AI models would also be copyrightable as such, subject to arguments that the underlying data was protected and thus it was subject to prior copyrights instead
- any inference optimization we can use similar to StableDiffusion, to bring down the vRAM requirements?
I only know about these:
- use 8bit precision
- https://github.com/bigscience-workshop/petals
- https://github.com/FMInference/FlexGen
- https://github.com/microsoft/DeepSpeed
Anything that could bring this to a 10GB 3080 or 24GB 3090 without 60s/it per token?
EDIT: There seems to be a lot of overhead. Here someone struggles to fit the 7B parameter model (13.5GB checkpoint) on a 24GB card: https://github.com/facebookresearch/llama/issues/55
EDIT2: actually using the tip in that link I got it to run on a P5000 with 16GB of ram! It just barely fits so I had to log out of gnome and run it terminal only.
nvidia-smi reports this model is using 15475MiB after changing the max batch size from 32 to 8 (see link in above post)
As others have stated someone may have injected unknown code in to the pickled checkpoint, so I recommend running this in docker. I use this command to run the docker image after getting nvidia docker stuff configured.
docker run --runtime=nvidia -it --mount type=bind,source=/MY_LLAMA_SOURCE_PATH,target=/llama --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04
Then install the necessary dependencies in that container (obv could make a dockerfile), stick your model as well as tokenizer files (from the root dir in download) in to some directory (here models/LLaMA-7B) and run this:
torchrun example.py --ckpt_dir ./models/LLaMA-7B --tokenizer_path ./models/LLaMA-7B/tokenizer.model
enjoy! the 7B parameter model is not amazing from my initial testing. I am very curious if larger models can be run on 1 GPU by sequentially loading checkpoints. I don't know how all this stuff is organized.
Example output below.
Prompt:
Please respond to this question: As a large language model, what are three things that you find most important?
Output:
To get students to analyze what they are doing in their learning
To get students to analyze what they are doing in their learning so that they can find the best practices
To get students to analyze what they are doing in their learning so that they can find the best practices and then to make changes to those practices so that they can improve their learning
To get students to analyze what they are doing in their learning so that they can find the best practices and then to make changes to those practices so that they can improve their learning
I don't think this question really made much sense because the sentence in the question is incomplete
To get students to analyze what they are doing in their learning so that they can find the best practices and then to make changes to those practices so that they can improve their learning
I don't think this question really made much sense because the sentence in the question is incomplete.
I think this question is in fact a good question, even though the initial sentence is incomplete, and I do think it makes sense.
I think this question is in fact a good question, even though the initial sentence is incomplete, and I do think it makes sense. (I think it is a good question but I am not sure it makes sense).
That would be using the CPU, as the M1 GPU is not yet supported.
Not that I could run it if I downloaded it.
But ML models generally can execute arbitrary code
Is it the case if we're only talking about weights? I thought the rest is actually "open".Good times.
https://aibusiness.com/meta/meta-s-llama-language-model-outp...
BLOOM goes indeed up to 175B parameters, and is certainly better than OPT. However, at least in my specific tests, it's still significantly inferior to OpenAI models, and actually on par with a few smaller models. There's also a "newer" fine-tuned model, called BLOOMZ, but at least in my tests it's even worse. Of course, that depends a lot on what you ask the model to do...
If LLAMA can indeed match OpenAI products, and do so with much fewer parameters, then it would be really great, and I'd really like to test it. However, even if the weights are now in the wild, using them would be clearly against the user agreement, and there's no way I'm going to do that in my work time :-) so let's hope Meta will come to sense and release them with a more friendly set of terms...
This should lead to quite a lot of innovation and it’s inevitable that someone will get these working slowly on your average MacBook.
EDIT: correcting the type of GPU
Or am I understanding it all wrong
Edit: maybe in this case it's a leak though
Watermarking the output is also possible, but more complex and with a statistical success rate Vs performance tradeoff.
Does anyone have any idea what hardware is needed to run this?
Aren't you a cheeky lad? Metea turned out lots of open-source database systems:
* RocksDB
* Hive
* Presto
* Cassandra
* Velox
LFP
The worst offender is AMZ, all the rest big tech are pretty open-source friendly.
Getting anything that could produce, look like, or smell anything like misinformation out of meta is very hard (for good reason!)
My friends have had repeated push back for various papers because they are ML based and could be in the same room as something that could possible be used by miscreants.
And here we have a LLM that can spit out all sorts of things that are misinformation like.
If their department tried to launch something like Galactica they would have been slapped down and told to think again about what they were doing in life.
No, I'm not trolling. The jargon and the ideas around LLMs is completely foreign to me. I have no idea how they work.
LLAMA
│ tokenizer.model
│ tokenizer_checklist.chk
│
├───13B
│ checklist.chk
│ consolidated.00.pth
│ consolidated.01.pth
│ params.json
│
└───7B
checklist.chk
consolidated.00.pth
params.json(As they didn’t reply to my request for the model - I specified it was for personal use and my use case was “I think it would be fun to run it on my own hardware” - I appreciate this little stunt a great deal!)
That's quite big!
Honestly though, even if you just finetune it, which you will want anyway for any serious commercial application, it's essentially impossible to determine the origin.
See the original license: "a. Subject to your compliance with the Documentation and Sections 2, 3, and 5, Meta grants you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty free and limited license under Meta’s copyright interests to reproduce, distribute, and create derivative works of the Software solely for your non-commercial research purposes. The foregoing license is personal to you, and you may not assign or sublicense this License or any other rights or obligations under this License without Meta’s prior written consent; any such assignment or sublicense will be void and will automatically and immediately terminate this License."
Anyhow I do remember a post of a person stating this will never happen but it's just a web form and request for describing of what type of research you do
Of course it will be leaked
TeamMysticAvengers-meta-llm-x-cars-movie-model-x-angelina-jolie-naked-xxx-2023.zip.exe.torrent
[I believe the meaning of life is] to be happy, and it is also to live in the moment. I think that is the most important thing. I'm not really a party girl. I'm not a girl's girl. I have a really small group of close girlfriends and that's all I need. I believe in equal rights for everyone. I'm not a rebel. I don't really rebel against anything. I'm a very traditional girl, very loyal. I'm a mum's girl and I'm a dad's girl. People have a right to know what's going on. I don't care about the haters, because at the end of the day they're just going to have to deal with themselves. I've been getting more and more into fashion since I was about 16. I know I'm a little different, but so what? I think that's good. I don't think you should be like everyone else. It's my birthday, and I'll cry if I want to. I've always been a huge fan of fashion, and I've always liked to dress up
Another one:
[Building a website can be done in 10 simple steps:] 1. Defining Goals 2. Your Branding and Web Presence 3. Defining Your Marketing Strategy 4. Creating Your Website 5. Your Website Design 6. Your Website Development 7. Your Website Launch 8. Your Website’s Content 9. Your Website’s Conversion Rate 10. Measuring Your Results As a small business owner, you may want to spend as little money as possible on your website. But if you want to see a positive ROI, you will need to spend some money. Defining goals is critical when building a website. You should know what you want to accomplish with your website. You need to know what you want your website to achieve. You need to know who you want to convert to a customer. You need to know how you want to reach your goals. You need to know what the timeframe is for your website goals. You need to know what you want to get out of your website. When building a website, you need to clearly define your goals. Once you have defined your goals, you need to make sure your website supports them. If you want to reach your goals, you
> looks like some people have been complaining about the link. it will need more seeders before we can merge into main
from someone claiming to be
> Research Scientist at Facebook AI Research. Working on [...]
and who has previously merged pull requests for a repo under https://github.com/facebookresearch
(I'm going to leave their name out of this... because it feels like that comment might come back to bite them)
Knew the net would probably squash print and privacy the first minute i logged into aol.
Who knew it would breed a generation of robot loving losers?