I would not suggest anyone to use ChatGPT outputs for actual knowledge at this point.
> "Discovering Latent Knowledge in Language Models Without Supervision" Existing techniques for training language models can be misaligned with the truth: if we train models with imitation learning, they may reproduce errors that humans make; if we train them to generate text that humans rate highly, they may output errors that human evaluators can't detect. We propose circumventing this issue by directly finding latent knowledge inside the internal activations of a language model in a purely unsupervised way.
https://arxiv.org/abs/2212.03827
In other words the model already tries to predict the truth because it is useful in next token prediction, but we need to find a way to detect the 'truth alignment' in its activations.
aws cloudfront update-distribution --id <distribution-id> --distribution-config <new-config> --no-reset-origin-access-identity
Took me a while to figure out that the parameter --no-reset-origin-access-identity was not only not working. But it did never exist on any version of the cli tool.
Meaning, if all the facts exist in the prompt then the likelihood of synthesizing fiction is diminished.
There are a number of ways to use the principle of analytic augmentation to add most or if not all of the facts required for a truthful response, ranging from simple “prompt engineering” to evaluating code to document embedding in latent space.
For example, if you use prompt engineering to k-shot a task to turn math word problems into executable JavaScript, meaning LLMs are only translators and the computations are done by a software interpreter, then the results are much more likely to be truthful.
Sampling from a number of variations on a prompt can lead to a more accurate outcome if say 1:10 times the translation attempt has a different answer.
Edit: [*] 'we' in my comment here is indicating the HN community, not entirety of humanity.
That seems useful, prudent, and completely in line with the spirit if a community like HN.
There’s no more reason that every critique should come with a “proposal” than that every cheer should come with some kind of admonition. As a community, multiple points of view are expressed and developed simultaneously.
Of course, some of points of view might personally frustrate you or leave you feeling like you don’t know how to respond to them. But is that so bad? Does it need to be squelched just because you don’t enjoy it?
This says more about you than the hypothetical "people" you are talking about.
Then again, perhaps LLMs could simply be incorporated into the peer-review process, where after submitting your paper, you'd have to answer the AI's basic questions. As a reviewer, I could imagine a structured AI report for a paper being helpful in guiding discussion: "The paper compares to recent approaches X, Y, and Z. And the work is indeed novel."
I'm a big believer of using all these new AI services/programs as tools to enhance my workflows, not replacing them.
Or same question in a different way, what sort of workflow would be enhanced by an innacurate summary?
Disclosure: I work at MSFT but not on Bing
This is a conflict of incentives. Whereas ArxivGPT has no reason not to tell the problems first.
Are you using your own API key and pay for the usage? How can you justify operation of programs that produce high costs but no income? Isn't the API publicly exposed to the client-side and possible subject of theft and abuse?
With this new class of products based on crafting prompts that best exploit a GPT's algorithm and training data, are we going to start seeing pull requests that tweak individual parts or words of the prompt. I'm also curious how the test suite for projects like this would look for specific facts or phrases to be contained in the responses for specific inputs.
Edit: btw, congratulations on the release. This is the kind of stuff I think should be explored more using LLMs. Great choice on making a chrome extension, it's great UI for this kind of thing.
[1] https://github.com/hunkimForks/chatgpt-arxiv-extension#how-t...
https://huggingface.co/ml6team/keyphrase-extraction-kbir-ins... is a decent tool to explore the constant stream of publications. The last mile still is left to the human.