I think this is false, but if you're reading quickly, it sounds cogent enough. As Sarah Constantin observed about GPT2 [0]:
> if you skim text, you miss obvious absurdities. The point is OpenAI HAS achieved the ability to pass the Turing test against humans on autopilot...The mental motion of “I didn’t really parse that paragraph, but sure, whatever, I’ll take the author’s word for it” is, in my introspective experience, absolutely identical to “I didn’t really parse that paragraph because it was bot-generated and didn’t make any sense so I couldn’t possibly have parsed it”, except that in the first case, I assume that the error lies with me rather than the text. This is not a safe assumption in a post-GPT2 world. Instead of “default to humility” (assume that when you don’t understand a passage, the passage is true and you’re just missing something) the ideal mental action in a world full of bots is “default to null” (if you don’t understand a passage, assume you’re in the same epistemic state as if you’d never read it at all.)
[0] https://www.skynettoday.com/editorials/humans-not-concentrat...
As a programmer, I find R hard to use and not very well designed, so I can see why you'd call that nonsense.
But when I was a math student, I found that in some ways R does have "a more mature package universe". There were many math algorithms that I could find packages for in R and not in Python, even as a mere grad student.
Why, does not sound too arrogant to me.
My experience with ChatGPT thus far is that it is as intelligent as a very broadly read person who just doesn't reeeally get the complex or nuanced aspects of the content it reads - much like many real Humans.
"After eighteen years of being a professor, I’ve graded many student essays. And while I usually try to teach a deep structure of concepts, what the median student actually learns seems to mostly be a set of low order correlations. They know what words to use, which words tend to go together, which combinations tend to have positive associations, and so on. But if you ask an exam question where the deep structure answer differs from answer you’d guess looking at low order correlations, most students usually give the wrong answer."
If he wanted it to replicate that classic HN feel he would either have to extend the prompt with additional examples or, better yet, use finetuning.
I guess he could also just randomly sprinkle in some terms like 'stochastic parrot' and find a way to shoehorn Tesla FSD into every conversation about AI.
First sentence of the first paragraph on OP's page
EDIT: it's a bit misleading, further down they describe what looks like a semantic-search approach
https://sonnet.io/posts/emotive-conjugation/#:~:text=I%27m%2...
My conclusion was that you can use LLMs to automate and scale attribution bias.
We did it guys!
> iii. Put top matching comments into a prompt and ask GPT-3 to answer the question using the context
It depends on the Prompt used to ask GPT the question. A prompt that instructs GPT to write like a HN poster should fix that.
The thing people don't realize is that right now there is a very large gap between the capabilities of a few models including OpenAI's most recent ones, and most of the other LLMs. So there are several options for actually training or fine-tuning with open models, but actually none of them have the language understanding and generation capabilities at the level of those new OpenAI models.
As far as I know.
It's very difficult to otherwise finetune existing LLMs. GPT itself is closed-sourced, and doesn't allow for finetuning (except via an opaque API and with limited amounts of data). Other open models are either very difficult to load in memory and/or simply not as expressive as GPT
Military budgets and manpower would make sock puppets interesting. https://www.theguardian.com/technology/2011/mar/17/us-spy-op...
Does anyone find it curious Edward Snowden is mute on the military's use of this technology?
How many people would be surprised to learn they are communicating with a program on websites like this?
Question: what are the biggest fallacies of people that post on hackernews?
Answer: One of the biggest fallacies of people that post on Hacker News is the nirvana fallacy, which is when someone jumps from "thing x is imperfect" to "thing x is worthless" without considering any of the benefits. People also tend to be familiar with certain technologies and will discount the flaws of existing technology vs. the flaws of new technology. Another fallacy is the assumption that there is one best tool for doing things - the one that supports the largest amount of use cases. Additionally, people may use appeals to authority, invent their own solutions rather than use existing ones, and assume that the approaches they first learned are most correct. Finally, people may assume open source software is a gift and that dependencies don't need to be tested.
Is that really a problem? It's only natural to get accustomed to old problems caused by old tech (where workarounds probably already exist or limitations understood and accepted) while it makes sense to hold new tech to a higher standard since it has the benefit of learning from past mistakes and should avoid introducing obvious problems for people in the future to deal with. Our future selves will thank us for being more critical of flawed new tech than we are of old tech.
(edited for clarity)
See a demo on the huggingface transformers documentation: https://huggingface.co/spaces/jerpint/buster
The only other project that I've seen that's doing something close to this is this one: https://github.com/getbuff/Buff
It's a bit more similar to the OPs bot (it's a Discord bit).
Cool to see momentum in this space!
HN data is indexed with embeddings for semantic search. When queried, it finds closest article, top comments and summarizes with GPT-3.
GPT-3 serves as a rendering tool for compressed comments.
> “AskHN” is a GPT-3 bot I trained on a corpus of over 6.5 million Hacker News comments to represent the collective wisdom of the HN community in a single bot.
I'm assuming you used the openai fine-tuning pathway to make a custom model?
Have you tested the responses on vanilla GPT3 vs your custom model?
I'd be curious to see the comparison.
The semantic search approach seems to focus the answers better than fine-tuning; at the cost of preloading the prompt with a lot of tokens, but with the benefit of a more constrained response.
I don't see any of the sublime and succinct snark.
A bot focused on the output of HNers would insist on providing arguments against going through Google's interview process in the first place and suggestions that the correct answer to "Python or R" should be Haskell or Julia and would never suggest prioritising emotional vulnerability or being a happy person!
This whole space is moving so fast its hard to keep up for someone whos immediate day job doesn't revolve around this space. Congrats.
Banana Sebastian housewares fly swimmingly under terrestrial Zruodroru'th Memphis Steve Jobs archipelagos
It's actually more likely to require a bathtub to increase the volume of the reticulated lorries, so I really don't think a farmer's market is the ideal place.
I for one am oh so proud that my valuable ramblings contributed to this majestic machinery.
Actually, it makes me feel as bad as knowing that CAPTCHA were used to train image recognition models...
I think it could be a good time to reconsider the question of the consent. I may agree that my words are used to train some IA... but 1) I must be asked (kindly) first and 2) it won't be free!!! (it may be paid to me or the service provider like HN... but it's NOT unpaid work ;-) )
Was planning to see how I can build something like these sites but without the need to regularly update the subtitles catalog: https://subscene.be https://subtitlecat.com https://subtitletrans.com https://my-subs.co
This actually works surprisingly well.
Check out the OpenAI cookbook for examples.
I see that you are asking about "How to get a job at Google". I will help you with "How to get a job at Google". In order to solve the problem of "How to get a job at Google" please follow the following steps first:
- rewrite your resume in Google Docs
- reinstall Chrome
- apply to the job
Let me know if I can help further with "How to get a job at Google". I like using it, but I have to tune my prompts to make sure that they don't bullshit me before getting to the point.
I do not like that it seems to be effectively an ad.
> Embedding every single one of the 6.5 eligible comments was prohibitively time-consuming and expensive (12 hours and ~$2,000).
Does anybody understand what he’s talking about here? Assuming 6.5 million comments and an average token length of 70 we'd be looking at $180 ($0.0004 / 1K tokens).
A version where you can turn knobs of flavored contributors would be pretty funny. I know my comment style is easily identifiable and reproducable, and it encodes a certain type of logical conjugation, albeit biased with some principles and trigger topics, and I think there is enough material on HN that there may be such a thing as a distinct, motohagiographic lens. :)
I am not a lawyer but there has to be a jurisdiction where I can establish standing at least.
If Op is reading. I'm curious about the database you are using to store the embeddings. Pinecone, Weaviate ...?
> The embeddings were then indexed with Pinecone.
Is there an easy way to load up GPT with my thoughts to have it be a fake me?
The methodology used here is a generic solution for distilling a non-generic corpus of utterances of , into a generic platitude machine.
Curious how that would differ; but would be an expensive endevour.
I would very much like to see the ghost of Terry pop up from time to time, to offer his wisdom and unique style of response.
How long did it take to scrape them and train the "corpus" on this content?
AI: of course .. here is your bash script (220 lines long)
Otoh, did I miss something or is it only on discord?
Thanks bottie, very use, much helpful.
Can you cut me a distro of myself?
Made me smile