Using GPT3, Supabase and Pinecone to automate a personalized marketing campaign (opens in new tab)

(vimota.me)

252 pointsvimota3y ago62 comments

62 comments

41 comments · 17 top-level

pbourke3y ago· 8 in thread

> My script read through each of the products we had responses for, called OpenAI's embedding api and loaded it into Pinecone - with a reference to the Supabase response entry.

OpenAI and the Pinecone database are not really needed for this task. A simple SBERT encoding of the product texts, followed by storing the vectors in a dense numpy array or faiss index would be more than sufficient. Especially if one is operating in batch mode, the locality and simplicity can’t be beat and you can easily scale to 100k-1M texts in your corpus on commodity hardware/VPS (though NVME disk will see a nice performance gain over regular SSD)

vimotaOP3y ago

Yep that's true! I'd probably do something like that if I were starting again, but the ease of calling a few APIs is pretty nice. I feel like that alone will drive a lot of adoption of some of these platforms even if it can just be done locally.

freediver3y ago

I'd argue that using something like SBERT + Faiss is easier and would take less time (you do not have two account creations + one billing setup), plus a working example of SBERT + Faiss is probably total less than 10 lines of code.

2 more replies

rvnx3y ago

Great result!

cardosof3y ago

I come to HN for posts like this, thank you.

bilater3y ago

most HN comment ever

asah3y ago

had the same reaction to the jargon...

...but I said WTH, googled SBERT and followed my nose and got it installed in minutes on my Mac and they kindly included a cut/paste example of semantic search.

2 more replies

jamesblonde3y ago

I appreciated (and understood) the comment.

If i have a lot of data, it's cheaper and more efficient to build your own batch inference application and just use two well-known libraries (s-bert and the FAISS indexing library). That didn't occur to me. I come here for insights - and i got one here.

sjcsjc3y ago

Yes. Very Dropbox.

1 more reply

swyx3y ago· 6 in thread

> And it pretty much worked! Using prompts to find matches is not really ideal, but we want to use GPT's semantic understanding. That's where Embeddings come in.

sounds like you ended up not using GPT3 in the end which is probably wise.

i'm curious if you might see further savings using other cheaper embeddings that are available on huggingface. but its probably not material at this point.

did you also consider using pgvector instead of pinecone? https://news.ycombinator.com/item?id=34684593 any painpoints with pinecone you can recall?

roseway43y ago

I've seen really good results using BERT and other open-source models for matching symptoms / healthcare service names to CPT/HCPCS code descriptions. Even for specialized domains, some of the freely available models perform well for ANN search. While BERT may not be viewed as state-of-the-art, versions of it have relatively low memory requirements versus newer models which is nice if you're self-hosting and/or care about scalability.

joshgel3y ago

I’m curious what problems you’ve applied this to. Would love to chat if you are open. (My email is in my profile)

vimotaOP3y ago

I didn't use the GPT3 autocomplete API in the end (though I did play around with it) but did use the embeddings API (which I believe is still considered part of the "GPT3" model, but I could be wrong!).

I totally could! I think each use case should dictate which model you should use, in my case I was not super cost or latency sensitive since it was a small dataset and I cared more about accuracy. But I'm planning on using something like https://huggingface.co/sentence-transformers/all-MiniLM-L6-v... for my next project where latency and cost will matter more :)

I have a lot of thoughts around that last question! The Supabase article came out way after I implemented this (August of last year) so I didn't even think to do that, not sure if it was even supported back then, but I'd probably reach for that if I was re-doing the project to reduce the number of systems I needed. I think the power of having the vector search done in the same DB as the rest of the data is that sometimes you may want to have structured filtering before the semantic/vector ranking (ie. only select user N's items and rank by similarity to <query>) which is trickier to do in Pinecone. They support metadata filtering but it feels like an after thought. For the project I'm working on now (https://pinched.io) , we'd like to filter on certain parameters as well as rank by relevance, so I'm going to explore combining structured querying with semantic search (ie. pgvector or something similar on DuckDB if it adds support for this).

swyx3y ago

> https://pinched.io

requested invite! i have a moderately large twitter so could be a good test heheh. i use https://www.flock.network/ for this stuff normally but the UX isnt that great so hoping for better.

itake3y ago

My understanding is pinecone is much faster, but for this small search space, I doubt pgvectors would be noticably worse.

I tested an early version of pgvector against faiss and found faiss had much better performance

https://github.com/pgvector/pgvector/issues/3

devxpy3y ago

Creating the index on pinecone takes about a minute. Creating a table on postgres should take a few milliseconds!

jamesblonde3y ago· 4 in thread

I have see a lot of people write about how important the interaction between vector DBs and chat-GPT3 (and GPT3) is. I am still not much wiser after this article. Is it that it makes it easier to go from:

user query -> GPT3 response -> Lookup in VectorDB -> send response based on closest embedding in VectorDB

curo3y ago

Query (→ GPT3 completion) → vector db lookups → GPT3 synthesis

The optional step two is used when the lookups are more closely related to an answer's latent space than the original query text. This approach is called HyDE (first published here: https://arxiv.org/abs/2212.10496).

The synthesis is also optional. You can essentially summarize your lookups or refine them or do whatever you want at this stage.

If you skipped steps 2 and 4, it's just a semantic search engine. If you skip step 2, you're either doing it for latency/performance reasons, or because the user query's embeddings are more similar to the docs in the vector db.

1 more reply

pablo246023y ago

All the embedding-enabled GPT-3 apps I've seen do the following: User query -> Retrieve closest embedding's plaintext counterpart -> feed plaintext as context to GPT-3 prompt.

jamesblonde3y ago

Is this a form of prompt engineering then?

Your vector DB has well formed prompts - users write random stuff, map it to the closest well formed prompt?

3 more replies

vimotaOP3y ago

To be clear, I didn't use the autocomplete GPT-3 API just the embeddings one! Pinecone has some good docs on it: https://docs.pinecone.io/docs/openai. Happy to answer any questions :)

EGreg3y ago· 2 in thread

Why not just use GPT-3 or even GPT-2 classifier API? No generative AI needed

EForEndeavour3y ago

Looks like the Classifications API is deprecated: https://platform.openai.com/docs/guides/classifications

apienx3y ago

Embeddings are superior (classifiers are deprecated).

Hyption3y ago· 1 in thread

I don't like the unscientific ad for his gf company.

'which helped launch the movement of those opposed to endocrine disruptors, was retracted and its author found to have committed scientific misconduct'

oluwie3y ago

mehh … you might be overthinking it

it’s his blog either way.

throwthere3y ago· 1 in thread

This looks incredible and magical to me. How do you learn to create things like this as a mostly web programmer? Vectorization, etc I had no idea could integrate with gpt etc but honestly it looks kind of obvious/effortless to the author.

vimotaOP3y ago

Appreciate the kind words, but I'm sure you could pick it up pretty quickly too :) The OpenAI docs are pretty helpful as a starting point: https://platform.openai.com/docs/guides/embeddings/use-cases

djoldman3y ago· 1 in thread

How long did this take?

Did you consider something like openrefine or fuzzy matching / levenshtein distance?

Seems like a common data cleaning ask with a small amount of data.

vimotaOP3y ago

I played around with that and pgtrgm (https://www.postgresql.org/docs/current/pgtrgm.html) a bit but unfortunately didn't have great results. I did do a bunch of manual data cleaning though, and also had some overriding logic if certain keywords match it would avoid semantic search and default to a result (for common ones).

fswd3y ago· 1 in thread

What is pinecone and is there a link to a website?

madmax1083y ago

Pinecone is a vector database: https://www.pinecone.io/

mdorazio3y ago

Are you saving the match pairs somewhere? I imagine 1) there are a finite number of them, 2) doing an exact lookup in a DB first will be faster and easier than calling GPT3 and Pinecone every time, and 3) eventually GPT3 APIs will get pricey enough to make you think twice unless you're running your own instance on a cluster.

mattfrommars3y ago

Pardon my ignorance here. I started to play around with text generation today and came around plenty of resource but hard to make any sense of it. I had this working https://github.com/oobabooga/text-generation-webui and instead of it being able to answer question, it revolves around the concept of generating text.

In your case and ChatGPT3, does is it provide output based on the data you feed it? If that is the case, is there anything related to training the model to use your data?

I am trying to gauge a sense of what is going on.

ipv6ipv43y ago

How do you know if the output that was sent to customers (who believe they are getting accurate results from a knowledgable human being, BTW) is correct?

espe3y ago

i fail to see how this data cleaning could not be solved with proper tokenization and some distance measure. the amount of power used for those api calls is slighty obscene.

edit: don't want to rant. it's not a bad post and i'm sure there is many and far more wasteful examples than this.

1f60c3y ago

I'm disappointed that the article doesn’t explain what they ended up doing.

sexangel3y ago

> 100s of human hours saved

wait till its thousands, millions, billions . . .

pjakubowski3y ago

Awesome to see the integration between Klaviyo automation and GPT-3 AI and using it to streamline your girlfriends processes. Keep up the fantastic work!

1 more reply

NotYourLawyer3y ago

This is pure spam.

wyem3y ago

Loved reading it. Will feature this in my newsletter on AI Tools and learning resources, AI Brews https://aibrews.com

j / k navigate · click thread line to collapse

62 comments

41 comments · 17 top-level

pbourke3y ago· 8 in thread

> My script read through each of the products we had responses for, called OpenAI's embedding api and loaded it into Pinecone - with a reference to the Supabase response entry.

vimotaOP3y ago

freediver3y ago

2 more replies

rvnx3y ago

Great result!

cardosof3y ago

I come to HN for posts like this, thank you.

bilater3y ago

most HN comment ever

asah3y ago

had the same reaction to the jargon...

...but I said WTH, googled SBERT and followed my nose and got it installed in minutes on my Mac and they kindly included a cut/paste example of semantic search.

2 more replies

jamesblonde3y ago

I appreciated (and understood) the comment.

sjcsjc3y ago

Yes. Very Dropbox.

1 more reply

swyx3y ago· 6 in thread

> And it pretty much worked! Using prompts to find matches is not really ideal, but we want to use GPT's semantic understanding. That's where Embeddings come in.

sounds like you ended up not using GPT3 in the end which is probably wise.

i'm curious if you might see further savings using other cheaper embeddings that are available on huggingface. but its probably not material at this point.

did you also consider using pgvector instead of pinecone? https://news.ycombinator.com/item?id=34684593 any painpoints with pinecone you can recall?

roseway43y ago

joshgel3y ago

I’m curious what problems you’ve applied this to. Would love to chat if you are open. (My email is in my profile)

vimotaOP3y ago

swyx3y ago

> https://pinched.io

requested invite! i have a moderately large twitter so could be a good test heheh. i use https://www.flock.network/ for this stuff normally but the UX isnt that great so hoping for better.

itake3y ago

My understanding is pinecone is much faster, but for this small search space, I doubt pgvectors would be noticably worse.

I tested an early version of pgvector against faiss and found faiss had much better performance

https://github.com/pgvector/pgvector/issues/3

devxpy3y ago

Creating the index on pinecone takes about a minute. Creating a table on postgres should take a few milliseconds!

jamesblonde3y ago· 4 in thread

user query -> GPT3 response -> Lookup in VectorDB -> send response based on closest embedding in VectorDB

curo3y ago

Query (→ GPT3 completion) → vector db lookups → GPT3 synthesis

The synthesis is also optional. You can essentially summarize your lookups or refine them or do whatever you want at this stage.

1 more reply

pablo246023y ago

All the embedding-enabled GPT-3 apps I've seen do the following: User query -> Retrieve closest embedding's plaintext counterpart -> feed plaintext as context to GPT-3 prompt.

jamesblonde3y ago

Is this a form of prompt engineering then?

Your vector DB has well formed prompts - users write random stuff, map it to the closest well formed prompt?

3 more replies

vimotaOP3y ago

To be clear, I didn't use the autocomplete GPT-3 API just the embeddings one! Pinecone has some good docs on it: https://docs.pinecone.io/docs/openai. Happy to answer any questions :)

EGreg3y ago· 2 in thread

Why not just use GPT-3 or even GPT-2 classifier API? No generative AI needed

EForEndeavour3y ago

Looks like the Classifications API is deprecated: https://platform.openai.com/docs/guides/classifications

apienx3y ago

Embeddings are superior (classifiers are deprecated).

Hyption3y ago· 1 in thread

I don't like the unscientific ad for his gf company.

'which helped launch the movement of those opposed to endocrine disruptors, was retracted and its author found to have committed scientific misconduct'

oluwie3y ago

mehh … you might be overthinking it

it’s his blog either way.

throwthere3y ago· 1 in thread

vimotaOP3y ago

Appreciate the kind words, but I'm sure you could pick it up pretty quickly too :) The OpenAI docs are pretty helpful as a starting point: https://platform.openai.com/docs/guides/embeddings/use-cases

djoldman3y ago· 1 in thread

How long did this take?

Did you consider something like openrefine or fuzzy matching / levenshtein distance?

Seems like a common data cleaning ask with a small amount of data.

vimotaOP3y ago

fswd3y ago· 1 in thread

What is pinecone and is there a link to a website?

madmax1083y ago

Pinecone is a vector database: https://www.pinecone.io/

mdorazio3y ago

mattfrommars3y ago

In your case and ChatGPT3, does is it provide output based on the data you feed it? If that is the case, is there anything related to training the model to use your data?

I am trying to gauge a sense of what is going on.

ipv6ipv43y ago

How do you know if the output that was sent to customers (who believe they are getting accurate results from a knowledgable human being, BTW) is correct?

espe3y ago

i fail to see how this data cleaning could not be solved with proper tokenization and some distance measure. the amount of power used for those api calls is slighty obscene.

edit: don't want to rant. it's not a bad post and i'm sure there is many and far more wasteful examples than this.

1f60c3y ago

I'm disappointed that the article doesn’t explain what they ended up doing.

sexangel3y ago

> 100s of human hours saved

wait till its thousands, millions, billions . . .

pjakubowski3y ago

Awesome to see the integration between Klaviyo automation and GPT-3 AI and using it to streamline your girlfriends processes. Keep up the fantastic work!

1 more reply

NotYourLawyer3y ago

This is pure spam.

wyem3y ago

Loved reading it. Will feature this in my newsletter on AI Tools and learning resources, AI Brews https://aibrews.com

j / k navigate · click thread line to collapse