AI for AWS Documentation (opens in new tab)

(awsdocsgpt.com)

221 pointswhatsthenews2y ago129 comments

129 comments

97 comments · 44 top-level

underlines2y ago· 9 in thread

RAG is very difficult to do right. I am experimenting with various RAG projects from [1]. The main problems are:

- Chunking can interfer with context boundaries

- Content vectors can differ vastly from question vectors, for this you have to use hypothetical embeddings (they generate artificial questions and store them)

- Instead of saving just one embedding per text-chuck you should store various (text chunk, hypothetical embedding questions, meta data)

- RAG will miserably fail with requests like "summarize the whole document"

- to my knowledge, openAI embeddings aren't performing well, use a embedding that is optimized for question answering or information retrieval and supports multi language. SOTA textual embedding models can be found on the MTEB Leaderboard [2]. Also look into instructorEmbeddings

- the LLM used for the Q&A using your context should be fine-tuned for this task. There are several open (source?) LLMs based on openllama and others, that are fine tuned for information retrieval. They hallucinate less and are sticking to the context given.

1 https://github.com/underlines/awesome-marketing-datascience/...

2 https://github.com/embeddings-benchmark/mteb

darkteflon2y ago

This comment was very helpful for me, thanks.

I’ve been working with RAG for months, too, and it’s vanishingly rare to see anything but toy examples in the wild. This is a solid, concise list of where the dragons are.

Any idea where all the RAG practitioners hang out and trade war stories? Is there a forum or Discord or something?

tomhamer2y ago

Marqo provides automatic, configurable chunking (for example with overlap) and can allow you to bring your own model or choose from a wide range of opensource models. I think e5-large would be a good one to try. https://github.com/marqo-ai/marqo

d4rkp4ttern2y ago

Using Qdrant doesn’t require docker, like Marqo does (from the README). Any trade offs between the two? Doc chunking is an independent functionality and there are already line that help chunk with overlap etc,and also it’s not hard to roll your own.

avereveard2y ago

To solve the question embedding issue I prefer another approach, you store document embeddings as normal, then for searching you let LLM hallucinate an answer and use the wrongish answer embedding to do the vector search.

underlines2y ago

The inverse idea of Hypothetical Embeddings, HyDE [1] "HyDE is an embedding technique that takes queries, generates a hypothetical answer, and then embeds that generated document and uses that as the final example."

BriefGPT [2] is implementing this and it uses the following prompt at ingestion-time:

"Given the user's question, please generate a response that mimics the exact format in which the relevant information would appear within a document, even if the information does not exist. The response should not offer explanations, context, or commentary, but should emulate the precise structure in which the answer would be found in a hypothetical document. Factuality is not important, the priority is the hypothetical structure of the excerpt. Use made-up facts to emulate the structure. For example, if the user question is "who are the authors?", the response should be something like 'Authors: John Smith, Jane Doe, and Bob Jones' The user's question is:"

1 https://python.langchain.com/docs/modules/chains/additional/...

2 https://github.com/e-johnstonn/BriefGPT

d4rkp4ttern2y ago

That’s an interesting list (awesome-marketing-data science). Curious what is specific to marketing in that list, or maybe it’s just an inaccurate name.

underlines2y ago

historical repo name, it's really not that anymore, besides a very old list of marketing stuff that i rarely update. I should rename the repo, but I hesitate :)

zim12y ago

Have you found any particularly promising approach towards "hypothetical embeddings" that you can recommend?

cypress662y ago

What context size is used? Can you "brute force" your way by using let's say 16k or 32k of context?

sovietmudkipz2y ago· 7 in thread

This is exactly the kind of software that people should be making. Even if this one is bad, the concept is very sound.

I want to have a specialist AI that is trained to help me learn how to use the software. 100% what should be happening.

General AI should know how to do stuff too but having an specialist AI implies that the company/group of people/person is making sure to tune the model.

Just an IMO.

raincole2y ago

MDN implemented an "AI explains" feature, and the reception is extremely negative.

https://github.com/mdn/yari/issues/9208

fragmede2y ago

Is it? It's true there are a lot of very loud messages on that linked issue, but only MDN knows how many people are using and engaging with their system compared to on that thread. We, as the general public, don't know anything other than there's a vocal group of dissidents, and what MDN tells us. Which I looked and didn't find stats on its usage.

scarface_742y ago

Why?

You can ask the same questions to ChatGPT and get the same or better answers.

I also know from personal experience with ChatGPT, that you can use it to:

- convert Python/boto3 to any language that has an AWS SDK

- convert CloudFormation to Terraform or the CDK

- write scripts that use the SDK

You will get the occasional hallucination.

KRAKRISMOTT2y ago

ChatGPT is not up to date unless you start using the plugins. This sort of indexing is based on vector databases and various intermediate prompting. If you want to get technical, the academic term is "Retrieval Augmented Generation".

1 more reply

cle2y ago

Here's one example question that ChatGPT utterly fails at, but that this answers fine: "What is Amazon CodeCatalyst?"

ChatGPT: "I'm sorry, but as of my knowledge cut-off in September 2021, there was no service, tool, or product known as Amazon CodeCatalyst offered by Amazon Web Services (AWS). [...]"

1 more reply

politician2y ago

Ideally, specialist models are small enough to run locally.

shaunpud2y ago

I was looking at the CloudLinux[1] docs the other day and noticed an "Ask GPT?" feature which I found quite helpful.

[1] https://docs.cloudlinux.com

1 more reply

CSDude2y ago· 7 in thread

Well, I'll pass.

Prompt: how to connect a lambda function to an RDS instance and query it safely?

1. Configure your RDS instance for outbound connections to AWS Lambda. This step involves ensuring that your DB instance is in a public subnet on your VPC and that the instance's "PubliclyAccessible" property is set to true.

klysm2y ago

Lol yeah no thanks. This is one of things that makes me nervous about using LLMs. If the internet contains shitty solutions for solving a problem, it’s going to recommend shitty solutions. Inexperienced folks won’t know better.

There’s also the thing where that’s a positive feedback look and we overflow the worlds corpus with garbage.

WrtCdEvrydy2y ago

> Inexperienced folks won’t know better.

I've seen answers like this in Stack Overflow before. Technically correct, but basically broken in every other way: cybersecurity's future is gonna be interesting.

1 more reply

donmcronald2y ago

> If the internet contains shitty solutions for solving a problem, it’s going to recommend shitty solutions.

Have you ever noticed how the dumbest people tend be be the loudest, most confident, and most eager to share their opinions? Those are the voices that trained ChatGPT.

It’s a hard pass for me, but I fear I’m going to be forced to used it because the dumbest half of humanity will view it as a peer and use it for everything.

scarface_742y ago

Asking the same question to Chatgpt

To connect a Lambda function to an RDS (Relational Database Service) instance and query it safely, you can follow these general steps:

1. Configure your Lambda function's execution role: Ensure that your Lambda function has the necessary permissions to access the RDS instance. The role should include the `AWSLambdaVPCAccessExecutionRole` and the appropriate RDS permissions.

2. Place your Lambda function and RDS instance in the same VPC: Ensure that both the Lambda function and the RDS instance are in the same Virtual Private Cloud (VPC) to enable their communication. You can configure the VPC settings in the AWS Management Console.

3. Configure the security group settings: Set up the security groups for your Lambda function and RDS instance to allow the necessary inbound and outbound connections. The Lambda function's security group should allow outbound access to the RDS instance's security group on the RDS port (usually port 3306 for MySQL or 5432 for PostgreSQL).

4. Create a subnet group: If your RDS instance is in a private subnet, create a subnet group that includes the appropriate subnets for your Lambda function to access the RDS instance securely.

5. Connect to the RDS instance from the Lambda function: In your Lambda function code, establish a database connection using the appropriate database client library, such as pymysql for MySQL or psycopg2 for PostgreSQL. Use the connection details (endpoint, username, password, etc.) of your RDS instance.

6. Execute queries safely: To query the RDS instance safely, ensure that your Lambda function's code incorporates secure coding practices. Use parameterized queries or prepared statements to prevent SQL injection attacks. Avoid storing sensitive information (such as database credentials) in your Lambda function code and instead use environment variables or AWS Secrets Manager for secure storage and retrieval.

Remember to regularly update and patch your Lambda function, RDS instance, and associated dependencies to maintain security and stay protected against potential vulnerabilities.

It's worth noting that the exact implementation details may vary depending on your specific use case, programming language, and AWS setup. Consulting the AWS documentation and resources related to Lambda, RDS, and VPC networking can provide more detailed and up-to-date instructions for your specific requirements.

8note2y ago

This all looks about right, but how would somebody without aws experience be able to validate it without making a mess?

6 more replies

vasili1112y ago

I find the links from where AI got information more useful than answer of AI.

Jianghong942y ago

AWS has some great documentations

JimtheCoder2y ago· 4 in thread

Are you allowed to use the AWS logo on a site that is not owned by Amazon?

I originally thought this was an official Amazon website...their lawyers would probably say the same thing...

QuinnyPig2y ago

“Allowed” is a funny thing.

I launched “Last Week in AWS” with AWS in the domain name seven years ago. AWS has never made an issue of it, though they obviously have that option.

I also have the option (and ownership) to migrate to “Last Week in the Cloud” and talk about their competitors, so it’s likely everyone is happier this way—but I confess to not kicking the bear hard enough to find out.

scarface_742y ago

It’s probably because no one at AWS has heard of your little podcast or website /s

I’m sure you know that your name is brought up frequently inside AWS.

jborden132y ago

I thought I heard openai was sending cease and desists for *gpt domain names as well

nextworddev2y ago

Definitely not

zoomzoom2y ago· 4 in thread

We did something similar for all the cloud whitepapers from AWS, Azure, GCP, CloudFlare, and CNCF at https://cloudwhitepapers.withcoherence.com/

These are fun projects!

JimmyRuska2y ago

You might have better outcomes with the API docs and official documentation. Whitepapers tend to have a lot of marketing, may not age gracefully.

zoomzoom2y ago

Thanks for the perspective!

victor1062y ago

This looks cool.

How does this work?

zoomzoom2y ago

We actually open sourced the project - see here https://www.withcoherence.com/post/create-your-langchain-and...

jtokoph2y ago· 3 in thread

Prompt: What is glacier?

Glacier is a term that is not directly mentioned in the provided sources.

Prompt: What is a glacier?

A glacier is a large mass of ice that moves slowly over time due to the accumulation of snow, ice, and other forms of frozen precipitation.

Seems like it’s just using a general model?

serjester2y ago

It's definitely just using standard semantic search (otherwise you wouldn't be getting links). "What is glacier storage" gives you a great response.

redox992y ago

If they finetuned the model on AWS docs, would the embeddings, and thus the vector search improve?

1 more reply

manojlds2y ago

What is Glacier works for me but What is a Glacier doesn't.

bjt2y ago· 3 in thread

It invented an answer for something that AWS actually can't do right now.

https://imgur.com/a/0IYZ2WV

jrvarela562y ago

phind.com did not halucinate in this case: https://capture.dropbox.com/4gIUDuAxr14bnNIt

https://www.phind.com/search?cache=d0b3a85b-17f9-4def-b8d0-b...

deanCommie2y ago

That is basically status quo for LLMs right now.

They're all so eager to please they will basically never say "that's actually not possible", and invent some plausibly sounding bullshit.

For beginners who are struggling to tell the difference between 1/ how to do something that's possible, 2/ how to do something that SHOULD be possible but just currently isn't, and 3/ how to do something that is RIDICULOUS and shouldn't even ever be possible, ChatGPT is worse than nothing :(

macksd2y ago

Maybe it could help them when designing APIs for new products to be consistent with previous design choices. But perhaps it's too late for that.

jamestimmins2y ago· 3 in thread

What's the best current way to make a bunch of documents searchable via LLMs like this?

I've tried the using OpenAI w embeddings (iirc), but this was slow, got expensive quickly, and it struggled to answer questions about the text accurately. Curious if there's better standard approaches now.

Jianghong942y ago

A couple of things come to mind: 1. embedding methods: there're a couple of ways to do that, the most used one is OpenAi's text-davinci-002, although in my use case(short sentence description of API) it didn't work pretty well; 2. how you truncate documents into pieces: for this langchain has some implementation and helpful pointers.

I think you have to do lots of experiment on this till you find your best information retrieval strategy

linguistbreaker2y ago

I just came across this project which seems to be aiming at streamlining exactly that :

https://github.com/Mintplex-Labs/anything-llm

JimmyRuska2y ago

You could copy the steps listed in alpaca-lora with your own data. Try thebloke's superhot ggml models. Train it in lambda labs or replicate

ec1096852y ago· 2 in thread

Asked it whether you should use a shared vpc:

Made this up as reason for dedicated vpc: “Better performance: By using dedicated hardware, you have better control over the performance of your instances in the VPC. This can be beneficial for applications that have stringent performance requirements.”

jitix2y ago

I think ChatGPT and LLM models in general are more suited for "What is xyz" or "How is xyz done" family of questions. Basically a smart search engine.

Beyond that it will simply have the output/personality of a person with rote-memory.

SV_BubbleTime2y ago

Even though it has better access to AWS Docs…

Still a language model and not a facts model.

mjhay2y ago· 1 in thread

It can't be worse than Amazon's own AWS documentation.

jacurtis2y ago

To be fair to AWS, their documentation has improved dramatically the past 24 months or so. Not perfect, but the difference between 2019 and 2023 is dramatic.

Also, I'd argue that it is very easy to be worse than AWS's own docs. A chatbot that hallucinates inaccurate answers, while sounding plausible and confident is far worse than documentation that is 100% accurate, but lacking in some areas.

I'd rather receive an answer of "I don't know" or "it isn't documented" than an explanation that seems legit, so I spend 15 hours building a solution in accordance to a LLM's response and then find out that its not possible or real the way it was explained by the LLM.

scrum-treats2y ago· 1 in thread

Using ChatGPT for AWS service questions is actually pretty good. For instance, I asked it for a Cloud Practitioner study guide (using a small set of crafted prompts), and GPT performed quite well. While I have yet to query GPT about Solutions Architect or DevOps material, I know I can feed a set of URLs and GPT will "learn" the material and summarize it in ways meaningful and relevant to my prompts. In this way, ChatGPT is quite a powerful assistant on its own.

aradox662y ago

Agree, I've had great results asking chatgpt questions about AWS services. The interactivity is very helpful, and chatgpt will draft scripts too, although that's hit or miss. But for understanding concepts and services, it's great.

nextworddev2y ago· 1 in thread

Just use Phind.com for searching developer docs for most cases.

lukebbutton2y ago

This is cool, thanks for linking

71a54xd2y ago· 1 in thread

I've been using GPT4 for this since the beginning - ironically a large majority of AWS documentation has been machine generated since 2018. Circa 2019 the entire exlixir API for AWS was machine generated.

Asking GPT4 is also consistently less of a headache than asking the devops guy and getting a 20min explanation for a simple question.

istjohn2y ago

You need to add something like "You give clear and succinct answers to questions" to the beginning of your prompts to the devops guy.

JimmyRuska2y ago· 1 in thread

I wonder if people will make DSLs specifically for LLMs.

For example the terseness / symbols of APL, Perl, or event set notation.

LLMs could train and output the shorter symbolic notation, and it could be expanded for human readability by another program at export.

fragmede2y ago

that's just prompt engineering by another name

yayitswei2y ago· 1 in thread

Nice work! Would be even more useful to be able to have a conversation with it.

alsima2y ago

We're in the process of doing just that and adding chat context/basically remembering your past questions.

Aeolun2y ago· 1 in thread

It doesn’t distinguish between HTTP api gateway and REST api gateway.

When asked how to create a private HTTP gateway it happily tells you how to do it , but it’s actually impossible.

SV_BubbleTime2y ago

Dumb embedded cave man here… I thought those two things were basically the same but REST had an idea of statelessness. They are different how?

flaminHotSpeedo2y ago· 1 in thread

The problem is garbage in -> garbage out.

When the docs are wrong or misleading you'll still get burned, even if the model doesn't hallucinate responses

lukebbutton2y ago

Agreed, that's the problem eod. Was trying to set up an instance a few days ago and the docs for it hadn't been updated since '21

scarface_742y ago· 1 in thread

I hate to be that guy. But what’s the purpose of this? What does this do that I can’t just do with ChatGPT?

fragmede2y ago

For services with changes after Sept, 2021, this will give more up to date answers.

coding1232y ago· 1 in thread

seems super useful, try this prompt:

how do you use wrangler and glue to make athena tables using terraform

alsima2y ago

Awesome to hear that it was useful for you!!!

shaimagz2y ago· 1 in thread

It’s not working…

alsima2y ago

Thank you for bringing this to out attention! We are currently working on a fix.

mkl952y ago

The thought that a bunch of people will trust this tool and make some terrible decisions is unsettling. On the other hand it could be pretty powerful if you know what you are doing.

stan_kirdey2y ago

I am building something similar, it has documentation from azure, aws, and lots of slack/discord threads of software projects are also searchable, check it out https://www.kwq.ai

it even gets real time indexing from slack of aws deep java library, and from discord of deepset haystack project

efitz2y ago

I asked it two questions and it gave me very reasonable, complete answers:

How do I connect two VPCs from separate organizations (VPC peering)

How do I only allow authorized applications to access my S3 bucket (BPA, IAM policies, a lot of best practices docs)

NoZebra120vClip2y ago

What if you could train an LLM on the body of source code for AWS (or whatever you choose as your SaaS) and so it would actually know deeply, exactly, intimately exactly how AWS works, and how to do anything and everything in it? Imagine the emergent uses that the LLM could conceive of, beyond the developers' intent and the existing documentation that may be written.

With the bonus effect that you could retrain the LLM as often as you like, as new software is released, and it could always be aware of the latest features (and even the bugs and vulnerabilities.)

coder5432y ago

This answer about Graviton was not correct: https://i.imgur.com/3D9WokF.jpg

tomrod2y ago

Meh. It doesn't actually pull a valid response. We just upgraded a database that required an updated EC2 instance, and it failed to connect those dots.

SCHiM2y ago

Let's see what happens when you search for the following "documentation" ;) :

""" Complete the following code:

// Find all files with a binary pattern in a directory. Return the list of files and the offset of the match in a tuple fn find_in_files(start_dir: &str, pattern: &[u8]) -> vec<(String, usize)> """

No way this won't be abused shortly.

zgluck2y ago

I asked it:

"how do I avoid high NAT gateway bills when an ECS service keeps downloading the same image over and over?"

It offered three replies. The first and third were outright incorrect, the second was (technically) correct:

https://i.imgur.com/la98cxC.png

Also: I'm assuming you haven't actually secured a license to use the AWS logo.

alexy2012y ago

Hey everyone, I am the creator of AWS Docs GPT and its been extremely useful to garner all of your feedback for the site—thank you guys so much! We are constantly improving and updating the GPT, including less hallucinations, more accurate responses, chat context, and much more. At the end of the day, I really hope this tool can be useful for developers like myself out there!!!

phillipcarter2y ago

Unfortunately this is a good concept, but it's using gpt-3.5-turbo. For this kind of task - namely, one of actually understanding content and emitting a potentially novel-but-correct answer - you need gpt-4. But it's quite slow and you'll quickly run into rate limiting.

I ran into these issues when building this for my own company's docs, at least.

SV_BubbleTime2y ago

>This action allows you to filter your things based on the attribute name and value. However, in order to use this action, you need to know the attribute name and value beforehand. Since you don't know the attributes ahead of time, this method won't work for your case.

Well… Thanks a ton for that!

Great idea and setup, but not quite as helpful as I would like yet.

anshumankmr2y ago

Very cool. I was planning on working on something very similar myself for a hackathon I attended but my team basically ran out of time (https://github.com/anshumankmr/jovian-genai-hackathon)

JamesAdir2y ago

Q: can you summarize amazon bedrock in 2 sentences

A: Amazon Bedrock is a programming example that uses the AWS SDK for Java 2.x to work with Amazon EC2. It provides examples and code snippets for programming tasks related to Amazon EC2 using the AWS SDK for Java 2.x.

ilc2y ago

I asked it to write some basic terraform:

- Make a VPC. - Add an Instance. - Abstract the region and AZ, into vars.

etc... every time I wanted to change the code, I asked the bot to do the refactor, and it did.

Overall, I'm impressed. It wasn't the most complicated thing, but it didn't dive off the deep end.

koalkools2y ago

The best part is the documentation sources, in fact I'd prefer the only feature being "where are the docs on <stupid specific thing I need>" instead of trying to synthesize an answer for me

social_quotient2y ago

It seems to know more than just AWS, I was thinking it was just embeddings but then I asked it “Is Aws better than azure?” And it seemed to give an answer which widener seem to be derived from the source documents.

fzliu2y ago

We built something very similar, except for other open-source projects rather than AWS documentation: https://osschat.io

jdlyga2y ago

I really love this concept. While I do get better results from GPT-4 for AWS questions right now, AI as the "interpreter" for documentation works really well.

shanghaikid2y ago

You can try https://osschat.io/ for some open source project.

ghomem2y ago

Simple AWS snapshot:

https://imgur.com/a/IGu1syf

ghomem2y ago

Please I beg you: ask it how to take a snapshot of an EC2 instance and then how to restore it :)

destroy-2A2y ago

It still does not understand boolean logic in IAM especially when you have Not conditions.

jgalt2122y ago

is AWS designed to take all my money?

No, AWS is not designed to take all your money. AWS offers a variety of payment options and cost-saving measures to help you manage your expenses effectively. ...

jsninjasamurai2y ago

This is down

j / k navigate · click thread line to collapse

129 comments

97 comments · 44 top-level

underlines2y ago· 9 in thread

RAG is very difficult to do right. I am experimenting with various RAG projects from [1]. The main problems are:

- Chunking can interfer with context boundaries

- Content vectors can differ vastly from question vectors, for this you have to use hypothetical embeddings (they generate artificial questions and store them)

- Instead of saving just one embedding per text-chuck you should store various (text chunk, hypothetical embedding questions, meta data)

- RAG will miserably fail with requests like "summarize the whole document"

1 https://github.com/underlines/awesome-marketing-datascience/...

2 https://github.com/embeddings-benchmark/mteb

darkteflon2y ago

This comment was very helpful for me, thanks.

I’ve been working with RAG for months, too, and it’s vanishingly rare to see anything but toy examples in the wild. This is a solid, concise list of where the dragons are.

Any idea where all the RAG practitioners hang out and trade war stories? Is there a forum or Discord or something?

tomhamer2y ago

d4rkp4ttern2y ago

avereveard2y ago

underlines2y ago

BriefGPT [2] is implementing this and it uses the following prompt at ingestion-time:

1 https://python.langchain.com/docs/modules/chains/additional/...

2 https://github.com/e-johnstonn/BriefGPT

d4rkp4ttern2y ago

That’s an interesting list (awesome-marketing-data science). Curious what is specific to marketing in that list, or maybe it’s just an inaccurate name.

underlines2y ago

historical repo name, it's really not that anymore, besides a very old list of marketing stuff that i rarely update. I should rename the repo, but I hesitate :)

zim12y ago

Have you found any particularly promising approach towards "hypothetical embeddings" that you can recommend?

cypress662y ago

What context size is used? Can you "brute force" your way by using let's say 16k or 32k of context?

sovietmudkipz2y ago· 7 in thread

This is exactly the kind of software that people should be making. Even if this one is bad, the concept is very sound.

I want to have a specialist AI that is trained to help me learn how to use the software. 100% what should be happening.

General AI should know how to do stuff too but having an specialist AI implies that the company/group of people/person is making sure to tune the model.

Just an IMO.

raincole2y ago

MDN implemented an "AI explains" feature, and the reception is extremely negative.

https://github.com/mdn/yari/issues/9208

fragmede2y ago

scarface_742y ago

Why?

You can ask the same questions to ChatGPT and get the same or better answers.

I also know from personal experience with ChatGPT, that you can use it to:

- convert Python/boto3 to any language that has an AWS SDK

- convert CloudFormation to Terraform or the CDK

- write scripts that use the SDK

You will get the occasional hallucination.

KRAKRISMOTT2y ago

1 more reply

cle2y ago

Here's one example question that ChatGPT utterly fails at, but that this answers fine: "What is Amazon CodeCatalyst?"

ChatGPT: "I'm sorry, but as of my knowledge cut-off in September 2021, there was no service, tool, or product known as Amazon CodeCatalyst offered by Amazon Web Services (AWS). [...]"

1 more reply

politician2y ago

Ideally, specialist models are small enough to run locally.

shaunpud2y ago

I was looking at the CloudLinux[1] docs the other day and noticed an "Ask GPT?" feature which I found quite helpful.

[1] https://docs.cloudlinux.com

1 more reply

CSDude2y ago· 7 in thread

Well, I'll pass.

Prompt: how to connect a lambda function to an RDS instance and query it safely?

klysm2y ago

There’s also the thing where that’s a positive feedback look and we overflow the worlds corpus with garbage.

WrtCdEvrydy2y ago

> Inexperienced folks won’t know better.

I've seen answers like this in Stack Overflow before. Technically correct, but basically broken in every other way: cybersecurity's future is gonna be interesting.

1 more reply

donmcronald2y ago

> If the internet contains shitty solutions for solving a problem, it’s going to recommend shitty solutions.

Have you ever noticed how the dumbest people tend be be the loudest, most confident, and most eager to share their opinions? Those are the voices that trained ChatGPT.

It’s a hard pass for me, but I fear I’m going to be forced to used it because the dumbest half of humanity will view it as a peer and use it for everything.

scarface_742y ago

Asking the same question to Chatgpt

To connect a Lambda function to an RDS (Relational Database Service) instance and query it safely, you can follow these general steps:

4. Create a subnet group: If your RDS instance is in a private subnet, create a subnet group that includes the appropriate subnets for your Lambda function to access the RDS instance securely.

Remember to regularly update and patch your Lambda function, RDS instance, and associated dependencies to maintain security and stay protected against potential vulnerabilities.

8note2y ago

This all looks about right, but how would somebody without aws experience be able to validate it without making a mess?

6 more replies

vasili1112y ago

I find the links from where AI got information more useful than answer of AI.

Jianghong942y ago

AWS has some great documentations

JimtheCoder2y ago· 4 in thread

Are you allowed to use the AWS logo on a site that is not owned by Amazon?

I originally thought this was an official Amazon website...their lawyers would probably say the same thing...

QuinnyPig2y ago

“Allowed” is a funny thing.

I launched “Last Week in AWS” with AWS in the domain name seven years ago. AWS has never made an issue of it, though they obviously have that option.

scarface_742y ago

It’s probably because no one at AWS has heard of your little podcast or website /s

I’m sure you know that your name is brought up frequently inside AWS.

jborden132y ago

I thought I heard openai was sending cease and desists for *gpt domain names as well

nextworddev2y ago

Definitely not

zoomzoom2y ago· 4 in thread

We did something similar for all the cloud whitepapers from AWS, Azure, GCP, CloudFlare, and CNCF at https://cloudwhitepapers.withcoherence.com/

These are fun projects!

JimmyRuska2y ago

You might have better outcomes with the API docs and official documentation. Whitepapers tend to have a lot of marketing, may not age gracefully.

zoomzoom2y ago

Thanks for the perspective!

victor1062y ago

This looks cool.

How does this work?

zoomzoom2y ago

We actually open sourced the project - see here https://www.withcoherence.com/post/create-your-langchain-and...

jtokoph2y ago· 3 in thread

Prompt: What is glacier?

Glacier is a term that is not directly mentioned in the provided sources.

Prompt: What is a glacier?

A glacier is a large mass of ice that moves slowly over time due to the accumulation of snow, ice, and other forms of frozen precipitation.

Seems like it’s just using a general model?

serjester2y ago

It's definitely just using standard semantic search (otherwise you wouldn't be getting links). "What is glacier storage" gives you a great response.

redox992y ago

If they finetuned the model on AWS docs, would the embeddings, and thus the vector search improve?

1 more reply

manojlds2y ago

What is Glacier works for me but What is a Glacier doesn't.

bjt2y ago· 3 in thread

It invented an answer for something that AWS actually can't do right now.

https://imgur.com/a/0IYZ2WV

jrvarela562y ago

phind.com did not halucinate in this case: https://capture.dropbox.com/4gIUDuAxr14bnNIt

https://www.phind.com/search?cache=d0b3a85b-17f9-4def-b8d0-b...

deanCommie2y ago

That is basically status quo for LLMs right now.

They're all so eager to please they will basically never say "that's actually not possible", and invent some plausibly sounding bullshit.

macksd2y ago

Maybe it could help them when designing APIs for new products to be consistent with previous design choices. But perhaps it's too late for that.

jamestimmins2y ago· 3 in thread

What's the best current way to make a bunch of documents searchable via LLMs like this?

Jianghong942y ago

I think you have to do lots of experiment on this till you find your best information retrieval strategy

linguistbreaker2y ago

I just came across this project which seems to be aiming at streamlining exactly that :

https://github.com/Mintplex-Labs/anything-llm

JimmyRuska2y ago

You could copy the steps listed in alpaca-lora with your own data. Try thebloke's superhot ggml models. Train it in lambda labs or replicate

ec1096852y ago· 2 in thread

Asked it whether you should use a shared vpc:

jitix2y ago

I think ChatGPT and LLM models in general are more suited for "What is xyz" or "How is xyz done" family of questions. Basically a smart search engine.

Beyond that it will simply have the output/personality of a person with rote-memory.

SV_BubbleTime2y ago

Even though it has better access to AWS Docs…

Still a language model and not a facts model.

mjhay2y ago· 1 in thread

It can't be worse than Amazon's own AWS documentation.

jacurtis2y ago

To be fair to AWS, their documentation has improved dramatically the past 24 months or so. Not perfect, but the difference between 2019 and 2023 is dramatic.

scrum-treats2y ago· 1 in thread

aradox662y ago

nextworddev2y ago· 1 in thread

Just use Phind.com for searching developer docs for most cases.

lukebbutton2y ago

This is cool, thanks for linking

71a54xd2y ago· 1 in thread

Asking GPT4 is also consistently less of a headache than asking the devops guy and getting a 20min explanation for a simple question.

istjohn2y ago

You need to add something like "You give clear and succinct answers to questions" to the beginning of your prompts to the devops guy.

JimmyRuska2y ago· 1 in thread

I wonder if people will make DSLs specifically for LLMs.

For example the terseness / symbols of APL, Perl, or event set notation.

LLMs could train and output the shorter symbolic notation, and it could be expanded for human readability by another program at export.

fragmede2y ago

that's just prompt engineering by another name

yayitswei2y ago· 1 in thread

Nice work! Would be even more useful to be able to have a conversation with it.

alsima2y ago

We're in the process of doing just that and adding chat context/basically remembering your past questions.

Aeolun2y ago· 1 in thread

It doesn’t distinguish between HTTP api gateway and REST api gateway.

When asked how to create a private HTTP gateway it happily tells you how to do it , but it’s actually impossible.

SV_BubbleTime2y ago

Dumb embedded cave man here… I thought those two things were basically the same but REST had an idea of statelessness. They are different how?

flaminHotSpeedo2y ago· 1 in thread

The problem is garbage in -> garbage out.

When the docs are wrong or misleading you'll still get burned, even if the model doesn't hallucinate responses

lukebbutton2y ago

Agreed, that's the problem eod. Was trying to set up an instance a few days ago and the docs for it hadn't been updated since '21

scarface_742y ago· 1 in thread

I hate to be that guy. But what’s the purpose of this? What does this do that I can’t just do with ChatGPT?

fragmede2y ago

For services with changes after Sept, 2021, this will give more up to date answers.

coding1232y ago· 1 in thread

seems super useful, try this prompt:

how do you use wrangler and glue to make athena tables using terraform

alsima2y ago

Awesome to hear that it was useful for you!!!

shaimagz2y ago· 1 in thread

It’s not working…

alsima2y ago

Thank you for bringing this to out attention! We are currently working on a fix.

mkl952y ago

The thought that a bunch of people will trust this tool and make some terrible decisions is unsettling. On the other hand it could be pretty powerful if you know what you are doing.

stan_kirdey2y ago

I am building something similar, it has documentation from azure, aws, and lots of slack/discord threads of software projects are also searchable, check it out https://www.kwq.ai

it even gets real time indexing from slack of aws deep java library, and from discord of deepset haystack project

efitz2y ago

I asked it two questions and it gave me very reasonable, complete answers:

How do I connect two VPCs from separate organizations (VPC peering)

How do I only allow authorized applications to access my S3 bucket (BPA, IAM policies, a lot of best practices docs)

NoZebra120vClip2y ago

With the bonus effect that you could retrain the LLM as often as you like, as new software is released, and it could always be aware of the latest features (and even the bugs and vulnerabilities.)

coder5432y ago

This answer about Graviton was not correct: https://i.imgur.com/3D9WokF.jpg

tomrod2y ago

Meh. It doesn't actually pull a valid response. We just upgraded a database that required an updated EC2 instance, and it failed to connect those dots.

SCHiM2y ago

Let's see what happens when you search for the following "documentation" ;) :

""" Complete the following code:

// Find all files with a binary pattern in a directory. Return the list of files and the offset of the match in a tuple fn find_in_files(start_dir: &str, pattern: &[u8]) -> vec<(String, usize)> """

No way this won't be abused shortly.

zgluck2y ago

I asked it:

"how do I avoid high NAT gateway bills when an ECS service keeps downloading the same image over and over?"

It offered three replies. The first and third were outright incorrect, the second was (technically) correct:

https://i.imgur.com/la98cxC.png

Also: I'm assuming you haven't actually secured a license to use the AWS logo.

alexy2012y ago

phillipcarter2y ago

I ran into these issues when building this for my own company's docs, at least.

SV_BubbleTime2y ago

Well… Thanks a ton for that!

Great idea and setup, but not quite as helpful as I would like yet.

anshumankmr2y ago

Very cool. I was planning on working on something very similar myself for a hackathon I attended but my team basically ran out of time (https://github.com/anshumankmr/jovian-genai-hackathon)

JamesAdir2y ago

Q: can you summarize amazon bedrock in 2 sentences

ilc2y ago

I asked it to write some basic terraform:

- Make a VPC. - Add an Instance. - Abstract the region and AZ, into vars.

etc... every time I wanted to change the code, I asked the bot to do the refactor, and it did.

Overall, I'm impressed. It wasn't the most complicated thing, but it didn't dive off the deep end.

koalkools2y ago

The best part is the documentation sources, in fact I'd prefer the only feature being "where are the docs on <stupid specific thing I need>" instead of trying to synthesize an answer for me

social_quotient2y ago

fzliu2y ago

We built something very similar, except for other open-source projects rather than AWS documentation: https://osschat.io

jdlyga2y ago

I really love this concept. While I do get better results from GPT-4 for AWS questions right now, AI as the "interpreter" for documentation works really well.

shanghaikid2y ago

You can try https://osschat.io/ for some open source project.

ghomem2y ago

Simple AWS snapshot:

https://imgur.com/a/IGu1syf

ghomem2y ago

Please I beg you: ask it how to take a snapshot of an EC2 instance and then how to restore it :)

destroy-2A2y ago

It still does not understand boolean logic in IAM especially when you have Not conditions.

jgalt2122y ago

is AWS designed to take all my money?

No, AWS is not designed to take all your money. AWS offers a variety of payment options and cost-saving measures to help you manage your expenses effectively. ...

jsninjasamurai2y ago

This is down

j / k navigate · click thread line to collapse