undefined | Better HN

0 pointsn4r92y ago0 comments

Don't beat yourself up. As a mathematically-oriented software engineer I also have little to no interest in using LLMs to help with coding. It feels like the people that do are the builders, i.e. interested in quickly knocking up something that works most of the time. Which is great from a business perspective. I can't help but see my programming output as an artistic expression of my thought, and therefore find it difficult to give up the hands-on construction to others (be it LLMs or juniors).

0 comments

9 comments · 2 top-level

corethree2y ago· 5 in thread

Mathematically oriented software engineers have the most to benefit off of software assistance due to algebraic designs.

As one myself LLMs don't feel like giving up control it feels like collaboration. You probably don't like collaboration.

jprete2y ago

Claiming that someone doesn’t like collaboration because they don’t want to use ChatGPT is a very unreasonable and biased take.

bumbledraven2y ago

Not especially more biased than GGP's claim that people who use LLMs as coding assistants are "builders, i.e. interested in quickly knocking up something that works most of the time".

n4r9OP2y ago

Could you expand on the comment about "algebraic designs"?

When I interact with an LLM it feels like I'm speaking to someone who lacks inherent conceptual understanding, but is doing their utmost best to bullshit me. This is problematic when it comes to algorithmic thinking.

I sometimes need to solve problems that involve tweaking existing algorithms in somewhat novel ways. For example "k-medoid clustering but with various capacity constraints on clusters", or "contraction heirarchies but accounting for varying vehicle dimension restrictions".

I love collaborating with smart colleagues to come up with ideas to solve these problems. Seeing chat-gpt fail to solve very simple traveling salesman type problems gives me little confidence that it can do novel conceptual tihnking.

EDIT

To test my own claims, I just tried asking Chat-GPT to make this modification to k-means clustering. It produced something that will give a result but will likely not be very optimal. When prompted to refine the solution, it failed to understand the issue and instead proposed the exact same solution just with a bit more implementation detail.

User

Suggest a modification to the k-means algorithm where the number of elements assigned to each cluster is limited to a pre-defined maximum value.

ChatGPT

To modify the traditional k-means algorithm to limit the number of elements assigned to each cluster to a pre-defined maximum value, you can create a variation of k-means called "k-means with cluster size constraints." This modification ensures that no cluster exceeds the specified maximum size while still optimizing cluster centroids based on the sum of squared distances. Here's a high-level overview of the modified algorithm:

1 Initialize centroids: Start by randomly selecting K initial centroids, where K is the desired number of clusters.

2 Assign data points to clusters: Assign each data point to the nearest centroid based on Euclidean distance, just like in the traditional k-means algorithm.

3 Update centroids: Recalculate the centroids of each cluster by taking the mean of all the data points assigned to that cluster.

4 Check cluster sizes: After updating the centroids, check the size of each cluster. If any cluster exceeds the pre-defined maximum size, proceed to reassign some data points from that cluster to other clusters. Here's how you can do this: Identify the cluster(s) that exceed the maximum size. For each over-sized cluster: Sort the data points in that cluster by their distance to the cluster centroid in ascending order. Start reassigning the farthest data points to other clusters (the ones with the smallest distances) until the cluster size constraint is satisfied. Update the centroids of the affected clusters accordingly.

5 Repeat steps 2 and 3: Continue iterating between data point assignment and centroid updating until convergence, or until a predefined number of iterations have been reached.

6 Termination: The algorithm terminates when either the cluster assignments do not change significantly between iterations or after a fixed number of iterations.

Here are a few things to consider when implementing this modification:

    You will need to experiment with the order in which you reassign data points from over-sized clusters to minimize the impact on the overall clustering quality.

    You should set an appropriate stopping criterion to prevent the algorithm from running indefinitely.

    Be prepared to handle situations where it is impossible to meet the cluster size constraint while maintaining good cluster quality. In such cases, you may need to relax the constraint or consider alternative clustering methods.

This modified k-means algorithm with cluster size constraints allows you to control cluster sizes while still finding meaningful clusters in your data.

User

Your proposed solution has an issue that points might be reassigned even if there are clearly no other suitable clusters. Please tweak the solution to avoid this issue.

ChatGPT

To address the issue of potentially reassigning points even when there are no other suitable clusters, we can modify the approach by introducing a priority queue or heap data structure to efficiently manage the reassignment process. Here's an improved version of the k-means algorithm with cluster size constraints:

1 Initialize centroids: Start by randomly selecting K initial centroids, where K is the desired number of clusters.

2 Assign data points to clusters: Assign each data point to the nearest centroid based on Euclidean distance, just like in the traditional k-means algorithm.

3 Update centroids: Recalculate the centroids of each cluster by taking the mean of all the data points assigned to that cluster.

4 Check cluster sizes: After updating the centroids, check the size of each cluster. If any cluster exceeds the pre-defined maximum size, proceed to reassign data points from that cluster to other clusters. Here's how you can do this: Identify the cluster(s) that exceed the maximum size. For each over-sized cluster: Create a priority queue (or heap) to store the data points in the cluster, ordered by their distance to the cluster centroid in ascending order. Start reassigning the farthest data points to other clusters (the ones with the smallest distances) until the cluster size constraint is satisfied. Update the centroids of the affected clusters accordingly. Continue this process until either the cluster size constraint is met or there are no more suitable clusters to reassign points to.

5 Repeat steps 2 and 3: Continue iterating between data point assignment and centroid updating until convergence, or until a predefined number of iterations have been reached.

6 Termination: The algorithm terminates when either the cluster assignments do not change significantly between iterations or after a fixed number of iterations.

This modified approach ensures that points are only reassigned to clusters that are suitable and have room to accommodate additional points while maintaining the overall quality of the clustering solution. The use of a priority queue helps prioritize the reassignment of points to clusters with the least impact on their quality.

Additionally, you should set an appropriate stopping criterion to prevent the algorithm from running indefinitely, and you may need to handle situations where it is impossible to meet the cluster size constraint while maintaining good cluster quality by relaxing the constraint or considering alternative clustering methods.

corethree2y ago

How is what you're doing mathematical? I mean you can call CS "math", but then I don't know what you mean by more "mathematical." Traditional cs algos are not what I term as more mathy from my pov. Maybe you can call stats "mathy" but this is more applied math and not too different from what "builders" do.

Also what you're doing here is asking chatGPT for the answer. chatGPT is more effective via collaboration. Meaning instead of shoving the entire problem down it's throat and asking it to solve it, you ask it for advice. Ask it for bits and pieces of things.

1 more reply

bumbledraven2y ago

An LLM is going to be most helpful for topics that came up a lot during training. What you're discussing doesn't sound like it falls into that category.

1 more reply

sanderjd2y ago· 2 in thread

I think the division is along a different axis than this one (or probably it's along multiple axes).

I've always been more of a fastidious crafter than a "just get it built" person, but I also struggle with a blank page. I thrive on editing more than writing. Since forever, I like to get out something that works or mostly works, and then start carving on it until I like it.

LLMs have been helping me get some ink on the page, but very little of what they suggest ends up in the final product.

jprete2y ago

I’ve observed for a long time that there’s a lot of value in making a seed, a starting point, for the thing a group needs to create. And that’s exactly the reason - new text is harder than criticism of the old.

sanderjd2y ago

Yep! But some people do seem to thrive more with a blank page.

Supposedly Tom Robbins writes books entirely by putting one word after another starting with the first one and finishing with the last one. I don't know if that's apocryphal, but I do think that's closer to the process for some people.

But if I were a writer, I'd be squarely in the "get out a first draft, it will be like pulling teeth, but just get something down; then you can do the fun part of revising and polishing".

j / k navigate · click thread line to collapse

0 comments

9 comments · 2 top-level

corethree2y ago· 5 in thread

Mathematically oriented software engineers have the most to benefit off of software assistance due to algebraic designs.

As one myself LLMs don't feel like giving up control it feels like collaboration. You probably don't like collaboration.

jprete2y ago

Claiming that someone doesn’t like collaboration because they don’t want to use ChatGPT is a very unreasonable and biased take.

bumbledraven2y ago

Not especially more biased than GGP's claim that people who use LLMs as coding assistants are "builders, i.e. interested in quickly knocking up something that works most of the time".

n4r9OP2y ago

Could you expand on the comment about "algebraic designs"?

EDIT

User

Suggest a modification to the k-means algorithm where the number of elements assigned to each cluster is limited to a pre-defined maximum value.

ChatGPT

1 Initialize centroids: Start by randomly selecting K initial centroids, where K is the desired number of clusters.

2 Assign data points to clusters: Assign each data point to the nearest centroid based on Euclidean distance, just like in the traditional k-means algorithm.

3 Update centroids: Recalculate the centroids of each cluster by taking the mean of all the data points assigned to that cluster.

5 Repeat steps 2 and 3: Continue iterating between data point assignment and centroid updating until convergence, or until a predefined number of iterations have been reached.

6 Termination: The algorithm terminates when either the cluster assignments do not change significantly between iterations or after a fixed number of iterations.

Here are a few things to consider when implementing this modification:

    You will need to experiment with the order in which you reassign data points from over-sized clusters to minimize the impact on the overall clustering quality.

    You should set an appropriate stopping criterion to prevent the algorithm from running indefinitely.

    Be prepared to handle situations where it is impossible to meet the cluster size constraint while maintaining good cluster quality. In such cases, you may need to relax the constraint or consider alternative clustering methods.

This modified k-means algorithm with cluster size constraints allows you to control cluster sizes while still finding meaningful clusters in your data.

User

Your proposed solution has an issue that points might be reassigned even if there are clearly no other suitable clusters. Please tweak the solution to avoid this issue.

ChatGPT

1 Initialize centroids: Start by randomly selecting K initial centroids, where K is the desired number of clusters.

2 Assign data points to clusters: Assign each data point to the nearest centroid based on Euclidean distance, just like in the traditional k-means algorithm.

3 Update centroids: Recalculate the centroids of each cluster by taking the mean of all the data points assigned to that cluster.

5 Repeat steps 2 and 3: Continue iterating between data point assignment and centroid updating until convergence, or until a predefined number of iterations have been reached.

6 Termination: The algorithm terminates when either the cluster assignments do not change significantly between iterations or after a fixed number of iterations.

corethree2y ago

1 more reply

bumbledraven2y ago

An LLM is going to be most helpful for topics that came up a lot during training. What you're discussing doesn't sound like it falls into that category.

1 more reply

sanderjd2y ago· 2 in thread

I think the division is along a different axis than this one (or probably it's along multiple axes).

LLMs have been helping me get some ink on the page, but very little of what they suggest ends up in the final product.

jprete2y ago

sanderjd2y ago

Yep! But some people do seem to thrive more with a blank page.

But if I were a writer, I'd be squarely in the "get out a first draft, it will be like pulling teeth, but just get something down; then you can do the fun part of revising and polishing".

j / k navigate · click thread line to collapse