I've always been more of a fastidious crafter than a "just get it built" person, but I also struggle with a blank page. I thrive on editing more than writing. Since forever, I like to get out something that works or mostly works, and then start carving on it until I like it.
LLMs have been helping me get some ink on the page, but very little of what they suggest ends up in the final product.
Supposedly Tom Robbins writes books entirely by putting one word after another starting with the first one and finishing with the last one. I don't know if that's apocryphal, but I do think that's closer to the process for some people.
But if I were a writer, I'd be squarely in the "get out a first draft, it will be like pulling teeth, but just get something down; then you can do the fun part of revising and polishing".
As one myself LLMs don't feel like giving up control it feels like collaboration. You probably don't like collaboration.
When I interact with an LLM it feels like I'm speaking to someone who lacks inherent conceptual understanding, but is doing their utmost best to bullshit me. This is problematic when it comes to algorithmic thinking.
I sometimes need to solve problems that involve tweaking existing algorithms in somewhat novel ways. For example "k-medoid clustering but with various capacity constraints on clusters", or "contraction heirarchies but accounting for varying vehicle dimension restrictions".
I love collaborating with smart colleagues to come up with ideas to solve these problems. Seeing chat-gpt fail to solve very simple traveling salesman type problems gives me little confidence that it can do novel conceptual tihnking.
EDIT
To test my own claims, I just tried asking Chat-GPT to make this modification to k-means clustering. It produced something that will give a result but will likely not be very optimal. When prompted to refine the solution, it failed to understand the issue and instead proposed the exact same solution just with a bit more implementation detail.
User
Suggest a modification to the k-means algorithm where the number of elements assigned to each cluster is limited to a pre-defined maximum value.
ChatGPT
To modify the traditional k-means algorithm to limit the number of elements assigned to each cluster to a pre-defined maximum value, you can create a variation of k-means called "k-means with cluster size constraints." This modification ensures that no cluster exceeds the specified maximum size while still optimizing cluster centroids based on the sum of squared distances. Here's a high-level overview of the modified algorithm:
1 Initialize centroids: Start by randomly selecting K initial centroids, where K is the desired number of clusters.
2 Assign data points to clusters: Assign each data point to the nearest centroid based on Euclidean distance, just like in the traditional k-means algorithm.
3 Update centroids: Recalculate the centroids of each cluster by taking the mean of all the data points assigned to that cluster.
4 Check cluster sizes: After updating the centroids, check the size of each cluster. If any cluster exceeds the pre-defined maximum size, proceed to reassign some data points from that cluster to other clusters. Here's how you can do this: Identify the cluster(s) that exceed the maximum size. For each over-sized cluster: Sort the data points in that cluster by their distance to the cluster centroid in ascending order. Start reassigning the farthest data points to other clusters (the ones with the smallest distances) until the cluster size constraint is satisfied. Update the centroids of the affected clusters accordingly.
5 Repeat steps 2 and 3: Continue iterating between data point assignment and centroid updating until convergence, or until a predefined number of iterations have been reached.
6 Termination: The algorithm terminates when either the cluster assignments do not change significantly between iterations or after a fixed number of iterations.
Here are a few things to consider when implementing this modification:
You will need to experiment with the order in which you reassign data points from over-sized clusters to minimize the impact on the overall clustering quality.
You should set an appropriate stopping criterion to prevent the algorithm from running indefinitely.
Be prepared to handle situations where it is impossible to meet the cluster size constraint while maintaining good cluster quality. In such cases, you may need to relax the constraint or consider alternative clustering methods.
This modified k-means algorithm with cluster size constraints allows you to control cluster sizes while still finding meaningful clusters in your data.User
Your proposed solution has an issue that points might be reassigned even if there are clearly no other suitable clusters. Please tweak the solution to avoid this issue.
ChatGPT
To address the issue of potentially reassigning points even when there are no other suitable clusters, we can modify the approach by introducing a priority queue or heap data structure to efficiently manage the reassignment process. Here's an improved version of the k-means algorithm with cluster size constraints:
1 Initialize centroids: Start by randomly selecting K initial centroids, where K is the desired number of clusters.
2 Assign data points to clusters: Assign each data point to the nearest centroid based on Euclidean distance, just like in the traditional k-means algorithm.
3 Update centroids: Recalculate the centroids of each cluster by taking the mean of all the data points assigned to that cluster.
4 Check cluster sizes: After updating the centroids, check the size of each cluster. If any cluster exceeds the pre-defined maximum size, proceed to reassign data points from that cluster to other clusters. Here's how you can do this: Identify the cluster(s) that exceed the maximum size. For each over-sized cluster: Create a priority queue (or heap) to store the data points in the cluster, ordered by their distance to the cluster centroid in ascending order. Start reassigning the farthest data points to other clusters (the ones with the smallest distances) until the cluster size constraint is satisfied. Update the centroids of the affected clusters accordingly. Continue this process until either the cluster size constraint is met or there are no more suitable clusters to reassign points to.
5 Repeat steps 2 and 3: Continue iterating between data point assignment and centroid updating until convergence, or until a predefined number of iterations have been reached.
6 Termination: The algorithm terminates when either the cluster assignments do not change significantly between iterations or after a fixed number of iterations.
This modified approach ensures that points are only reassigned to clusters that are suitable and have room to accommodate additional points while maintaining the overall quality of the clustering solution. The use of a priority queue helps prioritize the reassignment of points to clusters with the least impact on their quality.
Additionally, you should set an appropriate stopping criterion to prevent the algorithm from running indefinitely, and you may need to handle situations where it is impossible to meet the cluster size constraint while maintaining good cluster quality by relaxing the constraint or considering alternative clustering methods.
Also what you're doing here is asking chatGPT for the answer. chatGPT is more effective via collaboration. Meaning instead of shoving the entire problem down it's throat and asking it to solve it, you ask it for advice. Ask it for bits and pieces of things.