fazlerocks | Better HN

fazlerocks

149 karmaJoined September 10, 2014252 submissions

Exploring how AI agents can make software more resilient.

Also co-founded Hashnode (a prominent dev publishing platform with 5M MAU).

Recent submissions

Structural steel estimating: the steps were never the hard part (opens in new tab)

(bidferra.com)

2fazlerocks9d ago0

Which AI tools have you used every day for the past year?

For me:

1. ChatGPT desktop. This is the one I probably use the most without even thinking about it.

2. Gemini. I used 2.5 a lot for content stuff and recently started trying Gemini 3. The results are actually getting better now.

3. Codex in VS Code. Switched from Copilot about three months ago. Codex feels way better for me. The cloud task execution is wild, it can build an entire Next.js feature in one prompt.

4. ChatGPT on mobile. Still my go to for quick grammar or English fixes. I keep forgetting Apple’s AI even exists.

5. Superhuman's AI. I use it here and there to clean up replies. Nice to have but not something I’d miss if I stopped paying.

What about you?

1fazlerocks7mo ago0

Playwright new Test Agents explained (opens in new tab)

(bug0.com)

1fazlerocks8mo ago0

Ask HN: What's your 2025 "quality stack"?

Feels like we’re all shipping faster, but the old “just write more tests” advice isn’t keeping up.

Teams are smaller, release cycles are tighter, and AI is sneaking into a lot of workflows.

I’m curious what people here are actually relying on these days to keep things from breaking:

- What layers are in your stack? (types/linters, unit, contract, integration, E2E, monitoring, flags, SLOs, etc.)

- Is AI playing a real role yet for you? test gen, self-healing, triage, anomaly detection?

- Anything you dropped recently because it wasn’t worth the effort? (flaky UI tests, snapshot tests, staging envs…)

- For smaller teams, do you still bother with classic QA, or do you lean more on flags/observability/canaries?

- Anyone tried managed or AI-assisted QA instead of DIY? Curious if it actually worked, esp. around trust/cost/lock-in.

- How do you measure “confidence to release” beyond code coverage?

Would love to hear quick snapshots like: - team size / release cadence

- stack (web, mobile, regulated or not)

- pre-merge checks

- post-deploy safeguards

- tools you kept vs abandoned

- biggest source of flakiness right now

- what you’d do differently if starting today

Looking for real, on-the-ground stories from folks shipping in 2025. What’s working for you?

1fazlerocks10mo ago0

Show HN: Mail42 – Disposable emails with AI-based text extraction

Hi HN,

I built Mail42 (https://mail42.ai) to make it easier to test email flows without dealing with regex or messy parsing.

The pain point: whenever I tested signup or checkout flows, I had to create throwaway inboxes and then write hacky regex just to grab a 6-digit OTP or a verification link.

With Mail42 you can:

- Generate disposable email addresses instantly for QA

- Query emails using natural language like "get the OTP" or "find the verification link"

- Use a simple REST API that works with curl, Postman, or your test suite

Example:

`curl "https://get.mail42.ai/?email=test.123@mail42.ai&prompt=get otp"`

Response:

847291

It’s lightweight and intended for testing env

I’d love your feedback:

- would you use this in your testing workflow?

- what features or integrations would make it more useful?

- are there cases where regex still feels more reliable?

Thanks for checking it out.

1fazlerocks10mo ago0

Ask HN: What tools are you using for AI evals? Everything feels half-baked

We're running LLMs in production for content generation, customer support, and code review assistance. Been trying to build a proper evaluation pipeline for months but every tool we've tested has significant limitations.

What we've evaluated:

- OpenAI's Evals framework: Works well for benchmarking but challenging for custom use cases. Configuration through YAML files can be complex and extending functionality requires diving deep into their codebase. Primarily designed for batch processing rather than real-time monitoring.

- LangSmith: Strong tracing capabilities but eval features feel secondary to their observability focus. Pricing starts at $0.50 per 1k traces after the free tier, which adds up quickly with high volume. UI can be slow with larger datasets.

- Weights & Biases: Powerful platform but designed primarily for traditional ML experiment tracking. Setup is complex and requires significant ML expertise. Our product team struggles to use it effectively.

- Humanloop: Clean interface focused on prompt versioning with basic evaluation capabilities. Limited eval types available and pricing is steep for the feature set.

- Braintrust: Interesting approach to evaluation but feels like an early-stage product. Documentation is sparse and integration options are limited.

What we actually need: - Real-time eval monitoring (not just batch) - Custom eval functions that don't require PhD-level setup - Human-in-the-loop workflows for subjective tasks - Cost tracking per model/prompt - Integration with our existing observability stack - Something our product team can actually use

Current solution:

Custom scripts + monitoring dashboards for basic metrics. Weekly manual reviews in spreadsheets. It works but doesn't scale and we miss edge cases.

Has anyone found tools that handle production LLM evaluation well? Are we expecting too much or is the tooling genuinely immature? Especially interested in hearing from teams without dedicated ML engineers.

6fazlerocks1y ago3

Ask HN: What's the most overengineered tool everyone uses but won't admit sucks?

Looking to build something open source and trying to figure out what tools everyone pretends to love but actually hate.

I'll start. Jira. We all use it, we all hate it, nobody admits how much time we waste updating tickets.

Did you move it to the right column? Story points aren't filled out. Link it to the epic.

Meanwhile the actual work takes 2 hours, documenting it takes another hour.

Half the team ignores it, the other half are obsessed with workflows that have 47 different statuses. But try suggesting GitHub issues and suddenly "how will we track velocity??"

What tool is supposed to make you productive but just creates busywork?

28fazlerocks1y ago42

Ask HN: What's your serverless stack for AI/LLM apps in production?

I've been building AI applications using Next.js, GPT, and Langchain. As I'm approaching production scale, I'm curious how others are handling deployment infrastructure.

Current stack: - Next.js on Vercel - Serverless functions for AI/LLM endpoints - Pinecone for vector storage

Questions for those running AI in production:

1. What's your serverless infrastructure choice? (Vercel/Cloud Run/Lambda)

2. How are you handling state management for long-running agent tasks?

3. What's your approach to cost optimization with LLM API calls?

4. Are you self-hosting any components?

5. How are you handling vector store scaling?

Particularly interested in hearing from teams who've scaled beyond prototype stage. Have you hit any unexpected limitations with serverless for AI workloads?

3fazlerocks1y ago3

Show HN: Awesome Docs Gallery – crowdsourced list of the best dev docs (opens in new tab)

(awesome-docs.gallery)

1fazlerocks1y ago0

Show HN: Docs by Hashnode, we built another docs product in 2024, here's why

Hey HN,

I’m not sure if this post will stick (we’re competing with some YC-backed startups in this space), but I’ll give it a try!

We built Docs by Hashnode because we saw a gap in current documentation tools. Many platforms are either too rigid, lack customization (Like ReadMe), or require too much dev time to manage (Like Docusaurus). With our docs product, we wanted to create something that’s both flexible and scalable, allowing teams to focus on creating docs without the complexity.

Here’s what we’re solving:

1. Most doc platforms don’t scale well or offer customization.

2. Teams often struggle with rigid templates and version control.

3. Companies need API references and product guides that grow with their product, not against it.

Our solution:

1. Offer both Hosted and Headless mode with GraphQL support for those who need more control.

2.Real-time collaboration with inline commenting, perfect for technical and non-technical teams to work together.

3. AI-powered search for faster, smarter discovery of docs content.

4. Unlimited API references and guides to help you scale your docs as your product evolves.

5. Create API references easily using OpenAPI specs.

6. Blazing-fast performance optimized for SEO and Lighthouse scores.

Our goal is to build documentation that evolves with your product, not something that slows you down. Some early users, including YC startups, are already using it and love the flexibility.

Would love to get your feedback or answer any questions!

Thanks for reading! Excited to hear what the community here thinks.

https://hashnode.com/products/docs

3fazlerocks1y ago1

Show HN: We built a developer-friendly Headless CMS for blogs (opens in new tab)

(hashnode.com)

5fazlerocks2y ago2

Why Postgres is winning over MySQL? (opens in new tab)

(youtube.com)Video

3fazlerocks2y ago0

The ideal headless CMS for blogs (opens in new tab)

(hashnode.com)

2fazlerocks2y ago0

Headless Blogs by Hashnode: Open-Source Next.js Starter Kit and GraphQL APIs (opens in new tab)

(hashnode.com)

1fazlerocks2y ago0

Show HN: ChatGQL – Natural Language Conversations with GraphQL APIs (opens in new tab)

(chatgql.com)