by sixhobbits on 8/11/25, 2:03 PM with 504 comments
by epiccoleman on 8/11/25, 2:30 PM
This sort of thing is a great demonstration of why I remain excited about AI in spite of all the hype and anti-hype. It's just fun to mess with these tools, to let them get friction out of your way. It's a revival of the feelings I had when I first started coding: "wow, I really can do anything if I can just figure out how."
Great article, thanks for sharing!
by cultofmetatron on 8/11/25, 6:14 PM
by js2 on 8/11/25, 3:43 PM
FYI, this can be shortened to:
IS_SANDBOX=1 claude --dangerously-skip-permissions
You don't need the export in this case, nor does it need to be two separate commands joined by &&. (It's semantically different in that the variable is set only for the single `claude` invocation, not any commands which follow. That's often what you want though.)> I asked Claude to rename all the files and I could go do something else while it churned away, reading the files and figuring out the correct names.
It's got infinite patience for performing tedious tasks manually and will gladly eat up all your tokens. When I see it doing something like this manually, I stop it and tell it to write a program to do the thing I want. e.g. I needed to change the shape of about 100 JSON files the other day and it wanted to go through them one-by-one. I stopped it after the third file, told it to write a script to import the old shape and write out the new shape, and 30 seconds later it was done. I also had it write me a script to... rename my stupidly named bank statements. :-)
by tptacek on 8/11/25, 3:16 PM
Really, any coding agent our shop didn't write itself, though in those cases the smiting might be less theatrical than if you literally ran a yolo-mode agent on a prod server.
by spyder on 8/12/25, 10:58 AM
Yea, I know that was the case when I clicked on the thumbnails and couldn't close the image and had to reload the whole page. Good thing that you could just ask AI to fix this, but the bad thing is that you assumed it would produce fully working code in one shot and didn't test it properly.
by dabedee on 8/11/25, 2:37 PM
by chaosprint on 8/11/25, 2:36 PM
In fact, I now prefer to use a purely chat window to plan the overall direction and let LLM provide a few different architectural ideas, rather than asking LLM to write a lot of code whose detail I have no idea about.
by serf on 8/11/25, 9:54 PM
Sure, go have fun with the new software -- but for godsake don't actually depend on a company that can't bother to reply to you. Even Amazon replies.
by jrflowers on 8/11/25, 3:33 PM
The future is vibe coding but what some people don’t yet appreciate what that vibe is, which is a Pachinko machine permanently inserted between the user and the computer. It’s wild to think that anybody got anything done without the thrill of feeding quarters into the computer and seeing if the ball lands on “post on Reddit” or “delete database”
by throwaway-11-1 on 8/11/25, 7:11 PM
by _pdp_ on 8/11/25, 3:13 PM
Using coding agent is great btw, but at least learn how to double check their work cuz they are also quite terrible.
by felineflock on 8/11/25, 3:56 PM
by bodge5000 on 8/11/25, 10:17 PM
Otherwise good article, I'm still not sure vibe coding is for me and at the price, it's hard to justify trying to out, but things like this do make me a little more tempted to give it a shot. I doubt it'd ever replace writing code by hand for me, but could be fun for prototyping I suppose
by g42gregory on 8/11/25, 5:46 PM
It absolutely boggles my mind how anybody thinks that this is Ok?
Unless you are in North Korea, of course.
by mdasen on 8/11/25, 3:17 PM
by andrewstuart on 8/11/25, 4:27 PM
I throw their results at each other, get them to debug and review each others work.
Often a get all three to write the code for a given need and then ask all three to review all three answers to find the best solution.
If I’m building something sophisticated there might be 50 cycles of three way code review until they are all agreed that there no critical problems.
There’s no way I could do without all three at the same time it’s essential.
by alecco on 8/11/25, 7:46 PM
This is why we can't have nice things. Anthropic is placing more restrictive limits and now you risk being locked out for hours if you need to use it a bit more than usual (e.g. you have an impending deadline or presentation).
I wish Anthropic just banned these abusive accounts instead of placing draconian (and fuzzy) limits. The other day there was an idiot YouTube streamer actively looking to hit limits with as many concurrent Claude Code sessions as he could, doing nonsense projects.
by 1gn15 on 8/11/25, 3:58 PM
Repo: https://github.com/sixhobbits/hn-comment-ranker
I need to modify this to work with local models, though. But this does illustrate the article's point -- we both had an idea, but only one person actually went ahead and did it, because they're more familiar with agentic coding than me.
[1] Oh. I think I understand why. /lh
by t0md4n on 8/11/25, 3:34 PM
by rglover on 8/11/25, 4:52 PM
by dazzaji on 8/12/25, 4:18 AM
by wedn3sday on 8/11/25, 4:42 PM
by umvi on 8/11/25, 10:21 PM
by interpol_p on 8/12/25, 12:55 AM
Using it for iOS development is interesting. It does produce working output (sometimes!) but it's very hit-or-miss. Recently I gave it a couple hours to build a CarPlay prototype of one of my apps. It was completely unable to refactor the codebase to correctly support CarPlay (even though I passed the entire CarPlay documentation into it). I gave it three attempts at it. Then I intervened and added support for CarPlay manually, following that I added a lot of skeleton code for it to flesh out. Claude was then able to build a prototype
However, over the next few days as I tried to maintain the code I ended up rewriting 60% of it because it was not maintainable or correct. (By "not correct" I mean it had logic errors and was updating the display multiple times with incorrect information before replacing it with correct information, causing the data displayed to randomly refresh)
I also tried getting it to add some new screens to a game I develop. I wanted it to add some of the purchase flows into the app (boring code that I hate writing). It managed to do it with compile errors, and was unable to fix its own build output despite having the tools to do so. Instead of fixing the build errors it caused, Claude Code decided it would manually verify that only its own changes were correct by running `swiftc` on only files that it touched. Which was nonsense
All that said, there was a benefit in that Claude Code writing all this code and getting something up on the screen motivated me to finally pick up the work and do some of these tasks. I had been putting them off for months and just having the work "get started" no matter how bad, was a good kick start
by codeulike on 8/12/25, 9:44 AM
Because I think 'sending everything to the ai' would be a bit of an obstacle for most company environments
by cloudking on 8/11/25, 6:45 PM
Tried Cursor, Windsurf and always ran into tool failures, edit failures etc.
by prmph on 8/11/25, 9:24 PM
Wow, the danger is not so much from Claude Code itself, but that it might download a package that will do nasty things on your machine when executed.
by e-brake on 8/12/25, 11:06 AM
by cyral on 8/12/25, 3:13 AM
Regarding some of the comments here: I found the article style fine, and I even like the "follow my journey" style writing as it helps the reader understand the process you went though. That kind of engineering and debugging workflow is something I enjoy about this industry.
by doppelgunner on 8/11/25, 2:07 PM
by jofer on 8/11/25, 3:00 PM
However, I disagree that LLMs are anywhere near as good as what's described here for most things I've worked with.
So far, I'm pretty impressed with Cursor as a toy. It's not a usable tool for me, though. I haven't used Claude a ton, though I've seen co-workers use it quite a bit. Maybe I'm just not embracing the full "vibe coding" thing enough and not allowing AI agents to fully run wild.
I will concede that Claude and Cursor have gotten quite good at frontend web development generation. I don't doubt that there are a lot of tasks where they make sense.
However, I still have yet to see a _single_ example of any of these tools working for my domain. Every single case, even when the folks who are trumpeting the tools internally run the prompting/etc, results in catastrophic failure.
The ones people trumpet internally are cases where folks can't be bothered to learn the libraries they're working with.
The real issue is that people who aren't deeply familiar with the domain don't notice the problems with the changes LLMs make. They _seem_ reasonable. Essentially by definition.
Despite this, we are being nearly forced to use AI tooling on critical production scientific computing code. I have been told I should never be editing code directly and been told I must use AI tooling by various higher level execs and managers. Doing so is 10x to 100x slower than making changes directly. I don't have boilerplate. I do care about knowing what things do because I need to communicate that to customers and predict how changes to parameters will affect output.
I keep hearing things described as an "overactive intern", but I've never seen an intern this bad, and I've seen a _lot_ of interns. Interns don't make 1000 line changes that wreck core parts of the codebase despite being told to leave that part alone. Interns are willing to validate the underlying mathematical approximations to the physics and are capable of accurately reasoning about how different approximations will affect the output. Interns understand what the result of the pipeline will be used for and can communicate that in simple terms or more complex terms to customers. (You'd think this is what LLMs would be good at, but holy crap do they hallucinate when working with scientific terminology and jargon.)
Interns have PhDs (or in some cases, are still in grad school, but close to completion). They just don't have much software engineering experience yet. Maybe that's the ideal customer base for some of these LLM/AI code generation strategies, but those tools seem especially bad in the scientific computing domain.
My bottleneck isn't how fast I can type. My bottleneck is explaining to a customer how our data processing will affect their analysis.
(To our CEO) - Stop forcing us to use the wrong tools for our jobs.
(To the rest of the world) - Maybe I'm wrong and just being a luddite, but I haven't seem results that live up to the hype yet, especially within the scientific computing world.
by hungryhobbit on 8/11/25, 2:41 PM
I thought the article was a satire after I read this ... but it wasn't!
by ramesh31 on 8/11/25, 2:35 PM
by SuperSandro2000 on 8/11/25, 8:58 PM
by devmor on 8/11/25, 2:53 PM
The author didn't do anything actually useful or impactful, they played around with a toy and mimicked a portion of what it's like to spin up pet projects as a developer.
But hey, it could be that this says something after all. The first big public usages of AI were toys and vastly performed as a sideshow attraction for amused netizens. Maybe we haven't come very far at all, in comparison to the resources spent. It seems like all of the truly impressive and useful applications of this technology are still in specialized private sector work.
by lvl155 on 8/11/25, 3:04 PM
by aantix on 8/11/25, 4:12 PM
Are there internal guardrails within Claude Code to prevent such incidents?
rm -rf, drop database, etc?
by vibecoding-grft on 8/11/25, 5:20 PM
by not_a_bot_4sho on 8/11/25, 3:09 PM
by anotherpaul on 8/12/25, 6:29 AM
by javier_e06 on 8/12/25, 2:28 PM
https://www.media.mit.edu/articles/a-i-is-homogenizing-our-t...
by einpoklum on 8/13/25, 12:47 PM
This is something a third of your total gross income - if we take a median over the people of the world.
by varispeed on 8/11/25, 8:32 PM
by alberth on 8/11/25, 3:00 PM
(Sure, I could let them use my credentials but that isn’t really legit/fair use.)
by meistertigran on 8/12/25, 6:36 PM
Instead of giving it a VPS, I just made a tool that allows synchronization of localStorage data.
I now just upload the HTML it generates and have an app instantly.
by rcvassallo83 on 8/11/25, 10:57 PM
Sir, do you realize that crud is such a solved problem that popular MVC frameworks from over a decade ago generate it for you from templates? No wasteful LLM prompting required.
by darqis on 8/11/25, 8:15 PM
by nickradford on 8/11/25, 6:09 PM
by howToTestFE on 8/11/25, 10:39 PM
by buyx on 8/12/25, 5:24 AM
by visarga on 8/11/25, 4:09 PM
by amelius on 8/11/25, 11:03 PM
by mdrzn on 8/12/25, 11:13 AM
by esafak on 8/11/25, 3:22 PM
by siva7 on 8/11/25, 3:06 PM
I think i'm done with this community in the age of vibe coding. The line between satire, venture capitalism, business idea guys and sane tech enthusiasts is getting too blurry.
by dangoodmanUT on 8/12/25, 2:49 PM
by ramoz on 8/11/25, 6:33 PM
by sgt101 on 8/11/25, 4:48 PM
by wolvesechoes on 8/12/25, 9:07 AM
by jmull on 8/12/25, 12:47 AM
A frustration of using tools is that they never entirely act exactly the way you want... instead of it working the way you want, you have to work they way it wants (and before that, you have to figure out what that is).
...We're stuck with this, because it's just not feasible to build custom software for each person, that works exactly the way they want.
...Or is it?
I'm intrigued by the possibility that coding models do in fact make it feasible to have software customized exactly to what I want.
Of course, that includes the coding agent, so no need for Claude Code.
by Pomelolo on 8/14/25, 11:36 AM
by zb3 on 8/11/25, 3:07 PM
by ontigola on 8/12/25, 5:53 AM
by burntpineapple on 8/11/25, 2:31 PM
by almosthere on 8/11/25, 4:19 PM