from Hacker News

"I can't do that, Dave" – No agent yet

by freediver on 3/8/26, 6:21 PM with 26 comments

  • by rdiddly on 3/8/26, 6:46 PM

    Boy, that was fragmented. What should I have done for years leading up to today to prepare for reading this? Gaming? Doomscrolling social media? Chugging Mountain Dew? Reading poetry?
  • by Frost1x on 3/8/26, 6:50 PM

    It’s interesting because I’m seeing some emerging conversations where users are tending to prefer general agents that have their preferential bias over more constrained or specially built agents, because there are certain arbitrary goal criteria they either have forced on them or want to force upon the agent and the general purpose agents tend to do well at this because they just trudge along and do whatever.

    Meanwhile more specialized agents that try to add or enforce constraints around a problem space where certain aspects tend to be well established don’t sit well with a lot of uses. “No, you and general knowledge don’t know best, I know best… do this.”

    I can see the use case for both but I’m seeing a whole lot more willingness to want confirmation bias, essentially to automate away parts of jobs and tasks people already do but in the personalized or opinionated way they’ve established, unwilling to explore alternative options.

    So the general purpose agent structures that just kickoff whatever they can tend to favor best in terms of positive feedback from agent users. Meanwhile it to some degree ignores many of the potential benenfits of having agents with general knowledge and bounded by general established bounds. It’s basically the whole “please do parts of my job for me but only the way I want them done.”

    People aren’t ready for being wrong or change, they just want to automate parts of their processes away. So I’m not sure “no” is going to sit well with a lot of people.

  • by dvfjsdhgfv on 3/8/26, 6:33 PM

    Actually once I was very surprised when testing one of recent non-thinking Qwen models it said "I'm sorry, this project is too complex, I can do that". I was very impressed by this answer. So far, it was the only model that reacted to this task, ever. The remaining ones agreed to proceed and failed.
  • by speedgoose on 3/8/26, 6:42 PM

    For coding perhaps. For general purpose usages, current models know how and when to refuse. Politics, sexual taboo, drugs,…

    Perhaps we should train them to refuse developing more insert your most hated stack here.

  • by ElFitz on 3/8/26, 6:45 PM

    I’ve been running a Claude Code "thing" in a loop for a few days, and that has been extremely frustrating.

    But after tons of nudging it has started developing a sort of "improvement engine", as it calls it, for itself to help address that.

    It go through its own logs and sessions, documents and keep track of patterns and signals, associated strategies, then regularly evaluate their impacts, independently of the agent itself, and it feeds those back to it in each loop.

    It’s been quite fascinating to watch.

  • by bronlund on 3/8/26, 7:02 PM

    «Think ultra deep and analyze this article. Make a detailed list of the top five alternatives as to what he is talking about.»
  • by porphyra on 3/8/26, 6:34 PM

    I would much rather have models that try and fail than to have false refusals (which do happen and are really annoying).
  • by matchagaucho on 3/8/26, 6:57 PM

    Agents can propose refactoring just as readily as humans.

    If coding agents already read AGENTS.md before making changes, they can also maintain a TECHNICAL_DEBT.md checklist.

    Keep the loop intact: AGENTS.md ensures technical debt remains in context whenever changes are planned.

  • by throwway262515 on 3/9/26, 12:13 AM

    Qwen, is that you?

    My experience with it is that it tends to create such 3-word sentences when ask to write an article.

  • by nilirl on 3/8/26, 7:00 PM

    Huh? The point of the article is that we should use git to store an LLMs output as it works?

    How do any of the quotes and citations used coherently form that argument?

    What is this writing style? Why does it feel like it doesn't want me to understand what the heck it's saying?

  • by grian42 on 3/8/26, 6:58 PM

    haven't read through the crap poetry lol nobody got time for thatt but have experienced the same "i cant do that" - no agent yet, llms very eager to apply a bodge fix or something rather than going "this design is shit consider changing it pal lol", which is the "fix" i did myself.