from Hacker News

GPT-5.2 derives a new result in theoretical physics

by davidbarker on 2/13/26, 7:20 PM with 397 comments

  • by outlace on 2/13/26, 7:36 PM

    The headline may make it seem like AI just discovered some new result in physics all on its own, but reading the post, humans started off trying to solve some problem, it got complex, GPT simplified it and found a solution with the simpler representation. It took 12 hours for GPT pro to do this. In my experience LLM’s can make new things when they are some linear combination of existing things but I haven’t been to get them to do something totally out of distribution yet from first principles.
  • by square_usual on 2/13/26, 8:20 PM

    It's interesting to me that whenever a new breakthrough in AI use comes up, there's always a flood of people who come in to handwave away why this isn't actually a win for LLMs. Like with the novel solutions GPT 5.2 has been able to find for erdos problems - many users here (even in this very thread!) think they know more about this than Fields medalist Terence Tao, who maintains this list showing that, yes, LLMs have driven these proofs: https://github.com/teorth/erdosproblems/wiki/AI-contribution...
  • by Davidzheng on 2/13/26, 7:36 PM

    "An internal scaffolded version of GPT‑5.2 then spent roughly 12 hours reasoning through the problem, coming up with the same formula and producing a formal proof of its validity."

    When I use GPT 5.2 Thinking Extended, it gave me the impression that it's consistent enough/has a low enough rate of errors (or enough error correcting ability) to autonomously do math/physics for many hours if it were allowed to [but I guess the Extended time cuts off around 30 minute mark and Pro maybe 1-2 hours]. It's good to see some confirmation of that impression here. I hope scientists/mathematicians at large will be able to play with tools which think at this time-scale soon and see how much capabilities these machines really have.

  • by cpard on 2/13/26, 8:49 PM

    AI can be an amazing productivity multiplier for people who know what they're doing.

    This result reminded me of the C compiler case that Anthropic posted recently. Sure, agents wrote the code for hours but there was a human there giving them directions, scoping the problem, finding the test suites needed for the agentic loops to actually work etc etc. In general making sure the output actually works and that it's a story worth sharing with others.

    The "AI replaces humans in X" narrative is primarily a tool for driving attention and funding. It works great for creating impressions and building brand value but also does a disservice to the actual researchers, engineers and humans in general, who do the hard work of problem formulation, validation and at the end, solving the problem using another tool in their toolbox.

  • by nilkn on 2/13/26, 8:03 PM

    It would be more accurate to say that humans using GPT-5.2 derived a new result in theoretical physics (or, if you're being generous, humans and GPT-5.2 together derived a new result). The title makes it sound like GPT-5.2 produced a complete or near-complete paper on its own, but what it actually did was take human-derived datapoints, conjecture a generalization, then prove that generalization. Having scanned the paper, this seems to be a significant enough contribution to warrant a legitimate author credit, but I still think the title on its own is an exaggeration.
  • by turzmo on 2/14/26, 2:00 AM

    Physicist here. Did you guys actually read the paper? Am I missing something? The "key" AI-conjectured formula (39) is an obvious generalization of (35)-(38), and something a human would have guessed immediately.

    (35)-(38) are the AI-simplified versions of (29)-(32). Those earlier formulae look formidable to simplify by hand, but they are also the sort of thing you'd try to use a computer algebra system for.

    I'm willing to (begrudgingly) admit the possibility for AI to do novel work, but this particular result does not seem very impressive.

    I picture ChatGPT as the rich kid whose parents privately donated to a lab to get their name on a paper for college admissions. In this case, I don't think I'm being too cynical in thinking that something similar is happening here and that the role of AI in this result is being well overplayed.

  • by amai on 2/15/26, 9:21 PM

    "There is no question that dialogue between physicists and LLMs can generate fundamentally new knowledge."

    That is what one of the author says. This doesn't quite fit to the headline of the post.

  • by Insanity on 2/13/26, 7:28 PM

    They also claimed ChatGPT solved novel erdös problems when that wasn’t the case. Will take with a grain of salt until more external validation happened. But very cool if true!
  • by qnleigh on 2/14/26, 5:31 PM

    I'm surprised to see that the valence of comments here is mostly negative. Nima Arkhami-Hamed is one of the top living physicists, and he has nice things to say about the work. The fact that researchers can increasingly use these models to (help) find new results is a big deal, even considering the caveats.
  • by castigatio on 2/14/26, 10:05 AM

    I'm not sure where people think humans are getting these magical leaps of insight that transcend combinations of existing things. Magic? Ghost in the machine? The simplest explanation is that "leaps of insight" are simply novel combinations that demonstrate themselves to have some utility within the boundaries of a test case or objective.

    Snow + stick + need to clean driveway = snow shovel. Snow shovel + hill + desire for fun = sled

    At one point people were arguing that you could never get "true art" from linear programs. Now you get true art and people are arguing you can't get magical flashes of insight. The will to defend human intelligence / creativity is strong but the evidence is weak.

  • by mym1990 on 2/13/26, 9:14 PM

    Many innovations are built off cross pollination of domains and I think we are not too far off from having a loop where multiple agents grounded very well in specific domains can find intersections and optimizations by communicating with each other, especially if they are able to run for 12+ hours. The truth is that 99% of attempts at innovation will fail, but the 1% can yield something fantastic, the more attempts we can take, the faster progress will happen.
  • by elashri on 2/13/26, 7:48 PM

    I would be less interested in scattering amplitude of all particle physics concepts as a test case because the scattering amplitudes because it is one of the concisest definition and its solution is straightforward (not easy of course). So once you have a good grasp of the QM and the scattering then it is a matter of applying your knowledge of math to solve the problem. Usually the real problem is to actually define your parameters from your model and define the tree level calculations. Then for LLM to solve these it is impressive but the researchers defined everything and came up with the workflow.

    So I would read this (with more information available) with less emphasize on LLM discovering new result. The title is a little bit misleading but actually "derives" being the operative word here so it would be technically correct for people in the field.

  • by crorella on 2/13/26, 7:31 PM

  • by vbarrielle on 2/13/26, 8:29 PM

    I' m far from being an LLM enthusiast, but this is probably the right use case for this technology: conjectures which are hard to find, but then the proof can be checked with automated theorem provers. Isn't it what AlphaProof does by the way?
  • by computator on 2/13/26, 10:32 PM

    I have a weird long-shot idea for GPT to make a new discovery in physics: Ask it to find a mathematical relationship between some combination of the fundamental physical constants[1]. If it finds (for example) a formula that relates electron mass, Bohr radius, and speed of light to a high degree of precision, that might indicate an area of physics to explore further if those constants were thought to be independent.

    [1] https://en.wikipedia.org/wiki/List_of_physical_constants

  • by JanisErdmanis on 2/14/26, 10:12 AM

    Such tedious derivations used to be a work of poor PhD students who were instrumentalized for such tasks. I envy those who do PhDs in theoretical physics in the age of AI, people can learn so much about their field quicker via chat than reading obstructing papers.
  • by singularfutur on 2/14/26, 11:11 AM

    Humans did the actual work: framing the problem, computing base cases, verifying results. GPT just refactored a formula. That's a compiler's job, not a physicist's. Stop letting marketing write science headlines.
  • by giantg2 on 2/14/26, 11:54 AM

    GPT-5.2 can't even process a 1-2 page PDF and give me a subset of the content as a formatted word doc. Nor can it even be truthful about it's own capabilities.
  • by kaelandt on 2/14/26, 1:43 PM

    Misleading title, it's more like GPT-5.2 derives the generalization of a formula that physicists conjectured. Not really related to physics
  • by sciencejerk on 2/14/26, 5:01 AM

    An internal scaffolded version of GPT‑5.2...

    Any reason to believe that public versions of GPT-5.2 could have accomplished this task? "scaffolded" is a very interesting word choice

  • by smj-edison on 2/13/26, 10:58 PM

    Regardless of whether this means AGI has been achieved or not, I think this is really exciting since we could theoretically have agents look through papers and work on finding simpler solutions. The complexity of math is dizzying, so I think anything that can be done to simplify it would be amazing (I think of this essay[1]), especially if it frees up mathematicians' time to focus even more on the state of the art.

    [1] https://distill.pub/2017/research-debt/

  • by PlatoIsADisease on 2/13/26, 8:44 PM

    I'll read the article in a second, but let me guess ahead of time: Induction.

    Okay read it: Yep Induction. It already had the answer.

    Don't get me wrong, I love Induction... but we aren't having any revolutions in understanding with Induction.

  • by globalnode on 2/13/26, 11:01 PM

    Even if gpts results are debatable and we sometimes dislike misapplications of ai where its not needed, it feels as though another milestone is being reached. the first was when they were initially released and everyone was amazed. this second milestone seems to be that their competence has increased. I am often amazed at their output despite being a huge skeptic. I guess the fine tuning is coming along well but I still dont think we will see agi from these chatbots and I doubt theres a third milestone. The second was just a refinement of the first.
  • by major4x on 2/13/26, 9:54 PM

  • by gaigalas on 2/13/26, 8:09 PM

    I like the use of the word "derives". However, it gets outshined by "new result" in public eyes.

    I expect lots of derivations (new discoveries whose pieces were already in place somewhere, but no one has put them together).

    In this case, the human authors did the thinking and also used the LLM, but this could happen without the original human author too (some guy posts some partial on the internet, no one realizes is novel knowledge, gets reused by AI later). It would be tremendously nice if credit was kept in such possible scenarios.

  • by cagz on 2/14/26, 9:46 AM

    Does the article have a strong marketing vibe? Absolutely Does the research performed move the needle, however small, in theoretical physics? Yes Could we have expected this to happen a year ago? Not really.

    My personal opinion is that things will only accelerate from here.

  • by vonneumannstan on 2/13/26, 7:32 PM

    Interesting considering the Twitter froth recently about AI being incapable in principle of discovering anything.
  • by nxobject on 2/14/26, 3:10 AM

    Man, I'd be more worried about the impact of this on Mathematica than actual humans.
  • by another_twist on 2/13/26, 9:53 PM

    Thats great. I think we need to start researching how to get cheaper models to do math. I have a hunch it should be possible to get leaner models to achieve these results with the right sort of reinforcement learning.
  • by longfacehorrace on 2/13/26, 8:00 PM

    Car manufacturers need to step up their hype game...

    New Honda Civic discovered Pacific Ocean!

    New F150 discovers Utah Salt Flats!

    Sure it took humans engineering and operating our machines, but the car is the real contributor here!

  • by snarky123 on 2/13/26, 7:38 PM

    So wait,GPT found a formula that humans couldn't,then the humans proved it was right? That's either terrifying or the model just got lucky. Probably the latter.
  • by emp17344 on 2/13/26, 8:22 PM

    Cynically, I wonder if this was released at this time to ward off any criticism from the failure of LLMs to solve the 1stproof problems.
  • by the_king on 2/14/26, 7:29 AM

    This it is very impressive. But scrolling through the preprint, I wouldn't call any of it elegant.

    I'm not blaming the model here, but Python is much easier to read and more universal than math notation in most cases (especially for whatever's going on at the bottom of page four). I guess I'll have one translate the PDF.

  • by ares623 on 2/13/26, 8:21 PM

    I guess the important question is, is this enough news to sustain OpenAI long enough for their IPO?
  • by hackable_sand on 2/14/26, 9:28 AM

    Wonderful. Where's my money
  • by dadb00ty on 2/13/26, 11:19 PM

    But what does it all mean, Basil?
  • by baalimago on 2/13/26, 8:10 PM

    Well, anyone can derive a new result in anything. Question is most often if the result makes any sense
  • by pruufsocial on 2/13/26, 7:35 PM

    All I saw was gravitons and thought we’re finally here the singularity has begun
  • by user3939382 on 2/14/26, 8:51 AM

    I’m able to recover Schwarzchild using only known constants starting with hydrogen using a sort of calculator I made along these lines. No Schrödinger. There’s a lot there so working on what to publish.
  • by nsxwolf on 2/13/26, 11:29 PM

    Warp drive next.
  • by sfmike on 2/13/26, 9:00 PM

    5.2 is the best model on the market.
  • by brcmthrowaway on 2/13/26, 7:38 PM

    End times approach..
  • by getnormality on 2/13/26, 9:57 PM

    I'll believe it when someone other than OpenAI says it.

    Not saying they're lying, but I'm sure it's exaggerated in their own report.

  • by Noaidi on 2/14/26, 12:40 PM

    "Let's put 'GPT' in our paper to get clicks!"?
  • by anonym29 on 2/14/26, 3:21 AM

    sToChAsTiC pArRoTs CaNt PrOdUcE aNyTHiNg NeW!!!!1
  • by pear01 on 2/13/26, 10:31 PM

    If a researcher uses LLM to get a novel result should the llm also reap the rewards? Could a nobel prize ever be given to a llm or is that like giving a nobel to a calculator?
  • by mrguyorama on 2/13/26, 8:41 PM

    Don't lend much credence to a preprint. I'm not insinuating fraud, but plenty of preprints turn out to be "Actually you have a math error here", or are retracted entirely.

    Theoretical physics is throwing a lot of stuff at the wall and theory crafting to find anything that might stick a little. Generation might actually be good there, even generation that is "just" recombining existing ideas.

    I trust physicists and mathematicians to mostly use tools because they provide benefit, rather than because they are in vogue. I assume they were approached by OpenAI for this, but glad they found a way to benefit from it. Physicists have a lot of experience teasing useful results out of probabilistic and half broken math machines.

    If LLMs end up being solely tools for exploring some symbolic math, that's a real benefit. Wish it didn't involve destroying all progress on climate change, platforming truly evil people, destroying our economy, exploiting already disadvantaged artists, destroying OSS communities, enabling yet another order of magnitude increase in spam profitability, destroying the personal computer market, stealing all our data, sucking the oxygen out of investing into real industry, and bold faced lies to all people about how these systems work.

    Also, last I checked, MATLAB wasn't a trillion dollar business.

    Interestingly, the OpenAI wrangler is last in the list of Authors and acknowledgements. That somewhat implies the physicists don't think it deserves much credit. They could be biased against LLMs like me.

    When Victor Ninov (fraudulently) analyzed his team's accelerator data using an existing software suite to find a novel SuperHeavy element, he got first billing on the authors list. Probably he contributed to the theory and some practical work, but he alone was literate in the GOOSY data tool. Author lists are often a political game as well as credit, but Victor got top billing above people like his bosses, who were famous names. The guy who actually came up with the idea of how to create the element, in an innovative recipe that a lot of people doubted, was credited 8th

    https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.83...

  • by jtrn on 2/13/26, 10:26 PM

    This is my favorite field for me to have opinions about, without not having any training or skill. Fundamental research i just a something I enjoy thinking about, even tho I am psychologist. I try to pull inn my experience from the clinic and clinical research when i read theoretical physics. Don't take this text to seriously, its just my attempt at understanding whats going on.

    I am generally very skeptical about work on this level of abstraction. only after choosing Klein signature instead of physical spacetime, complexifying momenta, restricting to a "half-collinear" regime that doesn't exist in our universe, and picking a specific kinematic sub-region. Then they check the result against internal consistency conditions of the same mathematical system. This pattern should worry anyone familiar with the replication crisis. The conditions this field operates under are a near-perfect match for what psychology has identified as maximising systematic overconfidence: extreme researcher degrees of freedom (choose your signature, regime, helicity, ordering until something simplifies), no external feedback loop (the specific regimes studied have no experimental counterpart), survivorship bias (ugly results don't get published, so the field builds a narrative of "hidden simplicity" from the survivors), and tiny expert communities where fewer than a dozen people worldwide can fully verify any given result.

    The standard defence is that the underlying theory — Yang-Mills / QCD — is experimentally verified to extraordinary precision. True. But the leap from "this theory matches collider data" to "therefore this formula in an unphysical signature reveals deep truth about nature" has several unsupported steps that the field tends to hand-wave past.

    Compare to evolution: fossils, genetics, biogeography, embryology, molecular clocks, observed speciation — independent lines of evidence from different fields, different centuries, different methods, all converging. That's what robust external validation looks like. "Our formula satisfies the soft theorem" is not that.

    This isn't a claim that the math is wrong. It's a claim that the epistemic conditions are exactly the ones where humans fool themselves most reliably, and that the field's confidence in the physical significance of these results outstrips the available evidence.

    I wrote up a more detailed critique in a substack: https://jonnordland.substack.com/p/the-psychologists-case-ag...

  • by My_Name on 2/14/26, 1:20 PM

    I talked about basic principles of QM, gravity, time and relativity with Claude, then talked about implications of that, and claude came up with the idea that mass causes time and gravity as emergent properties that only affect macro scale objects, QM particles do not have to obey either of them, and this explains the double slit experiment, the delayed choice experiment, 'spooky action at a distance', and other aspects of entanglement.

    Basically, if you are small enough you can move forwards and backwards in time, from the moment you were put into a superposition, or entangled, until you interact with an object too large to ignore the emergent effects of time and gravity. This is 'being observed' and 'collapsing the wave function'. You occupy all possible positions in space as defined by the probability of you being there. Once observed, you move forward in linear time again and the last route you took is the only one you ever took even though that route could be affected by interference with other routes you took that now no longer exist. When in this state there is no 'before' or 'after' so the delayed choice experiment is simply an illusion caused by our view of time, and there is no delay, the choice and result all happen together.

    With entanglement, both particles return to the entanglement point, swap places and then move to the current moment and back again, over and over. They obey GR, information always travels under the speed of light (which to the photon is infinite anyway), so there is no spooky action at a distance, it is sub-lightspeed action through time that has the illusion of being instant to entities stuck in linear time.

    It then went on to talk about how mass creates time, and how time is just a different interpretation of gravity leading it to fully explain how a black hole switches time and space, and inwards becomes forwards in time inside the event horizon. Mass warps 4D (or more) space. That is gravity, and it is also time.