by davidbarker on 2/13/26, 7:20 PM with 397 comments
by outlace on 2/13/26, 7:36 PM
by square_usual on 2/13/26, 8:20 PM
by Davidzheng on 2/13/26, 7:36 PM
When I use GPT 5.2 Thinking Extended, it gave me the impression that it's consistent enough/has a low enough rate of errors (or enough error correcting ability) to autonomously do math/physics for many hours if it were allowed to [but I guess the Extended time cuts off around 30 minute mark and Pro maybe 1-2 hours]. It's good to see some confirmation of that impression here. I hope scientists/mathematicians at large will be able to play with tools which think at this time-scale soon and see how much capabilities these machines really have.
by cpard on 2/13/26, 8:49 PM
This result reminded me of the C compiler case that Anthropic posted recently. Sure, agents wrote the code for hours but there was a human there giving them directions, scoping the problem, finding the test suites needed for the agentic loops to actually work etc etc. In general making sure the output actually works and that it's a story worth sharing with others.
The "AI replaces humans in X" narrative is primarily a tool for driving attention and funding. It works great for creating impressions and building brand value but also does a disservice to the actual researchers, engineers and humans in general, who do the hard work of problem formulation, validation and at the end, solving the problem using another tool in their toolbox.
by nilkn on 2/13/26, 8:03 PM
by turzmo on 2/14/26, 2:00 AM
(35)-(38) are the AI-simplified versions of (29)-(32). Those earlier formulae look formidable to simplify by hand, but they are also the sort of thing you'd try to use a computer algebra system for.
I'm willing to (begrudgingly) admit the possibility for AI to do novel work, but this particular result does not seem very impressive.
I picture ChatGPT as the rich kid whose parents privately donated to a lab to get their name on a paper for college admissions. In this case, I don't think I'm being too cynical in thinking that something similar is happening here and that the role of AI in this result is being well overplayed.
by amai on 2/15/26, 9:21 PM
That is what one of the author says. This doesn't quite fit to the headline of the post.
by Insanity on 2/13/26, 7:28 PM
by qnleigh on 2/14/26, 5:31 PM
by castigatio on 2/14/26, 10:05 AM
Snow + stick + need to clean driveway = snow shovel. Snow shovel + hill + desire for fun = sled
At one point people were arguing that you could never get "true art" from linear programs. Now you get true art and people are arguing you can't get magical flashes of insight. The will to defend human intelligence / creativity is strong but the evidence is weak.
by mym1990 on 2/13/26, 9:14 PM
by elashri on 2/13/26, 7:48 PM
So I would read this (with more information available) with less emphasize on LLM discovering new result. The title is a little bit misleading but actually "derives" being the operative word here so it would be technically correct for people in the field.
by crorella on 2/13/26, 7:31 PM
by vbarrielle on 2/13/26, 8:29 PM
by computator on 2/13/26, 10:32 PM
[1] https://en.wikipedia.org/wiki/List_of_physical_constants
by JanisErdmanis on 2/14/26, 10:12 AM
by singularfutur on 2/14/26, 11:11 AM
by giantg2 on 2/14/26, 11:54 AM
by kaelandt on 2/14/26, 1:43 PM
by sciencejerk on 2/14/26, 5:01 AM
Any reason to believe that public versions of GPT-5.2 could have accomplished this task? "scaffolded" is a very interesting word choice
by smj-edison on 2/13/26, 10:58 PM
by PlatoIsADisease on 2/13/26, 8:44 PM
Okay read it: Yep Induction. It already had the answer.
Don't get me wrong, I love Induction... but we aren't having any revolutions in understanding with Induction.
by globalnode on 2/13/26, 11:01 PM
by major4x on 2/13/26, 9:54 PM
by gaigalas on 2/13/26, 8:09 PM
I expect lots of derivations (new discoveries whose pieces were already in place somewhere, but no one has put them together).
In this case, the human authors did the thinking and also used the LLM, but this could happen without the original human author too (some guy posts some partial on the internet, no one realizes is novel knowledge, gets reused by AI later). It would be tremendously nice if credit was kept in such possible scenarios.
by cagz on 2/14/26, 9:46 AM
My personal opinion is that things will only accelerate from here.
by vonneumannstan on 2/13/26, 7:32 PM
by nxobject on 2/14/26, 3:10 AM
by another_twist on 2/13/26, 9:53 PM
by longfacehorrace on 2/13/26, 8:00 PM
New Honda Civic discovered Pacific Ocean!
New F150 discovers Utah Salt Flats!
Sure it took humans engineering and operating our machines, but the car is the real contributor here!
by snarky123 on 2/13/26, 7:38 PM
by emp17344 on 2/13/26, 8:22 PM
by the_king on 2/14/26, 7:29 AM
I'm not blaming the model here, but Python is much easier to read and more universal than math notation in most cases (especially for whatever's going on at the bottom of page four). I guess I'll have one translate the PDF.
by ares623 on 2/13/26, 8:21 PM
by hackable_sand on 2/14/26, 9:28 AM
by dadb00ty on 2/13/26, 11:19 PM
by baalimago on 2/13/26, 8:10 PM
by pruufsocial on 2/13/26, 7:35 PM
by user3939382 on 2/14/26, 8:51 AM
by nsxwolf on 2/13/26, 11:29 PM
by sfmike on 2/13/26, 9:00 PM
by brcmthrowaway on 2/13/26, 7:38 PM
by getnormality on 2/13/26, 9:57 PM
Not saying they're lying, but I'm sure it's exaggerated in their own report.
by Noaidi on 2/14/26, 12:40 PM
by anonym29 on 2/14/26, 3:21 AM
by pear01 on 2/13/26, 10:31 PM
by mrguyorama on 2/13/26, 8:41 PM
Theoretical physics is throwing a lot of stuff at the wall and theory crafting to find anything that might stick a little. Generation might actually be good there, even generation that is "just" recombining existing ideas.
I trust physicists and mathematicians to mostly use tools because they provide benefit, rather than because they are in vogue. I assume they were approached by OpenAI for this, but glad they found a way to benefit from it. Physicists have a lot of experience teasing useful results out of probabilistic and half broken math machines.
If LLMs end up being solely tools for exploring some symbolic math, that's a real benefit. Wish it didn't involve destroying all progress on climate change, platforming truly evil people, destroying our economy, exploiting already disadvantaged artists, destroying OSS communities, enabling yet another order of magnitude increase in spam profitability, destroying the personal computer market, stealing all our data, sucking the oxygen out of investing into real industry, and bold faced lies to all people about how these systems work.
Also, last I checked, MATLAB wasn't a trillion dollar business.
Interestingly, the OpenAI wrangler is last in the list of Authors and acknowledgements. That somewhat implies the physicists don't think it deserves much credit. They could be biased against LLMs like me.
When Victor Ninov (fraudulently) analyzed his team's accelerator data using an existing software suite to find a novel SuperHeavy element, he got first billing on the authors list. Probably he contributed to the theory and some practical work, but he alone was literate in the GOOSY data tool. Author lists are often a political game as well as credit, but Victor got top billing above people like his bosses, who were famous names. The guy who actually came up with the idea of how to create the element, in an innovative recipe that a lot of people doubted, was credited 8th
https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.83...
by jtrn on 2/13/26, 10:26 PM
I am generally very skeptical about work on this level of abstraction. only after choosing Klein signature instead of physical spacetime, complexifying momenta, restricting to a "half-collinear" regime that doesn't exist in our universe, and picking a specific kinematic sub-region. Then they check the result against internal consistency conditions of the same mathematical system. This pattern should worry anyone familiar with the replication crisis. The conditions this field operates under are a near-perfect match for what psychology has identified as maximising systematic overconfidence: extreme researcher degrees of freedom (choose your signature, regime, helicity, ordering until something simplifies), no external feedback loop (the specific regimes studied have no experimental counterpart), survivorship bias (ugly results don't get published, so the field builds a narrative of "hidden simplicity" from the survivors), and tiny expert communities where fewer than a dozen people worldwide can fully verify any given result.
The standard defence is that the underlying theory — Yang-Mills / QCD — is experimentally verified to extraordinary precision. True. But the leap from "this theory matches collider data" to "therefore this formula in an unphysical signature reveals deep truth about nature" has several unsupported steps that the field tends to hand-wave past.
Compare to evolution: fossils, genetics, biogeography, embryology, molecular clocks, observed speciation — independent lines of evidence from different fields, different centuries, different methods, all converging. That's what robust external validation looks like. "Our formula satisfies the soft theorem" is not that.
This isn't a claim that the math is wrong. It's a claim that the epistemic conditions are exactly the ones where humans fool themselves most reliably, and that the field's confidence in the physical significance of these results outstrips the available evidence.
I wrote up a more detailed critique in a substack: https://jonnordland.substack.com/p/the-psychologists-case-ag...
by My_Name on 2/14/26, 1:20 PM
Basically, if you are small enough you can move forwards and backwards in time, from the moment you were put into a superposition, or entangled, until you interact with an object too large to ignore the emergent effects of time and gravity. This is 'being observed' and 'collapsing the wave function'. You occupy all possible positions in space as defined by the probability of you being there. Once observed, you move forward in linear time again and the last route you took is the only one you ever took even though that route could be affected by interference with other routes you took that now no longer exist. When in this state there is no 'before' or 'after' so the delayed choice experiment is simply an illusion caused by our view of time, and there is no delay, the choice and result all happen together.
With entanglement, both particles return to the entanglement point, swap places and then move to the current moment and back again, over and over. They obey GR, information always travels under the speed of light (which to the photon is infinite anyway), so there is no spooky action at a distance, it is sub-lightspeed action through time that has the illusion of being instant to entities stuck in linear time.
It then went on to talk about how mass creates time, and how time is just a different interpretation of gravity leading it to fully explain how a black hole switches time and space, and inwards becomes forwards in time inside the event horizon. Mass warps 4D (or more) space. That is gravity, and it is also time.