by skilled on 9/30/25, 4:55 PM with 878 comments
System card: https://openai.com/index/sora-2-system-card/
by the_duke on 9/30/25, 7:37 PM
It seems like OpenAI is trying to turn Sora into a social network - TikTok but AI.
The webapp is heavily geared towards consumption, with a feed as the entry point, liking and commenting for posts, and user profiles having a prominent role.
The creation aspect seems about as important as on Instagram, TikTok etc - easily available, but not the primary focus.
Generated videos are very short, with minimal controls. The only selectable option is picking between landscape and portrait mode.
There is no mention or attempt to move towards long form videos, storylines, advanced editing/controls/etc, like others in this space (eg Google Flow).
Seems like they want to turn this into AITok.
Edit: regarding accurate physics ... check out these two videos below...
To be fair, Veo fails miserably with those prompts also.
https://sora.chatgpt.com/p/s_68dc32c7ddb081919e0f38d8e006163...
https://sora.chatgpt.com/p/s_68dc3339c26881918e45f61d9312e95...
Veo:
https://veo-balldrop.wasmer.app/ballroll.mp4
https://veo-balldrop.wasmer.app/balldrop.mp4
Couldn't help but mock them a little, here is a bit of fun... the prompt adherence is pretty good, at least.
NOTE: there are plenty of quite impressive videos being posted, and a lot of horrible ones also.
by davidmurdoch on 10/1/25, 1:20 AM
If I start a new chat it works.
I'm a Plus subscriber and didn't hit rate limits.
This video gen tool will probably be even more useless.
by mscbuck on 9/30/25, 11:19 PM
My boss sends me complete AI Workslop made with these tools and he goes "Look how wild this is! This is the future" or sends me a youtube video with less than a thousand views of a guy who created UGC with Telegram and point and click tools.
I don't ever think he ever takes a beat, looks at the end product, and asks himself, "who is this for? Who even wants this?", and that's aside from the fact that I still think there are so many obvious tells with this content that make you know right away that it is AI.
by simonw on 9/30/25, 6:13 PM
I expect the "cameo" feature is an attempt at capturing that viral magic a second time.
by saguntum on 9/30/25, 7:15 PM
If they got the generation "live" enough, imagine walking past a mirror in a department store and seeing yourself in different clothes.
Wild times.
by rushingcreek on 9/30/25, 5:20 PM
However, I still don't see how OpenAI beats Google in video generation. As this was likely a data innovation, Google can replicate and improve this with their ownership of YouTube. I'd be surprised if they didn't already have something like this internally.
by btbuildem on 9/30/25, 10:59 PM
Tangentially related: it's wild to me that people heading such consequential projects have so little life experience. It's all exuberance and shiny things, zero consideration of the impacts and consequences. First Meta with "Vibes", now this.
1: https://www.gurufocus.com/news/3124829/openai-plans-to-launc...
by kveykva on 9/30/25, 5:11 PM
by samuelfekete on 9/30/25, 8:56 PM
by cogman10 on 9/30/25, 10:43 PM
What am I looking at that's super technically impressive here? The clips look nice, but from one cut to the next there's a lot of obvious differences (usually in the background, sometimes in the foreground).
by gorgoiler on 9/30/25, 5:55 PM
1/ 0m23s: The moon polo players begin with the red coat rider putting on a pair of gloves, but they are not wearing gloves in the left-vs-right charge-down.
2/ 1m05s: The dragon flies up the coast with the cliffs on one side, but then the close-up has the direction of flight reversed. Also, the person speaking seemingly has their back to the direction of flight. (And a stripy instead of plain shirt and a harness that wasn’t visible before.)
3/ 1m45s: The ducks aren't taking the right hand corner into the straightaway. They are heading into the wall.
I do wonder what the workflow will be for fixing any more challenging continuity errors.
by TheAceOfHearts on 9/30/25, 7:59 PM
I think OpenAI is actually doing a great job at easing people into these new technologies. It's not such a huge leap in capabilities that it's shocking, and it helps people acclimate for what's coming. This version is still limited but you can tell that in another generation or two it's going to break through some major capabilities threshold.
To give a comparison: in the LLM model space, the big capabilities threshold event for me came with the release of Gemini 2.5 Pro. The models before that were good in various ways, but that was the first model that felt truly magical.
From a creative perspective, it would be ideal if you could first generate a fixed set of assets, locations, and objects, which are then combined and used to bring multiple scenes to life while providing stronger continuity guarantees.
by willahmad on 9/30/25, 6:01 PM
State of the things with doom scrolling was already bad, add to it layoffs and replacing people with AI (just admit it, interns are struggling competing with Claude Code, Cursor and Codex)
What's coming next? Bunch of people, with lots of free time watching non-sense AI generated content?
I am genuinely curious, because I was and still excited about AI, until I saw how doom scrolling is getting worse
by adidoit on 9/30/25, 6:07 PM
by mempko on 9/30/25, 5:26 PM
The worst part is we are already seeing bad actors saying 'I didn't say that' or 'I didn't do that, it was a deep fake'. Now you will be able to say anything in real life and use AI for plausible deniability.
by baalimago on 10/1/25, 6:26 AM
by minimaxir on 9/30/25, 5:14 PM
Sora 2 itself as a video model doesn't seem better than Veo 3/Kling 2.5/Wan 2.2, and the primary touted feature of having a consistent character can be sufficiently emulated in those models with an input image.
by SeanAnderson on 9/30/25, 7:24 PM
by simonw on 9/30/25, 5:40 PM
The recent Google Veo 3 paper "Video models are zero-shot learners and reasoners" made a fascinating argument for video generation models as multi-purpose computer vision tools in the same way that LLMs are multi-purpose NLP tools. https://video-zero-shot.github.io/
It includes a bunch of interesting prompting examples in the appendix, it would be interesting to see how those work against Sora 2.
I wrote some notes on that paper here: https://simonwillison.net/2025/Sep/27/video-models-are-zero-...
by haolez on 9/30/25, 6:52 PM
For example, I saw a lot of people criticizing "Wish" (2023, Disney) for being a good movie in the first half, and totally dropping the ball in the last half. I haven't seen it yet, but I'm wondering if fans will be able to evolve the source material in the future to get the best possible version of it.
Maybe we will even get a good closure for Lost (2004)!
(I'm ignoring copyright aspects, of course, because those are too boring :D)
by rd on 9/30/25, 5:17 PM
by mdrzn on 9/30/25, 5:16 PM
by mempko on 9/30/25, 5:28 PM
The worst part is we are already seeing bad actors saying 'I didn't say that' or 'I didn't do that, it was a deep fake'. Now you will be able to say anything in real life and use AI for plausible deniability.
I predict a re-resurgence in life performances. Live music and live theater. People are going to get tired of video content when everything is fake.
by stan_kirdey on 9/30/25, 6:25 PM
by etrvic on 10/1/25, 10:46 AM
by TechSquidTV on 10/1/25, 1:56 PM
I saw some promnise with the Segment Anything model but I haven't seen anyone yet turn it into a motion solver. In fact I'm not sure if can do that at all. It may be that we need to use an AI algorithm to translate the video into a more simple rendition (colored dots representing the original motion) that can then be tracked more traditionally.
by jablongo on 9/30/25, 5:28 PM
by aaroninsf on 9/30/25, 5:24 PM
It's not that I disagree with the criticism; it's rather that when you live on the moving edge it's easy to lose track of the fact that things like this are miraculous and I know not a single person who thought we would get results "even" like this, this quickly.
This is a forum frequented by people making a living on the edge—get it. But still, remember to enjoy a little that you are living in a time of miracles. I hope we have leave to enjoy that.
by seydor on 9/30/25, 8:47 PM
by qoez on 9/30/25, 5:25 PM
by minimaxir on 10/1/25, 1:15 AM
> How the FUCK does Sora 2 have such a perfect memory of this Cyberpunk side mission that it knows the map location, biome/terrain, vehicle design, voices, and even the name of the gang you're fighting for, all without being prompted for any of those specifics??
> Sora basically got two details wrong, which is that the Basilisk tank doesn't have wheels (it hovers) and Panam is inside the tank rather than on the turret. I suppose there's a fair amount of video tutorials for this mission scattered around the internet, but still––it's a SIDE mission!
Everyone already assumed that Sora was trained on YouTube, but "generate gameplay of Cyberpunk 2077 with the Basilisk Tank and Panam" would have generated incoherent slop in most other image/video models, not verbatim gameplay footage that is consistent.
For reference, this is what you get when you give the same prompt to Veo 3 Fast (trained by the company that owns YouTube): https://x.com/minimaxir/status/1973192357559542169
by echelon on 9/30/25, 7:18 PM
I love this AI video technology.
Here are some of the films my friends and I have been making with AI. These are not "prompted", but instead use a lot of hand animation, rotoscoping, and human voice acting in addition to AI assistance:
https://www.youtube.com/watch?v=H4NFXGMuwpY
https://www.youtube.com/watch?v=tAAiiKteM-U
https://www.youtube.com/watch?v=7x7IZkHiGD8
https://www.youtube.com/watch?v=Tii9uF0nAx4
Here are films from other industry folks. One of them writes for a TV show you probably watch:
https://www.youtube.com/watch?v=FAQWRBCt_5E
https://www.youtube.com/watch?v=t_SgA6ymPuc
https://www.youtube.com/watch?v=OCZC6XmEmK0
I see several incredibly good things happening with this tech:
- More people being able to visually articulate themselves, including "lay" people who typically do not use editing software.
- Creative talent at the bottom rungs being able to reach high with their ambition and pitch grand ideas. With enough effort, they don't even need studio capital anymore. (Think about the tens of thousands of students that go to film school that never get to direct their dream film. That was a lot of us!)
- Smaller studios can start to compete with big studios. A ten person studio in France can now make a well-crafted animation that has more heart and soul than recent by-the-formula Pixar films. It's going to start looking like indie games. Silksong and Undertale and Stardew Valley, but for movies, shows, and shorts. Makoto Shinkai did this once by himself with "Voices of a Distant Star", but it hasn't been oft repeated. Now that is becoming possible.
You can't just "prompt" this stuff. It takes work. (Each of the shorts above took days of effort - something you probably wouldn't know unless you're in the trenches trying to use the tech!)
For people that know how to do a little VFX and editing, and that know the basic rules of storytelling, these tools are remarkable assets that compliment an existing skill set. But every shot, every location, every scene is still work. And you have to weave that all into a compelling story with good hooks and visuals. It's multi-layered and complex. Not unlike code.
And another code analogy: think of these models like Claude Code for the creative. An exoskeleton, but not the core driving engineer or vision that draws it all together. You can't prompt a code base, and similarly, you can't prompt a movie. At least not anytime soon.
by Awesomedonut on 10/1/25, 6:39 PM
Here's to hoping that the industry will adapt to have it aid animators for in-betweening and other things that supplement production. Anime studios are infamously terrible with overworking their employees, so I legitimately see benefits coming from this tool if devs can get it to function as proper frame interpolation (where animators do the keyframes themselves and the model in-betweens).
by msp26 on 9/30/25, 5:13 PM
by modeless on 9/30/25, 5:28 PM
I watch videos for two reasons. To see real things, or to consume interesting stories. These videos are not real, and the storytelling is still very limited.
by mempko on 9/30/25, 5:31 PM
by polishdude20 on 9/30/25, 7:22 PM
by neom on 9/30/25, 7:12 PM
by causal on 9/30/25, 5:16 PM
by darkwater on 9/30/25, 7:07 PM
> A lot of problems with other apps stem from the monetization model incentivizing decisions that are at odds with user wellbeing. Transparently, our only current plan is to eventually give users the option to pay some amount to generate an extra video if there’s too much demand relative to available compute. As the app evolves, we will openly communicate any changes in our approach here, while continuing to keep user wellbeing as our main goal.
by neilv on 9/30/25, 8:04 PM
How much are they (and providers of similar tools) going to be able to keep anyone from putting anyone else in a video, shown doing and saying whatever the tool user wants?
Will some only protect politicians and celebrities? Will the less-famous/less-powerful of us be harassed, defamed, exploited, scammed, etc.?
by Aeolun on 9/30/25, 10:53 PM
by d--b on 9/30/25, 6:45 PM
by dagaci on 9/30/25, 6:26 PM
by sys32768 on 9/30/25, 7:21 PM
by jug on 9/30/25, 11:09 PM
by jsnell on 9/30/25, 5:25 PM
Like, it should be preferable to keep all the slop in the same trough. But it's like they can't come up with even one legitimate use case, and so the best product they can build around the technology is to try to create an addictive loop of consuming nothing but auto-generated "empty-calories" content.
by nycdatasci on 9/30/25, 9:20 PM
by DetroitThrow on 9/30/25, 5:17 PM
by clgeoio on 9/30/25, 8:49 PM
> We are giving users the tools and optionality to be in control of what they see on the feed. Using OpenAI's existing large language models, we have developed a new class of recommender algorithms that can be instructed through natural language. We also have built-in mechanisms to periodically poll users on their wellbeing and proactively give them the option to adjust their feed.
So, nothing? I can see this being generated and then reposted to TikTok, Meta, etc for likes and engagement.
by robotsquidward on 9/30/25, 7:19 PM
by joshdavham on 9/30/25, 6:10 PM
I imagine it won’t necessarily be used in long scenes with subtle body language, etc involved. But maybe it’ll be used in other types of scenes?
by wantering on 10/3/25, 2:34 AM
by FullMetul on 9/30/25, 10:20 PM
by mavamaarten on 10/1/25, 11:22 AM
Ever since the launch of Veo, there's already so much AI slop videos on YouTube that it becomes hard to find real videos sometimes.
I'm tired, boss.
by tminima on 10/1/25, 7:59 AM
Biggest problem OpenAI has is not having an immense data backbone like Meta/Google/MSFT has. I think this is step in that direction -- create a data moat which in turn will help them make better models.
by jack_riminton on 10/1/25, 9:28 AM
Can it do Will Smith eating spaghetti? (I can't get access in UK)
by elpakal on 10/1/25, 2:28 AM
Also I find it neat that they still include an iOSMath bundle (in chatGPT too), makes me wonder how good their models really are at math.
by outlore on 9/30/25, 5:49 PM
by bgwalter on 9/30/25, 8:05 PM
Let me guess, the ultimate market will be teenagers "creating" a Skibidi Toilet and cheap TikTok propaganda videos which promote Gazan ocean front properties.
by bamboozled on 9/30/25, 11:18 PM
There would for sure be large swathes of people who would just lie about what they're doing and use AI to make it seem like they're skateboarding, or skiing or whatever at a pro or semi-pro level and have a lot of people watch it.
by intended on 9/30/25, 6:06 PM
Impressive that THAT was one of the issues to find, given where we were at the start of the year.
by wltr on 10/1/25, 4:37 AM
by fariszr on 9/30/25, 5:16 PM
by IncreasePosts on 9/30/25, 6:21 PM
by ElijahLynn on 9/30/25, 6:01 PM
click
takes me to the iPhone app store...
by ascorbic on 9/30/25, 6:50 PM
by alberth on 9/30/25, 8:51 PM
by anshumankmr on 10/1/25, 1:34 AM
by gvv on 9/30/25, 5:23 PM
edit: as per usual it's not yet...
by jp57 on 9/30/25, 8:14 PM
by sumeruchat on 9/30/25, 7:18 PM
by ashu1461 on 9/30/25, 8:43 PM
by vahid4m on 9/30/25, 8:18 PM
by Gnarl on 10/1/25, 9:18 AM
by doikor on 9/30/25, 7:38 PM
Basically proper working persistence of the scene.
by whimsicalism on 9/30/25, 5:27 PM
by nopinsight on 9/30/25, 10:55 PM
Their ultimate goal is physical AGI, although it wouldn’t hurt them if the social network takes off as well.
by squidsoup on 9/30/25, 8:20 PM
by Havoc on 9/30/25, 10:37 PM
Sam looks weirdly like Cillian Murphy in Oppenheimer in some shots. I wonder whether there was dataset bleedover from that.
by tptacek on 9/30/25, 7:18 PM
by Lucasoato on 9/30/25, 11:03 PM
by sandspar on 10/2/25, 6:46 AM
by GaggiX on 9/30/25, 6:28 PM
by natiman1000 on 10/1/25, 12:36 AM
by dyauspitr on 9/30/25, 7:03 PM
by NoahZuniga on 9/30/25, 8:00 PM
by VagabundoP on 9/30/25, 7:24 PM
There's still something off about the movements, faces and eyes. Gollum features.
by LarsDu88 on 9/30/25, 8:09 PM
One use case I'm really excited about is simply making animated sprites and rotational transformations of artwork using these videogen models, but unlike with local open models, they never seem to expose things like depth estimation output heads, aspect ratio alteration, or other things that would actually make these useful tools beyond shortform content generation.
by alkonaut on 9/30/25, 6:41 PM
by unethical_ban on 9/30/25, 7:21 PM
Multiple sci-fi-fantasy tales have been written about technology getting so out of control, either through its own doing or by abuse by a malevolent controller, that society must sever itself from that technology very intentionally and permanently.
I think the idea of AGI and transhumanism is that moment for society. I think it's hard to put the genie back in the bottle because multiple adversarial powers are racing to be more powerful than the rest, but maybe the best thing for society would be if every tensor chip disintegrated the moment they came into existence.
I don't see how society is better when everyone can run their own gooner simulation and share it with videos made of their high school classmates. Or how we'll benefit from being unable to trust any photo or video we see without trusting who sends it to you, and even then doubting its veracity. Not being able to hear your spouse's voice on the phone without checking the post-quantum digital signature of their transmission for authenticity.
Society is heading to a less stable, less certain moment than any point in its history, and it is happening within our lifetime.
by thebiglebrewski on 9/30/25, 6:25 PM
by kaicianflone on 9/30/25, 7:05 PM
by qgin on 9/30/25, 6:53 PM
by taikahessu on 10/1/25, 12:58 PM
by bergheim on 9/30/25, 6:19 PM
I kid.
Art should require effort. And by that I mean effort on the part of the artist. Not environmental damage. I am SO tired of non tech friends SWOONING me with some song they made in 0.3 seconds. I tell them, sarcastically, that I am indeed very impressed with their endeavors.
I know many people will disagree with me here, but I would be heart broken if it turned out someone like Nick Cave was AI generated.
And of course this goes into a philosophical debate. What does it matter if it was generated by AI?
And that's where we are heading. But for me I feel effort is required, where we are going means close to 0 effort required. Someone here said that just raises the bar for good movies. I say that mostly means we will get 1 billion movies. Most are "free" to produce and displaces the 0.0001% human made/good stuff. I dunno. Whoever had the PR machine on point got the blockbuster. Not weird, since the studio tried 300 000 000 of them at the same time.
Who the fuck wants that?
I feel like that ship in Wall-E. Let's invest in slurpies.
Anyway; AI is here and all of that, we are all embracing it. Will be interesting to see how all this ends once the fallout lands.
Sorry for a comment that feels all over the place; on the tram :)
by colonial on 9/30/25, 6:16 PM
by nickbettuzzi on 10/3/25, 2:27 AM
by drcongo on 9/30/25, 7:18 PM
by 2OEH8eoCRo0 on 9/30/25, 5:22 PM
by rvz on 9/30/25, 8:05 PM
by carabiner on 9/30/25, 7:34 PM
by beders on 9/30/25, 6:41 PM
by Josh5 on 9/30/25, 9:48 PM
by FrustratedMonky on 10/1/25, 1:26 AM
by basisword on 9/30/25, 6:11 PM
by outside1234 on 10/1/25, 4:08 AM
by taytus on 9/30/25, 5:26 PM
by dvngnt_ on 9/30/25, 5:10 PM
by barbarr on 9/30/25, 6:32 PM
by andybak on 9/30/25, 5:45 PM
Going back to sleep. Wake me up when it's available to me.
by boh on 9/30/25, 7:18 PM
by carrozo on 9/30/25, 7:44 PM
by baby on 10/1/25, 12:42 AM
by amelius on 9/30/25, 9:30 PM
by dcreater on 9/30/25, 10:51 PM
by ezomode on 9/30/25, 9:36 PM
by mrcino on 9/30/25, 7:33 PM
Brave new internet, where humans are not needed for any "social" media anymore, AI will generate slop for bots without any human interaction in an endless cycle.
by fersarr on 9/30/25, 8:58 PM
by _ZeD_ on 10/1/25, 4:30 AM
by ambicapter on 9/30/25, 7:11 PM
by umrashrf on 9/30/25, 10:47 PM
by sudohalt on 9/30/25, 6:05 PM
by egeres on 9/30/25, 8:20 PM
by apetresc on 9/30/25, 7:52 PM
by MangoToupe on 9/30/25, 6:39 PM
I guess copyright is pretty much dead now that the economy relies on violating it. Too bad those of us not invested into AI still won't be able to freely trade data as we please....
by dolebirchwood on 9/30/25, 9:21 PM
It's technically impressive, but all so very soulless.
When everything fake feels real, will everything real feel fake?
by LocalH on 10/1/25, 4:40 AM
by type0 on 10/2/25, 9:18 AM
by ionwake on 9/30/25, 7:38 PM
by yahoozoo on 9/30/25, 10:38 PM
by gainda on 9/30/25, 7:00 PM
it doesn't spark optimism or joy about the future of engaging with the internet & content which was already at a low point.
old is gold, even more so
by CSMastermind on 9/30/25, 7:46 PM
by groos on 9/30/25, 10:28 PM
by dragonwriter on 9/30/25, 6:55 PM
I think feeling like you need to use that in marketing copy is a pretty good clue in itself both that its not, and that you don’t believe it is so much as desperately wish it would be.
by bovermyer on 9/30/25, 7:14 PM
by deng on 9/30/25, 6:46 PM
I know, I know. Most people don't care. How exciting.
by dweekly on 9/30/25, 5:26 PM
I feel like this is the ultimate extension of "it feels like my feed is just the artificial version of what's happening my friends and doesn't really tell me anything about how they're actually faring."
by dwa3592 on 9/30/25, 7:16 PM
The point is that sora2 demo videos seemed impressive but I just didn't feel any real excitement. I am not sure who this is really helping.
by S0und on 9/30/25, 5:17 PM
by m3kw9 on 9/30/25, 6:05 PM
by beernet on 9/30/25, 5:13 PM
by marcofloriano on 9/30/25, 7:17 PM
So much visual power, yet so little soul power. We are dying.
by ChrisArchitect on 9/30/25, 5:49 PM
by ath3nd on 9/30/25, 7:16 PM
Absolutely cooked.
After the disaster that was chatGPT4.001, study mode and now this: an impossibly expensive to maintain AI video slop copyright violater, their releases are uninspired and bland, and smelling of desperation.
Making me giddy for their imminent collapse.
by pton_xd on 9/30/25, 5:21 PM
by iLoveOncall on 9/30/25, 7:37 PM
by mclightning on 9/30/25, 7:35 PM
by tonyabracadabra on 9/30/25, 11:07 PM