by jarbus on 10/23/25, 10:15 AM with 42 comments
by chandureddyvari on 10/23/25, 1:11 PM
by pjmlp on 10/23/25, 11:55 AM
> Monarch is split into a Python-based frontend, and a backend implemented in Rust.
Other than that, looks like a quite interesting project.
by alyxya on 10/23/25, 12:33 PM
As far as things that might be a performance loss here, one thing I'm wondering is if custom kernels are supported. I'm also wondering how much granularity of control there is with communication between different actors calling a function. Overall, I really like this project and hope to see it used over multi-controller setups.
by valzam on 10/23/25, 11:59 AM
by milancurcic on 10/23/25, 12:22 PM
by porridgeraisin on 10/23/25, 12:58 PM
In case someone that can fix this is reading here
by fadedsignal on 10/23/25, 2:27 PM
- Is this similar to openMPI?
- How is a mesh established? Do they need to be on the same host?
by semessier on 10/23/25, 5:31 PM
> ...Note that this does not support tensor engine, which is tied to CUDA and RDMA (via ibverbs).
I.e. yet another CUDA married approach: the issue is not ibverbs but the code shows they use GPUDirect RDMA, going from there this can only get worse - more CUDA dependencies. There would have been OpenUCX.
by logicchains on 10/23/25, 12:36 PM
by bjourne on 10/23/25, 8:25 PM
There are some infamous tech based on the "hiding" paradigm. PHP comes to mind. By hiding how the http request/response cycle actually works it fostered a generation of web developers who didn't know what a session cookie was, resulting in login systems that leaked like a sieve. Distributed computing is complicated. There are many parameters you need to tweak and many design decisions you need to take to make distributed model training run smoothly. I think explicit and transparent architectures are way better. Distributed model training shouldn't "feel" like running on a single device because it isn't.
by jonapro on 10/23/25, 11:58 AM
by nothrowaways on 10/23/25, 12:51 PM
by SomaticPirate on 10/23/25, 2:12 PM
Found a few typo's. The em dash makes me suspect an LLM was involved in proofreading