> For the past few decades, building a datacenter has been a well-understood, predictable exercise in utility engineering.
> In modern AI clusters, the network is no longer just infrastructure sitting beneath compute
It always make me smile when someone is presenting these kind of topologies as "New", "Modern A.I" or anything remotely "Revolutionary".
The HPC domain and any super computers have been doing RDMA networking centered around "all-to-all" and "all-reduce" operations for at least 3 fucking decades now.
They are the main reasons supercomputer are almost always constructed around stupidly complex Torus or Dragonfly network topologies
The only difference now is that it switch from "This niche thing 3 nerds were using for weather simulations" to "this cool thing any hyperscaler NEED to have for A.I"
This is the third time I've seen a website with this styling (serif, yellow and white on black). What's going on? Is it a template or some AI induced convergence?
Default output of claude code. Another obvious example is https://trumprx.gov/, with the background beige that's kinda close to the Hacker News one (to my eyes at least)
I can do "ctrl + +" to increase the font, but it's still serif and low contrast, so I have to do "ctrl + A". Or better yet - press "reader view" on firefox.
We're two data center networking engineers who've spent years designing and operating data center infrastructure for governments, telcos and banks in West Africa. This piece came out of our work on a new AI architecture based on associative memory rather than transformers. The GPU-free argument here is something we think about the next phase of AI networking. Happy to discuss further about it.
PS: Taking a look at our manifesto (https://almartis.xyz/) can help with more context.
Yes, read that. What these people are talking about seems to replacing training of NNs by something else entirely. The big question is, does that work? At all?
It's premature to discuss network architecture until that basic question is answered.
I'm maybe 10% of the way in but I find I'm increasingly skeptical. If the basic building block dates back to the 1970s haven't other people tried this by now? If not, isn't the first order of business to throw together a prototype that solves MINST or one of the many other small datasets floating around out there as a proof of concept?
So unfortunately I'm inclined to assume this is empty conjecture shat out by an LLM. Because who would write something up in this much detail rather than typing `import numpy as ...` and going to town?
I'll also note that the document has all the usual crank signs. Lots of grand visions, hypotheses, and expounding at an overly high level on how various things work with hardly anything concrete.
> In modern AI clusters, the network is no longer just infrastructure sitting beneath compute
It always make me smile when someone is presenting these kind of topologies as "New", "Modern A.I" or anything remotely "Revolutionary".
The HPC domain and any super computers have been doing RDMA networking centered around "all-to-all" and "all-reduce" operations for at least 3 fucking decades now.
They are the main reasons supercomputer are almost always constructed around stupidly complex Torus or Dragonfly network topologies
The only difference now is that it switch from "This niche thing 3 nerds were using for weather simulations" to "this cool thing any hyperscaler NEED to have for A.I"
PS: Taking a look at our manifesto (https://almartis.xyz/) can help with more context.
It's premature to discuss network architecture until that basic question is answered.
So unfortunately I'm inclined to assume this is empty conjecture shat out by an LLM. Because who would write something up in this much detail rather than typing `import numpy as ...` and going to town?
I'll also note that the document has all the usual crank signs. Lots of grand visions, hypotheses, and expounding at an overly high level on how various things work with hardly anything concrete.