Why current consoles use X86-64 hardware – Part 2
Last time we started a topic on the Cell processor. This time we’ll close on the Cell, and discuss ARM and Power PC.
Cell Processor (continued)
SPE (Synergistic Processor Element) with the cell processor are basically mini vector processors. They utilise the same in order theory as the PPE. To enable them to get the absolute most from the PS3, developers such as Naughty Dog needed to maximise the power of not only the PPE, but also the SPE. They are also SIMD (Single Instruction Multi Data) that should sound familiar if you read my previous article because it is almost identical to modern GPU. SIMD are well to graphical tasks and physics alike. These types of tasks are exactly what the PS3’s cell processor was designed to do. The best example I can think of is having a look at one of the final hoorahs for the PS3 which was The Last of Us, this game has graphically fidelity that completely obliterates anything the Xbox 360 could muster.
Each SPE have their own cache of 256KB. Careful utilisation of this is essential to get the most of the PPE.
All these reasons is one of the contributions to why the PS3 was so expensive on launch.
ARM Processors
(I don’t really wanna talk about ARM but for the sake of comparison I have to)
ARM is quite possibly one of the most efficient non-proprietary architectures and ARM is a company that designs processors and their dies but sells them to the highest bidder. (Just Wikipedia them it isn’t relevant to this).
8GB of RAM was desired for this current generation but due to cost constraints and marketing ***, this was replaced by cheaper and in some ways better RAM like special SDRAM with incredible bandwidth speeds. But ARM falls into the same category as other architectures, it just wasn’t easy enough to implement and was then ditched.
PowerPC: (ughh)
PowerPC was enormously popular in the last generation. Not only did the PS3 use Cell (which is based on PowerPC instructions), the Xbox 360 of course used PowerPC too. PowerPC was also very popular for Apple’s macs too, but they too have abandoned the PowerPC and have gone the Intel X86 route.
Cost has been said to be one of Apple’s primary motivators, but I suspect that the idea of being much more compatible with Windows software (with many Macs coming with Windows Partitions) isn’t exactly a problem either. The Cell processor in some ways is very similar to the PowerPC – and indeed the Cell’s PPE is very much different sides of the same coin.
This came with the benefit of being easier to program for. but still wasn’t ideal. One can guess numerous reasons MS decided to abandon this.
I’d imagine wanting to run a version of windows easily would be one (of course, it could have been compiled for the PowerPC architecture though, but would have added a bit of complexity no doubt). I suspect they wanted to reduce the production costs, and go with a solution that was already available.
PowerPC is a RISC CPU, and as we discussed earlier, that means some instructions are simply missing, A programmer can get around that with more code – but for games development and graphics libraries, it is probably not the way they wanted to go.
Why the current generation switched to x86-64
Wrapping up as to why the most recent generations of consoles decided to use the x86-64.
As with all business decisions in the world, it comes down to money and money alone, Sony and Microsoft want their developers and other developers to be able to port their titles across consoles and platforms with as minimal overhead as possible to maximise profit.
The APU design in the jaguar means that both systems are able to use the GPU and CPU interchangeably and sharing memory, it also allows the easiest time porting to the likes of PC where the GPU and CPU is usually separate and by having the APU design be seen as both a GPU and CPU developers can more easily optimise it for the PC.
Update from Wololo: it has come to my attention that this article was heavily “inspired” by an article on redgamingtech.com. Author AGCrown has admitted the plagiarism and will not be writing for our site anymore. I am keeping this article up because I hate sites who silently make issues go away rather than face them, but the original authors are welcome to contact me if this is a problem.

Nice Post Most Informative!…..Good Job. MORE PLEASE!!!
Please stop writing these articles. There are so many false statements in these, I don’t even know where I should beginning correcting them. Instead I’m just going to say this: Kill it, kill it with fire!
Instead of b*tching about how wrong something is, why don’t you write your own articles and show us how awesome you are at getting everything right, Wololo does let people write for the blog…
That is true that there are some mistakes, but at the first order it is ok.
The Cell was especially complex to program because it didn’t have a cache (as mentioned in the article) but a scratchpad. It means that the programmer could not access the global memory directly but had to move data with DMA operations into that scratchpad memory.
The ARM ISA and architecture is proprietary and sold through IP blocks. The only non-proprietary but decent ISA/architecture I know of is RISC-V . In the previous article it was mentioned ARM is RISC. That was true 20 years ago. Now it is a freaking mess. ARM V8 tries to fix that but the traditional ARM 32 is a *** mess.
The AMD GPUs are really good, especially with their asynchronous compute and HSA. The CPU cores are weak, but they are supposed to only do latency critical operations and synchronizations/scheduling. Almost everything else is supposed to be offloaded to the GPU.
ARM is no longer RISC dude.
RISC means 4-Stage pipeline design (and maybe some added SIMD-engine).
But now ARM does (today) what x86 does => Implementing hundreds and hundreds of special commands like Encryption or others.
ARM is no longer RISC. Why do you think ARM cannot go higher than 2-3 Ghz and it overheats then? For magic?
No dude. Its because => the more complex a cpu gets the LESSER performance it has 😉 Natural logic.
A real RISC is that cpu you have in your router at home. Its a 4-stage pipeline design. It gets as hot as ~100°C.
While typical today ARM-smartphones don´t get 100°C hot lol.
Smartphones FAIL at temperatures higher than 70-80°C.
the typical temperature when a todays x86-cpu also shuts off (AMD) or clocks down (Intel). Intel itself shuts its cpu-cores down at exactly 85°C.
PowerPC AND RISC are working at much higher temperatures. More like 100-110°C.
Which is why it means RISC => it does more per clock with the same Mhz than a x86/ARM-cpu.
ARM architecture has another problem => It cannot advance. Thats why we don`t have 8-core smartphones today with real 8 cores in it you can use all 8 cores at once 😉 see?
or do you see any 16-32 core smartphones today? I don´t! Since after 4 cores…the advancements stopped…
Well Said Gregory Rasputin!!…
Nah he isnt.
is it just me or do his articles really suck. eighter post some quality stuff or let it be.
Ok. These articles are garbage.
Firstly, Apple swiched to X86 because at that time, IBM could not create efficient notebook processors, and Apple was in strenghtening their Macbook portofolio.
“8GB of RAM was desired for this current generation but due to cost constraints and marketing ***, this was replaced by cheaper and in some ways better RAM like special SDRAM with incredible bandwidth speeds.”
What do you want to tell us? That ARM cannot use 8GB of RAM? The current conosole have SDRAM in them instead of DDRAM? That current ARM devices utilise SDRAM?
The only reason consoles use X86-64 this generation is because AMD had the best performace/cost ratio of the bidders. MS and Sony wanted APU-s with unified memory between GPU and CPU. Imagination could not deliver the performace and nVidia could not deliver it in 2014. So AMD remained.
The problem is => PS4/Xbox one both only have ~16 Gflops in all 8 cores combined. So only 14 Gflops for GAMES itself.
And thats the same as using a smartphone with 4 cores @ 2 Ghz each 😉 It also does 16 gflops there. And thats mobile hardware.
But more and more points to => WiiUs cpu can do ~150 Gflops. With coming Unity 5.4 that is. Since its a CUSTOM PowerPC.
You cannot just write the same code for it than x86 is using.
They say the 1.25 Ghz “Laughable” cores in WiiU are as fast as a 3-ghz Core i3 (Haswell/Broadwell) from Intel!
They specifically meant “its misunderstood and you cannot program like x86 for it. Its about comparable to a Intel Core i3 Midrange cpu for each core”.
Where did the comments go?
Ops. Wrong article.
Hi AGCROWN,
Are you the author of this article also?
http://www.redgamingtech.com/why-ps4-and-xbox-one-moved-to-x86-64/
If you are not, cite them on the sources and rewrite the stuff using your own words, instead of straight copying a lot of sentences.Anyway, adding to the mistakes pointed by other, I would add that you cannot simply recompile the OS to a new architecture, it takes a considerable effort even with a good hardware abstraction layer, also that the programmer doesnt have to add more code or care about the missing instructions, this is done by the compiler (unless the guy is writing in asm or doing some low level optimizations).
So, pick some more and better sources and rock on the third part. Good luck.
Think you might have got him thrown out of his position for writing on articles. Ha. xD
For the love of $DEITY, please get an editor.
this article and these comments hmmmmmm.
errrmmmm……. if the next article is going to be like this, don’t bother!
Never show emotion or favoritism in an article and always stick to research not what you like or not.
I don’t care if you like AMD since they run in Jaguar, Bmw and Lotus ecu’s or if you prefer Apple over Blackberry. This article is ridiculous.
The problem about todays ARM-achitecture?
It only gets like ~4-5 Gflops (max!) for an ARM A72- core running @ 2.5-3 Ghz. Even with 4 cores @ 2.5-3 Ghz you still don`t get more than like ~15-20 Gflops. And thats the tip of the iceberg, since they can´t run at e.g. 4 Ghz or 5 ghz like x86-cpus can.
It overheats when getting clocked higher. There it is: A big “Dark Silicon” Problem too. Which leads to => ARM-architecture has NO FUTURE. Since it is too complex and any more advantage in trying to overclock it leads to overheating its sram-cache or other parts of it (Which is then leading to malfunctioning => unwanted behaviour).
Like all x86-cpus today (manufactured in 32nm or smaller) have problems with Dark Silicon-effects, ARM having problems with that since newer smartphones introduced in 2016 is nothing new.
Dark Silicon is everywhere. No matter what device you use (Computers, x86-consoles, smartphones, or Smartwatches ;).
And the WiiU is the ONLY console this gen which has no Dark Silicon problems and cannot overheat (thanks to custom IBM edram, which helps to make the general PowerPC-architecture much faster) since the eDRAM is used (which drains 5x lesser energy and needs much lesser space)- which protects the architecture from getting hotter as 90-110°C (typical temperature for Risc 4-Stage pipelines clocked very high).
So yes: If you didn`t get it by now:
This means => The only SOLUTION to Dark Silicon is to implement EDRAM from IBMs server-segment, instead of using space-wasting and therefore overheating SRAM-cache cells
Thats the problem of “Dark Silicon” today => Too much transistors on too less space.
And IBM made that solution for Nintendo, since there was no other way to solve it.
So what happens then if you use eDRAM?
Your eDRAM-cachecells draws much lesser energy (i think its 1/5th of a typical SRAM-cache cell), it needs much lesser space and you can place it ANYWHERE on the chip (which is how WiiUs Espresso works => the HOT PowerPC-cpu-cores are on the RIGHT, and the very important and shoult-not-getting overheated edram-cache cells are on the LEFT of the chip => perfect solution to not overheat).
And yes => Since eDRAM draws much lesser energy, that also means the chip cools down and doesn´t overheat (which is the problem of todays X86/ARM-cpus).
See? Thats how it goes. The only problem about IBM edram used in servers (normally)?
it costs mucha money. 10 dollars they say for a 1-mbyte segment of eDRAM-cells implementation. So its 30 dollras for 3 Mbyte of eDRAM.
And the more Cache of that you need, the more expensive it will get. So 10 Mbyte is 100 Dollars (example)..
Todays supercomputers use like ~100 mbyte of that cache. So timed 10 it costs alone 1000 dollars to get 100 Mbytes of WORKING eDRAM in such a todays server-cpu
See? You have to know just one thing today: It costs ~50% of the DIESPACE of a typical AMD Jaguar (48mm for a 8-core module) to implement normal SRAM cache cells there. See? So the cache on todays cpus occupies 50% of die-space.
While on the WiiUs IBM Espresso the cache only occupies like ~3 or 6mm² out of 30 mm. So thats only 1/5th or even 1/10th.
They also said (when designing the cpu) => a lof ot the space on IBM Espresso is EMPTY.
Now we know why => it would overheat, when everything would be crunched together as close as possible
See? The SMALLER a cpu gets, the hotter it gets if it drains the same amount of energy.
the BIGGER a cpu-die gets, the LESSER energy it drains per mm².
So if you ask me- the only FUTURE for a cpu there right now is, is the PowerPC. Everything else…. => forget about it lol…There is no future for x86.
Intel Simply will abandon x86 pretty soon, since there are no more advantages to make. They simply don´t try anymore to make x86 faster.
They only are interested to make it need a little lesser energy now. But thats it. Skylake is allready slower in some certain games than e.g. Broadwell.
And THAT tells you that Dark Silicon is taking place everywhere…