The Missing Piece of the Computing Revolution

The CPUs Powering Today’s Parallel-Processing Boom

Transforming Data Centers Into Supercomputers

For the past three weeks in The Big Secret on Wall Street, we have detailed, like no one else has, the technological transformation that is driving artificial intelligence, machine learning, and so much more. We call this big story The Parallel-Processing Revolution.

On July 12, we’ll dig deeper into the exponential transformations taking place in computing right now. And on July 19, we will conclude the series with a look at the energy and data-storage needs required to propel this revolution over the next few decades. Next week, we will provide an Independence Day special report.

In honor of Porter & Co.’s second anniversary, we’re making this vital six-part series free to all our readers. You’ll get a high-level view of this tech revolution and receive several valuable investment ideas along the way. However, detailed recommendations and portfolio updates will be reserved for our paid subscribers. If you are not already a subscriber to The Big Secret on Wall Street, click here…

Here are links to Part 1, which provides a series overview and tells how chipmaker Nvidia led the industry switch to parallel computing, Part 2, chronicling two near-monopoly companies that power the production of chipmaking globally, and Part 3, which compared the current investment opportunities with those of the internet boom 25 years ago.

We share Part 4 of this series with you below.

No one insulted Sir Alan Sugar’s pet robot and got away with it.

Especially not cheeky technology journalist Charles Arthur… who’d dared to suggest in a 2001 article in British online newspaper The Independent that Lord Sugar’s latest invention was a “techno-flop”… difficult to operate and even harder to sell.

Sugar’s brainchild, the E-M@iler – a clunky landline phone with attached computer screen that allowed you to check your email – wasn’t flying off the shelves of British electronics retailers. With smartphones still a decade or so away, the idea of browsing emails on your telephone just seemed bizarre.

And the E-M@iler – despite its appealing, friendly-robot presence – had plenty of drawbacks: users were charged per minute… per email… and were bombarded by non-stop ads on the tiny screen. Customers’ phones were auto-billed each night when the E-M@iler downloaded the day’s mail from the server. (By the time you’d owned the phone for a year, you’d paid for it a second time in hidden fees and charges.)

But scrappy Sir Alan, who’d grown up in London’s gritty East End and now owned England’s top computer manufacturer, Amstrad, wasn’t about to admit that he’d pumped 6 million pounds, and counting, into a misfire.

Sugar wasn’t afraid to fight dirty. And he had a weapon that Charles Arthur, the technology critic, didn’t: 95,000 email addresses, each one attached to an active E-M@iler.

On an April morning in 2001, Lord Sugar pinged his 95,000 robots with a mass email. It was a call to arms.

I’m sure you are all as happy with your e-mailer as I am,” he wrote. “The other day… the technology editor of the Independent said that our e-mailer was a techno-flop… It occurred to me that I should send an email to Mr Charles Arthur telling him what a load of twaddle he is talking. If you feel the same as me and really love your e-mailer, why don’t you let him know your feelings by sending him an email.

In a move that would likely get him in legal trouble today, Sugar included the offending critic’s email address… then sat back and waited.

A week or so later, the unsuspecting Charles Arthur returned from vacation to find his inbox bursting with 1,390 messages.

But – in a surprise turn of events for Lord Sugar – most of the missives were far from glowing endorsements of the E-M@iler…

Instead, scores of ticked-off robot owners took the opportunity to complain to the press about their dissatisfaction with their purchase. “This is by far the worst product I have bought ever.” “I won’t be getting another one, I think they’re crap.” “This emailer I am testing is the second one and I now know that it performs no better than the first one.” “I WANTED TO SAY THAT I’M NOT HAPPY WITH MINE!!!” And so on.

The whole story – including the flood of bad reviews – made the news and made Lord Sugar, probably rightly, out to be an ass.

And the bad news kept piling up for him. Over the next five years, the E-M@iler continued to disappoint. In 2006, the price was cut from £80 to £19, with Amstrad making a loss on every unit. After a string of additional financial and operational business misfires, Lord Sugar sold Amstrad to British Sky Broadcasting for a fire-sale price of £125 million… a far cry from its peak £1.3 billion valuation in the ‘80s.

Fortunately for Sir Alan Sugar, he found a fulfilling second career post-Amstrad. Today, he serves as Britain’s answer to Donald Trump, yelling “You’re fired!” at contestants on the long-running English version of the TV show The Apprentice.

In a way, though, the E-M@iler fired Lord Sugar first.

It was an ignominious finish for a company that, in the 1980s, had dominated 60% of the market share for home computers in England.

And in the end, Sugar’s most lasting contribution to computer technology wasn’t even something he created directly. Today, we have the super-computer chip that’s the “brains” of the Parallel Processing Revolution… largely because Lord Sugar got in a fight.

A Call to ARM1

From his childhood days hawking soda bottles on the streets of London, Alan Sugar was a hustler. He launched his technology company, Amstrad (short for Alan Michael Sugar Trading) selling TV antennas out of the back of a van in 1968, then graduated to car stereos in the ‘70s and personal computers in the ‘80s.

With the 1980s personal computer boom – as technology advanced, and formerly massive mainframes shrunk down to desktop size – came fierce competition….

On the American side of the pond, Apple and Microsoft jockeyed for the pole position (a dance that still continues today). The English home-computer arms race wasn’t as well known outside of trade publications… but in the early ’80s, British nerds would have drawn swords over the burning question of “Acorn or Sinclair.”

Acorn (known as the “Apple of Britain” ) was the more sophisticated of the two, with a long-standing contract to make educational computers for the BBC (British Broadcasting Company), while Sinclair traded on mass appeal, producing Britain’s best-selling personal computer in 1982. (Their feud was dramatized in a 2009 BBC documentary called The Micro Men.)

Into this fray waded – you guessed it — Sir Alan Sugar, who knew nothing about programming or software but was determined to propel his protean tech company, Amstrad, to the top of the industry. He did this in the simplest way possible: He bought the more popular combatant, Sinclair, wholesale in 1986.

Just as he’d hoped, the existing Sinclair product line — plus a successful monitor-keyboard-printer combo designed by Sugar himself – propelled Sugar’s company to a £1.3 billion valuation by the mid-’80s.

Tiny £135 million Acorn, left out in the cold, couldn’t compete. Or could it?

Acorn fought valiantly before selling out to an Italian firm and ultimately bowing out of the computer biz in the early ’90s. And out of that effort grew the super-speedy chip that, today, powers the Parallel Processing Revolution.

Acorn’s proprietary chip, called ARM1, was based on a “reduced instruction set computer” architecture, or RISC. It was 10 times faster than the “complex instruction set computer” (CISC) CPU chips in Lord Sugar’s computers – and it could be manufactured more cheaply, too. The ARM1-powered computer retailed at one-third the price point of a mainstream PC.

But speedy or not, Acorn’s specialized computers couldn’t best Lord Sugar’s mass-produced PCs. Unlike Sugar’s IBM-compatible computers, Acorn’s machines couldn’t run the popular Microsoft Windows operating system.

Without Windows – and without Lord Sugar’s flair for headline-grabbing chaos — the ARM1-powered computer launched with little fanfare, selling just a few hundred thousand units over the next several years.

But long after Lord Sugar boxed up his E-M@ilers and joined the set of The Apprentice… and long after Acorn cashed in its chips and closed its doors… the ARM1 chip has survived and thrived, thanks to a spinoff partnership with Apple.

And ultimately, that tiny piece of silicon developed into the “brain” that drives Nvidia’s supercomputing architecture… and powers the Parallel Processing Revolution today.

Remarkably – like the GPU (graphic processing unit) monopoly we explored in our June 7 issue – the intellectual property to this chip technology is controlled today by just one company: ARM Holdings (Nasdaq: ARM). While it doesn’t get much attention in the media, ARM is poised to become one of the biggest winners from today’s parallel-computing revolution.

The Company That Ties It All Together

In our June 7 issue, which kicked off our Parallel Processing Revolution series, we explained how Nvidia is far more than just an artificial intelligence (“AI”) chipmaker. Over the last two decades, the company has laid the foundation for a technological revolution – one that’s now changing the very concept of what a computer is.

The combination of Nvidia’s super-powered GPUs that greatly increase the processing speed of computing, and its CUDA (Compute Unified Device Architecture) software network, unlocked the parallel-processing capacity required for training the large language models (“LLM”) powering today’s AI revolution. But that was just the start. The massive computational workloads of training LLMs – which scan large amounts of data to generate simplified, human-like text – presented a new challenge beyond computing speeds. The process of training LLMs to recognize patterns across huge swaths of data created an explosion in memory demand.

Consider the number of data points, known as parameters, used to train today’s cutting-edge LLMs like ChatGPT. Training the first GPT-1 model in 2018 required 120 million parameters. Each new iteration required an exponential increase in training data. ChatGPT-2 was trained on 1.5 billion parameters in 2019, followed by 175 billion for ChatGPT-3 in 2020. The number of parameters in the latest iteration, GTP-4, hasn’t been disclosed, but experts estimate it was trained on approximately 1.7 trillion parameters – a 100x increase in a matter of a few years.

LLMs must hold mountains of data in memory while training on these parameters. In computer science, memory capacity is measured in terms of bits and bytes. A bit refers to the smallest data unit that references a single 1 or 0 in binary code. A byte refers to an 8-bit data structure used to reference a single character in memory, such as a letter or number. One trillion bytes make up 1 terabyte – and hundreds of terabytes are required for training today’s most advanced LLMs.

The challenge: even cutting-edge GPUs, like Nvidia’s H100, hold less than 10% of the amount of memory needed for training today’s LLMs – only about 80 gigabytes (one gigabyte equals roughly 1.1 million bytes) of memory are inside each individual H100 chip.

In the data-center architectures of just five years ago, to get more memory, systems engineers would link multiple chips together via ethernet cables. This hack was sufficient to handle most data-center workloads before the age of AI.

But things have changed dramatically in those five years. Now, hundreds of terabytes of memory storage and transmission capacity are needed to power data centers. Almost no one in the industry anticipated that level of change would happen so quickly – but Jensen Huang did. As far back as 2019, the Nvidia CEO foresaw the future need to connect not just a few chips, but hundreds of chips in a data center. Huang reimagined the data center from a series of compartmentalized chips working independently on different tasks, to a fully-integrated supercomputer, where each chip could contribute its memory and processing power toward a single goal – training and running LLMs.

In 2019, only one company in the world had the high-performance cables capable of connecting hundreds of high-powered data-center chips together: Mellanox, the sole producer of the Infiniband cables (mentioned in “The Big Bang That No One Noticed”, on June 7). That year, Nvidia announced its $6.9 billion acquisition of the networking products company. During a conference call with journalists discussing the acquisition, CEO Huang laid out his vision for this new computing architecture:

Hyperscale data centers were really created to provision services and lightweight computing to billions of people. But over the past several years, the emergence of artificial intelligence and machine learning and data analytics has put so much load on the data centers, and the reason is that the data size and the compute size is so great that it doesn’t fit on one computer… All of those conversations lead to the same place, and that is a future where the datacenter is a giant compute engine… In the long term, I think we have the ability to create data-center-scale computing architectures.

David Rosenthal, host of the tech podcast Acquired, called Nvidia’s purchase of Mellanox “one of the best acquisitions of all time.” It provided the missing link the chipmaker needed to harness the power of hundreds of data-center chips together into a massive and explosively fast architecture. Less than four years after the acquisition, Huang’s supercomputing vision became a reality, in the form of the Grace Hopper Superchip architecture, which combines the Hopper GPU (based on Nvidia’s H100 chip) and the Grace CPU (more on this below).

The big breakthrough in combining these was Nvidia’s ability to package Mellanox Infiniband technology into its proprietary NVLink data-transmission cables. The NVLink system in the Grace Hopper architecture transmits memory data nine times faster than traditional ethernet cables. This enabled Nvidia to connect up to 256 individual Grace Hopper chips together, and tap into the full memory bank of both the Hopper GPU and the Grace CPU.

It was a major step in the parallel processing revolution.

Read the full article here >>

The CPUs Powering Today’s Parallel-Processing Boom

Transforming Data Centers Into Supercomputers

A Call to ARM1

The Company That Ties It All Together

Leave a Comment Cancel reply

The Big Secret on Wall Street by Porter & Co.