In 2000, a relatively unknown entrepreneur at the Intel Developer Forum said he’d like to take the entire Internet, which then existed as bits on hard drives scattered around the world, and put it on memory to speed it up.
“The Web, a good part of the Web, is a few terabits. So it’s not unreasonable,” he said. “We’d like to have the whole Web in memory, in random access memory.”
The comment raised eyebrows, but it was quickly forgotten. After all, the speaker, Larry Page, wasn’t well known at the time. Neither was Google for that matter, as the company’s backbone then consisted of 2,400 computers.
Flash forward to today. Google has become one of the world’s most important companies, and 2,400 servers would barely fill a corner in a modern datacenter. Experts estimate that Google now operates more than 1 million servers. And the Web has ballooned way past a few terabits.Facebook alone has 220 billion photos and juggles 4.5 billion updates, likes, new photos and other changes every day.
But Page’s original idea is alive and well. In fact, it’s more relevant than ever. Financial institutions, cloud companies and other enterprises with large data centers are shifting toward keeping data ‘in memory.’ Even Gartner picked In-Memory Computing (IMC) as one of the top ten strategic initiatives of 2013.
Data Center History In the Making
Chalk it up to an imbalance in the pace of change. Moore’s Law is still going strong: microprocessors double in performance and speed roughly every two years. Software developers have created analytics that let researchers crunch millions of variables from disparate sources of information. Yet, the time it takes a server or a smartphone to retrieve data from a storage system deep in the bowels of a cloud company or hosting provider on behalf of a business or consumer hasn’t decreased much at all.
Then as now, the process involves traveling across several congested lanes of traffic and then searching a spinning, mechanical hard drive. It is analogous to having to go home and get your credit card number every time you want to make a purchase at Amazon from work.
The lag has forced engineers and companies into unnatural acts. Large portions of application code are written today to maximize the use of memory and minimize access to high latency storage. Likewise, many enterprise storage systems only use a small portion of the available disk space they buy, storing data on the outer edges of disks reduces access time. To use another analogy, it is like renting an entire floor in an office building, but only using the first fifteen square feet near the elevator so people can get in and out faster during rush hour.
IMC ameliorates these problems by reducing the need to fetch data from disks. A memory fabric based on flash can be more than 53 times faster than one based around disks. Each transaction might normally take milliseconds, but multiply that over millions of transactions a day. IMC architectures vary, but generally they will include a combination of DRAM, which holds data temporarily, and arrays based on flash memory, which is almost as fast but is persistent.
The shift will have a cascading effect. Moving from drives to flash allows developers to cut many lines of code from applications. In turn, that means fewer product delays and maintenance headaches.
The Future of In Memory Computing
Some companies have already adapted IMC concepts. Social network Tagged.com was architected under the assumption that it will always retrieve data from the memory tier. SAP’s HANA only addresses non-volatile memory. Oracle is making a similar shift with Exadata, now combining DRAM and flash into a ‘memory tier.’ To SAP and Oracle, the Rubicon has been crossed. In tests, HANA has processed 1,000 times more data in half the time than conventional databases. IMC will usher in an entirely new programming model and ultimately a new business model for software companies.
With IMC-based systems, your data center would go on a massive diet. Right now, servers in the most advanced data centers are sitting around with nothing to do because of latency: even Microsoft admits servers are in use just 15 percent of the time. Think of it: 85 percent of your computing cycles go to waste because the servers are waiting for something to do. That is a massive amount of excess overhead in hardware, real estate, power consumption and productivity.
We did some calculations on what would happen if you redesigned a data center with memory-based storage systems. You could store 40 times as much data in the same finite space. It takes 4 racks of disk storage to create a system capable of 1 million IOPS, or input/output operations per second. It would take only one shelf of a flash-based storage system. Energy consumption would drop by 80 percent since memory-based systems consume less energy and require fewer air conditioners.
The metrics around in-memory computing will continue to get better. In the future, it may be possible to produce systems with hundreds of petabytes, or systems that can hold all of the printed material ever produced times five. All of this data would be instantly available to applications allowing for faster and more accurate decision making.
A shift to In-Memory Computing will allow Big Data analytics to sing. Think again about how IMC requires software reconfiguration. Reducing excess software code will accelerate performance. Speed is absolutely crucial for predictive analytics to succeed. The Internet of Things – where inanimate objects and sensors will be collecting data about the real world all the time – will become manageable. You will know what’s going on in near real-time – rather than waiting around.
This post was originally published on Forbes.com