Under the Hood at Google and Facebook - IEEE Spectrum
The entire article is worth reading for it’s discussion of Google’s and Facebook’s differing philosophies on datacenter construction and policy, but the part that stood out for me was the anti-workload optimized approach to servers that both competitors follow.
Giant data centers—even energy-efficient ones—are, of course, nothing without the proper servers. Facebook will be populating its Oregon and North Carolina locations with custom-designed servers, just as Google has long done.
Facebook’s Amir Michael, manager of hardware design, explains that when the company decided to build its own facilities, “we had a clean slate,” which allowed him and his colleagues to optimize the designs of their centers and servers in tandem for maximum energy efficiency. The result was a server that “looks very bare bones. I call it a ‘vanity-free’ design just because I don’t like people to call it ugly,” says Michael. “It has no front bezels. It has no paint. It has no logos or stickers on it. It really has only what is required.”
Google also keeps server frills to a minimum. Like Facebook, it buys commodity-level computing hardware and just fixes the many pieces that break, instead of purchasing high-end systems that are less prone to failure but also much more expensive. Economics, if nothing else, drove engineers at both companies to similar conclusions here. Fit and finish might count if you’re buying one server or even a hundred, but not when you’re shopping for tens of thousands at a time. And striving for high reliability is a little pointless at this scale, where failure is not only an option, it’s a daily fact of life.
Facebook’s Michael explains that he helped design three basic types of servers for running the Facebook application. The top layer of hardware, connected most directly with Facebook’s many users, consists of outward-facing Web servers. They don’t require much disk space—just enough for the operating system (Linux), the basic Web-server software (which until recently was Apache), the code needed to assemble Facebook pages (written in PHP, a scripting language), some log files, and a few other bits and pieces. Those machines are connected to a deeper layer of servers stuffed with hard disks and flash-based solid-state drives, which provide persistent storage for the giant MySQL databases that hold Facebook users’ photos, videos, comments, and friend lists, among other things. In between are RAM-heavy servers that run a memcached system to provide fast access to the most frequently used content.
Alpha geeks will recognize that these pieces of software—Linux, Apache, PHP, MySQL, memcached—all hail from the open-source community. Facebook’s programmers have modified these and other open-source packages to suit their needs, but at the most basic level, they are doing exactly what countless Web developers have done: building their site on an open-source foundation.
Not so at Google. Programmers there have written most of their company’s impressive software from scratch—with the exception of the Linux running on its servers. Most prominent are the Google File System (or GFS, a large-scale distributed file system), Bigtable (a low-overhead database), and MapReduce (which provides a mechanism for carrying out various kinds of computations in a massively parallel fashion). What’s more, Google’s programmers have rewritten the company’s main search code more than once.
Speaking two years ago at the Second ACM International Conference on Web Search and Data Mining, Jeff Dean, a Google Fellow working in the company’s system infrastructure group, said that over the years his company has made seven significant revisions to the way it implements Web search. However, outsiders don’t realize that, because, as Dean explained, “you can replace the entire back end without anyone really noticing.”
How are we to interpret the difference between Google’s and Facebook’s engineering cultures with respect to the use of open-source code? Part of the answer may just be that Google, having started earlier, had no choice but to develop its own software, because open-source alternatives weren’t yet available. But Steve Lacy, who worked as a software engineer for Google from 2005 to 2010, thinks otherwise. In a recent blog post, he argues that Google just suffers from a bad case of not-invented-here syndrome. Many open-source packages “put Google infrastructure to shame when it comes to ease of use and product focus,” writes Lacy. “[Nevertheless, Google] engineers are discouraged from using these systems, to the point where they’re chastised for even thinking of using anything other than Bigtable/Spanner and GFS/Colossus for their products.”
Notes
-
toontimbermont liked this
-
itspogilvy posted this