MIT may have just solved all your data center network lag issues

  • by

By Jon Gold, Network World | July 17th, 2014

New Fastpass system opts for counter-intuitive central processing architecture.

A group of MIT researchers say they’ve invented a new technology that should all but eliminate queue length in data center networking.

The technology will be fully described in a paper presented at the annual conference of the ACM Special Interest Group on Data Communication. According to MIT, the paper will detail a system — dubbed Fastpass — that uses a centralized arbiter to analyze network traffic holistically and make routing decisions based on that analysis, in contrast to the more decentralized protocols common today.

Experimentation done in Facebook data centers shows that a Fastpass arbiter with just eight cores can be used to manage a network transmitting 2.2 terabits of data per second, according to the researchers.

Professor Hari Balakrishnan, a co-author of the paper, admitted that this isn’t an intuitive solution to the problem of network lag.

“It’s not obvious that this is a good idea,” he said in a statement.

Another author, graduate student Jonathan Perry, addressed the obvious issue — won’t nodes spend a lot of time communicating with the central hub?

“If you have to pay these maybe 40 microseconds to go to the arbiter, can you really gain much from the whole scheme? Surprisingly, you can,” he said.

The trick, the researchers said, is a new way of dividing up the processing power needed to calculate transmission timings among multiple cores. In essence, Fastpass organizes workloads by time slot, rather than by source and destination pair. A core gets its own time slot, and schedules requests to the first free servers it can find, passing everything else on to the next core, which follows suit.

However, the authors say, Fastpass isn’t quite ready for prime time.

“This paper is not intended to show that you can build this in the world’s largest data centers today,” said Balakrishnan. “But the question as to whether a more scalable centralized system can be built, we think the answer is yes.”