A 30 year sorting algorithm saga: from 250k to 14GB in one minute

Tags: Software

Microsoft Research busted through the world record for sorting an unprecedented amount of data in under one minute, with a new sorting technique called MinuteSort (with flat datacenter storage). The Microsoft Research team sorted the equivalent of two 100-byte data records for every human being on the planet.

That's a rough equivalent of a short email message, for example, this series of 1's represents 200 bytes:

1111111111111111111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111111111111111111

This is 200 bytes of a plain text message you might send in an email.

How much wood could a woodchuck chuck if a woodchuck could chuck wood?
Good woodchucks chuck much.

Multiply that by the population of planet Earth, currently at around 7 BILLION, and the result is 1,400,000,000,000 bytes, or over 14GB in one minute.

That's a lot of data to wade through, and sorting hasn't always been so fast and sexy. Take a look at this scan of an ancient TI-99 programming magazine article from 1983 on five popular sorting algorithms, some still used today. 

  • Bubble Sort
  • Shell Sort
  • Selection Sort
  • Heap Sort
  • Quick Sort

Here's a sampling of the article, page 1 & 2 (Download the complete 5 pages)

Ti-1ti-1a

Turn to the last page or the article and you can see that in 1983 the Quick Sort, wins the honor of sorting roughly 250k in about 1 minute and the Bubble sort can run equivalent of the 200 byte sample sample above in about 6 minutes (ouch).

image

How times have changed.

Sorting basics from Carnegie Mellon http://www.cs.cmu.edu/~adityaa/211/Lecture12AG.pdf [pdf]

Download the complete article

Big congrats to the awesome team at Microsoft Research! 100 points and a Geritol if you are old enough to have used a TI-99 or have read this magazine.

5 Comments

  • Jesse Slicer said

    I had a subscription to that magazine back in the day AND not only have used a TI-99/4A (it was my first computer in 1982 and launched my interest in programming), but it also allowed me to run a BBS, meet my future wife, and the computer still works to this day :)

  • James Curran said

    Well, I had a TRS-80, and I not only read, but still have (easily accessible) copies of Creative Computing & Kilobaud magazines from the late 1970s.

  • Rachel said

    Jesse - Me too! It was one of my firsts and yeah, it helped push me a great deal to this industry. That and the TRS80 and Apple IIe. Ahhh the memories!

    James - I expect no less from you!

    Mark - I can't say I was using source code control 30 years ago. I don't have any source. The language is TI-BASIC.

  • Fadi said

    I wonder if the source code will be published anywhere (it's just a sorting algorithm - why not?) - but then again, this is Microsoft we're talking about.

    In any case, they're claiming on the website linked to from MS's blog that they have managed to sort almost 1 TB in less than a minute. Wow!

Add a Comment