[TriLUG] [Novalug] Comparing Clouds; A trivial test. (fwd)

Fri Nov 1 18:09:15 EDT 2013

 I did some quick benchmarking, and initial results looked extremely positive!

=====8< snip 8<=====
asjoyner at dns1:~$ git clone https://github.com/maxwax/cpu-looper
Cloning into 'cpu-looper'...
remote: Counting objects: 13, done.
remote: Compressing objects: 100% (9/9), done.
remote: Total 13 (delta 5), reused 9 (delta 4)
Unpacking objects: 100% (13/13), done.
asjoyner at dns1:~$ cd cpu-looper/
asjoyner at dns1:~/cpu-looper$ gcc -O2 cpu-looper.c -o cpu-looper
asjoyner at dns1:~/cpu-looper$ time ./cpu-looper

real    0m0.001s
user    0m0.000s
sys     0m0.000s
=====8< snip 8<=====

Of course, you presumably weren't using -O2 (-O1 optimizes out your
unused math operation, -O2 compiles out the for() loop altogether.  :)

I did some additional experimentation with an instance I had lying
around, the smallest size: f1-micro, and it seemed that it would take
it about 10 hours to complete the full version of your code.  That's a
far cry from your dedicated machine, but a healthy margin greater than
the AWS machine.  It's also fairly unrealistic to use an f1-micro for
a CPU intensive job... so I fired up a g1-small instance, the smallest
reasonable choice for this task.

===== 8< snip 8<====
asjoyner at cputest-small:~/cpu-looper$ gcc cpu-looper.c -o cpu-looper
asjoyner at cputest-small:~/cpu-looper$ time ./cpu-looper

real    214m52.559s
user    213m54.280s
sys     0m9.120s
asjoyner at cputest-small:~/cpu-looper$
===== 8< snip 8<====

That's not as fast as your truly unladen properly dedicated machine,
but the g1-small instance is a relatively underpowered machine, not
really designed for heavy compute work.  Considering that, it's not
too shabby.  It's also only 5 cents per hour to run a g1-small
machine.  My power at home runs about 14 cents per kw/hr.  I don't
know much about the power efficiency of the Phenom II or what power
supply you're using, but assuming it's around 350w to 400w to run that
machine, your electricity cost is roughly equivalent, and GCE has no
up-front capital costs.

The g1-small instance clocks in at 1.38 GCEUs or Google Compute Engine
Units, a measure of the quantity of CPU resources available to the
instance.  I figured I'd give it a whirl on the n1-standard-1 instance
type as well, to see how the GCEU relative metric stacked up.
Specifically, n1-standard-1 is 2.75 GCEUs, so we'd expect it to
complete in very close to half the time.  Here goes...

===== 8< snip 8<====
asjoyner at cputest-standard:~/cpu-looper$ time ./cpu-looper

real    103m29.333s
user    103m21.940s
sys     0m0.940s
asjoyner at cputest-standard:~/cpu-looper$
===== 8< snip 8<====

Compared to the g1-small, it's just over twice as fast (slightly
faster than the GCEUs would predict).  Cost-effectiveness wise, it's
basically the same as the g1-small per arbitrary math operation, it
just gets done faster... although for most use cases you'd likely be
able to optimise by using the more than twice the ram also available
in the n1-standard-1, assuming your task can be structured to make use
of it.

Of all the results tested in this thread, it's the fastest way to do
arbitrary math; about 30% faster than the VM on an unloaded /
otherwise-dedicated machine.

I'm curious what instance type you tested on AWS with?  Based on the
performance, I'd assume it was their t1.micro instance?

Aaron S. Joyner

--
[Sorry, context trimmed due to list length restrictions.  Here's the
summary version of Maxwell's original numbers so you don't have to
hunt them down in the list archive.]

Debian 7.2 virtual machine in KVM virtual machine on my local
workstation.  The workstation is a 2010-era 4-core AMD Phenom II CPU
with 12G of RAM and no other significant workloads.

root at debian72:~# time ./cputest
real 134m15.134s   [2.23 hours]
user 134m9.707s
sys 0m0.020s
____________________________________________________________________________
First test: Rackspace.  Unknown server with AMD Opteron 2.1GHz 4170 HE
"Lisbon" processor.  Similar to a 6-core version of my Phenom II.

[root at rackfree ~]# time ./cputest
real 192m41.034s  [3.20 hours]
user 192m8.815s
sys 0m2.852s
____________________________________________________________________________
Second test: Amazon AWS.  Unknown server with Intel Sandy Bridge
E5-2650 CPU @ 2.0 Ghz.
24:55:26 elapsed

89501.70 user
6.95 system
99%CPU

[This message weighs in at 4567 bytes, and after it's double-encoded
by GMail in plain text and HTML when being sent, I wonder if it will
trip over the 9k list limitation...?  Might come down to 1000 vs. 1024
math in Mailman... :) ]