A NAS in every household will help you and archaeologists. Do it now!

Our lives are digital.  Our cameras are no longer film.  Our notes are no longer postcards.  The USPS is having a hard time staying in business.

To get really deep about this … Thousands of years from now, archaelogists will see our world vividly just like on the day your iPhone or DSLR captured it. That is … if the data’s still around.

We’re losing data left and right because we aren’t practicing good ways of storing it.

Stop spreading your digital existence across 12 devices (including the ones long retired but never copied data from in the attic/garage/dumpster/Goodwill). Keep a definitive copy of everything in one place.

It’d be a shame cave paintings outlive our digital pictures, and right now that’s scarily possible.

If we could just centralize and manage it better, then maybe we could also have an easier time archiving it all.

So, let’s get practical!

First off … problems … how data was stored in the dark ages:

  • Cloud services.  They keep things accessible, can help centralize and they’re often inexpensive.  Cloud services miss the boat on your precious pictures and home movies because:
    • Your internet is too slow, and while Google et al are working on this, it’ll be a while yet.
    • Easy to user cloud storage providers are charging too much.
    • Inexpensive cloud storage providers are usually too hard to use.
  • The hard drive inside your computer can die at any time, and it’s probably not big enough.  Plus, it’s harder (not impossible) to share that stuff with say … your smart TV … and the rest of your family.
  • Portable/external hard drives.  Don’t get me started.  No.  I own far too many, and I have no clue what’s on most of them.  Plus 1/3 of them are broken — in some cases with precious photos or bits of source code lost forever.

Solution:  Get a Network Attached Storage device.  Today.  Without delay.

Why?  If you can centralize everything, it’s easier to back up.  You also have super fast access to it, and everybody in your home can share (or not — they do have access control features).

I have serious love for Synology‘s devices for three reasons:

  1. They integrate with Amazon’s Glacier service.  To me, this is a killer feature.  Now I can store every single one of my selfies, vacation pictures, inappropriate home movies, etc. in a very safe place until my credit card stops working.  At $10 per terabyte per month, that credit card should work a while.  Glacier is a good deal.
  2. It’s seriously awesome, fully featured software.
  3. Quality, fast hardware.

All at a price that while not the cheapest doesn’t particularly break the bank.

Now, I’ll assume that if you’re anything like me you want speed.  You want access to your data, or you’re not going to use that NAS like it’s supposed to be.

You’re also not going to invest in a 24 drive SSD enterprise SSD NAS because … well … you’re a home user.

So, some guidelines:

  • Buy at least twice as much storage as you think you need.  Your estimate is low.
  • Plan to upgrade/replace in 3 years.  You don’t have to make a perfect buying decision — nor do you have to buy for eternity.  Plan to MIGRATE! — which is why you’ll want hardware that’s fast enough you can copy data off it before the earth crashes into the sun!
  • Don’t plan to add more hard drives anytime soon.  Fill all the drive bays.
  • Buy the largest available drives.
  • Forget SSD.  SSD is too small and far too expensive for the storage you want.  Buy more drives and get performance advantages of having more drives instead.
  • Plan on backing up every computer you own to the NAS — size appropriately — and then some.

My Picks

With price and performance in mind, I’ll wade through the mess of models Synology has to tell you what makes sense in my opinion:

Recommendation 1:  Synology DS414

  • Four drives provide 16TB physical space — 10-12TB usable with Synology’s own RAID.
  • Four drives provide better read performance than two or one
  • Spare fan just in case one fails
  • Link aggregation, but you’ll never use it.

Recommendation 2:  Synology DS214+

  • Fastest Synology two drive model.
  • Two drive redundancy.
  • For some users, the video playback features of the DS214play may be more appropriate, but it’s slower and more expensive.

Recommendation 3:  Synology DS114

  • Danger!  Just one drive — no redundancy.  You are backing up with Glacier, right?
  • Fast for a single drive NAS

All provide:

  • USB 3.0 port(s) to load your data from a portable drive
  • Gigabit ethernet
  • All that lovely Synology software!

Hard drives?

Personally, I’d buy the Western Digital Red 5400RPM NAS drives in 4TB.  Based on Amazon’s pricing, I don’t see much of a premium if any for getting the largest model on the market.  The larger the drives, the more benefit you get from your NAS, so I wouldn’t skimp.

If you really truly believe you won’t need the space, but you’d like the performance of four drives on the DS414, then you can save around 350 USD by purchasing 4x 2TB drives instead of 4x 4TB.

Your Network Needs Speed

Now, along with all that firepower in the NAS, you need the network to feed that speed addiction.

Get a good quality switch, and if you’re going to use your NAS over wireless check out Amped Wireless RTA 15.  Wired speeds will nearly always be faster, but I like wireless convenience just like you.

You’ll Love Speedy Backups

For extra credit, Apple’s Time Machine backup works really nicely with my NAS.  It works a lot faster when I plug in the ethernet cable.  On a Cisco 2960G switch (yes, I have some serious commercial grade switches lying around), my late model Apple MacBook Pro Retina did around 100 gigs under 15 minutes.

Do I need a NAS in the future?

Possibly not.  When bandwidth gets there and cloud offerings match up at the right price points.

Oh, and a little re-arrangement of the letters NAS … NSA.  User trust!  Yes, all this assumes user trust of cloud services.  Then again, the NSA can probably backdoor your NAS if they really want to.  Sorry.  Nothing’s perfect.

Happy Trails

Your mileage my vary.  My new DS414 was a religious experience.

Why Amazon’s EC2 Outage Should Not Have Mattered

This past week I got a call in the middle of the night from my team that a major web site we operate had gone down. The reason: Amazon’s EC2 service was having issues.

This is the outage that famously interrupted access to web sites ordinarily visited by millions of people, knocked Reddit alternately offline or into an emergency read-only mode for about a day (or more?) and made mention in the Wall Street Journal, MSNBC and other major news outlets.

In the Northern Virginia region where the outage occurred and where we were hosted, Amazon divides the EC2 service into four availability zones. We were unlucky enough to have the most recent copies of crucial data in exactly the wrong availability zone, and this made nearly impossible an immediate graceful fail-over to another zone because the data was not retrievable at the time. Furthermore, we were unable to immediately transition to another region because our AMI’s (Amazon Machine Images) were stuck in the crippled Northern Virginia region and we lacked pre-arranged procedures to migrate services.

While in the works, we had not yet established procedures to migrate to another region. Having some faith in Amazon’s engineering team, we decided to stand pat. Our belief was that by the time we took mitigating measures, Amazon’s services would be back to life anyways. And … that proved to be true to the extent that we needed.

The lessons learned are this:
(1) Replicate your data across multiple Amazon regions
(2) Do 1 with your machine images and configuration
(3) For extra safety, do 1 and 2 with another cloud provider as well
(4) It’s probably a good idea to also do an off-cloud backup

Had we already done just 1 and 2, our downtime would have been measured in minutes, not hours as one of our SA’s flipped a few switches… all WHILE STAYING on Amazon systems. Notice how Amazon’s shopping site never seemed to go down? I suspect they do this.

As for the coverage stating that Amazon is down for a third day and horribly crippled, I can tell you that we are operating around the present issues, are still on Amazon infrastructure and are not significantly impacted at this time. Had we completed implementation of our contingency plans only within Amazon by the time this happened, things would have barely skipped a beat.

So, take the hype about the “Great Amazon Crash of 2011” with a grain of salt. The real lesson is that in today’s cloud contingency planning still counts. Amazon resources providing alternatives in California, Ireland, Tokyo and Singapore have hummed along without a hiccup throughout this time.

If Amazon would make it easier to move or replicate things among regions, this would make implementation of our contingency plans easier. If cloud providers in general could make portability among each other a point and click affair, that would be even better.

Other services such as Amazon’s RDS (Relational Database Service) and Beanstalk rely on EC2 as a sub-component. As such, they were impacted as well. The core issue at Amazon appears to have involved the storage component upon which EC2 increasingly relies upon: EBS. Ultimately, a series of related failures and overload of remaining online systems caused instability across many components within the same data center.

Moving into the future, I would like to see a world where Amazon moves resources automagically across data centers and replicates in multiple regions seamlessly. Also, I question the nature of the storage systems behind the scenes that power things like EBS, and until I have more information it is difficult to comment on their robustness.

Both users and providers of clouds should take steps to get away from reliance on a single data center. Initially, the burden by necessity falls on the cloud’s customers. Over time, providers should develop ways such that global distribution and redundancy happen more seamlessly.

Going higher level, components must be designed to operate as autonomously as possible. If a system goes down in New York City, and a system in London relies upon that system, then London may go down as well. Therefore, a burden also exists to design software and/or infrastructure that carefully take into account all failure or degradation scenarios.

Ruby Developers: Manage a Multi-Gem Project with RuntimeGemIncluder (Experimental Release)

A couple of years ago in the dark ages of Ruby, one created one Gem at a time, hopefully unit tested it and perhaps integrated it into a project.

Every minute change in a Gem could mean painstaking work often doing various builds, includes and/or install steps over and over.  No more!

I created this simple Gem (a Gem itself!) that at run-time builds and installs all Gems in paths matching patterns defined by you.

I invite brave souls to try it out this EXPERIMENTAL release now pending a more thoroughly tested/mature release. Install RuntimeGemIncluder, define some simple configuration in your environment.rb or a similar place and use require as you normally would:

Here’s an example I used to include everything in my NetBeans workspace with JRuby.

Download the Gem from http://rubyforge.org/frs/?group_id=9252

To install, go to the directory where you have downloaded the Gem and type:

gem install runtime-gem-includer-0.0.1.gem

(Soon you may be able to install directly from RubyForge by simply typing ‘gem install runtime-gem-includer‘.)

Some place before you load the rest of your project (like environment.rb if you’re using Rails) insert the following code:

trace_flag = "--trace"
$runtime_gem_includer_config =
{
:gem_build_cmd = "\"#{ENV['JRUBY_HOME']}/bin/jruby\" -S rake #{trace_flag} gem",
:gem_install_cmd = "\"#{ENV['JRUBY_HOME']}/bin/jruby\" -S gem install",
:gem_uninstall_cmd = "\"#{ENV['JRUBY_HOME']}/bin/jruby\" -S gem uninstall",
:gem_clean_cmd = "\"#{ENV['JRUBY_HOME']}/bin/jruby\" -S rake clean",
:force_rebuild = false,
:gem_source_path_patterns = [ "/home/erictucker/NetBeansProjects/*" ],
:gem_source_path_exclusion_patterns = []
}
require 'runtime_gem_includer'

If you are using JRuby and would like to just use the defaults, the following code should be sufficient:


$runtime_gem_includer_config =
{
:gem_source_path_patterns = [ "/home/erictucker/NetBeansProjects/*" ],
:gem_source_path_exclusion_patterns = []
}
require 'runtime_gem_includer'

Now simply in any source file as you normally would:

require 'my_gem_name'

And you’re off to the races!

Gems are dynamically built and installed at runtime (accomplished by overriding Kernel::require).  Edit everywhere, click run, watch the magic! There may be some applications for this Gem in continuous integration. Rebuilds and reloads of specified Gems should occur during application startup/initialization once per instance/run of your application.

Interested in source, documentation, etc.? http://rtgemincl.rubyforge.org/

My Project – Better Information: It’s Coming

As many close to me know, I have spent the last few years working on a largely stealth project. The original idea hatched in late 2005 on a 25 hour journey to visit a friend in Singapore.

The project remains mostly in stealth, but I will make some public comments.

Broadly speaking, today’s information suffers from intentional and unintentional inaccuracy, bias, incompleteness, inconsistency, inefficient presentation and other problems.

I look to bridge the gap between masses of loosely structured information and usable knowledge. Raw data needs to go to real wisdom in your brain … faster.

To this end, my team has explored many solutions both technological and non-technological.

Stay tuned.