Virtual Machine and Examples for OSCON Workshop

The Virtual Machine and code samples used in my upcoming Introduction to Apache Hadoop workshop at OSCON are now available. Note that these are instructor-led demonstrations, so you should feel free to simply watch and run the examples yourself later if you prefer.
  1. Download the Cloudera Quickstart VM. System requirements are detailed on that page, and you'll need the image appropriate for the virtualization software you use, such as VMWare or Virtualbox.
  2. Once you've booted the VM, run the wget http://www.tomwheeler.com/publications/oscon_hadoop_intro.tgz command from the terminal window inside the VM. If you'd rather just study the code, you can download a copy.
  3. Finally, run the tar -zxf oscon_hadoop_intro.tgz command from the terminal window.

Please run these commands before the workshop. I will explain the code examples and specific steps during the session.

Finally, while the virtual machine does contain Cloudera's CDH distribution, the code should also run unchanged in any modern version of Hadoop, though you may need to modify the library paths to compile and run the code.