Linux Hadoop Cloudera CDH3


Description:

The Cloudera distributions are Linux distributions ready to form a Hadoop cluster.

What is Hadoop?

Hadoop is a free framework written in Java that facilitates the writing of distributed applications.

Some components of Hadoop are extremely popular today:

  • MapReduce, an algorithm which allows the alignment of a task or calculation over large amounts of data;
  • HBase, a distributed database for large data volumes;
  • HDFS,the distributed file system.

One peculiarity of Hadoop is its ability to operate even if several nodes in the cluster are faulty.
http:/hadoop.apache.org/

Why the Cloudera distribution?

The Cloudera company is now a reference in the Hadoop world and a major contributor. We offer the above CDH3 version of Ubuntu 10.04 (64bit) using: HDFS, MapReduce, HBAse, Hive, Zookeeper, Hue.
http:/www.cloudera.com/

What are the characteristics of the 3 versions offered by OVH?

  • 'Pseudo-distributed' Mode: it's a version for testing and development. All Hadoop bricks are collected on a single machine.
  • 'Master' Mode: in a Hadoop cluster, you must have a 'Master' server that shall be responsible for managing your cluster. 'The Master' has the roles of 'JobTracker' for MapReduce and 'namenode' for HDFS.
  • 'Slave' Mode: for all nodes in your cluster to perform the calculations ('TaskTracker') and contain data ('datanode').



Uses:

Web Hosting


Email Hosting


Gaming


Easy to use



Specifications:

Versions:


Core:


Languages:




Technical details

Access to a server Email service
FTP (Port 21) - POP3 (Port 110) -
SSH (Port 22) - IMAP (Port 143) -
TSE (Port 3389) - SMTP (Port 25) -
Web services Programming
Web (Port 80) - My SQL -
Named (Port 53) - PHP -