Saturday, 15 August 2015

Popular Hadoop Distributions in the Market

List of popular HADOOP distributions are given below. Commercial distributions are providing more stable hadoop and comes up with patches for each issues.

Apache HADOOP:

  • Complex Cluster setup.
  • Manual integration and install of HADOOP ecosystem components.
  • No Commercial support. Only discussion forum.
  • Good for first try.

Cloudera:

  • Established distribution with many referenced deployments.
  • Powerful tools for deployment, management and monitoring such as Cloudera Manager.
  • Impala is for Interactive querying and analytic.

HortonWorks: 

  • It is from Yahoo.
  • Only distribution without any modification in Apache Hadoop.
  • Hcatalog for metadata.

MapR:

  • Supports native Unix file system.
  • HA features such as snapshots, mirroring or stateful fail over.

Amazon Elastic Map Reduce(EMR):

  • Hosted Solution.
  • The most popular applications, such as Hive, Pig, HBase, DistCp, and Ganglia are already integrated with Amazon EMR.

And also we have IBM's BigInsights and Microsoft's HDInsights.