hadoop发行版本
Azure HDInsight
Azure HDInsight is Microsoft's distribution of Hadoop. The Azure HDInsight ecosystem includes the following features/components: Pig, Hive, Hbase, Sqoop, Oozie, Ambari, Microsoft Avro Library, YARN, Cluster Dashboard and Tez.
Apart from the above listed features/components, there are a few other components which enable reporting and analytics on top of data present in Azure HDInsight. These components include the following:
More information: http://azure.microsoft.com/en-us/documentation/articles/hdinsight-introduction
Here are few highlights of Azure HDInsight:
- Azure HDInsight is based on Hortonworks Data Platform.
- Azure HDInsight enables Apache Hadoop as a service in Microsoft Azure cloud thereby leveraging all the benefits of cloud computing.
- Azure HDInsight offers strong support for PowerShell via HDInsight PowerShell Cmdlets.
- Windows Azure and HDInsight PowerShell Cmdlets can be used to perform various activities including uploading, downloading, movement of data to and from Azure Blob Storage and On-Premise file systems, configuring/executing/post-processing jobs on HDInsight, and other related activities.
- Azure HDInsight being a Hadoop service in the cloud, one can provision a cluster, process the data, and destroy the cluster and pay for only the resources used.
- Microsoft also offers an HDInsight Emulator which allows developers to explore HDInsight on premise without requiring an Azure Account.
Links & Additional Information
Getting Started
- Getting started with HDInsight
- Sign-up for a free trial account and get started with Azure HDInsight
- Download HDInsight Emulator and get started on-premise
Cloudera
Cloudera was the first company to be formed to build enterprise solutions based on Hadoop. Cloudera has a Hadoop distribution known as Cloudera's Distribution for Hadoop (CDH). Here is a simplified representation of Cloudera's Hadoop Ecosystem.
Source: http://www.cloudera.com/content/cloudera/en/products-and-services/cdh.html
Cloudera's Hadoop Ecosystem includes the following features/components: Apache Avro, Apache Crunch, Apache DataFu, Apache Flume, Apache Hadoop, Apache Hbase, Apache Hive, Hue, Cloudera Impala, Kite SDK (formerly CDK), LLAMA, Apache Mahout, Apache Oozie, Parquet, Apache Pig, Cloudera Search, Apache Sentry, Apache Spark, Apache Sqoop and Apache ZooKeeper.
More Information: http://www.cloudera.com/content/dev-center/en/home/developer-admin-resources/cdh-components.html
Here are few highlights of CDH:
- CDH can be deployed on-premise as well as in the cloud.
- Cloudera manager simplifies the deployment and management of Hadoop and other components in Cloudera's Hadoop Ecosystem.
- Cloudera has an Enterprise edition - Cloudera Enterprise, and is proprietary. There three variations of this - Basic, Flex, and Data Hub.
- Express edition is available via a free download.
- Cloudera Enterprise Data Hub edition is supported on AWS cloud.
Links & Additional Information
Getting Started
Hortonworks
Hortonworks has a Hadoop distribution known as Hortonworks Data Platform (HDP). Here is a simplified representation of Hortonworks Data Platform.
Source: http://hortonworks.com/hdp/
Hortonworks Data Platform includes the following features/components: Apache Hadoop, Apache Pig, Apache Hive, Apache Hbase, Apache ZooKeeper, Apache Oozie, Apache Sqoop, Apache Flume, Apache Ambari, Hue, Apache Mahout, Apache Knox, Apache Storm, Apache Tez, Apache Phoenix, Apache Accumulo and Apache Falcon.
More Information: http://hortonworks.com/hadoop/
Here are few highlights of Hortonworks Data Platform:
- Can be deployed on-premise as well as in the cloud.
- Supports deploying on Linux as well Windows platforms.
- HDP is built in open through Apache Projects.
Links & Additional Information
- Hortonworks Documentation
- HDP as part of a modern data architecture
- Hortonworks Training
- Hortonworks Forums
Getting Started
- Getting started with Hortonworks Data Platform
- Hortonworks Tutorials - Learn Hadoop on Sandbox
- Download ready to use VMs
Amazon Elastic Map Reduce (EMR)
Amazon Web Services (AWS) Elastic MapReduce (EMR) was among the first Hadoop offerings available in the market. Here is a high-level architecture/job flow of Amazon EMR.
Source: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-what-is-emr.html
Amazon EMR contains most of the popular features/components like Hive, Pig, HBase, DistCp, Ganglia, etc. integrated into it.
Here are few highlights of Amazon EMR:
- EMR is a Hadoop distribution in the Cloud.
- Leverages AWS's Elastic Compute Cloud (EC2) for computation.
- Leverages AWS's Simple Storage Service (S3) for storage.
- Is tightly integrated with other AWS services.
- Deployment and Management is simplified using AWS Management Console and AWS Toolkit.
Links & Additional Information
- Amazon's Elastic MapReduce (EMR)
- AWS Simple Monthly Calculator
- Amazon EMR Articles/Tutorials
- Amazon EMR Forum
Getting Started
MapR
MapR is another major distribution available in the market. Below is a simplified architecture of MapR Data Platform.
Source: http://www.mapr.com/products/product-overview/overview
Here are few highlights of MapR:
- MapR is available in the cloud through some of the leading cloud providers - Amazon Web Services (AWS), Google Compute Engine, CenturyLink Technology Solutions, and OpenStack.
- MapR integrates/supports more than 20 open source projects.
- MapR supports multiple versions of various individual projects it integrates into its data platform. This gives the users flexibility to migrate to the subsequent/latest versions at their own pace.
Links & Additional Information
Getting Started
Apart from the distributions listed above, there are various other distributions available in the market from leading providers like Intel, Oracle, HP, and many others.
hadoop发行版本的更多相关文章
- Hadoop 发行版本 Hortonworks 安装详解(一) 准备工作
一.前言 目前Hadoop发行版非常多,所有这些发行版均是基于Apache Hadoop衍生出来的,之所以有这么多的版本,完全是由Apache Hadoop的开源协议决定的:任何人可以对其进行修改,并 ...
- Hadoop 发行版本 Hortonworks 安装详解(二) 安装Ambari
一.通过yum安装ambari-server 由于上一步我们搭建了本地源,实际上yum是通过本地源安装的ambari-server,虽然也可以直接通过官方源在线安装,不过体积巨大比较费时. 这里我选择 ...
- Hadoop发行版本介绍
前言 从2011年开始,中国进入大数据风起云涌的时代,以Hadoop为代表的家族软件,占据了大数据处理的广阔地盘.开源界及厂商,所有数据软件,无一不向Hadoop靠拢.Hadoop也从小众的高富帅领域 ...
- hadoop发行版本之间的区别
Hadoop是一个能够对大量数据进行分布式处理的软件框架. Hadoop 以一种可靠.高效.可伸缩的方式进行数据处理.Hadoop的发行版除了有Apache hadoop外cloudera,horto ...
- 4.1-4.2 基于HDFS云盘存储系统分析及hadoop发行版本
一.基于HDFS云盘存储系统 如:某度网盘 优点: *普通的商用机器 内存 磁盘 *数据的安全性 操作: *put get *rm mv *java api *filesystem 核心: *H ...
- hadoop 有那些发行版本
hadoop发行版本 1. apache hadoop http://hadoop.apache.org/ 2. cloudera hadoop(CDH) https://www.cloudera. ...
- Hadoop三大发行版本
apache 提供基础版本 cloudera 主要是修改Hadoop,提供更加稳定的发行版本,以及可视化的管理服务,主要产品如下: CDH:Cloudera Distributed Hadoop Cl ...
- Hadoop国内主要发行版本
Hadoop主要版本 目前国内使用的不收费的Hadoop版本主要包括以下3个: Apache hadoop Cloudera的CDH Hortonworks版本(Hortonworks Data Pl ...
- 微软的R语言发行版本MRO及开发工具RTVS
(此文章同时发表在本人微信公众号"dotNET每日精华文章",欢迎右边二维码来关注.) 题记:微软在收购R语言的开发商后,也独立发行或在自己的产品中集成了R语言,这里就介绍下它们包 ...
随机推荐
- bzoj2464 小明的游戏
Description 小明最近喜欢玩一个游戏.给定一个n * m的棋盘,上面有两种格子#和@.游戏的规则很简单:给定一个起始位置和一个目标位置,小明每一步能向上,下,左,右四个方向移动一格.如果移动 ...
- 解决nginx启动时域名解析失败而导致服务启动失败的问题
问题: nginx启动或者reload的时候,会对proxy_pass后面的域名进行DNS解析,如果解析失败,启动就会失败或者reload失败. 我们是to B的产品,客户的环境可能是不通公网的,因此 ...
- 杂项:MSMQ
ylbtech-杂项:MSMQ MicroSoft Message Queuing(微软消息队列)是在多个不同的应用之间实现相互通信的一种异步传输模式,相互通信的应用可以分布于同一台机器上,也可以分布 ...
- [转]连连看游戏 C#
源代码下载地址 http://files.cnblogs.com/files/z5337/%E8%BF%9E%E8%BF%9E%E7%9C%8B%E6%B8%B8%E6%88%8F.rar 代码由 & ...
- 廖雪峰Java1-2程序基础-8字符和字符串
1.字符类型char char是基本的数据类型 char:保存一个字符,如英文字符.中文字符. Java使用unicode表示字符,可以将char赋值给int类型,查看字符对应的unicode编码. ...
- vSphere client 登陆ESXI主机“您无权登录次服务器”
vCenter安装在虚拟机上,安装好后想调整下内存,直接把虚拟机关闭了电源,突然一想服务器都被我关了,还拿什么修改内存,完蛋! 突然想起,在使用vCenter之前,都是用vsphere client ...
- 阿里云EC2+QEMU虚拟机+ROS完全教程!
---恢复内容开始--- 1.安装centos6.5 x64 同时记录,当前centos分配得到的IP,子网掩码,网关,以及MAC!!! 查看IP.mac命令ip add 查看网关命令cat /etc ...
- 使用minGW/cygwin在Windows是用于gcc开发
刚才记录了下用eclipse在linux下开发,突然想起来也另一种方法:MinGW. MinGW是Windows的gcc开发工具,直接使用Windows的运行库,所以可以在windows下面方便的用g ...
- 超全整理!Linux性能分析工具汇总合集
转自:http://rdc.hundsun.com/portal/article/731.html?ref=myread 出于对Linux操作系统的兴趣,以及对底层知识的强烈欲望,因此整理了这篇文章. ...
- python之路05
一 元组 对于元组我们可以把他看成一个不可变的列表# 元组:在()内用逗号分隔开的能够存多个值,对于元组来说列表有的一些功能它基本上都有,# 1.按索引取值(正向取+反向取):只能取# 2.切片(顾 ...