Welcome to Apache™ Hadoop®!
- What Is Apache Hadoop?
- Getting Started
- Download Hadoop
- Who Uses Hadoop?
- News
- 15 October, 2013: release 2.2.0 available
- 25 August, 2013: release 2.1.0-beta available
- 27 December, 2011: release 1.0.0 available
- March 2011 - Apache Hadoop takes top prize at Media Guardian Innovation Awards
- January 2011 - ZooKeeper Graduates
- September 2010 - Hive and Pig Graduate
- May 2010 - Avro and HBase Graduate
- July 2009 - New Hadoop Subprojects
- March 2009 - ApacheCon EU
- November 2008 - ApacheCon US
- July 2008 - Hadoop Wins Terabyte Sort Benchmark
What Is Apache Hadoop?
The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing.
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
The project includes these modules:
- Hadoop Common: The common utilities that support the other Hadoop modules.
- Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
- Hadoop YARN: A framework for job scheduling and cluster resource management.
- Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.
Other Hadoop-related projects at Apache include:
- Ambari™: A web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters which includes support for Hadoop HDFS, Hadoop MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig and Sqoop. Ambari also provides a dashboard for viewing cluster health such as heatmaps and ability to view MapReduce, Pig and Hive applications visually alongwith features to diagnose their performance characteristics in a user-friendly manner.
- Avro™: A data serialization system.
- Cassandra™: A scalable multi-master database with no single points of failure.
- Chukwa™: A data collection system for managing large distributed systems.
- HBase™: A scalable, distributed database that supports structured data storage for large tables.
- Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.
- Mahout™: A Scalable machine learning and data mining library.
- Pig™: A high-level data-flow language and execution framework for parallel computation.
- Spark™: A fast and general compute engine for Hadoop data. Spark provides a simple and expressive programming model that supports a wide range of applications, including ETL, machine learning, stream processing, and graph computation.
- ZooKeeper™: A high-performance coordination service for distributed applications.
Getting Started
To get started, begin here:
- Learn about Hadoop by reading the documentation.
- Download Hadoop from the release page.
- Discuss Hadoop on the mailing list.
Download Hadoop
Please head to the releases page to download a release of Apache Hadoop.
Who Uses Hadoop?
A wide variety of companies and organizations use Hadoop for both research and production. Users are encouraged to add themselves to the Hadoop PoweredBy wiki page.
News
15 October, 2013: release 2.2.0 available
Apache Hadoop 2.x reaches GA milestone! Full information about this milestone release is available at Hadoop Releases.
25 August, 2013: release 2.1.0-beta available
Apache Hadoop 2.x reaches beta milestone! Full information about this milestone release is available at Hadoop Releases.
27 December, 2011: release 1.0.0 available
Hadoop reaches 1.0.0! Full information about this milestone release is available at Hadoop Releases.
March 2011 - Apache Hadoop takes top prize at Media Guardian Innovation Awards
Described by the judging panel as a "Swiss army knife of the 21st century", Apache Hadoop picked up the innovator of the year award for having the potential to change the face of media innovations.
January 2011 - ZooKeeper Graduates
Hadoop's ZooKeeper subproject has graduated to become a top-level Apache project.
Apache ZooKeeper can now be found at http://zookeeper.apache.org/
September 2010 - Hive and Pig Graduate
Hadoop's Hive and Pig subprojects have graduated to become top-level Apache projects.
Apache Hive can now be found at http://hive.apache.org/
Pig can now be found at http://pig.apache.org/
May 2010 - Avro and HBase Graduate
Hadoop's Avro and HBase subprojects have graduated to become top-level Apache projects.
Apache Avro can now be found at http://avro.apache.org/
Apache HBase can now be found at http://hbase.apache.org/
July 2009 - New Hadoop Subprojects
Hadoop is getting bigger!
- Hadoop Core is renamed Hadoop Common.
- MapReduce and the Hadoop Distributed File System (HDFS) are now separate subprojects.
- Avro and Chukwa are new Hadoop subprojects.
See the summary descriptions for all subprojects above. Visit the individual sites for more detailed information.
March 2009 - ApacheCon EU
In case you missed it.... ApacheCon Europe 2009
November 2008 - ApacheCon US
In case you missed it.... ApacheCon US 2008
July 2008 - Hadoop Wins Terabyte Sort Benchmark
Hadoop Wins Terabyte Sort Benchmark: One of Yahoo's Hadoop clusters sorted 1 terabyte of data in 209 seconds, which beat the previous record of 297 seconds in the annual general purpose (Daytona) terabyte sort benchmark. This is the first time that either a Java or an open source program has won.
Welcome to Apache™ Hadoop®!的更多相关文章
- Hive创建表格报【Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException】引发的血案
在成功启动Hive之后感慨这次终于没有出现Bug了,满怀信心地打了长长的创建表格的命令,结果现实再一次给了我一棒,报了以下的错误Error, return code 1 from org.apache ...
- hive 使用where条件报错 java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.ppd.ExprWalkerInfo.getConvertedNode
hadoop 版本 2.6.0 hive版本 1.1.1 错误: java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.ppd.ExprWalk ...
- Hive:org.apache.hadoop.hdfs.protocol.NSQuotaExceededException: The NameSpace quota (directories and files) of directory /mydir is exceeded: quota=100000 file count=100001
集群中遇到了文件个数超出限制的错误: 0)昨天晚上spark 任务突然抛出了异常:org.apache.hadoop.hdfs.protocol.NSQuotaExceededException: T ...
- Hadoop程序运行中的Error(1)-Error: org.apache.hadoop.hdfs.BlockMissingException
15/03/18 09:59:21 INFO mapreduce.Job: Task Id : attempt_1426641074924_0002_m_000000_2, Status : FAIL ...
- Ubuntu14.04用apt在线/离线安装CDH5.1.2[Apache Hadoop 2.3.0]
目录 [TOC] 1.CDH介绍 1.1.什么是CDH和CM? CDH一个对Apache Hadoop的集成环境的封装,可以使用Cloudera Manager进行自动化安装. Cloudera-Ma ...
- 【解决】org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control
[环境信息] Hadoop版本:2.4.0 客户端OS:Windows Server 2008 R2 服务器端OS:CentOS 6.4 [问题现象] 在通过Windows客户端向Linux服务器提交 ...
- ERROR [org.apache.hadoop.security.UserGroupInformation] - PriviledgedActionExcep
换了个环境,出现此异常 016-10-18 23:54:01,334 WARN [org.apache.hadoop.util.NativeCodeLoader] - Unable to load n ...
- org.apache.hadoop.ipc.RemoteException(java.io.IOException)
昨晚突然之间mr跑步起来了 jps查看 进程都在的,但是在reduce任务跑了85%的时候会抛异常 异常情况如下: 2016-09-21 21:32:28,538 INFO [org.apache.h ...
- kylin cube测试时,报错:org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
异常: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, i ...
- org.apache.hadoop.security.AccessControlException: Permission denied:
org.apache.hadoop.security.AccessControlException: Permission denied: user=xxj, access=WRITE, inode= ...
随机推荐
- HTML5摇一摇
方式一 (function(){ /** * 摇一摇 * @author rubekid */ function Shake(options){ this.init(options); } Shake ...
- 获取android手机联系人信息
package com.yarin.android.Examples_04_04; import android.app.Activity; import android.database.Curso ...
- js动态新增组合Input标签
var x = 1; function addlink() { var linkdiv = document.getElementById("add1_0"); if (linkd ...
- [ERROR] Unknown/unsupported storage engine: InnoDB
将CentOS上的mysql升级以后,出现无法启动服务的问题.运行mysqld_safe后查看log信息,看到标题所示的错误.搜索以后发现是配置不对,难道两个版本的配置不能互相兼容?那还叫升级?坑爹啊 ...
- Swift开发必备技巧:内存管理、weak和unowned
因为 Playground 本身会持有所有声明在其中的东西,因此本节中的示例代码需要在 Xcode 项目环境中运行.在 Playground 中可能无法得到正确的结果. 不管在什么语言里,内存管理的内 ...
- oracle 语句
1.查询TRENDCHART_DLT表中的30条数据,统计字段FRONT01='0',BACK12='0'的条数 select sum(case when FRONT01='0' then 1 els ...
- 数据库和Doctrine(转载自http://www.111cn.net/phper/332/85987.htm)
对于任何应用程序来说最为普遍最具挑战性的任务,就是从数据库中 读取和持久化数据信息.尽管symfony完整的框架没有默认集成ORM,但是symfony标准版,集成了很多程序,还自带集成了Doctrin ...
- 手机端禁止iPhone字体放大
/*禁止iphone字体放大 */ html { -webkit-text-size-adjust: none; }
- Hibernate数据库对象的创建与导出
Hibernate 与数据库的关系是ORM关系,对象映射数据库. 那么如何通过对象对数据库进行各种对象的ddl与dml操作呢? 数据库对象操作的〈database-object /〉+ SchemaE ...
- C程序设计语言练习题1-17
练习1-17 编写一个程序,打印长度大于80个字符的所有输入行. 代码如下: #include <stdio.h> // 包含标准库的信息. #define MAXROW 10 // 最大 ...