[Hive - LanguageManual] Alter Table/Partition/Column
Alter Table/Partition/Column
Alter table statements enable you to change the structure of an existing table. You can add columns/partitions, change SerDe, add table and SerDe properties, or rename the table itself. Similarly, alter table partition statements allow you change the properties of a specific partition in the named table.
Alter Table
Rename Table
ALTER TABLE table_name RENAME TO new_table_name; |
This statement lets you change the name of a table to a different name.
As of version 0.6, a rename on a managed table moves its HDFS location as well. (Older Hive versions just renamed the table in the metastore without moving the HDFS location.)
Alter Table Properties
ALTER TABLE table_name SET TBLPROPERTIES table_properties;table_properties: : (property_name = property_value, property_name = property_value, ... ) |
You can use this statement to add your own metadata to the tables. Currently last_modified_user, last_modified_time properties are automatically added and managed by Hive. Users can add their own properties to this list. You can do DESCRIBE EXTENDED TABLE to get this information.
Alter Table Comment
To change the comment of a table you have to change the comment property of the TBLPROPERTIES:
ALTER TABLE table_name SET TBLPROPERTIES ('comment' = new_comment); |
Add SerDe Properties
ALTER TABLE table_name SET SERDE serde_class_name [WITH SERDEPROPERTIES serde_properties];ALTER TABLE table_name SET SERDEPROPERTIES serde_properties;serde_properties: : (property_name = property_value, property_name = property_value, ... ) |
These statements enable you to change a table's SerDe or add user-defined metadata to the table's SerDe object.
The SerDe properties are passed to the table's SerDe when it is being initialized by Hive to serialize and deserialize data. So users can store any information required for their custom SerDe here. Refer to the SerDe documentation and Hive SerDe in the Developer Guide for more information, and see Row Format, Storage Format, and SerDe above for details about setting a table's SerDe and SERDEPROPERTIES in a CREATE TABLE statement.
Example, note that both property_name and property_value must be quoted:
ALTER TABLE table_name SET SERDEPROPERTIES ('field.delim' = ','); |
Alter Table Storage Properties
ALTER TABLE table_name CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name, ...)] INTO num_buckets BUCKETS; |
These statements change the table's physical storage properties.
NOTE: These commands will only modify Hive's metadata, and will NOT reorganize or reformat existing data. Users should make sure the actual data layout conforms with the metadata definition.
Additional Alter Table Statements
See Alter Either Table or Partition below for more DDL statements that alter tables.
Alter Partition
Add Partitions
ALTER TABLE table_name ADD [IF NOT EXISTS] PARTITION partition_spec [LOCATION 'location1'] partition_spec [LOCATION 'location2'] ...;partition_spec: : (partition_column = partition_col_value, partition_column = partition_col_value, ...) |
You can use ALTER TABLE ADD PARTITION to add partitions to a table. Partition values should be quoted only if they are strings. The location must be a directory inside of which data files reside. (location后面的参数值一定是HDFS文件系统上的目录,而不是文件。)(ADD PARTITION changes the table metadata, but does not load data. If the data does not exist in the partition's location, queries will not return any results.)
Note that it is proper syntax to have multiple partition_spec in a single ALTER TABLE, but if you do this in version 0.7, your partitioning scheme will fail. That is, every query specifying a partition will always use only the first partition. Instead, you should use the following form if you want to add many partitions in Hive 0.7:
ALTER TABLE table_name ADD PARTITION (partCol = 'value1') location 'loc1';ALTER TABLE table_name ADD PARTITION (partCol = 'value2') location 'loc2';...ALTER TABLE table_name ADD PARTITION (partCol = 'valueN') location 'locN'; |
Specifically, the following example (which was the default example before) will FAIL silently and without error, and all queries will go only to dt='2008-08-08' partition, no matter which partition you specify.
ALTER TABLE page_view ADD PARTITION (dt='2008-08-08', country='us') location '/path/to/us/part080808' PARTITION (dt='2008-08-09', country='us') location '/path/to/us/part080809'; |
In Hive 0.8 and later, you can add multiple partitions in a single ALTER TABLE statement as shown in the previous example.
An error is thrown if the partition_spec for the table already exists. You can use IF NOT EXISTS to skip the error.
Dynamic Partitions
Partitions can be added to a table dynamically, using a Hive INSERT statement (or a Pig STORE statement). See these documents for details and examples:
- Design Document for Dynamic Partitions
- Tutorial: Dynamic-Partition Insert
- Hive DML: Dynamic Partition Inserts
- HCatalog Dynamic Partitioning
Rename Partition
Version information
Icon
As of Hive 0.9.
ALTER TABLE table_name PARTITION partition_spec RENAME TO PARTITION partition_spec; |
This statement lets you change the value of a partition column.
Exchange Partition
ALTER TABLE table_name_1 EXCHANGE PARTITION (partition_spec) WITH TABLE table_name_2; |
This statement lets you move the data in a partition from a table to another table that has the same schema but does not already have that partition. For details, see Exchange Partition and HIVE-4095.
Recover Partitions (MSCK REPAIR TABLE)
Hive stores a list of partitions for each table in its metastore. If, however, new partitions are directly added to HDFS (say by using hadoop fs -put command), the metastore (and hence Hive) will not be aware of these partitions unless the user runs ALTER TABLE table_name ADD PARTITION commands on each of the newly added partitions.
However, users can run
MSCK REPAIR TABLE table_name; |
which will add metadata about partitions to the Hive metastore for partitions for which such metadata doesn't already exist. In other words, it will add any partitions that exist on HDFS but not in metastore to the metastore. See HIVE-874 for more details.
The equivalent command on Amazon Elastic MapReduce (EMR)'s version of Hive is
ALTER TABLE table_name RECOVER PARTITIONS; |
Drop Partitions
ALTER TABLE table_name DROP [IF EXISTS] PARTITION partition_spec, PARTITION partition_spec,...; |
You can use ALTER TABLE DROP PARTITION to drop a partition for a table. This removes the data and metadata for this partition.
For tables that are protected by NO_DROP CASCADE, you can use the predicate IGNORE PROTECTION to drop a specified partition or set of partitions (for example, when splitting a table between two Hadoop clusters).
ALTER TABLE table_name DROP [IF EXISTS] PARTITION partition_spec IGNORE PROTECTION; |
The above command will drop that partition regardless of protection stats.
In Hive 0.7.0 or later, DROP returns an error if the partition doesn't exist, unless IF EXISTS is specified or the configuration variable hive.exec.drop.ignorenonexistent is set to true.
ALTER TABLE page_view DROP PARTITION (dt='2008-08-08', country='us'); |
(Un)Archive Partition
ALTER TABLE table_name ARCHIVE PARTITION partition_spec;ALTER TABLE table_name UNARCHIVE PARTITION partition_spec; |
Archiving is a feature to moves a partition's files into a Hadoop Archive (HAR). Note that only the file count will be reduced; HAR does not provide any compression. See LanguageManual Archiving for more information
Alter Either Table or Partition
Alter Table/Partition File Format
ALTER TABLE table_name [PARTITION partition_spec] SET FILEFORMAT file_format; |
This statement changes the table's (or partition's) file format. For available file_format options, see the section above on CREATE TABLE.
Alter Table/Partition Location
ALTER TABLE table_name [PARTITION partition_spec] SET LOCATION "new location"; |
Alter Table/Partition Touch
ALTER TABLE table_name TOUCH [PARTITION partition_spec]; |
TOUCH reads the metadata, and writes it back. This has the effect of causing the pre/post execute hooks to fire. An example use case is if you have a hook that logs all the tables/partitions that were modified, along with an external script that alters the files on HDFS directly. Since the script modifies files outside of hive, the modification wouldn't be logged by the hook. The external script could call TOUCH to fire the hook and mark the said table or partition as modified.
Also, it may be useful later if we incorporate reliable last modified times. Then touch would update that time as well.
Note that TOUCH doesn't create a table or partition if it doesn't already exist. (See Create Table.)
Alter Table/Partition Protections
Version information
Icon
ALTER TABLE table_name [PARTITION partition_spec] ENABLE|DISABLE NO_DROP [CASCADE]; ALTER TABLE table_name [PARTITION partition_spec] ENABLE|DISABLE OFFLINE; |
Protection on data can be set at either the table or partition level. Enabling NO_DROP prevents a table from being dropped. Enabling OFFLINE prevents the data in a table or partition from being queried, but the metadata can still be accessed.
If any partition in a table has NO_DROP enabled, the table cannot be dropped either. Conversely, if a table has NO_DROP enabled then partitions may be dropped, but with NO_DROP CASCADE then partitions cannot be dropped either.
Alter Table/Partition Compact
Version information
Icon
In Hive release 0.13.0 and later when transactions are being used, the ALTER TABLE statement can request compaction of a table or partition.
ALTER TABLE table_name [PARTITION (partition_key = 'partition_value' [, ...])] COMPACT 'compaction_type'; |
In general you do not need to request compactions when Hive transactions are being used, because the system will detect the need for them and initiate the compaction. However, if compaction is turned off for a table or you want to compact the table at a time the system would not choose to, ALTER TABLE can initiate the compaction. The statement will enqueue a request for compaction and return. To watch the progress of the compaction, use SHOW COMPACTIONS.
The compaction_type can be MAJOR or MINOR. See the Basic Design section in Hive Transactions for more information.
Alter Table/Partition Concatenate
Version information
Icon
ALTER TABLE table_name [PARTITION (partition_key = 'partition_value' [, ...])] CONCATENATE; |
If the table or partition contains many small RCFiles or ORC files, then the above command will merge them into larger files. In case of RCFile the merge happens at block level whereas for ORC files the merge happens at stripe level thereby avoiding the overhead of decompressing and decoding the data.
Alter Column
Rules for Column Names
Column names are case insensitive.
Version information
Icon
In Hive release 0.12.0 and earlier, column names can only contain alphanumeric and underscore characters.
In Hive release 0.13.0 and later, by default column names can be specified within backticks (`) and contain any Unicode character (HIVE-6013). Within a string delimited by backticks, all characters are treated literally except that double backticks (``) represent one backtick character. The pre-0.13.0 behavior can be used by setting hive.support.quoted.identifiers to none, in which case backticked names are interpreted as regular expressions. See Supporting Quoted Identifiers in Column Names for details.
Change Column Name/Type/Position/Comment
ALTER TABLE table_name [PARTITION partition_spec] CHANGE [COLUMN] col_old_name col_new_name column_type [COMMENT col_comment] [FIRST|AFTER column_name] [CASCADE|RESTRICT]; |
This command will allow users to change a column's name, data type, comment, or position, or an arbitrary combination of them. The PARTITION clause is available in Hive 0.14.0 and later; seeUpgrading Pre-Hive 0.13.0 Decimal Columns for usage. A patch for Hive 0.13 is also available (see HIVE-7971).
The CASCADE|RESTRICT clause is available in Hive 0.15.0. ALTER TABLE CHANGE COLUMN with CASCADE command changes the columns of a table's metadata, and cascades the same change to all the partition metadata. RESTRICT is the default, limiting column change only to table metadata.
ALTER TABLE CHANGE COLUMN CASCADE clause will override the table partition's column metadata regardless of the table or partition's protection mode. Use with discretion.
The column change command will only modify Hive's metadata, and will not modify data. Users should make sure the actual data layout of the table/partition conforms with the metadata definition.
Example:
CREATE TABLE test_change (a int, b int, c int);// First change column a's name to a1.ALTER TABLE test_change CHANGE a a1 INT;// Next change column a1's name to a2, its data type to string, and put it after column b.ALTER TABLE test_change CHANGE a1 a2 STRING AFTER b;// The new table's structure is: b int, a2 string, c int. // Then change column c's name to c1, and put it as the first column.ALTER TABLE test_change CHANGE c c1 INT FIRST;// The new table's structure is: c1 int, b int, a2 string. |
Add/Replace Columns
ALTER TABLE table_name [PARTITION partition_spec] ADD|REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...) [CASCADE|RESTRICT] |
ADD COLUMNS lets you add new columns to the end of the existing columns but before the partition columns. This is supported for Avro backed tables as well, for Hive 0.14 and later. The PARTITION clause is available in Hive 0.14.0 and later; REPLACE COLUMNS removes all existing columns and adds the new set of columns. This can be done only for tables with native SerDe (DynamicSerDe, MetadataTypedColumnsetSerDe, LazySimpleSerDe and ColumnarSerDe). Refer to Hive SerDe for more information. REPLACE COLUMNS can also be used to drop columns. For example:
"ALTER TABLE test_change REPLACE COLUMNS (a int, b int);" will remove column 'c' from test_change's schema.
The PARTITION clause is available in Hive 0.14.0 and later; see Upgrading Pre-Hive 0.13.0 Decimal Columns for usage.
The CASCADE|RESTRICT clause is available in Hive 0.15.0. ALTER TABLE ADD|REPLACE COLUMNS with CASCADE command changes the columns of a table's metadata, and cascades the same change to all the partition metadata. RESTRICT is the default, limiting column changes only to table metadata.
The column change command will only modify Hive's metadata, and will not modify data. Users should make sure the actual data layout of the table/partition conforms with the metadata definition.
Partial Partition Specification
As of Hive 0.14 (HIVE-8411), users are able to provide a partial partition spec for certain above alter column statements, similar to dynamic partitioning. So rather than having to issue an alter column statement for each partition that needs to be changed:
ALTER TABLE foo PARTITION (ds='2008-04-08', hr=11) CHANGE COLUMN dec_column_name dec_column_name DECIMAL(38,18);ALTER TABLE foo PARTITION (ds='2008-04-08', hr=12) CHANGE COLUMN dec_column_name dec_column_name DECIMAL(38,18);... |
... you can change many existing partitions at once using a single ALTER statement with a partial partition specification:
// hive.exec.dynamic.partition needs to be set to true to enable dynamic partitioning with ALTER PARTITIONSET hive.exec.dynamic.partition = true; // This will alter all existing partitions in the table with ds='2008-04-08' -- be sure you know what you are doing!ALTER TABLE foo PARTITION (ds='2008-04-08', hr) CHANGE COLUMN dec_column_name dec_column_name DECIMAL(38,18);// This will alter all existing partitions in the table -- be sure you know what you are doing!ALTER TABLE foo PARTITION (ds, hr) CHANGE COLUMN dec_column_name dec_column_name DECIMAL(38,18); |
Similar to dynamic partitioning, hive.exec.dynamic.partition must be set to true to enable use of partial partition specs during ALTER PARTITION. This is supported for the following operations:
- Change column
- Add column
- Replace column
- File Format
- Serde Properties
[Hive - LanguageManual] Alter Table/Partition/Column的更多相关文章
- faster alter table add column
Create a new table (using the structure of the current table) with the new column(s) included. execu ...
- Oracle alter table modify column Syntax example
http://www.dba-oracle.com/t_alter_table_modify_column_syntax_example.htm For complete tips on Oracle ...
- SQL SERVER删除列,报错."由于一个或多个对象访问此列,ALTER TABLE DROP COLUMN ... 失败"
队友给我修改数据的语句.总是执行失败.很纳闷. 如下图: 仔细看了下这个列,并没有什么特殊.如下图: 但其确实有个约束: 'DF__HIS_DRUG___ALL_I__04E4BC85' . 为什么有 ...
- [Hive - LanguageManual] Create/Drop/Alter Database Create/Drop/Truncate Table
Hive Data Definition Language Hive Data Definition Language Overview Create/Drop/Alter Database Crea ...
- [Hive - LanguageManual ] ]SQL Standard Based Hive Authorization
Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0. ...
- [Hive - LanguageManual] Hive Default Authorization - Legacy Mode
Disclaimer Prerequisites Users, Groups, and Roles Names of Users and Roles Creating/Dropping/Using R ...
- hive 使用笔记(partition; HDFS乱码;日期函数)
6. insert 语句 1) 因为目标表有partition, 所以刚开始我使用的语句是 insert overwrite table sa_r_item_sales_day_week_month ...
- [Hive - LanguageManual] Create/Drop/Alter -View、 Index 、 Function
Create/Drop/Alter View Create View Drop View Alter View Properties Alter View As Select Version info ...
- 【转】Hive 修改 table、column
表 1.重命名表重命名表的语句如下: ALTER TABLE table_name RENAME TO new_table_name 2.修改表属性: ALTER TABLE table_name S ...
随机推荐
- C++异常以及异常与析构函数
1. 抛出异常 1.1 抛出异常(也称为抛弃异常)即检测是否产生异常,在C++中,其采用throw语句来实现,如果检测到产生异常,则抛出异常. 该语句的格式为: throw 表达式; 如果在try语句 ...
- [从jQuery看JavaScript]-匿名函数与闭包(Anonymous Function and Closure)【转】
(function(){ //这里忽略jQuery所有实现 })(); 半年前初次接触jQuery的时候,我也像其他人一样很兴奋地想看看源码是什么样的.然而,在看到源码的第一眼,我就迷糊了.为什么只有 ...
- struts2中token防止重复提交表单
struts2中token防止重复提交表单 >>>>>>>>>>>>>>>>>>>&g ...
- Android用户界面布局(layouts)
Android用户界面布局(layouts) 备注:view理解为视图 一个布局定义了用户界面的可视结构,比如activity的UI或是APP widget的UI,我们可以用下面两种方式来声明布局: ...
- Redis安装教程
1. Linux下Redis安装教程 (1)安装 #tar xf redis-2.6.14.tar.gz #cd redis-2.6.14 #make #make install (2)配置 修改re ...
- 自己动手实现STL:前言
一.前言 最近,刚看完<STL源码剖析>,深深被实现STL库的那些的大牛们所折服.同时又感觉自己与大牛们差距之大,便萌生深入学习之意.如果仅仅只是看看<STL源码剖析>的话,又 ...
- 转载:iOS 推送的服务端实现
参考网址1: iOS消息推送机制的实现 http://www.cnblogs.com/qq78292959/archive/2012/07/16/2593651.html 参考网址2: iOS 推送的 ...
- Java [Leetcode 125]Valid Palindrome
题目描述: Given a string, determine if it is a palindrome, considering only alphanumeric characters and ...
- 解释一下,在你往浏览器中输入一个URL后都发生了什么,要尽可能详细
这道题目没有所谓的完全的正确答案,这个题目可以让你在任意的一个点深入下去, 只要你对这个点是熟悉的.以下是一个大概流程: 浏览器向DNS服务器查找输入URL对应的IP地址. DNS服务器返回网站的IP ...
- Android Thread.UncaughtExceptionHandler捕获
在Java 的异常处理机制中:如果抛出的是Exception异常的话,必须有try..catch..进行处理,属于checked exception.如果抛出的是RuntimeException异常的 ...