Programming Impala Applications

The core development language with Impala is SQL. You can also use Java or other languages to interact with Impala through the standard JDBC and ODBC interfaces used by many business intelligence tools. For specialized kinds of analysis, you can supplement the SQL built-in functions by writing user-defined functions (UDFs) in C++ or Java.

Continue reading:

 

Overview of the Impala SQL Dialect

The Impala SQL dialect is descended from the SQL syntax used in the Apache Hive component (HiveQL). As such, it is familiar to users who are already familiar with running SQL queries on the Hadoop infrastructure. Currently, Impala SQL supports a subset of HiveQL statements, data types, and built-in functions.

For users coming to Impala from traditional database backgrounds, the following aspects of the SQL dialect might seem familiar or unusual:

  • Impala SQL is focused on queries and includes relatively little DML. There is no UPDATE or DELETE statement. Stale data is typically discarded (by DROP TABLE orALTER TABLE ... DROP PARTITION statements) or replaced (by INSERT OVERWRITE statements).
  • All data loading is done by INSERT statements, which typically insert data in bulk by querying from other tables. There are two variations, INSERT INTO which appends to the existing data, and INSERT OVERWRITE which replaces the entire contents of a table or partition (similar to TRUNCATE TABLE followed by a new INSERT). There is no INSERT ... VALUES syntax to insert a single row.
  • You often construct Impala table definitions and data files in some other environment, and then attach Impala so that it can run real-time queries. The same data files and table metadata are shared with other components of the Hadoop ecosystem.
  • Because Hadoop and Impala are focused on data warehouse-style operations on large data sets, Impala SQL includes some idioms that you might find in the import utilities for traditional database systems. For example, you can create a table that reads comma-separated or tab-separated text files, specifying the separator in theCREATE TABLE statement. You can create external tables that read existing data files but do not move or transform them.
  • Because Impala reads large quantities of data that might not be perfectly tidy and predictable, it does not impose length constraints on string data types. For example, you can define a database column as STRING with unlimited length, rather than CHAR(1) or VARCHAR(64). Although in Impala 2.0 and later, you can also use length-constrained CHAR and VARCHAR types.)
  • For query-intensive applications, you will find familiar notions such as joinsbuilt-in functions for processing strings, numbers, and dates, aggregate functions, subqueries, and comparison operators such as IN() and BETWEEN.
  • From the data warehousing world, you will recognize the notion of partitioned tables.
  • In Impala 1.2 and higher, UDFs let you perform custom comparisons and transformation logic during SELECT and INSERT...SELECT statements.

Related information: Impala SQL Language Reference, especially SQL Statements and Built-in Functions

Overview of Impala Programming Interfaces

You can connect and submit requests to the Impala daemons through:

  • The impala-shell interactive command interpreter.
  • The Apache Hue web-based user interface.
  • JDBC.
  • ODBC.

With these options, you can use Impala in heterogeneous environments, with JDBC or ODBC applications running on non-Linux platforms. You can also use Impala on combination with various Business Intelligence tools that use the JDBC and ODBC interfaces.

Each impalad daemon process, running on separate nodes in a cluster, listens to several ports for incoming requests. Requests from impala-shell and Hue are routed to the impalad daemons through the same port. The impalad daemons listen on separate ports for JDBC and ODBC requests.

Programming Impala Applications的更多相关文章

  1. (updated on Mar 1th)Programming Mobile Applications for Android Handheld Systems by Dr. Adam Porter

    本作品由Man_华创作,采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可.基于http://www.cnblogs.com/manhua/上的作品创作. Lab - Inte ...

  2. Cloudera Impala Guide

    Impala Concepts and Architecture The following sections provide background information to help you b ...

  3. (转) [it-ebooks]电子书列表

    [it-ebooks]电子书列表   [2014]: Learning Objective-C by Developing iPhone Games || Leverage Xcode and Obj ...

  4. 我要成为前端工程师!给 JavaScript 新手的建议与学习资源整理

    来源于:http://blog.miniasp.com/post/2016/02/02/JavaScript-novice-advice-and-learning-resources.aspx 今年有 ...

  5. Michael Schatz - 序列比对课程

    Michael Schatz - Cold Spring Harbor Laboratory 最近在研究 BWA mem 序列比对算法,直接去看论文,看不懂,论文就3页,太精简了,好多背景知识都不了解 ...

  6. 一句话讲清楚什么是JavaEE

    Java技术不仅是一门编程语言而且是一个平台.同时Java语言是一门有着特定语法和风格的高级的面向对象的语言,Java平台是Java语言编写的特定应用程序运行的环境.Java平台有很多种,很多的Jav ...

  7. [转]Android 学习资料分享(2015 版)

    转 Android 学习资料分享(2015 版) 原文地址:http://www.jianshu.com/p/874ff12a4c01 目录[-] 我是如何自学Android,资料分享(2015 版) ...

  8. 什么是JavaEE

    Java技术不仅是一门编程语言而且是一个平台.同时Java语言是一门有着特定语法和风格的高级的面向对象的语言,Java平台是Java语言编写的特定应用程序运行的环境.Java平台有很多种,很多的Jav ...

  9. C10K问题渣翻译

    The C10K problem [Help save the best Linux news source on the web -- subscribe to Linux Weekly News! ...

随机推荐

  1. WCF入门(四)---WCF架构

    WCF是一个分层架构,为开发各种分布式应用的充分支持.该体系结构在下面将详细说明. 约定 约定层旁边就是应用层,并含有类似于现实世界的约定,指定服务和什么样的信息可以访问它会使操作的信息.约定基本都是 ...

  2. iOS:自定义工具栏、导航栏、标签栏

    工具栏为UIToolBar,导航栏UINavigationBar,标签栏UITabBar.它们的样式基本上时差不多的,唯一的一点区别就是,工具栏一般需要自己去创建,然后添加到视图中,而导航栏和标签栏不 ...

  3. 持久化框架Hibernate 开发实例(一)

    1 Hibernate简介 Hibernate框架是一个非常流行的持久化框架,其中在web开发中占据了非常重要的地位, Hibernate作为Web应用的底层,实现了对数据库操作的封装.HIberna ...

  4. SGU 275 To xor or not to xor (高斯消元)

    题目链接 题意:有n个数,范围是[0, 10^18],n最大为100,找出若干个数使它们异或的值最大并输出这个最大值. 分析: 一道高斯消元的好题/ 我们把每个数用二进制表示,要使得最后的异或值最大, ...

  5. hdu 4825 Xor Sum (建树) 2014年百度之星程序设计大赛 - 资格赛 1003

    题目 题意:给n个数,m次询问,每次给一个数,求这n个数里与这个数 异或 最大的数. 思路:建一个类似字典数的数,把每一个数用 32位的0或者1 表示,查找从高位向底位找,优先找不同的,如果没有不同的 ...

  6. java 语言里 遍历 collection 的方式

    我来简单说一下吧,一般有2种方法来遍历collection中的元素,以HashSet为例子HashSet hs=new HashSet();hs.add("hello");hs.a ...

  7. HDU 4902

    数据太弱,直接让我小暴力一下就过了,一开始没注意到时间是15000MS,队友发现真是太给力了 #include <cstdio> #include <cstring> ],x[ ...

  8. highcharts 设置标题不显示

    设置标题不显示:title:false 用法: title: {         text: false },

  9. fastdfs-client-java 文件上传

    FastDFS是一个开源的轻量级分布式文件系统,它对文件进行管理,功能包括:文件存储.文件同步.文件访问(文件上传.文件下载)等,解决了大容量存储和负载均衡的问题.特别适合以文件为载体的在线服务,如相 ...

  10. UVA 10098 Generating Fast, Sorted Permutation

    // 给你字符串 按字典序输出所有排列// 要是每个字母都不同 可以直接dfs ^_^// 用前面说的生成排列算法 也可以直接 stl next_permutation #include <io ...