Hive/Phoenix + Druid + JdbcTemplate 在 Spring Boot 下的整合
一.POM依赖
作者的hadoop集群环境为:
HDFS,YARN,MapReduce2 : 2.7.3 Hive : 1.2.1000 HBase : 1.1.2 注:phoenix版本依赖性较强,请注意不同发行版之间的差异(直接从集群服务器上获取jar包最为可靠)- <properties>
- <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
- <spring-data-hadoop.version>2.4.0.RELEASE</spring-data-hadoop.version>
- <hive.version>1.2.1</hive.version>
- <phoenix-client.version>4.7</phoenix-client.version>
- <druid.version>1.0.27</druid.version>
- </properties>
- <dependencies>
- <dependency>
- <groupId>org.springframework.boot</groupId>
- <artifactId>spring-boot-starter-jdbc</artifactId>
- </dependency>
- <dependency>
- <groupId>org.springframework.data</groupId>
- <artifactId>spring-data-hadoop</artifactId>
- <version>${spring-data-hadoop.version}</version>
- </dependency>
- <dependency>
- <groupId>org.apache.hive</groupId>
- <artifactId>hive-jdbc</artifactId>
- <version>${hive.version}</version>
- </dependency>
- <dependency>
- <groupId>org.apache.phoenix</groupId>
- <artifactId>phoenix-client</artifactId>
- <version>${phoenix-client.version}</version>
- </dependency>
- <dependency>
- <groupId>com.alibaba</groupId>
- <artifactId>druid</artifactId>
- <version>${druid.version}</version>
- </dependency>
- </dependencies>
二.spring boot 配置文件
因为spring boot 是默认且推荐采用yaml和properties配置文件的方式。因此,作者在这里采用yaml方式为例:
application.yml:
- # hive 数据源自定义配置
- hive:
- url: jdbc:hive2://192.168.61.43:10000/default
- type: com.alibaba.druid.pool.DruidDataSource
- driver-class-name: org.apache.hive.jdbc.HiveDriver
- username: hive
- password: hive
- # phoenix 数据源自定义配置
- phoenix:
- enable: true
- url: jdbc:phoenix:192.168.61.43
- type: com.alibaba.druid.pool.DruidDataSource
- driver-class-name: org.apache.phoenix.jdbc.PhoenixDriver
- username:
- password:
- default-auto-commit: true
- max-active: 100
- initialSize: 1
- maxWait: 60000
- minIdle: 1
- timeBetweenEvictionRunsMillis: 60000
- minEvictableIdleTimeMillis: 300000
- testWhileIdle: true
- testOnBorrow: false
- testOnReturn: false
- poolPreparedStatements: true
- maxOpenPreparedStatements: 50
三.spring boot 配置Bean实现
因为上述配置信息为自定义的信息,spring boot 的 auto configuration 并不能完全理解编码者的意图,因此我们要手动创造数据源Bean:
Hive:
- /**
- * hive数据源配置
- * @author chenty
- *
- */
- @Configuration
- public class HiveDataSource {
- @Autowired
- private Environment env;
- @Bean(name = "hiveJdbcDataSource")
- @Qualifier("hiveJdbcDataSource")
- public DataSource dataSource() {
- DruidDataSource dataSource = new DruidDataSource();
- dataSource.setUrl(env.getProperty("hive.url"));
- dataSource.setDriverClassName(env.getProperty("hive.driver-class-name"));
- dataSource.setUsername(env.getProperty("hive.username"));
- dataSource.setPassword(env.getProperty("hive.password"));
- return dataSource;
- }
- @Bean(name = "hiveJdbcTemplate")
- public JdbcTemplate hiveJdbcTemplate(@Qualifier("hiveJdbcDataSource") DataSource dataSource) {
- return new JdbcTemplate(dataSource);
- }
- }
- /**
- * phoenix数据源配置
- * @author chenty
- *
- */
- @Configuration
- public class PhoenixDataSource {
- @Autowired
- private Environment env;
- @Bean(name = "phoenixJdbcDataSource")
- @Qualifier("phoenixJdbcDataSource")
- public DataSource dataSource() {
- DruidDataSource dataSource = new DruidDataSource();
- dataSource.setUrl(env.getProperty("phoenix.url"));
- dataSource.setDriverClassName(env.getProperty("phoenix.driver-class-name"));
- dataSource.setUsername(env.getProperty("phoenix.username"));//phoenix的用户名默认为空
- dataSource.setPassword(env.getProperty("phoenix.password"));//phoenix的密码默认为空
- dataSource.setDefaultAutoCommit(Boolean.valueOf(env.getProperty("phoenix.default-auto-commit")));
- return dataSource;
- }
- @Bean(name = "phoenixJdbcTemplate")
- public JdbcTemplate phoenixJdbcTemplate(@Qualifier("phoenixJdbcDataSource") DataSource dataSource) {
- return new JdbcTemplate(dataSource);
- }
- }
四.数据源测试
接下来我们只需在测试类中,注入 hive/phoenix 的 JdbcTemplate,即可实现 hive/phoenix 的数据交互:
Hive:
- @RunWith(SpringJUnit4ClassRunner.class)
- @SpringApplicationConfiguration(HiveServiceApplication.class)
- public class MainTest {
- @Autowired
- @Qualifier("hiveJdbcTemplate")
- JdbcTemplate hiveJdbcTemplate;
- @Test
- public void DataSourceTest() {
- // create table
- StringBuffer sql = new StringBuffer("create table IF NOT EXISTS ");
- sql.append("HIVE_TEST1 ");
- sql.append("(KEY INT, VALUE STRING) ");
- sql.append("PARTITIONED BY (S_TIME DATE)"); // 分区存储
- sql.append("ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' "); // 定义分隔符
- sql.append("STORED AS TEXTFILE"); // 作为文本存储
- // drop table
- // StringBuffer sql = new StringBuffer("DROP TABLE IF EXISTS ");
- // sql.append("HIVE_TEST1");
- hiveJdbcTemplate.execute(sql.toString());
- }
- }
Phoenix:
- @RunWith(SpringJUnit4ClassRunner.class)
- @SpringApplicationConfiguration(HBaseServiceApplication.class)
- public class MainTest {
- @Autowired
- @Qualifier("phoenixJdbcTemplate")
- JdbcTemplate phoenixJdbcTemplate;
- @Test
- public void DataSourceTest() {
- //phoenix
- phoenixJdbcTemplate.execute("create table IF NOT EXISTS PHOENIX_TEST2 (ID INTEGER not null primary key, Name varchar(20),Age INTEGER)");
- }
- }
五.传统方式
虽然 spring boot 本身是不推荐传统的xml配置的,但是实际生产过程中因各种客观因素,导致有时我们不得不引入传统的xml形式的配置文件。因此针对 hive/phoenix 如果用xml配置文件,并且在spring boot 下如何实现再做下简单的介绍:
application.xml:
- <!-- 配置HiveTemplate -->
- <bean id="hiveTemplate" class="org.springframework.jdbc.core.JdbcTemplate">
- <constructor-arg ref="hiveDataSource"/>
- <qualifier value="hiveTemplate"/>
- </bean>
- <bean id="hiveDataSource" class="com.alibaba.druid.pool.DruidDataSource">
- <property name="driverClassName" value="org.apache.hive.jdbc.HiveDriver"/>
- <property name="url" value="jdbc:hive2://172.20.36.212:10000/default"/>
- <property name="username" value="hive"/>
- <property name="password" value="hive"/>
- <!-- 初始化连接大小 -->
- <property name="initialSize" value="0" />
- <!-- 连接池最大使用连接数量 -->
- <property name="maxActive" value="1500" />
- <!-- 连接池最小空闲 -->
- <property name="minIdle" value="0" />
- <!-- 获取连接最大等待时间 -->
- <property name="maxWait" value="60000" />
- </bean>
- <!-- 配置PhoenixTemplate -->
- <bean id="phoenixTemplate" class="org.springframework.jdbc.core.JdbcTemplate">
- <constructor-arg ref="phoenixDataSource"/>
- <qualifier value="phoenixJdbcTemplate"/>
- </bean>
- <bean id="phoenixDataSource" class="com.alibaba.druid.pool.DruidDataSource">
- <property name="driverClassName" value="org.apache.phoenix.jdbc.PhoenixDriver"/>
- <property name="url" value="jdbc:phoenix:172.20.36.212"/>
- <!-- 初始化连接大小 -->
- <property name="initialSize" value="0" />
- <!-- 连接池最大使用连接数量 -->
- <property name="maxActive" value="1500" />
- <!-- 连接池最小空闲 -->
- <property name="minIdle" value="0" />
- <!-- 获取连接最大等待时间 -->
- <property name="maxWait" value="60000" />
- <!--因为Phoenix进行数据更改时不会自动的commit,必须要添加defaultAutoCommit属性,否则会导致数据无法提交的情况-->
- <property name="defaultAutoCommit" value="true"/>
- </bean>
实现测试:
有了xml配置,我们只需在上述第四步骤测试类的类定义上加入如下注解,即可实现xml配置文件信息的加载:
- @ImportResource({"classpath:application.xml","..."})