python怎样通过thrift方式连接hive

本篇文章给大家分享的是有关python怎样通过thrift方式连接hive，小编觉得挺实用的，因此分享给大家学习，希望大家阅读完这篇文章后可以有所收获，话不多说，跟着小编一起来看看吧。

为泽库等地区用户提供了全套网页设计制作服务，及泽库网站建设行业解决方案。主营业务为网站制作、网站建设、泽库网站设计，以传统方式定制建设网站，并提供域名空间备案等一条龙服务，秉承以专业、用心的态度为用户提供真诚的服务。我们深信只要达到每一位用户的要求，就会得到认可，从而选择与我们长期合作。这样，我们也可以走得更远！

hive安装完成后，如果只是本地使用，启用

nohup hive --service metastore &

[hadoop@master1 usr]$ hive

Logging initialized using configuration in file:/data/usr/hive/conf/hive-log4j.properties
hive> use fmcm;
OK
Time taken: 0.874 seconds

如果是要脚本调用，则需要启用HiveServer2,确保10000端口已经被监听（可在hive-site.xml中修改端口）

 nohup hive --service hiveserver2 &

[hadoop@master1 usr]$ netstat -an|grep 10000            
tcp        0      0 0.0.0.0:10000           0.0.0.0:*               LISTEN

HiveServer2为客户端在远程执行hive查询提供了接口，通过Thrift RPC来实现，还提供了多用户并发和认证功能。目前python可以通过pyhs2这个模块来连接HiveServer2，实现查询和取回结果的操作。

不过pyhs2已经不在维护,追新的可以参考另外2个很好的python package(已经被证明pyhs2存在性能瓶颈，最好尽快切换到pyhive)

https://github.com/dropbox/PyHive

https://github.com/cloudera/impyla

安装sasl失败的话，先安装：
yum install gcc-c++ python-devel.x86_64 cyrus-sasl-devel.x86_64

pyhs2的项目托管在github之上，地址为https://github.com/BradRuderman/pyhs2或在https://pypi.python.org/pypi/pyhs2/0.2直接下载

如果安装不成功，可以尝试先安装以下的组件：

yum install cyrus-sasl-plain
yum install cyrus-sasl-devel

安装时如果遇到报错:

error: sasl/sasl.h: No such file or directory

可以尝试先安装sasl , ubantu可以用sudo apt-get install libsasl2-dev, CentOS可以使用anaconda的pip安装, 或者按照以下步骤安装:

curl -O -L ftp://ftp.cyrusimap.org/cyrus-sasl/cyrus-sasl-2.1.26.tar.gz
tar xzf cyrus-sasl-2.1.2.26.tar.gz
cd cyrus-sasl-2.1.26.tar.gz
./configure && make install


最后附上测试代码:

# -*- coding:utf-8 -*-
'''
采用Hive和thrift方式连接数据库
'''
import pyhs2
import sys
reload(sys)
sys.setdefaultencoding('utf8')

class HiveClient:
    def __init__(self, db_host, user, password, database, port=10000, authMechanism="PLAIN"):
      
        self.conn = pyhs2.connect(host=db_host,
                                  port=port,
                                  authMechanism=authMechanism,
                                  user=user,
                                  password=password,
                                  database=database,
                                  )

    def query(self, sql):
        with self.conn.cursor() as cursor:
            cursor.execute(sql)
            return cursor.fetch()

    def close(self):
        self.conn.close()


def main():
    """
    main process
    @rtype:
    @return:
    @note:

    """
    hive_client = HiveClient(db_host='10.24.33.3', port=10000, user='hadoop', password='hadoop',
                             database='fmcm', authMechanism='PLAIN')
    result = hive_client.query('select * from fm_news_newsaction limit 10')
    print result
    hive_client.close()


if __name__ == '__main__':
    main()

以上就是python怎样通过thrift方式连接hive，小编相信有部分知识点可能是我们日常工作会见到或用到的。希望你能通过这篇文章学到更多知识。更多详情敬请关注创新互联行业资讯频道。

网站题目：python怎样通过thrift方式连接hive
转载注明：http://bzwzjz.com/article/pdpddo.html

用户体验为先导为品牌带来生命力

python怎样通过thrift方式连接hive

其他资讯

用户体验为先导 为品牌带来生命力

python怎样通过thrift方式连接hive

其他资讯

用户体验为先导为品牌带来生命力