生活随笔
收集整理的這篇文章主要介紹了
pyhive、pyspark配置
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
pyhive
檢查HiveServer2
$HIVE_HOME /bin/hiveserver2
$HIVE_HOME /bin/beeline
! connect jdbc:hive2://localhost:10000
報錯:User: xxx is not allowed to impersonate anonymous,進行如下配置
< ! --解決beeline連接hive權限不足的問題--
>
< ! --設置允許root用戶登錄--
>
< property
> < name
> hadoop.proxyuser.root.hosts
< /name
> < value
> *
< /value
>
< /property
>
< property
> < name
> hadoop.proxyuser.root.groups
< /name
> < value
> *
< /value
>
< /property
>
< ! --設置允許spark用戶登錄--
>
< property
> < name
> hadoop.proxyuser.spark.hosts
< /name
> < value
> *
< /value
>
< /property
>
< property
> < name
> hadoop.proxyuser.spark.groups
< /name
> < value
> *
< /value
>
< /property
>
安裝pyhive
pip
install sasl
pip
install thrift
pip
install thrift-sasl
pip
install pyhive
sudo apt-get install libsasl2-dev
連接pyhive
from pyhive
import hiveconn
= hive
. Connection
( host
= '127.0.0.1' , port
= 10000 , auth
= "CUSTOM" , username
= 'root' , password
= 'hive' )
cursor
= conn
. cursor
( )
cursor
. execute
( 'select * from t limit 10' )
for result
in cursor
. fetchall
( ) : print ( result
)
cursor
. close
( )
conn
. close
( )
pyspark
export JAVA_HOME = /opt/java
export JRE_HOME = ${JAVA_HOME} /jre
export CLASSPATH = .:
${JAVA_HOME} /lib:
${JRE_HOME} /lib
export PATH = ${JAVA_HOME} /bin:
$PATH
export HADOOP_HOME = /opt/hadoop
export CLASSPATH = $( $HADOOP_HOME/bin/hadoop classpath) : $CLASSPATH
export HADOOP_COMMON_LIB_NATIVE_DIR = $HADOOP_HOME /lib/native
export PATH = $PATH : $HADOOP_HOME /bin:
$HADOOP_HOME /sbin
export HIVE_HOME = /opt/hive
export PATH = $PATH : $HIVE_HOME /bin
export SCALA_HOME = /opt/scala
export PATH = ${SCALA_HOME} /bin:
$PATH
export SPARK_HOME = /opt/spark
export PATH = ${SPARK_HOME} /bin:
$PATH
export PYTHONPATH = $SPARK_HOME /python:
$SPARK_HOME /python/lib/py4j-0.10.9-src.zip:
$PYTHONPATH
export PYSPARK_PYTHON = /home/spark/envs/py3/bin/python3
export PYSPARK_DRIVER_PYTHON = /home/spark/envs/py3/bin/python3
export PATH = /opt/sbt/:
$PATH
參考文獻
使用Python連接Hive pyhive的安裝 解決beeline無法連接hive數據庫的問題 解決sasl安裝問題 使用PyHive連接Hive數據倉庫
總結
以上是生活随笔 為你收集整理的pyhive、pyspark配置 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。