當(dāng)前位置:
首頁(yè) >
CDH使用秘籍(一):Cloudera Manager和Managed Service的数据库
發(fā)布時(shí)間:2025/3/21
52
豆豆
生活随笔
收集整理的這篇文章主要介紹了
CDH使用秘籍(一):Cloudera Manager和Managed Service的数据库
小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
背景
從業(yè)務(wù)發(fā)展需求,大數(shù)據(jù)平臺(tái)須要使用spark作為機(jī)器學(xué)習(xí)、數(shù)據(jù)挖掘、實(shí)時(shí)計(jì)算等工作,所以決定使用Cloudera Manager5.2.0版本號(hào)和CDH5。 曾經(jīng)搭建過(guò)Cloudera Manager4.8.2和CDH4,在搭建Cloudera Manager5.2.0版本號(hào)的時(shí)候,發(fā)現(xiàn)對(duì)應(yīng)的Service Host Monitor 和 Service Monitor不能配置外部表,剛開(kāi)是還以為是配置出錯(cuò),后來(lái)才發(fā)現(xiàn)應(yīng)該是新版本號(hào)的Cloudera的存儲(chǔ)改變方式了。查了非常多文檔,果然發(fā)現(xiàn),新版本號(hào)中Service Host Monitor 和 ServicMonitore 不須要配置數(shù)據(jù)庫(kù),默認(rèn)使用內(nèi)置存儲(chǔ)方式。而且不能改動(dòng)。
概述
Cloudera Manager uses databases to store information about the Cloudera Manager configuration, as well as information such as the health of the system or task progress. For quick, simple installations, Cloudera Manager can install and configure an embedded PostgreSQL database as part of the Cloudera Manager installation process. In addition, some CDH services use databases and are automatically configured to use a default database. If you plan to use the embedded and default databases provided during the Cloudera Manager installation, see Installation Path A - Automated Installation by Cloudera Manager.
Although the embedded database is useful for getting started quickly, you can also use your own?PostgreSQL, MySQL, or Oracle database?for the Cloudera Manager Server and services that use databases.
須要的數(shù)據(jù)庫(kù) The?Cloudera Manager Server,?Activity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server, and?Cloudera Navigator Metadata Server?all require databases. The type of data contained in the databases and their estimated sizes are as follows:
The Cloudera Manager Service Host Monitor and Service Monitor roles have an?internal datastore.?(注意。就是此處說(shuō)明了, Host Monitor and Service Monitor在CM5版本號(hào)中,不能配置外部表,僅僅能使用內(nèi)置表。
Cloudera Manager 提供三種不同的安裝方式,方法A是自己主動(dòng)化安裝。方法B和C是使用rpm或tar手動(dòng)安裝:
使用外部數(shù)據(jù)庫(kù)須要很多其它的輸入以及相關(guān)工作,可是cloudera提供了很多其它的兼容性和擴(kuò)展性,讓你能夠彈性的選擇數(shù)據(jù)庫(kù)和配置。 當(dāng)然能夠在一套系統(tǒng)中安裝多種不同的數(shù)據(jù)庫(kù)。可是這樣會(huì)帶來(lái)非常多不確定的因素。所以cloudera建議始終使用同一種數(shù)據(jù)庫(kù)。
在非常多樣例中,你須要將對(duì)應(yīng)的service與database安裝到同一臺(tái)機(jī)器上,能夠減小網(wǎng)絡(luò)IO。提高總體效率。 當(dāng)然,你也能夠?qū)ervice和database分開(kāi)安裝到不同的機(jī)器上。在大型部署中或者database管理員須要這種配置,比方這種場(chǎng)景,Oracle DBA須要獨(dú)立的管理database。
搭建數(shù)據(jù)庫(kù)的配置參考官網(wǎng)。有具體配置步驟: 搭建Cloudera Manager Server數(shù)據(jù)庫(kù) 為Activity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server搭建外部數(shù)據(jù)庫(kù) 為Hue。Oozie搭建外部數(shù)據(jù)庫(kù)
下一篇文章中,我將具體介紹Cloudera Manager中database的存儲(chǔ)機(jī)制。如何配置,調(diào)優(yōu)等。
?原創(chuàng)文章。歡迎轉(zhuǎn)載,轉(zhuǎn)載請(qǐng)標(biāo)明出處?
從業(yè)務(wù)發(fā)展需求,大數(shù)據(jù)平臺(tái)須要使用spark作為機(jī)器學(xué)習(xí)、數(shù)據(jù)挖掘、實(shí)時(shí)計(jì)算等工作,所以決定使用Cloudera Manager5.2.0版本號(hào)和CDH5。 曾經(jīng)搭建過(guò)Cloudera Manager4.8.2和CDH4,在搭建Cloudera Manager5.2.0版本號(hào)的時(shí)候,發(fā)現(xiàn)對(duì)應(yīng)的Service Host Monitor 和 Service Monitor不能配置外部表,剛開(kāi)是還以為是配置出錯(cuò),后來(lái)才發(fā)現(xiàn)應(yīng)該是新版本號(hào)的Cloudera的存儲(chǔ)改變方式了。查了非常多文檔,果然發(fā)現(xiàn),新版本號(hào)中Service Host Monitor 和 ServicMonitore 不須要配置數(shù)據(jù)庫(kù),默認(rèn)使用內(nèi)置存儲(chǔ)方式。而且不能改動(dòng)。
概述
Cloudera Manager uses databases to store information about the Cloudera Manager configuration, as well as information such as the health of the system or task progress. For quick, simple installations, Cloudera Manager can install and configure an embedded PostgreSQL database as part of the Cloudera Manager installation process. In addition, some CDH services use databases and are automatically configured to use a default database. If you plan to use the embedded and default databases provided during the Cloudera Manager installation, see Installation Path A - Automated Installation by Cloudera Manager.
Although the embedded database is useful for getting started quickly, you can also use your own?PostgreSQL, MySQL, or Oracle database?for the Cloudera Manager Server and services that use databases.
須要的數(shù)據(jù)庫(kù) The?Cloudera Manager Server,?Activity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server, and?Cloudera Navigator Metadata Server?all require databases. The type of data contained in the databases and their estimated sizes are as follows:
- Cloudera Manager - Contains all the information about services you have configured and their role assignments, all configuration history, commands, users, and running processes. This relatively small database (<100 MB) is the most important to back up.
- Activity Monitor - Contains information about past activities. In large clusters, this database can grow large. Configuring an Activity Monitor database is only necessary if a MapReduce service is deployed.
- Reports Manager - Tracks disk utilization and processing activities over time. Medium-sized.
- Hive Metastore - Contains Hive metadata. Relatively small.
- Sentry Server - Contains authorization metadata. Relatively small.
- Cloudera Navigator Audit Server - Contains auditing information. In large clusters, this database can grow large.
- Cloudera Navigator Metadata Server - Contains authorization, policies, and audit report metadata. Relatively small.
The Cloudera Manager Service Host Monitor and Service Monitor roles have an?internal datastore.?(注意。就是此處說(shuō)明了, Host Monitor and Service Monitor在CM5版本號(hào)中,不能配置外部表,僅僅能使用內(nèi)置表。
與CM4版本號(hào)有差別)
Cloudera Manager 提供三種不同的安裝方式,方法A是自己主動(dòng)化安裝。方法B和C是使用rpm或tar手動(dòng)安裝:
- Path A automatically installs an embedded PostgreSQL database to meet the requirements of the services. This path reduces the number of installation tasks to complete and choices to make. In Path A you can optionally choose to create external databases forActivity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server.
- Path B and Path C require you to create databases for the Cloudera Manager Server,?Activity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server.
使用外部數(shù)據(jù)庫(kù)須要很多其它的輸入以及相關(guān)工作,可是cloudera提供了很多其它的兼容性和擴(kuò)展性,讓你能夠彈性的選擇數(shù)據(jù)庫(kù)和配置。 當(dāng)然能夠在一套系統(tǒng)中安裝多種不同的數(shù)據(jù)庫(kù)。可是這樣會(huì)帶來(lái)非常多不確定的因素。所以cloudera建議始終使用同一種數(shù)據(jù)庫(kù)。
在非常多樣例中,你須要將對(duì)應(yīng)的service與database安裝到同一臺(tái)機(jī)器上,能夠減小網(wǎng)絡(luò)IO。提高總體效率。 當(dāng)然,你也能夠?qū)ervice和database分開(kāi)安裝到不同的機(jī)器上。在大型部署中或者database管理員須要這種配置,比方這種場(chǎng)景,Oracle DBA須要獨(dú)立的管理database。
搭建數(shù)據(jù)庫(kù)的配置參考官網(wǎng)。有具體配置步驟: 搭建Cloudera Manager Server數(shù)據(jù)庫(kù) 為Activity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server搭建外部數(shù)據(jù)庫(kù) 為Hue。Oozie搭建外部數(shù)據(jù)庫(kù)
下一篇文章中,我將具體介紹Cloudera Manager中database的存儲(chǔ)機(jī)制。如何配置,調(diào)優(yōu)等。
?原創(chuàng)文章。歡迎轉(zhuǎn)載,轉(zhuǎn)載請(qǐng)標(biāo)明出處?
總結(jié)
以上是生活随笔為你收集整理的CDH使用秘籍(一):Cloudera Manager和Managed Service的数据库的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: webpack快速构建项目
- 下一篇: MySQL--4操作数据表中的记录小结