当前位置:

CDH6.3.2 集成配置 ATLAS-2.1.0

访客 2024-04-24 447 0

去官网下载ATLAS源码包:http://atlas.apache.org/2.1.0/index.html#/Downloads

一、Atlas源码编译

1.修改pom文件

因与CDH6.3.2集成,在repositories中新增以下部分:

<repository>

<id>cloudera</id>

<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>

<releases>

<enabled>true</enabled>

</releases>

<snapshots>

<enabled>false</enabled>

</snapshots>

</repository>

修改CHD对应的版本,注意连接符号。直接复制cdh的话是个号

<lucene-solr.version>7.4.0-cdh6.3.2</lucene-solr.version>

<hadoop.version>3.0.0-cdh6.3.2</hadoop.version>

<hbase.version>2.1.0-cdh6.3.2</hbase.version>

<solr.version>7.4.0-cdh6.3.2</solr.version>

<hive.version>2.1.1-cdh6.3.2</hive.version>

<kafka.version>2.2.1-cdh6.3.2</kafka.version>

<kafka.scala.binary.version>2.11</kafka.scala.binary.version>

<calcite.version>1.16.0</calcite.version>

<zookeeper.version>3.4.5-cdh6.3.2</zookeeper.version>

<falcon.version>0.8</falcon.version>

<sqoop.version>1.4.7-cdh6.3.2</sqoop.version>

2.兼容Hive2.1.1版本,修改Atlas源代码

默认是3.1不修改的话会报错

所需修改的项目位置:atlas-release-2.1.0-rc3/addons/hive-bridge

①.src/main/java/org/apache/atlas/hive/bridge//HiveMetaStoreBridge.java577行

StringcatalogName=hiveDB.getCatalogName()!=null?hiveDB.getCatalogName().toLowerCase():null;

改为

StringcatalogName=null;

②.src/main/java/org/apache/atlas/hive/hook/AtlasHiveHookContext.java81行

this.metastoreHandler=(listenerEvent!=null)?metastoreEvent.getIHMSHandler():null;

this.metastoreHandler=null;

3.编译

注意java版本需要和生产环境的版本一致不然会报错

mvnclean-DskipTestspackage-Pdist

完成之后文件在/home/software/atlas/distro/target,会编译生成很多压缩包

二、Atlas安装

1.解压

将apache-atlas-2.1.0-bin.tar.gz解压至安装目录,不要用官方文档说的server包那个包没有各种hook文件

2.修改配置文件atlas-env.sh

exportHBASE_CONF_DIR=/etc/hbase/conf

exportATLAS_SERVER_HEAP="-Xms15360m-Xmx15360m-XX:MaxNewSize=5120m-XX:MetaspaceSize=100M-XX:MaxMetaspaceSize=512m"

exportATLAS_SERVER_OPTS="-server-XX:SoftRefLRUPolicyMSPerMB=0-XX:CMSClassUnloadingEnabled-XX:UseConcMarkSweepGC-XX:CMSParallelRemarkEnabled-XX:PrintTenuringDistribution-XX:HeapDumpOnOutOfMemoryError-XX:HeapDumpPath=dumps/atlas_server.hprof-Xloggc:logs/gc-worker.log-verbose:gc-XX:UseGCLogFileRotation-XX:NumberOfGCLogFiles=10-XX:GCLogFileSize=1m-XX:PrintGCDetails-XX:PrintHeapAtGC-XX:PrintGCTimeStamps"

exportMANAGE_LOCAL_HBASE=false

exportMANAGE_LOCAL_SOLR=false

exportMANAGE_EMBEDDED_CASSANDRA=false

exportMANAGE_LOCAL_ELASTICSEARCH=false

3.修改配置文件atlas-application.properties

这需要重点注意,habase、kafka、solr、zookeeper等配置需要修改

#

#LicensedtotheApacheSoftwareFoundation(ASF)underone

#ormorecontributorlicenseagreements.SeetheNOTICEfile

#distributedwiththisworkforadditionalinformation

#regardingcopyrightownership.TheASFlicensesthisfile

#toyouundertheApacheLicense,Version2.0(the

#"License");youmaynotusethisfileexceptincompliance

#withtheLicense.YoumayobtainacopyoftheLicenseat

#

#http://www.apache.org/licenses/LICENSE-2.0

#

#Unlessrequiredbyapplicablelaworagreedtoinwriting,software

#distributedundertheLicenseisdistributedonan"ASIS"BASIS,

#WITHOUTWARRANTIESORCONDITIONSOFANYKIND,eitherexpressorimplied.

#SeetheLicenseforthespecificlanguagegoverningpermissionsand

#limitationsundertheLicense.

#

#########GraphDatabaseConfigs#########

#GraphDatabase

#Configuresthegraphdatabasetouse.DefaultstoJanusGraph

#atlas.graphdb.backend=org.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase

#GraphStorage

#Setatlas.graph.storage.backendtothecorrectvalueforyourdesiredstorage

#backend.Possiblevalues:

#

#hbase

#cassandra

#embeddedcassandra-ShouldonlybesetbybuildingAtlaswith-Pdist,embedded-cassandra-solr

#berkeleyje

#

#Seetheconfigurationdocumentationformoreinformationaboutconfiguringthevariousstoragebackends.

#

atlas.graph.storage.backend=hbase

atlas.graph.storage.hbase.table=apache_atlas_janus

#Hbase

#Forstandalonemode,specifylocalhost

#fordistributedmode,specifyzookeeperquorumhere

atlas.graph.storage.hostname=hadoop-101:2181,hadoop-102:2181,hadoop-103:2181

atlas.graph.storage.hbase.regions-per-server=1

atlas.graph.storage.lock.wait-time=10000

#InordertouseCassandraasabackend,commentoutthehbasespecificpropertiesabove,anduncommentthe

#thefollowingproperties

#atlas.graph.storage.clustername=

#atlas.graph.storage.port=

#GremlinQueryOptimizer

#

#Enablesrewritinggremlinqueriestomaximizeperformance.Thisflagisprovidedas

#apossiblewaytoworkaroundanydefectsthatarefoundintheoptimizeruntilthey

#areresolved.

#atlas.query.gremlinOptimizerEnabled=true

#Deletehandler

#

#Thisallowsthedefaultbehaviorofdoing"soft"deletestobechanged.

#

#AllowedValues:

#org.apache.atlas.repository.store.graph.v1.SoftDeleteHandlerV1-alldeletesare"soft"deletes

#org.apache.atlas.repository.store.graph.v1.HardDeleteHandlerV1-alldeletesare"hard"deletes

#

#atlas.DeleteHandlerV1.impl=org.apache.atlas.repository.store.graph.v1.SoftDeleteHandlerV1

#Entityauditrepository

#

#Thisallowsthedefaultbehaviorofloggingentitychangestohbasetobechanged.

#

#AllowedValues:

#org.apache.atlas.repository.audit.HBaseBasedAuditRepository-logentitychangestohbase

#org.apache.atlas.repository.audit.CassandraBasedAuditRepository-logentitychangestocassandra

#org.apache.atlas.repository.audit.NoopEntityAuditRepository-disabletheauditrepository

#

#atlas.EntityAuditRepository.impl=org.apache.atlas.repository.audit.HBaseBasedAuditRepository

#ifCassandraisusedasabackendforauditfromtheaboveproperty,uncommentandsetthefollowing

#propertiesappropriately.Ifusingtheembeddedcassandraprofile,thesepropertiescanremain

#commentedout.

#atlas.EntityAuditRepository.keyspace=atlas_audit

#atlas.EntityAuditRepository.replicationFactor=1

#GraphSearchIndex

atlas.graph.index.search.backend=solr

#Solr

#Solrcloudmodeproperties

atlas.graph.index.search.solr.mode=cloud

atlas.graph.index.search.solr.zookeeper-url=master1:2181/solr,master2:2181/solr,core1:2181/solr

atlas.graph.index.search.solr.zookeeper-connect-timeout=60000

atlas.graph.index.search.solr.zookeeper-session-timeout=60000

atlas.graph.index.search.solr.wait-searcher=true

#Solrhttpmodeproperties

#atlas.graph.index.search.solr.mode=http

#atlas.graph.index.search.solr.http-urls=http://localhost:8983/solr

#ElasticSearchsupport(TechPreview)

#Commentoutabovesolrconfiguration,anduncommentthefollowingtwolines.Additionally,makesurethe

#hostnamefieldissettoacommadelimitedsetofelasticsearchmasternodes,oranELBthatfrontsthemasters.

#

#Elasticsearchdoesnotprovideauthenticationoutofthebox,butdoesprovideanoptionwiththeX-Packproduct

#https://www.elastic.co/products/x-pack/security

#

#Alternatively,theJanusGraphdocumentationprovidessometipsonhowtosecureElasticsearchwithoutadditional

#plugins:https://docs.janusgraph.org/latest/elasticsearch.html

#atlas.graph.index.search.hostname=localhost

#atlas.graph.index.search.elasticsearch.client-only=false

#Solr-specificconfigurationproperty

atlas.graph.index.search.max-result-set-size=150

#########ImportConfigs#########

#atlas.import.temp.directory=/temp/import

#########NotificationConfigs#########

atlas.notification.embedded=false

atlas.kafka.data=${sys:atlas.home}/data/kafka

atlas.kafka.zookeeper.connect=hadoop-101:2181,hadoop-102:2181,hadoop-103:2181

atlas.kafka.bootstrap.servers=master1:9092,master2:9092,core1:9092

atlas.kafka.zookeeper.session.timeout.ms=60000

atlas.kafka.zookeeper.connection.timeout.ms=60000

atlas.kafka.zookeeper.sync.time.ms=20

atlas.kafka.auto.commit.interval.ms=1000

atlas.kafka.hook.group.id=atlas

atlas.kafka.enable.auto.commit=false

atlas.kafka.auto.offset.reset=earliest

atlas.kafka.session.timeout.ms=30000

atlas.kafka.offsets.topic.replication.factor=1

atlas.kafka.poll.timeout.ms=1000

atlas.notification.create.topics=true

atlas.notification.replicas=1

atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES

atlas.notification.log.failed.messages=true

atlas.notification.consumer.retry.interval=500

atlas.notification.hook.retry.interval=1000

#EnableforKerberizedKafkaclusters

#atlas.notification.kafka.service.principal=kafka/_HOST@EXAMPLE.COM

#atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab

##Serverportconfiguration

atlas.server.http.port=21000

#atlas.server.https.port=21443

#########SecurityProperties#########

#SSLconfig

atlas.enableTLS=false

#truststore.file=/path/to/truststore.jks

#cert.stores.credential.provider.path=jceks://file/path/to/credentialstore.jceks

#followingonlyrequiredfor2-waySSL

#keystore.file=/path/to/keystore.jks

#Authenticationconfig

atlas.authentication.method.kerberos=false

atlas.authentication.method.file=true

####ldap.type=LDAPorAD

atlas.authentication.method.ldap.type=none

####usercredentialsfile

atlas.authentication.method.file.filename=${sys:atlas.home}/conf/users-credentials.properties

###groupsfromUGI

#atlas.authentication.method.ldap.ugi-groups=true

########LDAPproperties#########

#atlas.authentication.method.ldap.url=ldap://<ldapserverurl>:389

#atlas.authentication.method.ldap.userDNpattern=uid={0},ou=People,dc=example,dc=com

#atlas.authentication.method.ldap.groupSearchBase=dc=example,dc=com

#atlas.authentication.method.ldap.groupSearchFilter=(member=uid={0},ou=Users,dc=example,dc=com)

#atlas.authentication.method.ldap.groupRoleAttribute=cn

#atlas.authentication.method.ldap.base.dn=dc=example,dc=com

#atlas.authentication.method.ldap.bind.dn=cn=Manager,dc=example,dc=com

#atlas.authentication.method.ldap.bind.password=<password>

#atlas.authentication.method.ldap.referral=ignore

#atlas.authentication.method.ldap.user.searchfilter=(uid={0})

#atlas.authentication.method.ldap.default.role=<defaultrole>

#########Activedirectoryproperties#######

#atlas.authentication.method.ldap.ad.domain=example.com

#atlas.authentication.method.ldap.ad.url=ldap://<ADserverurl>:389

#atlas.authentication.method.ldap.ad.base.dn=(sAMAccountName={0})

#atlas.authentication.method.ldap.ad.bind.dn=CN=team,CN=Users,DC=example,DC=com

#atlas.authentication.method.ldap.ad.bind.password=<password>

#atlas.authentication.method.ldap.ad.referral=ignore

#atlas.authentication.method.ldap.ad.user.searchfilter=(sAMAccountName={0})

#atlas.authentication.method.ldap.ad.default.role=<defaultrole>

#########JAASConfiguration########

#atlas.jaas.KafkaClient.loginModuleName=com.sun.security.auth.module.Krb5LoginModule

#atlas.jaas.KafkaClient.loginModuleControlFlag=required

#atlas.jaas.KafkaClient.option.useKeyTab=true

#atlas.jaas.KafkaClient.option.storeKey=true

#atlas.jaas.KafkaClient.option.serviceName=kafka

#atlas.jaas.KafkaClient.option.keyTab=/etc/security/keytabs/atlas.service.keytab

#atlas.jaas.KafkaClient.option.principal=atlas/_HOST@EXAMPLE.COM

#########ServerProperties#########

atlas.rest.address=http://localhost:21000

#Ifenabledandsettotrue,thiswillrunsetupstepswhentheserverstarts

atlas.server.run.setup.on.start=false

#########EntityAuditConfigs#########

atlas.audit.hbase.tablename=apache_atlas_entity_audit

atlas.audit.zookeeper.session.timeout.ms=1000

atlas.audit.hbase.zookeeper.quorum=hadoop-101:2181,hadoop-102:2181,hadoop-103:2181

#########HighAvailabilityConfiguration########

atlas.server.ha.enabled=false

####EnabledtheconfigsbelowasperneedifHAisenabled#####

#atlas.server.ids=id1

#atlas.server.address.id1=localhost:21000

#atlas.server.ha.zookeeper.connect=localhost:2181

#atlas.server.ha.zookeeper.retry.sleeptime.ms=1000

#atlas.server.ha.zookeeper.num.retries=3

#atlas.server.ha.zookeeper.session.timeout.ms=20000

##ifACLsneedtobesetonthecreatednodes,uncommenttheselinesandsetthevalues##

#atlas.server.ha.zookeeper.acl=<scheme>:<id>

#atlas.server.ha.zookeeper.auth=<scheme>:<authinfo>

#########AtlasAuthorization#########

atlas.authorizer.impl=simple

atlas.authorizer.simple.authz.policy.file=atlas-simple-authz-policy.json

#########TypeCacheImplementation########

#Atypecacheclasswhichimplements

#org.apache.atlas.typesystem.types.cache.TypeCache.

#Thedefaultimplementationisorg.apache.atlas.typesystem.types.cache.DefaultTypeCachewhichisalocalin-memorytypecache.

#atlas.TypeCache.impl=

#########PerformanceConfigs#########

#atlas.graph.storage.lock.retries=10

#atlas.graph.storage.cache.db-cache-time=120000

#########CSRFConfigs#########

atlas.rest-csrf.enabled=true

atlas.rest-csrf.browser-useragents-regex=^Mozilla.*,^Opera.*,^Chrome.*

atlas.rest-csrf.methods-to-ignore=GET,OPTIONS,HEAD,TRACE

atlas.rest-csrf.custom-header=X-XSRF-HEADER

############KNOXConfigs################

#atlas.sso.knox.browser.useragent=Mozilla,Chrome,Opera

#atlas.sso.knox.enabled=true

#atlas.sso.knox.providerurl=https://<knoxgatewayip>:8443/gateway/knoxsso/api/v1/websso

#atlas.sso.knox.publicKey=

############AtlasMetric/Statsconfigs################

#Format:atlas.metric.query.<key>.<name>

atlas.metric.query.cache.ttlInSecs=900

#atlas.metric.query.general.typeCount=

#atlas.metric.query.general.typeUnusedCount=

#atlas.metric.query.general.entityCount=

#atlas.metric.query.general.tagCount=

#atlas.metric.query.general.entityDeleted=

#

#atlas.metric.query.entity.typeEntities=

#atlas.metric.query.entity.entityTagged=

#

#atlas.metric.query.tags.entityTags=

#########CompiledQueryCacheConfiguration#########

#Thesizeofthecompiledquerycache.Olderquerieswillbeevictedfromthecache

#whenwereachthecapacity.

#atlas.CompiledQueryCache.capacity=1000

#Allowsnotificationswhenitemsareevictedfromthecompiledquery

#cachebecauseithasbecomefull.Awarningwillbeissuedwhen

#thespecifiednumberofevictionshaveoccurred.Iftheeviction

#warningthreshold<=0,noevictionwarningswillbeissued.

#atlas.CompiledQueryCache.evictionWarningThrottle=0

#########FullTextSearchConfiguration#########

#Settofalsetodisablefulltextsearch.

#atlas.search.fulltext.enable=true

#########GremlinSearchConfiguration#########

#Settofalsetodisablegremlinsearch.

atlas.search.gremlin.enable=false

##########Addhttpheaders###########

#atlas.headers.Access-Control-Allow-Origin=*

#atlas.headers.Access-Control-Allow-Methods=GET,OPTIONS,HEAD,PUT,POST

#atlas.headers.<headerName>=<headerValue>

#########UIConfiguration########

atlas.ui.default.version=v1

#########HiveHookConfigs#######
atlas.hook.hive.synchronous=false
atlas.hook.hive.numRetries=3
atlas.hook.hive.queueSize=10000
atlas.cluster.name=primary

4.修改atlas-log4j.xml文件

去掉如下代码的注释

<appendername="perf_appender"class="org.apache.log4j.DailyRollingFileAppender">

<paramname="file"value="${atlas.log.dir}/atlas_perf.log"/>

<paramname="datePattern"value="'.'yyyy-MM-dd"/>

<paramname="append"value="true"/>

<layoutclass="org.apache.log4j.PatternLayout">

<paramname="ConversionPattern"value="%d|%t|%m%n"/>

</layout>

</appender>

<loggername="org.apache.atlas.perf"additivity="false">

<levelvalue="debug"/>

<appender-refref="perf_appender"/>

</logger>

5.集成CDH的HBase

添加hbase集群配置文件到/home/software/atlas/conf/hbase下

ln-s/etc/hbase/conf//home/software/atlas/conf/hbase

6.集成CDH的Solr

①将apache-atlas-2.1.0/conf/solr文件拷贝到solr的安装目录下,更名为atlas-solr

②创建collection

vi/etc/passwd

/sbin/nologin修改为/bin/bash

su-solr

/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/solr/bin/solrcreate-cvertex_index-d/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/solr/atlas-solr-shards3-replicationFactor2

/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/solr/bin/solrcreate-cedge_index-d/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/solr/atlas-solr-shards3-replicationFactor2

/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/solr/bin/solrcreate-cfulltext_index-d/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/solr/atlas-solr-shards3-replicationFactor2

③验证创建collection成功

登录solrweb控制台:http://xxxx:8983验证是否启动成功

7.集成CDH的Kafka

①创建KafkaTopic

/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/solr/bin/solrcreate-cvertex_index-d/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/solr/atlas-solr-shards3-replicationFactor2-force

/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/solr/bin/solrcreate-cedge_index-d/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/solr/atlas-solr-shards3-replicationFactor2-force

/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/solr/bin/solrcreate-cfulltext_index-d/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/solr/atlas-solr-shards3-replicationFactor2-force

②查看topic

kafka-topics--list--zookeeperhadoop-101:2181,hadoop-102:2181,hadoop-103:2181

8.Atlas启动

cd/home/software/atlas

./bin/atlas_start.py

登录atlasweb控制台:http://xxxxxx:21000验证是否启动成功!

默认用户名和密码为:admin

三、Atlas集成配置

1、Atlas与Hive集成

1.配置修改

修改hive的相关配置文件

进入CMweb控制台–>进入hive的配置界面

①搜索hive-site.xml

修改【hive-site.xml的Hive服务高级配置代码段(安全阀)】

名称:hive.exec.post.hooks

值:org.apache.atlas.hive.hook.HiveHook

修改【hive-site.xml的Hive客户端高级配置代码段(安全阀)】

名称:hive.exec.post.hooks

值:org.apache.atlas.hive.hook.HiveHook

③搜索Hive辅助JAR目录,增加辅助目录:

修改【Hive辅助JAR目录】

值:/home/fusion_data/hive_auxlib/

将atlas下的hookjar包拷贝到hive辅助目录下,

cp/home/software/atlas/hook/hive/*/etc/hive/conf

scp/home/software/atlas/hook/hive/*root@hadoop-101:/etc/hive/conf

scp/home/software/atlas/hook/hive/*root@hadoop-103:/etc/hive/conf

将atlas配置文件atlas-application.properties拷贝到hive配置文件下

cp/home/software/atlas/conf/atlas-application.properties/etc/hive/conf

scp/home/software/atlas/conf/atlas-application.propertiesroot@hadoop-101:/etc/hive/conf

scp/home/software/atlas/conf/atlas-application.propertiesroot@hadoop-103:/etc/hive/conf

2.将hive元数据导入Atlas

默认账密是admin/admin

cd/home/software/atlas

./bin/import-hive.sh

Enterusernameforatlas:-admin

Enterpasswordforatlas:-

HiveMetaDataimportwassuccessful!!

2、Atlas与Sqoop集成

1.修改配

通过添加以下内容在/sqoop-site.xml中设置Atlashook:

<property>

<name>sqoop.job.data.publish.class</name>

<value>org.apache.atlas.sqoop.hook.SqoopHook</value>

</property>

将<atlaspackage>/conf/atlas-application.properties复制到<sqooppackage>/conf/

在sqooplib中链接<atlaspackage>/hook/sqoop/*.jar或完整复制过去

发表评论

  • 评论列表
还没有人评论,快来抢沙发吧~