Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dss 0.5与azkaban 2.5对接问题列表 #11

Open
xccoder opened this issue Dec 6, 2019 · 0 comments
Open

dss 0.5与azkaban 2.5对接问题列表 #11

xccoder opened this issue Dec 6, 2019 · 0 comments

Comments

@xccoder
Copy link

@xccoder xccoder commented Dec 6, 2019

  1. 安装linkis jobtypes
    按照官方安装文档进行自动化安装,执行sh install.sh最后一步报错:{"error":"Missing required parameter 'execid'."}。并没有看到文档中所说的“如果安装成功最后会打印:{"status":"success"}”,但是能在azkaban的/plugins/jobtypes目录下看到已经安装好的linkis任务插件。通过排查在安装脚本最后一步会去调用"curl http://azkaban_ip:executor_port/executor?action=reloadJobTypePlugins"进行插件的刷新。重启azkaban executor日志中看到已经加载了插件的信息 INFO [JobTypeManager][Azkaban] Loaded jobtype linkis com.webank.wedatasphere.dss.plugins.azkaban.linkis.jobtype.AzkabanDssJobType。当时没有排查到相应的问题于是跳过。当发布linkis任务到azkaban执行成功之后反过来复盘这个问题的时候,这确定应该是个误报信息。

  2. 从dss发布project到azkaban

    问题描述:日志报错azkaban不存在当前用户

    问题排查:确认报用户不存在的用户是能正常访问的azkaban的,异常堆栈日志被捕获了没有太多日志。于是本地远程调试发现在AzkabanSecurityService#getSession方法执行httpClient.execute(httpPost, context)时直接报错了。我们的azkaban开启了https当前登录的接口不支持https,临时的解决方案是关闭了azkaban的https。

  3. 问题2的衍生

    解决完第一个问题之后还是不能发布任务,但是response = httpClient.execute(httpPost, context);

reponse返回的信息已经是变为“incorrect login”。最后排查发现是把azkaban的登录请求中的password写成了userpwd,改了重新打包验证通过。

  1. 任务发布成功但执行失败

    问题描述:

     - azkaban.flow.start.month=12
    05-12-2019 21:15:42 CST sql INFO - azkaban.flow.start.hour=21
    05-12-2019 21:15:42 CST sql INFO - azkaban.flow.uuid=bbdc0985-4a2c-4dab-94ff-ab38e2bbde24
    05-12-2019 21:15:42 CST sql INFO - azkaban.flow.flowid=publish_demo_xc_
    05-12-2019 21:15:42 CST sql INFO - azkaban.flow.start.day=05
    05-12-2019 21:15:42 CST sql INFO - azkaban.job.metadata.file=_job.653.sql.meta
    05-12-2019 21:15:42 CST sql INFO - azkaban.flow.start.timestamp=2019-12-05T21:15:42.829+08:00
    05-12-2019 21:15:42 CST sql INFO - linkistype=linkis.spark.sql
    05-12-2019 21:15:42 CST sql INFO - ****** End Job properties  ******
    05-12-2019 21:15:42 CST sql ERROR - Job run failed!
    05-12-2019 21:15:42 CST sql ERROR - nullnull
    05-12-2019 21:15:42 CST sql INFO - Finishing job sql at 1575551742852 with status FAILED
    

    排查之后发现是没拿到提交用户,获取提交用户用的是

    azkaban.flow.submituser
    

    这个参数经过3.8版本的azkaban和2.5版本的对比发现2.5没有这个参数
    解决方案:临时解决方案是把azkaban.flow.submituser和user.to.proxy 这两个参数写死打包替换,用于流程测试
    另外一种方案就是编译3.8版本然后重新安装部署azkaban

renrr pushed a commit to renrr/DataSphereStudio that referenced this issue Mar 30, 2020
Adamyuanyuan pushed a commit that referenced this issue Jul 10, 2020
Dev 0.9.0
peacewong pushed a commit that referenced this issue Oct 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.