news 2026/4/18 1:47:00

记录apache doris使用过程中出现的问题

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
记录apache doris使用过程中出现的问题


本文详细记录了在使用Apache Doris过程中遇到的各种问题,包括创建表时的错误、日志权限变更、磁盘空间不足、物化视图启用、Hive数据导入、LOAD任务失败等,并提供了相应的解决方案,例如调整内存、设置参数和修复权限问题等。
1,执行创建语句过程中出现:
[Err] 1064 - errCode = 2, detailMessage = Failed to find enough host in all backends. need: 3

原因:

语句中指定了 PROPERTIES("replication_num" = "3");

结果BE只有2个:

查看对应节点的日志:.

==> ./be.WARNING.log.20200921-141304 <==
W1026 18:13:39.139992 19091 utils.cpp:101] fail to get master client from cache. host=192.168.6.143, port=9020, code=7
W1026 18:13:39.140386 19091 task_worker_pool.cpp:1185] finish report olap table state failed. status:-1, master host:192.168.6.143, port:9020
W1026 18:13:40.391201 19089 utils.cpp:101] fail to get master client from cache. host=192.168.6.143, port=9020, code=7
W1026 18:13:40.391471 19089 task_worker_pool.cpp:1060] finish report task failed. status:-1, master host:192.168.6.143port:9020
W1027 10:00:31.385262 2359 data_dir.cpp:128] open file filed, error: IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id
W1027 10:00:31.385926 2359 data_dir.cpp:95] _init_cluster_id failed, error: IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id
W1027 10:00:31.385958 2359 storage_engine.cpp:192] Store load failed, status=IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id, path=/wyyt/software/doris/be/storage
W1027 10:00:31.386071 2353 storage_engine.cpp:148] _init_store_map failed, error: Internal error: init path failed, error=IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id;
W1027 10:00:31.386106 2353 storage_engine.cpp:96] open engine failed, error: Internal error: init path failed, error=IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id;
F1027 10:00:31.386186 2353 doris_main.cpp:189] fail to open StorageEngine, res=init path failed, error=IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id;

找到原因之后,解决问题。我这里是打开文件失败,权限给755试试,然后重启BE节点。

如果重启失败,直接删除 be.pid ,再重启

2,日志权限用户变更了


启动服务的时候是什么用户就是什么用户

3,创建doris表报错


原因:字段长度数字加起来不能超过10W。如果要改,可以设置,但是不推荐

4,磁盘满了
ErrorReason{code=errCode = 2, msg='failed to create task: errCode = 2, detailMessage = disk 6189104187500640169 on backend 11001 exceed limit usage'

导致所有的任务暂停;

5,开启物化视图
create materialized view test_p_user_view as select user_id,user_name from test_p_user limit 8;
ERROR 1064 (HY000): errCode = 2, detailMessage = The materialized view is coming soon

解决:可以在master上执行这个命令 ADMIN SET FRONTEND CONFIG ("enable_materialized_view" = "true");

目前物化视图只支持duplicate key 表,而且0.12只支持部分,0.13版本会完善

6,hive数据导入到doris流程
1,在doris创建对应的表

2,执行语句

7,type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = there is no scanNode Backend
从hdfs导入大表导致be节点挂掉

解决方案:对fe进行参数设置

任务要显示指定内存:

查看be日志,查看core文件,查看是否是OOM。

参考:https://blog.csdn.net/weixin_42135997/article/details/80732658

https://blog.csdn.net/qq_15437667/article/details/83934113?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~all~sobaiduend~default-1-83934113.nonecase&utm_term=linux%20%E6%80%8E%E4%B9%88%E7%9C%8Bcore%E6%96%87%E4%BB%B6&spm=1000.2123.3001.4430

8,突然之间执行不了命令


查看be节点,是Alive状态。

查看be节点日志 be.INFO be.WARN 日志都没发现啥

后来发现是一个节点的磁盘出问题了 ,以后遇到这种问题,就晓得怎么排查了。。

9,broker 导入hdfs数据规则
1)验证了broker导入hdfs数据,导入数据使用uniq模式的情况下。相同主键覆盖不是有序,而是按照第二个字段的长度来替换的(第二个字段长度最大,相同长度则取时间最新的。),如果第二个字段一样,同理,比较第三个字段长度。

结果数据:

10,Doris broker导入数据失败
type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = all partitions have no load data

原始表数据为null。没数据

11,同时执行多个broker任务导致BE节点挂掉
原因:应该是内存不足的原因导致BE死掉。

解决方案:broker 单节点限制每次1个G,或者更小

12,routine laod 报错 errCode = 2, detailMessage = failed to send task: errCode = 2, detailMessage = failed
BE的任务并发是默认 max_routine_load_task_num_per * be数量

比如be节点有3个,那么所有的并发是 5*3

13,通过insert into


14,导入任务失败


内存不够,修改内存

15,ETL_QUALITY_UNSATISFIED; msg:quality not good enough to cancel
异常说明:数据质量不好,导致不能doris不能解析或者解析失败而取消导入任务

可能原因:

1. varchar字段太长;分隔符问题

2. too_many_filtered_rows

解决方案

长文本不要导入;长文本导入截断;数据中包含分隔符

16,使用broker导入数据到doris之后,发现内存没有释放

解决方案:

尝试升级doris版本为0.13.15,验证这个问题:

地址:https://cloud.baidu.com/doc/PALO/s/Ikivhcwb5

17,出现的错误

doris版本为 0.13.11 补丁版本。

18,出现be节点的data目录很大,有的be节点目录很正常。

初步判断原因集群负载有问题,routine load写入太频繁

查看表是否正常:

修改routine load参数 ,设置为60s

(
'desired_concurrent_number'='3',
'max_batch_interval' = '60',
'max_batch_rows' = '300000',
'max_batch_size' = '209715200',
'strict_mode' = 'false',
'format' = 'json'
)

19,doris版本 0.14.7 升级之后解决之前存在的问题 Too Many Tasks ................


20,doris 0.14.7 内网3个fe部署之后写入数据以后,fe有节点挂掉,具体日志:

2021-08-27 09:09:25,172 ERROR (heartbeat mgr|19) [BDBJEJournal.write():166] catch an exception when writing to database. sleep and retry. journal id 1526718
com.sleepycat.je.rep.InsufficientAcksException: (JE 7.3.7) Transaction: -16160910 VLSN: 31,775,195, initiated at: 09:09:22. Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 2000ms. FeederState=192.168.7.5_9010_1625132780567(2)[MASTER]
Current feeds:
192.168.7.7_9010_1625192915300: feederVLSN=31,775,198 replicaTxnEndVLSN=31,775,193
192.168.7.4_9010_1625132697001: feederVLSN=31,775,198 replicaTxnEndVLSN=31,775,191

at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureSufficientAcks(DurabilityQuorum.java:205) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:189) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHookInternal(RepImpl.java:1426) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1385) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:226) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.commit(Txn.java:772) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.commit(Txn.java:625) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.operationEnd(Txn.java:1803) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.Database.put(Database.java:1506) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.Database.put(Database.java:1556) ~[je-7.3.7.jar:7.3.7]
at org.apache.doris.journal.bdbje.BDBJEJournal.write(BDBJEJournal.java:159) [palo-fe.jar:3.4.0]
at org.apache.doris.persist.EditLog.logEdit(EditLog.java:849) [palo-fe.jar:3.4.0]
at org.apache.doris.persist.EditLog.logHeartbeat(EditLog.java:1265) [palo-fe.jar:3.4.0]
at org.apache.doris.system.HeartbeatMgr.runAfterCatalogReady(HeartbeatMgr.java:154) [palo-fe.jar:3.4.0]
at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) [palo-fe.jar:3.4.0]
at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:3.4.0]
2021-08-27 09:09:27,884 WARN (Thread-49|192) [BDBJEMetricHandler.write():117] write metric data into bdb error, key:192.168.7.7:8030_query_err_rate_1630026555000
com.sleepycat.je.rep.InsufficientAcksException: (JE 7.3.7) Transaction: -16160912 VLSN: 31,775,198, initiated at: 09:09:23. Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 2000ms. FeederState=192.168.7.5_9010_1625132780567(2)[MASTER]
Current feeds:
192.168.7.7_9010_1625192915300: feederVLSN=31,775,199 replicaTxnEndVLSN=31,775,196
192.168.7.4_9010_1625132697001: feederVLSN=31,775,199 replicaTxnEndVLSN=31,775,191

at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureSufficientAcks(DurabilityQuorum.java:205) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:189) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHookInternal(RepImpl.java:1426) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1385) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:226) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.commit(Txn.java:772) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.commit(Txn.java:625) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.operationEnd(Txn.java:1803) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.Database.put(Database.java:1506) ~[je-7.3.7.jar:7.3.7]
at org.apache.doris.metric.collector.BDBJEMetricHandler.write(BDBJEMetricHandler.java:115) ~[palo-fe.jar:3.4.0]
at org.apache.doris.metric.collector.BDBJEMetricHandler.writeDouble(BDBJEMetricHandler.java:109) ~[palo-fe.jar:3.4.0]
at org.apache.doris.metric.collector.MetricCollector.parseFeMetricJsonAndWriteMetric(MetricCollector.java:217) ~[palo-fe.jar:3.4.0]
at org.apache.doris.metric.collector.MetricCollector.writeMetric(MetricCollector.java:105) ~[palo-fe.jar:3.4.0]
at org.apache.doris.metric.collector.MetricCollector.lambdainit
0(MetricCollector.java:77) ~[palo-fe.jar:3.4.0]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
2021-08-27 09:09:33,338 WARN (Thread-49|192) [BDBJEMetricHandler.write():117] write metric data into bdb error, key:192.168.7.7:8030_quantile0.75_1630026555000
com.sleepycat.je.rep.InsufficientAcksException: (JE 7.3.7) Transaction: -16160913 VLSN: 31,775,200, initiated at: 09:09:27. Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 2000ms. FeederState=192.168.7.5_9010_1625132780567(2)[MASTER]
Current feeds:
192.168.7.7_9010_1625192915300: feederVLSN=31,775,202 replicaTxnEndVLSN=31,775,198
192.168.7.4_9010_1625132697001: feederVLSN=31,775,202 replicaTxnEndVLSN=31,775,196

at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureSufficientAcks(DurabilityQuorum.java:205) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:189) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHookInternal(RepImpl.java:1426) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1385) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:226) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.commit(Txn.java:772) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.commit(Txn.java:625) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.operationEnd(Txn.java:1803) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.Database.put(Database.java:1506) ~[je-7.3.7.jar:7.3.7]
at org.apache.doris.metric.collector.BDBJEMetricHandler.write(BDBJEMetricHandler.java:115) ~[palo-fe.jar:3.4.0]
at org.apache.doris.metric.collector.BDBJEMetricHandler.writeDouble(BDBJEMetricHandler.java:109) ~[palo-fe.jar:3.4.0]
at org.apache.doris.metric.collector.MetricCollector.parseFeMetricJsonAndWriteMetric(MetricCollector.java:247) ~[palo-fe.jar:3.4.0]
at org.apache.doris.metric.collector.MetricCollector.writeMetric(MetricCollector.java:105) ~[palo-fe.jar:3.4.0]
at org.apache.doris.metric.collector.MetricCollector.lambdainit
0(MetricCollector.java:77) ~[palo-fe.jar:3.4.0]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
2021-08-27 09:09:37,283 ERROR (heartbeat mgr|19) [BDBJEJournal.write():166] catch an exception when writing to database. sleep and retry. journal id 1526718
com.sleepycat.je.rep.InsufficientAcksException: (JE 7.3.7) Transaction: -16160914 VLSN: 31,775,202, initiated at: 09:09:30. Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 2000ms. FeederState=192.168.7.5_9010_1625132780567(2)[MASTER]
Current feeds:
192.168.7.7_9010_1625192915300: feederVLSN=31,775,205 replicaTxnEndVLSN=31,775,200
192.168.7.4_9010_1625132697001: feederVLSN=31,775,205 replicaTxnEndVLSN=31,775,196

at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureSufficientAcks(DurabilityQuorum.java:205) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:189) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHookInternal(RepImpl.java:1426) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1385) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:226) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.commit(Txn.java:772) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.commit(Txn.java:625) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.txn.Txn.operationEnd(Txn.java:1803) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.Database.put(Database.java:1506) ~[je-7.3.7.jar:7.3.7]
at com.sleepycat.je.Database.put(Database.java:1556) ~[je-7.3.7.jar:7.3.7]
at org.apache.doris.journal.bdbje.BDBJEJournal.write(BDBJEJournal.java:159) [palo-fe.jar:3.4.0]
at org.apache.doris.persist.EditLog.logEdit(EditLog.java:849) [palo-fe.jar:3.4.0]
at org.apache.doris.persist.EditLog.logHeartbeat(EditLog.java:1265) [palo-fe.jar:3.4.0]
at org.apache.doris.system.HeartbeatMgr.runAfterCatalogReady(HeartbeatMgr.java:154) [palo-fe.jar:3.4.0]
at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) [palo-fe.jar:3.4.0]
at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:3.4.0]
2021-08-27 09:09:40,305 WARN (Thread-49|192) [BDBJEMetricHandler.write():117] write metric data into bdb error, key:192.168.7.7:8030_quantile0.95_1630026555000
com.sleepycat.je.rep.InsufficientAcksException: (JE 7.3.7) Transaction: -16160916 VLSN: 31,775,205, initiated at: 09:09:33. Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 2000ms. FeederState=192.168.7.5_9010_1625132780567(2)[MASTER]

如下图:

初步判断是不是心跳超时时间设置的太短了,因为测试这个版本没有调整任何参数。
后来判断是不是fe元数据同步副本的时候写入失败,重试失败。

重启了3次才起来:


版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/4/18 4:37:20

AcousticSense AI步骤详解:从.mp3上传到流派概率输出全流程

AcousticSense AI步骤详解&#xff1a;从.mp3上传到流派概率输出全流程 1. 什么是AcousticSense AI&#xff1f;——让AI“看见”音乐的听觉引擎 你有没有想过&#xff0c;如果音乐能被“看见”&#xff0c;它会是什么样子&#xff1f; AcousticSense AI 就是这样一个把声音…

作者头像 李华
网站建设 2026/4/18 8:51:32

【收藏级】2026小白程序员AI大模型入门指南:从种类到实战全解析

近两年&#xff0c;AI领域迎来爆发式增长&#xff0c;熊猫更愿意将这场变革定义为“普通人可触及的第一次AI科技革命”。随着大模型与AI Agent技术的持续迭代成熟&#xff0c;不少人在实际使用中都会发出惊叹&#xff1a;原来AI早已突破想象边界&#xff0c;能完成这么多复杂任…

作者头像 李华
网站建设 2026/4/18 9:41:45

Z-Image-Turbo小白教程:输入一句话,轻松获得专业级插画

Z-Image-Turbo小白教程&#xff1a;输入一句话&#xff0c;轻松获得专业级插画 你有没有过这样的时刻——脑子里已经浮现出一幅绝美的画面&#xff1a;晨光中的古寺飞檐、赛博朋克街头的霓虹雨巷、或是云朵堆成的猫咪蜷在蓝天下……可刚打开绘图软件&#xff0c;就卡在第一步&…

作者头像 李华
网站建设 2026/4/17 22:16:01

手把手教你用WSL2在Windows运行DeepSeek-R1推理引擎

手把手教你用WSL2在Windows运行DeepSeek-R1推理引擎 你是否也遇到过这些困扰&#xff1a;想本地跑一个真正能思考的AI模型&#xff0c;却卡在GPU显存不足、CUDA环境复杂、Linux依赖难配&#xff1f;或者刚买完新电脑&#xff0c;发现连基础的大模型推理都得靠云服务——既慢又…

作者头像 李华
网站建设 2026/4/18 7:43:08

Qwen3-VL-4B Pro实战:让AI帮你解读复杂图表数据

Qwen3-VL-4B Pro实战&#xff1a;让AI帮你解读复杂图表数据 1. 为什么你需要一个真正“看得懂图”的AI&#xff1f; 你有没有遇到过这样的场景&#xff1a; 财务同事甩来一张密密麻麻的折线图柱状图组合图&#xff0c;附言&#xff1a;“帮忙看看Q3增长拐点在哪&#xff1f;…

作者头像 李华
网站建设 2026/4/18 9:21:05

[特殊字符]_微服务架构下的性能调优实战[20260128160349]

作为一名经历过多个微服务架构项目的工程师&#xff0c;我深知在分布式环境下进行性能调优的复杂性。微服务架构虽然提供了良好的可扩展性和灵活性&#xff0c;但也带来了新的性能挑战。今天我要分享的是在微服务架构下进行性能调优的实战经验。 &#x1f4a1; 微服务架构的性…

作者头像 李华