Getting error: SCSI error: return code = 0x00010000

Getting error: SCSI error: return code = 0x00010000

Environment

  • Red Hat Enterprise Linux 5
  • Red Hat Enterprise Linux 6
  • Red Hat Enterprise Linux 7

Issue

Getting error: 'kernel: sd h:c:t:l: SCSI error: return code = 0x00010000':

May 22 23:50:10 localhost kernel: Device sdb not ready.
May 22 23:50:10 localhost kernel: end_request: I/O error, dev sdb, sector 0
May 22 23:50:10 localhost kernel: SCSI error : <0 0 2 14> return code = 0x10000

SAN access issue reported on RHEL server, observed following messages in /var/log/messages

May  5 04:15:00 localhost kernel: sd 3:0:1:42: Unhandled error code
May  5 04:15:00 localhost kernel: sd 3:0:1:42: SCSI error: return code = 0x00010000
May  5 04:15:00 localhost kernel: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK

In device mapper multipath, some path fails and going down. Logs have many events logged for tur checker reports path is down and multipath -ll output show paths in failed faulty state:

[root@example ~]# multipath -ll
sdb: checker msg is "tur checker reports path is down"
mpath0 (2001738006296000b) dm-8 IBM,2810XIV
[size=200G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=3][active]
\_ 2:0:0:1 sda 8:0   [active][ready]
 \_ 2:0:1:1 sdb 8:16  [failed][faulty]
 \_ 4:0:8:1 sdc 8:32  [active][ready]
 \_ 4:0:9:1 sdd 8:48  [active][ready]

We had A SAN director failure and the system got paniced due to lpfc lost devices.

Resolution

SCSI error: return code = 0x00010000 DID_NO_CONNECT

There is likely a hardware issue that is related to the connectivity problems. Contact storage hardware support for assistance in determining cause and addressing the problem.

Parallel SCSI

DID_NO_CONNECT = SCSI SELECTION failed because there was no device at the address specified

iSCSI

  • iscsi layer returns this if replacement/recovery timeout seconds has expired or if user asked to shutdown session.

FC/SAN

  • Typically this will follow loss of a storage port, for example:
kernel:  rport-0:0-1: blocked FC remote port time out: saving binding
kernel: sd 0:0:1:22: Unhandled error code
kernel: sd 0:0:1:22: SCSI error: return code = 0x00010000
kernel: Result: hostbyte=>DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
  • In the above example the remote port timed out and the controller lost connection to the devices behind that storage port.

Root Cause

The SCSI error return code of 0x00010000 is broken down into constituent parts and decoded as shown below:

0x00 01 00 00
-------------
           00   status byte : {likely} not valid, see other fields
        00         msg byte : {likely} not valid, see other fields
     01           host byte : DID_NO_CONNECT - no connection to device {possibly device doesn't exist or transport failed}
  00            driver byte : {likely} not valid, see other fields

The return code 0x00010000 is a DID_NO_CONNECT -- hardware transport connection to device is no longer available. This scsi error indicates that IO command is being rejected because the command cannot be sent to the device until the hardware transport becomes available again. These are a symptom/result of some other root cause. That other root cause needs to be found and addressed.

Either access to the device is temporarily unavailable or the device is no longer available within the configuration. A temporary service interrupt can be cause by maintenance activity in the san such as a switch reboot. Such activity can cause either a link down condition and/or remote port timeouts. Its possible the hardware transport connectivity will return at some time later. A permanent loss of connectivity can occur if storage is reconfigured to remove that lun from being presented to the host.

Check for the following or similar event messages within the system logs:

Mar 28 19:52:53 hostname kernel: qla2xxx 0000:04:00.1: LOOP DOWN detected (4 4 0 0). Mar 28 19:53:23 hostname kernel:  rport-3:0-0: blocked FC remote port time out: saving binding Mar 28 19:53:23 hostname kernel: sd 3:0:0:1: SCSI error: return code = 0x00010000 : .

Look at what was being logged just before these DID_NO_CONNECTs started being logged.

You can also have remote ports timeout (loss of connectivity) without a link down event. See the following for more information on remote port timeouts:

There is a delay between the link down event at 19:52:53 and remote port time out at 19:53:23 -- a 30 second delay. When a remote port is lost, a delay within the driver called dev_loss_tmo, device loss timeout, is applied before taking further action. If the port returns to the configuration before that timeout expires, then io is immediately retried. If the port hasn't returned by that time then all io is immediately returned with DID_NO_CONNECT status. Any and all further io will immediately fail after the remote port time out event has occurred. See and  configuration guides for more information on dev_loss_tmo behavior. Also  has information for setting dev_loss_tmo outside of multipath.

Other symptoms can result from the host loosing connectivity to storage. If no connectivity remains, issues such as file systems going read-only can result from being unable commit necessary metadata changes to the disks hosting the filesystem.

See  for more information.

Read more

安装 启动 使用 Neo4j的超详细教程

安装 启动 使用 Neo4j的超详细教程

最近在做一个基于知识图谱的智能生成项目。需要用到Neo4j图数据库。写这篇文章记录一下Neo4j的安装及其使用。 一.Neo4j的安装 1.首先安装JDK,配环境变量。(参照网上教程,很多) Neo4j是基于Java的图形数据库,运行Neo4j需要启动JVM进程,因此必须安装JAVA SE的JDK。从Oracle官方网站下载 Java SE JDK。我使用的版本是JDK1.8 2.官网上安装neo4j。 官方网址:https://neo4j.com/deployment-center/  在官网上下载对应版本。Neo4j应用程序有如下主要的目录结构: bin目录:用于存储Neo4j的可执行程序; conf目录:用于控制Neo4j启动的配置文件; data目录:用于存储核心数据库文件; plugins目录:用于存储Neo4j的插件; 3.配置环境变量 创建主目录环境变量NEO4J_HOME,并把主目录设置为变量值。复制具体的neo4j文件地址作为变量值。 配置文档存储在conf目录下,Neo4j通过配置文件neo4j.conf控制服务器的工作。默认情况下,不需

企业微信群机器人Webhook配置全攻略:从创建到发送消息的完整流程

企业微信群机器人Webhook配置全攻略:从创建到发送消息的完整流程 在数字化办公日益普及的今天,企业微信作为国内领先的企业级通讯工具,其群机器人功能为团队协作带来了极大的便利。本文将手把手教你如何从零开始配置企业微信群机器人Webhook,实现自动化消息推送,提升团队沟通效率。 1. 准备工作与环境配置 在开始创建机器人之前,需要确保满足以下基本条件: * 企业微信账号:拥有有效的企业微信管理员或成员账号 * 群聊条件:至少包含3名成员的群聊(这是创建机器人的最低人数要求) * 网络环境:能够正常访问企业微信服务器 提示:如果是企业管理员,建议先在"企业微信管理后台"确认机器人功能是否已对企业开放。某些企业可能出于安全考虑会限制此功能。 2. 创建群机器人 2.1 添加机器人到群聊 1. 打开企业微信客户端,进入目标群聊 2. 点击右上角的群菜单按钮(通常显示为"..."或"⋮") 3. 选择"添加群机器人"选项 4.

Flowise物联网融合:与智能家居设备联动的应用设想

Flowise物联网融合:与智能家居设备联动的应用设想 1. Flowise:让AI工作流变得像搭积木一样简单 Flowise 是一个真正把“AI平民化”落地的工具。它不像传统开发那样需要写几十行 LangChain 代码、配置向量库、调试提示词模板,而是把所有这些能力打包成一个个可拖拽的节点——就像小时候玩乐高,你不需要懂塑料怎么合成,只要知道哪块该拼在哪,就能搭出一座城堡。 它诞生于2023年,短短一年就收获了45.6k GitHub Stars,MIT协议开源,意味着你可以放心把它用在公司内部系统里,甚至嵌入到客户交付的产品中,完全不用担心授权问题。最打动人的不是它的技术多炫酷,而是它真的“不挑人”:产品经理能搭出知识库问答机器人,运营同学能配出自动抓取竞品文案的Agent,连刚学Python两周的实习生,也能在5分钟内跑通一个本地大模型的RAG流程。 它的核心逻辑很朴素:把LangChain里那些抽象概念——比如LLM调用、文档切分、向量检索、工具调用——变成画布上看得见、摸得着的方块。你拖一个“Ollama LLM”节点,再拖一个“Chroma Vector

OpenClaw配置Bot接入飞书机器人+Kimi2.5

OpenClaw配置Bot接入飞书机器人+Kimi2.5

上一篇文章写了Ubuntu_24.04下安装OpenClaw的过程,这篇文档记录一下接入飞书机器+Kimi2.5。 准备工作 飞书 创建飞书机器人 访问飞书开放平台:https://open.feishu.cn/app,点击创建应用: 填写应用名称和描述后就直接创建: 复制App ID 和 App Secret 创建成功后,在“凭证与基础信息”中找到 App ID 和 App Secret,把这2个信息复制记录下来,后面需要配置到openclaw中 配置权限 点击【权限管理】→【开通权限】 或使用【批量导入/导出权限】,选择导入,输入以下内容,如下图 点击【下一步,确认新增权限】即可开通所需要的权限。 配置事件与回调 说明:这一步的配置需要先讲AppId和AppSecret配置到openclaw成功之后再设置订阅方式,