tf.nn.con2d

Ne0inhk

14 Jan 2025 — 4 min read

1. API参数介绍

tf.nn.conv2d((input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)

其中input的默认输入格式为NHWC，即为[batch, in_height, in_width, in_chanels]。

filter的输入格式为HWIO，[filter_height, filter_width, in_channels, out_channels]，其中filter的个数和out_channels相同。

stride的输入格式和data_format的格式相对应，它也是一个四维张量，表示在data_format每一维上的移动步长。例如输入格式默认格式”NHWC“，则strides的设置为[1,stride,stride,1]对应[batch,in_height, in_width, in_channels]第二、三维是在filter在特征图上的移动的跨度，第四个表示在一个样本的一个通道上移动。第一维和第四维一般恒定为1。

padding为string类型，必须是"SAME","VALID"其中一个，这个值决定了不同的卷积方式。conv2d的VALID方式不会在原有输入的基础上添加新的像素。

use_cudnn_on_gpu为bool类型，是否使用cudnn加速，默认为true。

2. 实例说明

下文通过实例来说明TensorFlow卷积的具体实现方式：

考虑一种最简单的情况，现在有一张3×3单通道的图像（输入数据shape：[1，3，3，1]），用一个1×1的卷积核（对应的shape：[1，1，1，1]）去做卷积，最后会得到一张3×3的feature map，即133的feature map。

增加图片的通道数，使用一张3×3五通道的图像（输入数据shape：[1，3，3，5]），用一个1×1的卷积核（对应的shape：[1，1，1，1]）去做卷积，仍然是一张3×3的feature map，这就相当于每一个像素点，卷积核都与该像素点的每一个通道做卷积。对于每个filter而言，是把所有channel的结果求和。

如果out_channels为1，则是直接把多个channels相加。

input = tf.Variable(tf.ones([1, 3, 3, 4])) filter = tf.Variable(tf.ones([1, 1, 4, 1])) op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID') intilizers = tf.global_variables_initializer() with tf.Session() as sess: sess.run(intilizers) print(op.shape) print(sess.run(op))

结果为

(1, 3, 3, 1) [[[[4.] [4.] [4.]] [[4.] [4.] [4.]] [[4.] [4.] [4.]]]]

如果out_channels大于1，就是对应的filter进行运算的结果，然后形成一个更大的张量。

input = tf.Variable(tf.ones([1, 3, 3, 4])) filter = tf.Variable(tf.ones([1, 1, 4, 2])) op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID') intilizers = tf.global_variables_initializer() with tf.Session() as sess: sess.run(intilizers) print(op.shape) print(sess.run(op))

结果为

(1, 3, 3, 2) [[[[4. 4.] [4. 4.] [4. 4.]] [[4. 4.] [4. 4.] [4. 4.]] [[4. 4.] [4. 4.] [4. 4.]]]]

.使用一张3×3五通道的图像（输入数据shape：[1，3，3，5]），用一个3×3的卷积核（对应的shape：[1，1，1，1]）去做卷积，用的卷积核做卷积，最后的输出是一个值，相当于情况2的feature map所有像素点的值求和。

input = tf.Variable(tf.random_normal([1, 3, 3, 5])) filter = tf.Variable(tf.random_normal([3, 3, 5, 1])) op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID') print(op.shape)

输出结果为

(1, 1, 1, 1)

使用更大的图片将情况2的图片扩大到5×5，仍然是3×3的卷积核，令步长为1，输出3×3的feature map

input = tf.Variable(tf.random_normal([1, 5, 5, 5])) filter = tf.Variable(tf.random_normal([3, 3, 5, 1])) op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID') print(op.shape)

输出结果为

(1, 3, 3, 1)

上面我们一直令参数padding的值为‘VALID’，当其为‘SAME’时，表示卷积核可以停留在图像边缘，如下，输出5×5的feature map

input = tf.Variable(tf.random_normal([1, 5, 5, 5])) filter = tf.Variable(tf.random_normal([3, 3, 5, 1])) op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME') print(op.shape)

输出结果为

(1, 5, 5, 1)

如果卷积核有多个。

input = tf.Variable(tf.random_normal([1, 5, 5, 5])) filter = tf.Variable(tf.random_normal([3, 3, 5, 7])) op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME') print(op.shape)

输出结果为

(1, 5, 5, 7)

步长不为1的情况，文档里说了对于图片，因为只有两维，通常strides取[1，stride，stride，1]

input = tf.Variable(tf.random_normal([1, 5, 5, 5])) filter = tf.Variable(tf.random_normal([3, 3, 5, 7])) op = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding='SAME') print(op.shape)

输出结果为

(1, 3, 3, 7)

如果batch值不为1，同时输入10张图，本质上是一张图一张的计算。

input = tf.Variable(tf.random_normal([10, 5, 5, 5])) filter = tf.Variable(tf.random_normal([3, 3, 5, 7])) op = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding='SAME') print(op.shape)

输出结果为

(10, 3, 3, 7)

3. h = tf.nn.bias_add(conv, b)

https://blog.ZEEKLOG.net/herosunly/article/details/90340152

参考博客地址为：
https://www.bilibili.com/video/av38229543/?p=25
https://blog.ZEEKLOG.net/mieleizhi0522/article/details/80412804
https://www.cnblogs.com/qggg/p/6832342.html

探索Vortex开源GPGPU：RISC-V SIMT架构(4-2)，TCU 矩阵计算(2)

目录前言一、TCU模块框图二、WMMA代码分析 2.1 WMMA矩阵分块 2.2 WMMA矩阵地址偏移计算 2.2.1 WMMA matrixA 2.2.2 WMMA matrixB 2.2.3 WMMA matrixC 2.2.4 tcu_int WMMA源代码总结前言本篇分析Vortex矩阵计算的核心模块TCU WMMA。前文：探索Vortex开源GPGPU：RISC-V SIMT架构(4-2)，TCU 矩阵计算(1)https://blog.ZEEKLOG.net/weixin_

Apache IoTDB 架构特性与 Prometheus+Grafana 监控体系部署实践

Apache IoTDB 架构特性与 Prometheus+Grafana 监控体系部署实践文章目录 * Apache IoTDB 架构特性与 Prometheus+Grafana 监控体系部署实践 * Apache IoTDB 核心特性与价值 * Apache IoTDB 监控面板完整部署方案 * 安装步骤 * 步骤一：IoTDB开启监控指标采集 * 步骤二：安装、配置Prometheus * 步骤三：安装grafana并配置数据源 * 步骤四：导入IoTDB Grafana看板 * TimechoDB（基于 Apache IoTDB）增强特性 * 总结与应用场景建议 Apache IoTDB 核心特性与价值 Apache IoTDB 专为物联网场景打造的高性能轻量级时序数据库，以 “设备 - 测点” 原生数据模型贴合物理设备与传感器关系，通过高压缩算法、百万级并发写入能力和毫秒级查询响应优化海量时序数据存储成本与处理效率，同时支持边缘轻量部署、

SQL Server 2019安装教程(超详细图文)

SQL Server 介绍） SQL Server 是由微软（Microsoft）开发的一款关系型数据库管理系统（RDBMS），支持结构化查询语言（SQL）进行数据存储、管理和分析。自1989年首次发布以来，SQL Server 已成为企业级数据管理的核心解决方案，广泛应用于金融、电商、ERP、CRM 等业务系统。它提供高可用性、安全性、事务处理（ACID）和商业智能（BI）支持，并支持 Windows 和 Linux 跨平台部署。一、获取 SQL Server 2019 安装包 1. 官方下载方式前往微软官网注册账号后，即可下载 SQL Server Developer 版本（

tf.nn.con2d

Ne0inhk

1. API参数介绍

2. 实例说明

3. h = tf.nn.bias_add(conv, b)

Read more

探索Vortex开源GPGPU：RISC-V SIMT架构(4-2)，TCU 矩阵计算(2)

最新电子电气架构（EEA）调研-3

Apache IoTDB 架构特性与 Prometheus+Grafana 监控体系部署实践

SQL Server 2019安装教程(超详细图文)