PyTorch Tensor 完全指南：从基础概念到高级操作实战-尧图网络科技

PyTorch Tensor 完全指南：从基础概念到高级操作实战

摘要：本文系统讲解 PyTorch 中 Tensor（张量）的核心概念、创建方式、数学运算、形状操作、索引切片、自动求导等知识，配合大量代码示例，帮助读者全面掌握 PyTorch Tensor 的使用。
适用读者：深度学习入门者、PyTorch 初学者 | 难度：入门~中级 | 预计阅读时间：30 分钟

文章目录

PyTorch Tensor 完全指南：从基础概念到高级操作实战
- 一、前言
- 二、Tensor 的创建
- - 2.1 直接从 Python 数据创建
  - 2.2 创建指定形状的 Tensor
  - 2.3 随机数创建
  - 2.4 序列创建
  - 2.5 指定设备（CPU/GPU）
  - 2.6 稀疏张量
- 三、Tensor 的基本数学运算
- - 3.1 算术运算
  - 3.2 三角函数与数学函数
  - 3.3 统计函数
  - 3.4 比较运算
- 四、Tensor 的形状操作
- - 4.1 查看形状
  - 4.2 reshape / view
  - 4.3 转置与维度变换
  - 4.4 增删维度
  - 4.5 拼接与切分
- 五、Tensor 的索引与切片
- - 5.1 clamp（值裁剪）
- 六、Tensor 与 NumPy 的互转
- 七、Tensor 的设备管理
- 八、自动求导（Autograd）
- - 8.1 基本用法
  - 8.2 阻止梯度追踪
  - 8.3 自定义 autograd Function
  - 8.4 梯度累积与清零
- 九、随机种子
- 十、总结
- 参考资料

一、前言

Tensor（张量）是 PyTorch 中最核心的数据结构，可以理解为一个多维数组。它与 NumPy 的 ndarray 非常相似，但额外支持 GPU 加速和自动求导（autograd），这使得它成为深度学习框架的基石。

简单来说：

标量（Scalar）是 0 维张量
向量（Vector）是 1 维张量
矩阵（Matrix）是 2 维张量
更高维度的数据统称为 N 维张量

本文将从 Tensor 的创建、基本运算、形状操作、索引切片、自动求导等方面，系统地讲解 PyTorch Tensor 的使用方法。

二、Tensor 的创建

2.1 直接从 Python 数据创建

importtorch# 从列表创建a=torch.tensor([[1,2],[3,4]])print(a)# tensor([[1, 2],# [3, 4]])# 注意：torch.tensor()（小写 t）是推荐的新版写法# torch.Tensor()（大写 T）也可以用，但行为略有不同

提示：推荐使用torch.tensor()（小写），它是新版 API，能自动推断数据类型。torch.Tensor()（大写）始终创建FloatTensor。

2.2 创建指定形状的 Tensor

importtorch# 全零张量zeros=torch.zeros(2,3)print(zeros)# tensor([[0., 0., 0.],# [0., 0., 0.]])# 全一张量ones=torch.ones(4,5)print(ones)# 单位矩阵eye=torch.eye(3,3)print(eye)# 形状类似的张量（zeros_like / ones_like）b=torch.Tensor(2,3)f=torch.zeros_like(b)# 与 b 形状相同的全零张量g=torch.ones_like(b)# 与 b 形状相同的全一张量

2.3 随机数创建

importtorch# 均匀分布 [0, 1)rand=torch.rand(2,3)print(rand)# 正态分布（均值 0，标准差 1）randn=torch.randn(2,3)print(randn)# 指定均值和标准差的正态分布normal=torch.normal(mean=0.0,std=torch.rand(5))print(normal)# 均值和标准差都是张量normal2=torch.normal(mean=torch.rand(5),std=torch.rand(5))print(normal2)# 均匀分布指定范围uniform=torch.Tensor(2,2).uniform_(-1,1)print(uniform)

2.4 序列创建

importtorch# 等差数列：从 0 到 10，步长 1arange=torch.arange(0,10,1)print(arange)# tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])# 等间距数列：从 2 到 10，包含 4 个数linspace=torch.linspace(2,10,4)print(linspace)# tensor([ 2.0000, 4.6667, 7.3333, 10.0000])# 随机排列randperm=torch.randperm(10)print(randperm)

注意：通过arange和randperm创建的是LongTensor类型，其他方法默认创建FloatTensor类型。

2.5 指定设备（CPU/GPU）

importtorch# 创建在 GPU 上的张量dev=torch.device('cuda:0')a=torch.tensor([2,2],dtype=torch.float32,device=dev)print(a)print(a.type())# torch.cuda.FloatTensor

2.6 稀疏张量

当数据中大部分元素为零时，使用稀疏张量可以节省内存：

importtorch# 定义非零元素的位置i=torch.tensor([[0,1,2],[0,1,2]])# 行索引和列索引v=torch.tensor([1,2,3])# 非零元素的值# 创建稀疏张量并转为稠密张量sparse=torch.sparse_coo_tensor(i,v,(4,4),dtype=torch.float32)dense=sparse.to_dense()print(dense)# tensor([[1., 0., 0., 0.],# [0., 2., 0., 0.],# [0., 0., 3., 0.],# [0., 0., 0., 0.]])

三、Tensor 的基本数学运算

3.1 算术运算

importtorch a=torch.tensor([[1,2],[3,4]],dtype=torch.float32)b=torch.tensor([[5,6],[7,8]],dtype=torch.float32)# 加法print(a+b)print(torch.add(a,b))# 减法print(a-b)# 乘法（逐元素）print(a*b)print(torch.mul(a,b))# 除法（逐元素）print(a/b)# 矩阵乘法print(torch.mm(a,b))# 2D 矩阵乘法print(torch.matmul(a,b))# 通用矩阵乘法，支持广播print(a @ b)# 等价写法# 幂运算print(torch.pow(a,2))# 逐元素开方print(torch.sqrt(a))

3.2 三角函数与数学函数

importtorch a=torch.tensor([0.0,0.5,1.0])# 三角函数print(torch.sin(a))print(torch.cos(a))print(torch.tan(a))# 取整 / 取余b=torch.tensor([1.2,2.7,-0.3])print(torch.floor(b))# 向下取整print(torch.ceil(b))# 向上取整print(torch.round(b))# 四舍五入print(torch.abs(b))# 绝对值

3.3 统计函数

importtorch a=torch.tensor([[1.0,2.0,3.0],[4.0,5.0,6.0]])# 求和print(torch.sum(a))# 所有元素求和print(torch.sum(a,dim=0))# 按列求和print(torch.sum(a,dim=1))# 按行求和# 最大值 / 最小值print(torch.max(a))# 全局最大值print(torch.max(a,dim=0))# 每列最大值，返回 (values, indices)print(torch.min(a))# 均值 / 标准差print(torch.mean(a))print(torch.std(a))# argmax / argminprint(torch.argmax(a,dim=1))# 每行最大值的索引

提示：统计函数可以通过dim参数指定在哪个维度上操作。bincount函数只能处理一维张量。

3.4 比较运算

importtorch a=torch.tensor([1.0,2.0,3.0])b=torch.tensor([1.0,3.0,2.0])print(torch.eq(a,b))# 逐元素相等判断print(torch.gt(a,b))# 大于print(torch.lt(a,b))# 小于print(torch.ge(a,b))# 大于等于

四、Tensor 的形状操作

4.1 查看形状

importtorch a=torch.rand(2,3,4)print(a.shape)# torch.Size([2, 3, 4])print(a.size())# torch.Size([2, 3, 4])print(a.dim())# 3（维度数）print(a.numel())# 24（元素总数）

4.2 reshape / view

importtorch a=torch.rand(2,6)# reshape：返回新的张量b=a.reshape(3,4)print(b.shape)# torch.Size([3, 4])# view：返回视图（共享内存），要求张量在内存中连续c=a.view(3,4)print(c.shape)# 自动推断维度：使用 -1d=a.view(4,-1)# 自动计算第二个维度print(d.shape)# torch.Size([4, 3])

4.3 转置与维度变换

importtorch a=torch.rand(2,3)# 转置（2D）print(a.t())# 交换维度b=torch.rand(2,3,4)print(b.permute(2,0,1).shape)# torch.Size([4, 2, 3])# 交换两个维度print(b.transpose(0,1).shape)# torch.Size([3, 2, 4])

4.4 增删维度

importtorch a=torch.rand(3,4)# 增加维度b=a.unsqueeze(0)# 在第 0 维增加print(b.shape)# torch.Size([1, 3, 4])c=a.unsqueeze(1)# 在第 1 维增加print(c.shape)# torch.Size([3, 1, 4])# 删除维度（维度大小必须为 1）d=b.squeeze(0)print(d.shape)# torch.Size([3, 4])

4.5 拼接与切分

importtorch a=torch.rand(2,3)b=torch.rand(2,3)# 拼接cat_0=torch.cat([a,b],dim=0)# 按行拼接print(cat_0.shape)# torch.Size([4, 3])cat_1=torch.cat([a,b],dim=1)# 按列拼接print(cat_1.shape)# torch.Size([2, 6])# stack：增加新维度后拼接stacked=torch.stack([a,b],dim=0)print(stacked.shape)# torch.Size([2, 2, 3])# 切分chunks=torch.chunk(cat_0,2,dim=0)print(chunks[0].shape)# torch.Size([2, 3])# split：可以指定每个块的大小splits=torch.split(cat_0,[1,3],dim=0)print(splits[0].shape)# torch.Size([1, 3])print(splits[1].shape)# torch.Size([3, 3])

提示：split传入列表时，列表中数字之和应等于该维度上的总大小。

五、Tensor 的索引与切片

Tensor 的索引方式与 NumPy 基本一致：

importtorch a=torch.tensor([[1,2,3],[4,5,6],[7,8,9]],dtype=torch.float32)# 基本索引print(a[0,1])# tensor(2.) 第 0 行第 1 列print(a[0:2,1:3])# 切片# 条件索引mask=a>5print(a[mask])# tensor([6., 7., 8., 9.])# gather：按索引取值index=torch.tensor([[0,1],[1,2]])print(torch.gather(a,1,index))# 按行方向取值# index_select：按维度选取indices=torch.tensor([0,2])print(torch.index_select(a,0,indices))# 选取第 0、2 行print(torch.index_select(a,1,indices))# 选取第 0、2 列

5.1 clamp（值裁剪）

将张量中的值限制在指定范围内：

importtorch a=torch.tensor([0.5,1.5,-0.3,2.0])# 将值限制在 [0, 1] 之间clamped=torch.clamp(a,0,1)print(clamped)# tensor([0.5000, 1.0000, 0.0000, 1.0000])

六、Tensor 与 NumPy 的互转

importtorchimportnumpyasnp# Tensor → NumPya=torch.tensor([1,2,3])b=a.numpy()print(type(b))# <class 'numpy.ndarray'># NumPy → Tensorc=np.array([4,5,6])d=torch.from_numpy(c)print(type(d))# <class 'torch.Tensor'>

注意：torch.from_numpy()创建的 Tensor 与 NumPy 数组共享内存，修改一个会影响另一个。

七、Tensor 的设备管理

在深度学习中，数据处理通常在 CPU 上进行，模型训练在 GPU 上进行。Tensor 需要在正确的设备上才能参与运算。

importtorch# 检查 GPU 是否可用device=torch.device('cuda'iftorch.cuda.is_available()else'cpu')# 将 Tensor 移动到 GPUa=torch.rand(3,3)a=a.to(device)# 从 GPU 移回 CPUa=a.cpu()# 或者使用 .to() 方法a=a.to('cpu')

八、自动求导（Autograd）

PyTorch 的自动求导机制是其核心特性之一，通过构建计算图来自动计算梯度。

8.1 基本用法

importtorch# 创建需要求导的张量x=torch.tensor([2.0,3.0],requires_grad=True)y=x**2+2*x+1# 反向传播y.sum().backward()# 查看梯度print(x.grad)# dy/dx = 2x + 2 → tensor([6., 8.])

8.2 阻止梯度追踪

importtorch x=torch.tensor([1.0,2.0],requires_grad=True)# 方法 1：with torch.no_grad()withtorch.no_grad():y=x*2print(y.requires_grad)# False# 方法 2：detach()z=x.detach()print(z.requires_grad)# False

8.3 自定义 autograd Function

PyTorch 允许自定义前向和反向传播逻辑：

importtorchclassLinearFunction(torch.autograd.Function):@staticmethoddefforward(ctx,w,x,b):# 保存用于反向传播的张量ctx.save_for_backward(w,x,b)returnw*x+b@staticmethoddefbackward(ctx,grad_out):w,x,b=ctx.saved_tensors grad_w=grad_out*x grad_x=grad_out*w grad_b=grad_out*breturngrad_x,grad_w,grad_b# 使用自定义函数w=torch.rand(2,2,requires_grad=True)x=torch.rand(2,2,requires_grad=True)b=torch.rand(2,2,requires_grad=True)out=LinearFunction.apply(w,x,b)out.backward(torch.ones(2,2))print(w.grad)print(x.grad)print(b.grad)

8.4 梯度累积与清零

importtorch x=torch.tensor([1.0],requires_grad=True)# 第一次计算y=x**2y.backward()print(x.grad)# tensor([2.])# 梯度会累积！y=x**2y.backward()print(x.grad)# tensor([4.]) —— 累积了# 需要手动清零x.grad.zero_()y=x**2y.backward()print(x.grad)# tensor([2.])

重要提示：在训练循环中，每次迭代前必须调用optimizer.zero_grad()清零梯度，否则梯度会不断累积导致错误。

九、随机种子

通过设置随机种子，可以保证实验的可复现性：

importtorch# 设置随机种子torch.manual_seed(42)# 之后的随机操作会得到相同的结果a=torch.rand(3)b=torch.rand(3)print(a)print(b)# 再次设置相同种子torch.manual_seed(42)c=torch.rand(3)print(c)# 与 a 完全相同

十、总结

本文要点回顾：

Tensor 创建：掌握torch.tensor()、torch.zeros()、torch.rand()、torch.arange()等常用创建方法
数学运算：算术运算、统计函数、比较运算等，与 NumPy 用法类似
形状操作：reshape/view、cat/stack、chunk/split、unsqueeze/squeeze
索引切片：基本索引、条件索引、gather、index_select、clamp
设备管理：使用.to(device)在 CPU 和 GPU 之间移动数据
自动求导：requires_grad=True、backward()、自定义Function
随机种子：torch.manual_seed()保证实验可复现