Dora原理及代码讲解
相关的完整训练应用及验证代码在:
 https://github.com/mst272/LLM-Dojo 项目的dora部分:Dora部分
前验证知识:Lora
Lora估计大家已经很熟悉了,在这里就不详细介绍Lora的一些原理的,简而言之就是冻结已经训练好的模型权重,在其线性层或其它层中增加一些参数,只训练这些新增的参数层。
 可以直接用这一个图来表示:
 
直接用代码来解释一下:
创建LoraLayer
import torch.nn as nn
import torch
# 构建LoraLayer
class LoRALayer(nn.Module):
    def __init__(self, in_dim, out_dim, rank,  alpha):
        super().__init__()
        std_dev = 1 / torch.sqrt(torch.tensor(rank).float())
        self.A = nn.Parameter(torch.rand(in_dim, rank)*std_dev)
        self.B = nn.Parameter(torch.zeros(rank, out_dim))
        self.alpha = alpha
    def forward(self, x):
        x = self.alpha * (x @ self.A @ self.B)
        return x
将Lora合并到线性层
class LinearWithLoRA(nn.Module):
    def __init__(self, linear, rank, alpha):
        super().__init__()
        self.linear = linear
        self.lora = LoRALayer(
            linear.in_features,
            linear.out_features,
            rank,
            alpha
        )
    def forward(self,x):
        return self.linear(x) + self.lora(x)
Lora的变体—Dora
具体原理可以在代码中理解,实际运用中将LinearWithLoRA替换为LinearWithDoRA即可使用。
class LinearWithDoRA(nn.Module):
    def __init__(self, linear, rank, alpha):
        super().__init__()
        self.linear = linear
        self.lora = LoRALayer(
            linear.in_features, linear.out_features, rank, alpha
        )
        self.m = nn.Parameter(torch.ones(1, linear.out_features))
    
    def forward(self, x):
        linear_out = self.linear(x)
        lora_out = self.lora(x)
        lora_out_norm = lora_out / (lora_out.norm(p=2, dim=1, keepdim=True) + 1e-9)
        dora_modification = self.m * lora_out_norm
        return linear_out + dora_modification   
具体如何使用我们可以自己简单的建一个Model进行测试,例如:
class TestMLP(nn.Module):
    def __init__(self, num_features, num_hidden1, num_hidden2, num_class):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Linear(num_features, num_hidden1),
            nn.ReLU(),
            nn.Linear(num_hidden1, num_hidden2),
            nn.ReLU(),
            nn.Linear(num_hidden2, num_class)
        )
    def forward(self, x):
        x = self.layers(x)
        return x
具体完整的模型训练集验证可见文章开头的代码库,其中有完整的代码演示。









