search.json

[{"title":"Pytorch深度学习教程(1):数据加载之Dataset和DataLoader使用","url":"/2022/05/22/Pytorch%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E6%95%99%E7%A8%8B-1-%E6%95%B0%E6%8D%AE%E5%8A%A0%E8%BD%BD%E4%B9%8BDataset%E5%92%8CDataLoader%E4%BD%BF%E7%94%A8/","content":"深度学习模型，区别于其他的机器学习模型，一方面，模型训练所需的数据量通常是非常大的，是无法一次性加载到内存中；另一方面，模型训练多采用基于梯度下降的优化方法对模型的权重和偏置进行逐步调整的，不可能一次性地在模型中进行正向传播和反向传播。通常，我们需要进行数据加载和预处理处理，将其封装成适合迭代训练的形式，具体会对整个数据进行随机打乱，然后将原始数据处理成一个一个的Batch，然后送到模型中进行训练。\n深度学习模型流程中一般都是先解决数据加载问题，包括数据的输入问题和预处理问题，数据加载处理在深度学习链路中起着非常重要的基础作用。这篇文章将介绍Pytorch对自定义数据集进行封装的方法。\nDataset、Batch、Iteration和Epoch的关系在介绍如何使用Pytorch加载数据前，简单介绍下，Dataset，Batch，Iteration 和 Epoch 的区别和关系。\n\n\n\n\n名词\n解释\n\n\n\n\n  Dataset\n待训练的全量数据集\n\n\n   Batch\n待训练全量数据集的一小部分样本对模型进行一次反向传播参数更新，这一小部分样本称为“一个Batch”\n\n\n Iteration\n使用一个Batch数据对模型进行一次参数更新的过程，称之为“一次Iteration”\n\n\n   Epoch\n待训练全量数据集对模型进行一次完整的参数更新，称之为“一个Epoch”\n\n\n\n\n假设DatasetSize=10，BatchSize=3，那么每个Epoch会执行4个Iteration，对应四个Batch，每个BatchSize大小分别包括3，3，3和1个样本。\n\nPytoch数据处理：DataSet和DataLoaderPytorch提供了几个有用的工具：torch.utils.data.Dataset类和torch.utils.data.DataLoader类，用于数据读取和预处理。基本流程是先把原始数据转变成torch.utils.data.Dataset类，随后把得到的torch.utils.data.Dataset类当作一个参数传递给torch.utils.data.DataLoader类，得到一个数据加载器，这个数据加载器每次可以返回一个Batch的数据供模型训练使用。\ntorch.utils.data.Dataset类torch.utils.data.Dataset是代表这一数据的抽象类，你可以自己定义数据类，继承和重写这个抽象类，只需要定义init，len和getitem这三个魔法函数,其中：\n\n_init_()：用于初始化原始数据的路径和文件名等。\n_len_()：用于返回数据集中的样本总个数。\n_getitem_()：用于返回指定索引的样本所对应的输入变量与输出变量。\n\n# class CustomDataset(torch.utils.data.Dataset):#需要继承data.Dataset#     def __init__(self):#         # TODO#         # 1. Initialize file path or list of file names.#         pass#     def __getitem__(self, index):#         # TODO#         # 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open).#         # 2. Preprocess the data (e.g. torchvision.Transform).#         # 3. Return a data pair (e.g. image and label).#         #这里需要注意的是，第一步：read one data，是一个data#         pass#     def __len__(self):#         # You should change 0 to the total size of your dataset.#         pass\n用原始数据构造出来的Dataset子类可以理解成一个静态数据池，这个数据池使得我们可以用索引得到某个样本数据，而想要该数据池流动起来，源源不断地输出Batch供给给模型训练，还需要下一个工具DataLoader类。所以我们把创建的Dataset子类当参数传入即将构建的DataLoader类才是使用Dataset子类最终目的。\ntorch.utils.data.DataLoader类DataLoader(object)可用参数:\n\ndataset(Dataset): 传入的数据集。\nbatch_size(int, optional): 每个batch有多少样本。\nshuffle(bool, optional): 在每个epoch开始时，对数据进行打乱排序。\nsampler(Sampler, optional): 自定义从数据集中取样本的策略，如果指定这个参数，那么shuffle必须为False。\nbatch_sampler(Sampler, optional): 与sampler类似，但是一次只返回一个batch的indices（索引），需要注意的是，一旦指定了这个参数，那么batch_size,shuffle,sampler,drop_last就不能再配置（互斥）。\nnum_workers (int, optional): 决定了有几个进程来处理data loading。0意味着所有的数据都会被load进主进程。（默认为0）\ncollate_fn (callable, optional): 将一个list的sample组成一个mini-batch的函数。\npin_memory (bool, optional)： 如果设置为True，那么data loader将会在返回它们之前，将tensors拷贝到CUDA中的固定内存中。\ndrop_last (bool, optional):如果设置为True：这个是对最后的未完成的batch来说的，比如batch_size设置为64，而一个epoch只有100个样本，那么训练时后面36个样本会被丢弃。 如果为False（默认），那么会继续正常执行，只是最后的batch_size会小一点。\ntimeout(numeric, optional):如果是正数，表明等待从worker进程中收集一个batch等待时间，若超出设定时间还没有收集到，那就不收集这个内容。这个numeric应总是大于等于0。默认为0\nworker_init_fn (callable, optional): 每个worker初始化函数。\n\n实例txt数据读取使用个人创建的txt文件数据，进行数据读取操作。\nimport torchimport numpy as npfrom torch.utils.data import Dataset, DataLoaderclass SampleTxtDataset(Dataset):    def __init__(self):        # 数据文件地址        self.txt_file_path = &quot;./sample_easy_data.txt&quot;    def __getitem__(self, item):        txt_data = np.loadtxt(self.txt_file_path, delimiter=&quot;,&quot;)        self._x = torch.from_numpy(txt_data[:, :2])        self._y = torch.from_numpy(txt_data[:, 2])        return self._x[item], self._y[item]    def __len__(self):        txt_data = np.loadtxt(self.txt_file_path, delimiter=&quot;,&quot;)        self._len = len(txt_data)        return self._lensample_txt_dataset = SampleTxtDataset()print(&quot;Data Size:&quot;,len(sample_txt_dataset))print(&quot;First Sample:&quot;,next(iter(sample_txt_dataset)))print(&quot;First Sample&#x27;s Type:&quot;,type(next(iter(sample_txt_dataset))[0]))sample_dataloader = DataLoader(dataset=sample_txt_dataset, batch_size=3, shuffle=True)num_epochs = 4for epoch in range(num_epochs):    for iteration, (batch_x, batch_y) in enumerate(sample_dataloader):        print(&#x27;Epoch: &#x27;, epoch, &#x27;| Iteration: &#x27;, iteration, &#x27;| batch x: &#x27;, batch_x.numpy(), &#x27;| batch y: &#x27;, batch_y.numpy())\nDataset的示例结果：\n\nDataLoader的示例结果：\n\ncsv文件读取使用常见离线数据csv文件进行数据加载和预处理。\nimport torchimport pandas as pdfrom torch.utils.data import Dataset, DataLoaderclass SampleCsvDataset(Dataset):    def __init__(self):        self.csv_file_path = &quot;./sample_boston.csv&quot;    def __getitem__(self, item):        raw_data = pd.read_csv(self.csv_file_path)        raw_data_shape = raw_data.shape        self._x  = torch.from_numpy(raw_data.iloc[:,:raw_data_shape[1]-1].values)        self._y  = torch.from_numpy(raw_data.iloc[:,raw_data_shape[1]-1].values)        return self._x[item], self._y[item]    def __len__(self):        raw_data = pd.read_csv(self.csv_file_path)        raw_data_shape = raw_data.shape        self._len = raw_data_shape[0]        return self._lensample_csv_dataset = SampleCsvDataset()print(&quot;Data Size:&quot;,len(sample_csv_dataset))print(&quot;First Sample:&quot;,next(iter(sample_csv_dataset)))print(&quot;First Sample&#x27;s Type:&quot;,type(next(iter(sample_csv_dataset))[0]))sample_dataloader = DataLoader(dataset=sample_csv_dataset, batch_size=3, shuffle=True)num_epochs = 4for epoch in range(num_epochs):    for iteration, (batch_x, batch_y) in enumerate(sample_dataloader):        print(&#x27;Epoch: &#x27;, epoch, &#x27;| Iteration: &#x27;, iteration, &#x27;| batch x: &#x27;, batch_x.numpy(), &#x27;| batch y: &#x27;, batch_y.numpy())\nmysql数据读取生产落地数据多为数据库，本文也针对常见Mysql数据库进行了数据加载，使用的是MYSQL8.0数据库的示例数据库sakila.payment表进行数据读取演示。\nimport torchimport pandas as pdimport pymysqlfrom torch.utils.data import Dataset, DataLoaderclass SampleMysqlDataset(Dataset):    def __init__(self):        # 初始化MySQL数据库连接配置参数        self.mysql_host = &quot;localhost&quot;        self.mysql_port = 3307        self.mysql_user = &quot;utest&quot;        self.mysql_password = &quot;123456xyq&quot;        self.mysql_db = &quot;sakila&quot;        self.mysql_table = &quot;payment&quot;        self.mysql_charset = &quot;utf8&quot;        self.mysql_sql_data = &quot;select payment_id, customer_id, staff_id, rental_id, amount from sakila.payment&quot;        self.mysql_sql_cnt = &quot;select count(*) from sakila.payment&quot;    def __getitem__(self, item):        # 创建数据库连接        conn = pymysql.connect(host=self.mysql_host,                        port=self.mysql_port,                        user=self.mysql_user,                        password=self.mysql_password,                        db=self.mysql_db,                        charset=self.mysql_charset)        raw_dataframe = pd.read_sql(self.mysql_sql_data, conn)        raw_dataframe_shape = raw_dataframe.shape        self._x  = torch.from_numpy(raw_dataframe.iloc[:,:raw_dataframe_shape[1]-1].values)        self._y  = torch.from_numpy(raw_dataframe.iloc[:,raw_dataframe_shape[1]-1].values)        return self._x[item], self._y[item]    def __len__(self):        # 创建数据库连接        conn = pymysql.connect(host=self.mysql_host,                        port=self.mysql_port,                        user=self.mysql_user,                        password=self.mysql_password,                        db=self.mysql_db,                        charset=self.mysql_charset)        raw_dataframe = pd.read_sql(self.mysql_sql_data, conn)        raw_dataframe_shape = raw_dataframe.shape        self._len = raw_dataframe_shape[0]        return self._lensample_mysql_dataset = SampleMysqlDataset()print(&quot;Data Size:&quot;,len(sample_mysql_dataset))print(&quot;First Sample:&quot;,next(iter(sample_mysql_dataset)))print(&quot;First Sample&#x27;s Type:&quot;,type(next(iter(sample_mysql_dataset))[0]))sample_dataloader = DataLoader(dataset=sample_mysql_dataset, batch_size=3, shuffle=True)num_epochs = 4for epoch in range(num_epochs):    for iteration, (batch_x, batch_y) in enumerate(sample_dataloader):        print(&#x27;Epoch: &#x27;, epoch, &#x27;| Iteration: &#x27;, iteration, &#x27;| batch x: &#x27;, batch_x.numpy(), &#x27;| batch y: &#x27;, batch_y.numpy())\n使用pytorch自带数据集为方便快速试验，Pytorch也集成了常见的数据集在torchaudio，torchtext和torchvision中，本代码使用torchvision读取常用的图像算法数据集MNIST，具体代码如下。\nfrom torchvision import datasets, transformsfrom torch.utils.data import DataLoader# 导入训练集sample_mnist_dataset = datasets.MNIST(root=r&#x27;./data&#x27;,                              transform=transforms.ToTensor(),                              train=True,                              download=True)print(&quot;Data Size:&quot;,len(sample_mnist_dataset))print(&quot;First Sample:&quot;,next(iter(sample_mnist_dataset)))print(&quot;First Sample&#x27;s Type:&quot;,type(next(iter(sample_mnist_dataset))[0]))sample_dataloader = DataLoader(dataset=sample_mnist_dataset, batch_size=3, shuffle=True)num_epochs = 4for epoch in range(num_epochs):    for iter, (batch_x, batch_y) in enumerate(sample_dataloader):        print(&#x27;Epoch: &#x27;, epoch, &#x27;| Iteration: &#x27;, iter, &#x27;| batch x: &#x27;, batch_x.numpy(), &#x27;| batch y: &#x27;, batch_y.numpy())\n探索完整代码已上传Github，有需要的可以自行下载代码，如果对你有帮助，请Star，哈哈哈哈！\n\n生产读取大量数据无法一次加载到内存该如何操作呢？\n\n如何使用TorchData进行数据读取和预处理？\n\n\n参考More info: pan_jinquan：Dataset, DataLoader产生自定义的训练数据\nMore info: 夜和大帝：Dataset类的使用\nMore info: setail：pytorch_tutorial\nMore info: Ericam_：十分钟搞懂Pytorch如何读取MNIST数据集\nMore info: Chenllliang：两文读懂PyTorch中Dataset与DataLoader（一）打造自己的数据集\nMore Info: cici_iii：大数据量下如何使用Dataset和IterDataset构建数据集\nMore Info: csdn-WJW: 如何划分训练集，测试集和验证集\n","categories":["深度学习"],"tags":["Pytorch_Tutorial"]},{"title":"Pytorch深度学习教程(2):“Hello World”之Mnist图像分类","url":"/2022/05/29/Pytorch%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E6%95%99%E7%A8%8B-2-%E2%80%9CHello-World%E2%80%9D%E4%B9%8BMnist%E5%9B%BE%E5%83%8F%E5%88%86%E7%B1%BB/","content":"简介本章节主要通过深度学习“Hello World”之Mnist图像分类，学会深度学习的基本链路，快速搭建个人的Baseline模型，包括数据加载、数据可视化、模型构建、模型训练评估和模型结果指标展示。\nMnist数据集是手写数字0~9的MNIST数据集，包含60,000个用于训练的训练集和10,000个用于测试的测试集。这些数字已被大小归一化，并以固定尺寸的图像为中心。对于尝试学习深度学习技术的人来说，这是一个非常棒的数据集，同时可以花费最少时间来进行预处理和格式化。详细：Mnist官网\n环境配置环境配置主要包括：操作系统，显卡，深度学习Pytorch等配置如下。\n系统：Windows 11 - WSL2 系统显卡：NVIDIA GeForce RTX 3060python: 3.6.13pytorch: 1.7.1cudatoolkit: 11.0.221torchvision: 0.8.2\n导入包环境本文使用的Python Package 主要有Pytorch的深度学习框架，torchvision用于加载Mnist数据集，matplotlib用于可视化展示数据集和展示训练结果等。\nfrom torchvision import datasets, transformsfrom torch.utils.data import DataLoaderimport torchimport torch.optim as optimimport torch.nn as nnimport torch.nn.functional as Fimport numpy as npimport randomimport matplotlib.pyplot as pltfrom torchinfo import summary\n构建baseline模型核心是快速跑通完整链路，同时在Baseline的基础上迭代更新有合适的对比标准，方便验证算法模型的稳定性和正确性。为保证训练模型的可重复性，我们需要进行一些配置。全部配置主要分为三部分，具体如下：\n\nCUDNN配置，CUDNN中对卷积操作进行优化，牺牲精度来换取计算效率。\nPytorch在运行中会有很多随机初始化操作，所以需要固定随机种子。\nPython &amp; Numpy也需要固定对应的随机种子。\n注意，如果Dataloader采用了多线程(num_workers &gt; 1), 那么由于读取数据的顺序不同，最终运行结果也会有差异。也就是说，改变num_workers参数，也会对实验结果产生影响。\n\n# 保证试验结果的稳定性seed = 0random.seed(seed)np.random.seed(seed)torch.manual_seed(seed)torch.cuda.manual_seed_all(seed)torch.backends.cudnn.benchmark = Falsetorch.backends.cudnn.deterministic = True# 配置Cuda参数is_cuda = Falseif torch.cuda.is_available():    is_cuda = True# 模型参数batch_size = 32num_epochs = 2\n模型数据本章节就不详细讲解，大家可以查看前一章博客讲解。\n# 构建Datasetmnist_train_dataset = datasets.MNIST(root=r&#x27;./data&#x27;,                              transform=transforms.ToTensor(),                              train=True,                              download=True)mnist_test_dataset = datasets.MNIST(root=r&#x27;./data&#x27;,                              transform=transforms.ToTensor(),                              train=False,                              download=True)# 构建Dataloadermnist_train_loader = DataLoader(mnist_train_dataset,batch_size=32,shuffle=True)mnist_test_loader = DataLoader(mnist_test_dataset,batch_size=32,shuffle=True)\n为更好地看到我们需要处理什么样的数据，我们可以用一些可视化的手段展示我们的数据。\n# 图片查看def plot_image(image,batch_size=32):    x_batch,y_batch = image[0],image[1]    f = plt.figure(figsize=(300,300))    # mean = 0.1307    # std = 0.3081    # image = ((image * mean) + std)    # plt.imshow(image, cmap=&#x27;gray&#x27;)    for i in range(batch_size):        image_tx = x_batch.numpy()[i]        # image_ty = y_batch.numpy()[i]        np.math.sqrt(32)        # Debug, plot figure        sub_size = int(np.math.sqrt(32))+1        f.add_subplot(sub_size,sub_size, i + 1)        # plt.subplot_mosaic        plt.imshow(image_tx[0], cmap=&#x27;gray&#x27;)    plt.show()sample_image_batch = next(iter(mnist_train_loader))plot_image(sample_image_batch)\n图片展示结果如下：\n\n模型构建# 构建网络class MnistNet(nn.Module):    def __init__(self):        super().__init__()        self.conv1 = nn.Conv2d(in_channels=1,out_channels=32,kernel_size= 3)        self.conv2 = nn.Conv2d(in_channels= 32,out_channels= 64,kernel_size= 3)        self.dropout = nn.Dropout2d(0.25)        self.fc1 = nn.Linear(9216, 128)        self.fc2 = nn.Linear(128, 10)    def forward(self, x):        x = self.conv1(x)        x = F.relu(x)        x = self.conv2(x)        x = F.relu(x)        x = F.max_pool2d(x, 2)        x = self.dropout(x)        x = x.view(-1, 9216)        x = self.fc1(x)        x = F.relu(x)        x = self.fc2(x)        x = F.log_softmax(x, dim=1)        return x\n在进行模型训练过程中，我们可以使用以下代码，查看构建的模型结构，目前torchinfo仅能展示 torch.nn 模块下的模型结构，采用 torch.nn.functional 构造的函数是无法显示的。\nsummary(model, input_size=(batch_size, 1, 28, 28),depth=4)\n\n模型训练模型训练函数如果模型中有BN层(Batch Normalization）和Dropout，需要在训练时添加model.train()，在测试时添加model.eval()。其中model.train()是保证BN层用每一批数据的均值和方差，而model.eval()是保证BN用全部训练数据的均值和方差；而对于Dropout，model.train()是随机取一部分网络连接来训练更新参数，而model.eval()是利用到了所有网络连接。\n# 模型训练函数def fit_train(model,data_loader):    model.train()    running_loss = 0    running_correct = 0    for batch_idx, (data, target) in enumerate(data_loader):        if is_cuda:            data, target = data.cuda(), target.cuda()        optimizer.zero_grad()        output = model(data)        loss = F.nll_loss(output, target)        running_loss += F.nll_loss(output ,target ,reduction=&#x27;sum&#x27;).item()        preds = output.data.max(1, keepdim=True)[1]        running_correct += preds.eq(target.data.view_as(preds)).cpu().sum()        loss.backward()        optimizer.step()    loss = running_loss/len(data_loader.dataset)    accuracy = 100. * running_correct/len(data_loader.dataset)    print(f&quot;Train loss is &#123;loss:&#123;5&#125;.&#123;2&#125;&#125; and Train accuracy is &#123;accuracy:&#123;10&#125;.&#123;4&#125;&#125; %&quot;)    return loss, accuracy# 模型评估函数def fit_eval(model,data_loader):    model.eval()    running_loss = 0    running_correct = 0    for batch_idx, (data, target) in enumerate(data_loader):        if is_cuda:            data, target = data.cuda(), target.cuda()        output = model(data)        loss = F.nll_loss(output, target)        running_loss += F.nll_loss(output ,target ,reduction=&#x27;sum&#x27;).item()        preds = output.data.max(1, keepdim=True)[1]        running_correct += preds.eq(target.data.view_as(preds)).cpu().sum()    loss = running_loss/len(data_loader.dataset)    accuracy = 100. * running_correct/len(data_loader.dataset)    print(f&quot;Eval loss is &#123;loss:&#123;5&#125;.&#123;2&#125;&#125; and Eval accuracy is &#123;accuracy:&#123;10&#125;.&#123;4&#125;&#125; %&quot;)    return loss, accuracy\n初始化模型和配置优化函数。\nmodel = MnistNet()if is_cuda:    model.cuda()optimizer = optim.SGD(model.parameters(),lr=0.01)\n模型评估train_losses, train_accuracy = [], []val_losses, val_accuracy = [], []for epoch in range(1, 20):    epoch_loss, epoch_accuracy = fit_train(model, mnist_train_loader)    val_epoch_loss, val_epoch_accuracy = fit_eval(model, mnist_test_loader)    train_losses.append(epoch_loss)    train_accuracy.append(epoch_accuracy)    val_losses.append(val_epoch_loss)    val_accuracy.append(val_epoch_accuracy)\n\nplt.plot(range(1,len(train_losses)+1),train_losses,&#x27;bo&#x27;,label = &#x27;training loss&#x27;)plt.plot(range(1,len(val_losses)+1),val_losses,&#x27;r&#x27;,label = &#x27;validation loss&#x27;)plt.legend()plt.show()\n\nplt.plot(range(1,len(train_accuracy)+1),train_accuracy,&#x27;bo&#x27;,label = &#x27;train accuracy&#x27;)plt.plot(range(1,len(val_accuracy)+1),val_accuracy,&#x27;r&#x27;,label = &#x27;val accuracy&#x27;)plt.legend()plt.show()\n\n探索完整代码已上传Github，有需要的可以自行下载代码，如果对你有帮助，请Star，哈哈哈哈！\n到此为止，已经可以使用自己数据玩耍各种Demo，快（苦）乐（逼）地进行炼丹之路。道路阻且长，行则将至，但行好事莫问前程。\n\nDemo模型还可以修改Loss函数，优化函数和模型结构？\n除了Torchinfo，是否可以使用Tensorboard等可视化工具管理模型数据的可视化数据呢？\nDemo模型离生产上线还有多远距离？如何上线部署模型呢？如何实现从Mysql等数据库到线上模型调用呢？任何的代码是否都应该以能够终端使用为目的？\n\n正如，人往往会对未知的事情产生恐惧，因为结局是未知的。所以当一切不再未知的时候，那么是不是就不会产生恐惧呢？\n","categories":["深度学习"],"tags":["Pytorch_Tutorial"]},{"title":"Pytorch深度学习教程(3):Mnist图像分类TorchServe部署","url":"/2022/06/06/Pytorch%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E6%95%99%E7%A8%8B-3-Mnist%E5%9B%BE%E5%83%8F%E5%88%86%E7%B1%BBTorchServe%E9%83%A8%E7%BD%B2/","content":"简介经过前 两篇博客 学习，我们已可使用CNN模型完成Mnist手写数字分类模型，对于算法从数据处理、模型构建、模型训练和评估链路有初步认知。但工业可能需要部署离线在线模型用于提供模型推理服务，所谓模型推理服务是指在系统配置训练完成机器学习模型，以便其可接受新的输入并将推理结果返回给系统。\n其次，虽然很多大厂都会有封装好部署平台供算法人员便捷配置，但是学习中对于完整的工程链路开发对于个人能力建设也是非常重要的，而不是仅仅作为一颗螺丝钉，如何实现从Demo模型转换成线上模型推理服务部署，对于个人的正向反馈也是非常有意义的。\n那么针对 PyTorch 训练好 Demo 模型，如何部署到生产环境用于提供模型推理服务呢？部署形式非常多样，其中 TorchServe 是 PyTorch开源项目 部分，是AWS和Facebook合作开发的用于部署Pytorch的模型，对于算法工程师是相当友好的。本章介绍如何使用TorchServe完成PyTorch模型的部署和调用。\nTorchServe简介Torchserve是PyTorch的首选模型部署解决方案。它允许为模型公开一个可供直接访问或者应用程序访问的WebAPI，借助TorchServe，PyTorch用户可以更快地将其模型应用于生产，而无需编写自定义代码，此外，TorchServe将工程开发和算法开发进行解耦，算法工程师主要完成数据Process和模型构建这一擅长领域，其他的多模型服务、A/B测试的版本控制、监视指标以及应用程序集成RESTful都已封装好。\n\nTorchServe is a performant, flexible and easy to use tool for serving PyTorch eager mode and torschripted models.\n\n官网介绍可看出：TorchServe是一款性能好、灵活性好、易使用的工具，其次可部署模型类型是Pytorch的Eager模式和Script模式模型。\nTorchServe框架主要分为四个部分：Frontend是TorchServe的请求和响应的处理部分；WorkerProcess 指的是一组运行的模型实例，可以由管理API设定运行的数量；Model Store是模型存储加载的地方；Backend用于管理Worker Process，具体可参考下图里。\n\n环境安装本人使用 Windows11+WSL2+Ubuntu 环境进行部署。\nConda配置官网要求：Python Version &gt;= 3.8，本文使用Conda管理深度学习环境，具体使用可参考之前的博文：深度学习管理配置。\n\n可使用下述命令创建Conda的Python环境（python版本为3.8，环境名为ts_ENV）和激活指定环境(ts_env)。\nconda create --name ts_env python=3.8conda activate ts_env\nTS源码安装可参考官网TS安装文档。\ngit clone https://github.com/pytorch/serve.gitcd serve./ts_scripts/setup_wsl_ubuntuexport PATH=$HOME/.local/bin:$PATHpython ./ts_scripts/install_from_src.pyecho &#x27;export PATH=$HOME/.local/bin:$PATH&#x27; &gt;&gt; ~/.bashrcsource ~/.bashrc\n模型打包TorchServe 的一个关键特性是能够将所有模型工件打包到单个模型存档文件中。它是一个单独的命令行界面 (CLI)，torch-model-archiver，可以使用 state_dict 获取模型检查点或模型定义文件，并将它们打包成 .mar 文件。 然后，任何使用 TorchServe 的人都可以重新分发和提供该文件。它包含以下模型工件：在 torchscript 或模型定义文件的情况下的模型检查点文件和在急切模式的情况下的 state_dict 文件，以及服务模型可能需要的其他可选资产。 CLI 创建一个 .mar 文件，TorchServe 的服务器 CLI 使用该文件为模型提供服务。\ntorch-model-archiver 命令来打包模型，需要提供以下三个文件。\n第 1 步：创建一个新的模型架构文件，其中包含从 torch.nn.modules 扩展的模型类。在这个例子中，我们创建了mnist模型文件mnist_model.py文件。\nimport torchfrom torch import nnimport torch.nn.functional as F# 构建网络class MnistClassificationNet(nn.Module):    def __init__(self):        super().__init__()        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3)        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3)        self.max_pool2d = nn.MaxPool2d(kernel_size=2)        self.dropout = nn.Dropout2d(p=0.25)        # self.relu = nn.ReLU()        self.fc1 = nn.Linear(in_features=9216, out_features=128)        self.fc2 = nn.Linear(in_features=128, out_features=10)        self.log_softmax = nn.LogSoftmax(dim=1)    def forward(self, x):        x = self.conv1(x)        x = F.relu(x)        x = self.conv2(x)        x = F.relu(x)        x= self.max_pool2d(x)        x = self.dropout(x)        x = x.view(-1, 9216)        x = self.fc1(x)        x = F.relu(x)        x = self.fc2(x)        x= self.log_softmax(x)        return x\n第 2 步：使用 mnist_sd 训练 MNIST 数字识别模型并保存模型的状态字典。\ntorch.save(model.state_dict(), &quot;./checkpoints/model_pth/mnist_sd.pt&quot;)\n第 3 步：编写自定义处理程序以在您的模型上运行推理。 在此示例中，我们添加了一个 mnist_handler.py 文件，它使用上述模型对输入灰度图像进行推理并识别图像中的数字。\nfrom torchvision import transformsfrom ts.torch_handler.image_classifier import ImageClassifierfrom torch.profiler import ProfilerActivityclass MNISTDigitClassifier(ImageClassifier):    &quot;&quot;&quot;    MNISTDigitClassifier handler class. This handler extends class ImageClassifier from image_classifier.py, a    default handler. This handler takes an image and returns the number in that image.    Here method postprocess() has been overridden while others are reused from parent class.    &quot;&quot;&quot;    image_processing = transforms.Compose([        transforms.ToTensor()        # transforms.Normalize((0.1307,), (0.3081,))    ])    def __init__(self):        super(MNISTDigitClassifier, self).__init__()        self.profiler_args = &#123;            &quot;activities&quot; : [ProfilerActivity.CPU],            &quot;record_shapes&quot;: True,        &#125;    def postprocess(self, data):        &quot;&quot;&quot;The post process of MNIST converts the predicted output response to a label.        Args:            data (list): The predicted output from the Inference with probabilities is passed            to the post-process function        Returns:            list : A list of dictionaries with predictions and explanations is returned        &quot;&quot;&quot;        return data.argmax(1).tolist()        # return data.tolist()\n第 4 步：使用 torch-model-archiver 程序创建一个 Torch 模型存档以存档上述文件。\n$ torch-model-archiver -husage: torch-model-archiver [-h] --model-name MODEL_NAME  --version MODEL_VERSION_NUMBER                      --model-file MODEL_FILE_PATH --serialized-file MODEL_SERIALIZED_PATH                      --handler HANDLER [--runtime &#123;python,python2,python3&#125;]                      [--export-path EXPORT_PATH] [-f] [--requirements-file]Model Archiver Tooloptional arguments:  -h, --help            show this help message and exit  --model-name MODEL_NAME                        Exported model name. Exported file will be named as                        model-name.mar and saved in current working directory                        if no --export-path is specified, else it will be                        saved under the export path  --serialized-file SERIALIZED_FILE                        Path to .pt or .pth file containing state_dict in                        case of eager mode or an executable ScriptModule                        in case of TorchScript.  --model-file MODEL_FILE                        Path to python file containing model architecture.                        This parameter is mandatory for eager mode models.                        The model architecture file must contain only one                        class definition extended from torch.nn.modules.  --handler HANDLER     TorchServe&#x27;s default handler name  or handler python                        file path to handle custom TorchServe inference logic.  --extra-files EXTRA_FILES                        Comma separated path to extra dependency files.  --runtime &#123;python,python2,python3&#125;                        The runtime specifies which language to run your                        inference code on. The default runtime is                        RuntimeType.PYTHON. At the present moment we support                        the following runtimes python, python2, python3  --export-path EXPORT_PATH                        Path where the exported .mar file will be saved. This                        is an optional parameter. If --export-path is not                        specified, the file will be saved in the current                        working directory.  --archive-format &#123;tgz,default&#125;                        The format in which the model artifacts are archived.                        &quot;tgz&quot;: This creates the model-archive in &lt;model-name&gt;.tar.gz format.                        If platform hosting requires model-artifacts to be in &quot;.tar.gz&quot;                        use this option.                        &quot;no-archive&quot;: This option creates an non-archived version of model artifacts                        at &quot;export-path/&#123;model-name&#125;&quot; location. As a result of this choice,                        MANIFEST file will be created at &quot;export-path/&#123;model-name&#125;&quot; location                        without archiving these model files                        &quot;default&quot;: This creates the model-archive in &lt;model-name&gt;.mar format.                        This is the default archiving format. Models archived in this format                        will be readily hostable on TorchServe.  -f, --force           When the -f or --force flag is specified, an existing                        .mar file with same name as that provided in --model-                        name in the path specified by --export-path will                        overwritten  -v, --version         Model&#x27;s version.  -r, --requirements-file                        Path to requirements.txt file containing a list of model specific python                        packages to be installed by TorchServe for seamless model serving.\ntorch-model-archiver --model-name mnist --version 1.0 --model-file mnist_model.py --serialized-file mnist_sd.pt --export-path ./model_store --handler mnist_handler.py -f\n模型部署$ torchserve --helpusage: torchserve [-h] [-v | --version]                          [--start]                          [--stop]                          [--ts-config TS_CONFIG]                          [--model-store MODEL_STORE]                          [--workflow-store WORKFLOW_STORE]                          [--models MODEL_PATH1 MODEL_NAME=MODEL_PATH2... [MODEL_PATH1 MODEL_NAME=MODEL_PATH2... ...]]                          [--log-config LOG_CONFIG]torchservemandatory arguments:  --model-store MODEL_STORE                        Model store location where models can be loaded  optional arguments:  -h, --help            show this help message and exit  -v, --version         Return TorchServe Version  --start               Start the model-server  --stop                Stop the model-server  --ts-config TS_CONFIG                        Configuration file for TorchServe  --models MODEL_PATH1 MODEL_NAME=MODEL_PATH2... [MODEL_PATH1 MODEL_NAME=MODEL_PATH2... ...]                        Models to be loaded using [model_name=]model_location                        format. Location can be a HTTP URL, a model archive                        file or directory contains model archive files in                        MODEL_STORE.  --log-config LOG_CONFIG                        Log4j configuration file for TorchServe  --ncs, --no-config-snapshots                                 Disable snapshot feature  --workflow-store WORKFLOW_STORE                        Workflow store location where workflow can be loaded. Defaults to model-store\n启动torchserve服务torchserve --start --ncs --model-store model_store --models mnist.mar\n模型启动日志如下截图：\n\n推理健康检查APIcurl http://localhost:8080/ping\n如果server正常运行, 响应会如截图所示：\n\n推理curl http://127.0.0.1:8080/predictions/mnist -T ./data/test.png\ntest.png为数字为0的图片，通过上述的调用推理，可以看出结果是能正常返回的，是可以作为下游应用调用。\n\n停止torchserve服务torchserve --start\n探索完整代码已上传Github，有需要的可以自行下载代码，如果对你有帮助，请Star，哈哈哈哈！\n到此为止，已经可以使用自己数据玩耍各种Demo，快（苦）乐（逼）地进行炼丹之路。道路阻且长，行则将至，但行好事莫问前程。\n\n除了使用TorchServe部署模型，还有其他的解决方案吗？\n除了使用提供这种Web API的形式，是否可以构建一个GUI的形式提供呢？例如 PYQT5 ？这里放一张PYQT5的图，后面会填坑。\n\n\n正如，人往往会对未知的事情产生恐惧，因为结局是未知的。所以当一切不再未知的时候，那么是不是就不会产生恐惧呢？\n参考More info: Chenglu：如何部署PyTorch模型More info: TorchServeMore info: TorchServe_Mnist ExampleMore info: 随便写点笔记More info: PyTorch Eager mode and Script modeMore info: Self-host your 🤗HuggingFace Transformer NER model with Torchserve + StreamlitMore info: TorchServe搭建codeBERT分类模型服务More info: torchserver模型本地部署和docker部署\n","tags":["Pytorch_Tutorial"]},{"title":"Pytorch深度学习教程(4):序列模型","url":"/2022/06/12/Pytorch%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E6%95%99%E7%A8%8B-4-%E5%BA%8F%E5%88%97%E6%A8%A1%E5%9E%8B/","content":"序列模型介绍循环神经网络（Recurrent Neural Network，RNN）是一种用于处理序列数据的神经网络。相比一般的神经网络来说，他能够处理序列变化的数据。比如某个单词的意思会因为上文提到的内容不同而有不同的含义，RNN就能够很好地解决这类问题。RNN也可以堆叠在一起。需要注意的是，堆中的每个RNN都有自己的权重矩阵。因此，权重矩阵在水平轴（时间轴）上共享，而不是在垂直轴（RNN的数量）上共享。\n\n因RNN模型训练采用BPTT算法(BackPropagation Through Time)，导致RNN有一个主要的缺点，即所谓的梯度消失和梯度爆炸，而梯度爆炸问题可以通过梯度裁剪，去抑制，梯度消失的问题是随着反向传播，梯度越来越小，在这些网络中，随着时间步长的增加，反向传播使梯度变得越来越小，导致梯度消失。进而导致RNN长期依赖问题。注意，裁剪仅限制梯度的大小，而不限制其方向。所以，学习仍然朝着正确的方向前进。\nhttps://www.zhihu.com/question/279046805/answer/1153623199\n","tags":["Pytorch_Tutorial"]},{"title":"Windows系统使用Conda配置深度学习环境","url":"/2022/05/15/Windows%E7%B3%BB%E7%BB%9F%E4%BD%BF%E7%94%A8Conda%E9%85%8D%E7%BD%AE%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E7%8E%AF%E5%A2%83/","content":"Conda环境管理Conda是一个开源软件包管理系统和环境管理系统，可以管理不同Python版本环境，不同的环境之间是互相隔离，互不影响的。\n查看环境# 查看当前环境conda info --env\n克隆环境# 假设已有环境名为A，需要生成的环境名为B：conda create -n B --clone A# 如果特殊环境为Base，需要采用以下方式conda update condaconda create -n &lt;my_env&gt; --clone rootconda create -n torch_env --clone rootconda install pytorch=0.4.0 cuda90 -c pytorch# 用于复制环境到新的机器conda list --explicit &gt; spec-file.txtconda create --name &lt;my_env&gt; --file spec-file.txt\n创建环境# 创建一个环境名为py34，指定Python版本是3.4 #（不用管是3.4.x，conda会为我们自动寻找3.4.x中的最新版本） conda create --name py34 python=3.4 # 通过创建环境，我们可以使用不同版本的Python conda create --name py27 python=2.7\n激活环境# 在windows环境下使用activate激活 activate py34# 在Linux &amp; Mac中使用source activate激活 source activate py34 \n退出环境# 在windows环境下使用deactivate &lt;my_env&gt;# 在Linux &amp; Mac中使用source deactivate &lt;my_env&gt;\n删除环境# 如果你不想要这个名为py34的环境，可以通过以下命令删除这个环境。 conda remove -n py34 --all # 可以通过以下命令查看已有的环境列表，现在py34已经不在这个列表里。 conda info -e\n配置镜像# 显示目前的channels conda config --show channels # 切换默认镜像源 conda config --remove-key channels# 删除指定channel conda config --remove channels_URL # Windows 用户无法直接创建名为 .condarc 的文件，使用以下命令 C:\\Users\\用户名\\.condarcconda config --set show_channel_urls yes# 中科大镜像源 conda config --add channels https://mirrors.ustc.edu.cn/anaconda/pkgs/main/ conda config --add channels https://mirrors.ustc.edu.cn/anaconda/pkgs/free/conda config --add channels https://mirrors.ustc.edu.cn/anaconda/cloud/conda-forge/ conda config --add channels https://mirrors.ustc.edu.cn/anaconda/cloud/msys2/ conda config --add channels https://mirrors.ustc.edu.cn/anaconda/cloud/bioconda/ conda config --add channels https://mirrors.ustc.edu.cn/anaconda/cloud/menpo/ conda config --add channels https://mirrors.ustc.edu.cn/anaconda/cloud/  # 北京外国语大学源 conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/pkgs/main conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/pkgs/free conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/pkgs/r conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/pkgs/pro conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/pkgs/msys2# 清华源 conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/mainconda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/pro conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2 \nTensorflow环境安装查看Tensorflow版本和安装指定版本。\n# 查询tensorflow-gpu的版本conda search tensorflow-gpu# 指定版本进行安装conda install tensorflow-gpu==1.13.1\n\n安装过程中会安装cudatoolkit-10.0.130和cudnn-7.6.5。\n\n执行下述命令查看tf版本和GPU是否生效。\n查看是否安装成功import tensorflow as tf# 打印Tensorflow版本print(tf.__version__)# 打印是否支持GPUprint(tf.test.is_gpu_available())\n根据图片打印结果，成功安装tf 1.13.1 版本，同时GPU安装生效。\n\nPytorch环境安装执行下述命令安装Pytorch。\nconda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=10.1 -c pytorch\n执行下述命令查看torch版本和GPU是否生效。\nimport torchprint(torch.__version__)print(torch.cuda.is_available())\n根据图片打印结果，成功安装Pytorch 1.7.0 版本，同时GPU安装生效。\n\n参考\nMore info: 阿尔发go：conda常用命令\n\nMore info: 卖萌哥：conda的安装与使用\n\nMore info: 无聊就看书：Tensorflow-gpu1.13.1 和 Tensorflow-gpu2.0.0共存之安装教程\n\n\n","tags":["Conda"]},{"title":"snippets","url":"/2022/05/08/snippets/","content":"本博客的目的主要是为了收集一些常用代码，或者一些有意思的代码，方便后续的工作学习使用。\nSklearn的dataset转为dataframeimport pandas as pdfrom sklearn import datasetsdef sklearn_to_df(sklearn_dataset):    df = pd.DataFrame(sklearn_dataset.data, columns=sklearn_dataset.feature_names)    df[&#x27;TARGET&#x27;] = pd.Series(sklearn_dataset.target)    return dfdf_boston = sklearn_to_df(datasets.load_boston())print(df_boston.head())\npymysql读取mysql数据转化为dataframeimport pymysqlimport pandas as pddef load_data_frame_from_mysql():    conn = pymysql.connect(host=&quot;127.0.0.1&quot;,                           port=3307,                           user=&quot;utest&quot;,                           password=&quot;123456xyq&quot;,                           db=&quot;sakila&quot;,                           charset=&quot;utf8&quot;)    sql = &quot;SELECT * FROM sakila.payment&quot;    data_frame = pd.read_sql(sql, conn)    return data_framepdata = load_data_frame_from_mysql()print(pdata.head())\nImage处理代码# -*-coding: utf-8 -*-&quot;&quot;&quot;    @Project: IntelligentManufacture    @File   : image_processing.py    @Author : panjq    @E-mail : pan_jinquan@163.com    @Date   : 2019-02-14 15:34:50&quot;&quot;&quot; import osimport globimport cv2import numpy as npimport matplotlib.pyplot as plt def show_image(title, image):    &#x27;&#x27;&#x27;    调用matplotlib显示RGB图片    :param title: 图像标题    :param image: 图像的数据    :return:    &#x27;&#x27;&#x27;    # plt.figure(&quot;show_image&quot;)    # print(image.dtype)    plt.imshow(image)    plt.axis(&#x27;on&#x27;)  # 关掉坐标轴为 off    plt.title(title)  # 图像题目    plt.show() def cv_show_image(title, image):    &#x27;&#x27;&#x27;    调用OpenCV显示RGB图片    :param title: 图像标题    :param image: 输入RGB图像    :return:    &#x27;&#x27;&#x27;    channels=image.shape[-1]    if channels==3:        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)  # 将BGR转为RGB    cv2.imshow(title,image)    cv2.waitKey(0) def read_image(filename, resize_height=None, resize_width=None, normalization=False):    &#x27;&#x27;&#x27;    读取图片数据,默认返回的是uint8,[0,255]    :param filename:    :param resize_height:    :param resize_width:    :param normalization:是否归一化到[0.,1.0]    :return: 返回的RGB图片数据    &#x27;&#x27;&#x27;     bgr_image = cv2.imread(filename)    # bgr_image = cv2.imread(filename,cv2.IMREAD_IGNORE_ORIENTATION|cv2.IMREAD_COLOR)    if bgr_image is None:        print(&quot;Warning:不存在:&#123;&#125;&quot;, filename)        return None    if len(bgr_image.shape) == 2:  # 若是灰度图则转为三通道        print(&quot;Warning:gray image&quot;, filename)        bgr_image = cv2.cvtColor(bgr_image, cv2.COLOR_GRAY2BGR)     rgb_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2RGB)  # 将BGR转为RGB    # show_image(filename,rgb_image)    # rgb_image=Image.open(filename)    rgb_image = resize_image(rgb_image,resize_height,resize_width)    rgb_image = np.asanyarray(rgb_image)    if normalization:        # 不能写成:rgb_image=rgb_image/255        rgb_image = rgb_image / 255.0    # show_image(&quot;src resize image&quot;,image)    return rgb_image def fast_read_image_roi(filename, orig_rect, ImreadModes=cv2.IMREAD_COLOR, normalization=False):    &#x27;&#x27;&#x27;    快速读取图片的方法    :param filename: 图片路径    :param orig_rect:原始图片的感兴趣区域rect    :param ImreadModes: IMREAD_UNCHANGED                        IMREAD_GRAYSCALE                        IMREAD_COLOR                        IMREAD_ANYDEPTH                        IMREAD_ANYCOLOR                        IMREAD_LOAD_GDAL                        IMREAD_REDUCED_GRAYSCALE_2                        IMREAD_REDUCED_COLOR_2                        IMREAD_REDUCED_GRAYSCALE_4                        IMREAD_REDUCED_COLOR_4                        IMREAD_REDUCED_GRAYSCALE_8                        IMREAD_REDUCED_COLOR_8                        IMREAD_IGNORE_ORIENTATION    :param normalization: 是否归一化    :return: 返回感兴趣区域ROI    &#x27;&#x27;&#x27;    # 当采用IMREAD_REDUCED模式时，对应rect也需要缩放    scale=1    if ImreadModes == cv2.IMREAD_REDUCED_COLOR_2 or ImreadModes == cv2.IMREAD_REDUCED_COLOR_2:        scale=1/2    elif ImreadModes == cv2.IMREAD_REDUCED_GRAYSCALE_4 or ImreadModes == cv2.IMREAD_REDUCED_COLOR_4:        scale=1/4    elif ImreadModes == cv2.IMREAD_REDUCED_GRAYSCALE_8 or ImreadModes == cv2.IMREAD_REDUCED_COLOR_8:        scale=1/8    rect = np.array(orig_rect)*scale    rect = rect.astype(int).tolist()    bgr_image = cv2.imread(filename,flags=ImreadModes)     if bgr_image is None:        print(&quot;Warning:不存在:&#123;&#125;&quot;, filename)        return None    if len(bgr_image.shape) == 3:  #        rgb_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2RGB)  # 将BGR转为RGB    else:        rgb_image=bgr_image #若是灰度图    rgb_image = np.asanyarray(rgb_image)    if normalization:        # 不能写成:rgb_image=rgb_image/255        rgb_image = rgb_image / 255.0    roi_image=get_rect_image(rgb_image , rect)    # show_image_rect(&quot;src resize image&quot;,rgb_image,rect)    # cv_show_image(&quot;reROI&quot;,roi_image)    return roi_image def resize_image(image,resize_height, resize_width):    &#x27;&#x27;&#x27;    :param image:    :param resize_height:    :param resize_width:    :return:    &#x27;&#x27;&#x27;    image_shape=np.shape(image)    height=image_shape[0]    width=image_shape[1]    if (resize_height is None) and (resize_width is None):#错误写法：resize_height and resize_width is None        return image    if resize_height is None:        resize_height=int(height*resize_width/width)    elif resize_width is None:        resize_width=int(width*resize_height/height)    image = cv2.resize(image, dsize=(resize_width, resize_height))    return imagedef scale_image(image,scale):    &#x27;&#x27;&#x27;    :param image:    :param scale: (scale_w,scale_h)    :return:    &#x27;&#x27;&#x27;    image = cv2.resize(image,dsize=None, fx=scale[0],fy=scale[1])    return image  def get_rect_image(image,rect):    &#x27;&#x27;&#x27;    :param image:    :param rect: [x,y,w,h]    :return:    &#x27;&#x27;&#x27;    x, y, w, h=rect    cut_img = image[y:(y+ h),x:(x+w)]    return cut_imgdef scale_rect(orig_rect,orig_shape,dest_shape):    &#x27;&#x27;&#x27;    对图像进行缩放时，对应的rectangle也要进行缩放    :param orig_rect: 原始图像的rect=[x,y,w,h]    :param orig_shape: 原始图像的维度shape=[h,w]    :param dest_shape: 缩放后图像的维度shape=[h,w]    :return: 经过缩放后的rectangle    &#x27;&#x27;&#x27;    new_x=int(orig_rect[0]*dest_shape[1]/orig_shape[1])    new_y=int(orig_rect[1]*dest_shape[0]/orig_shape[0])    new_w=int(orig_rect[2]*dest_shape[1]/orig_shape[1])    new_h=int(orig_rect[3]*dest_shape[0]/orig_shape[0])    dest_rect=[new_x,new_y,new_w,new_h]    return dest_rect def show_image_rect(win_name,image,rect):    &#x27;&#x27;&#x27;    :param win_name:    :param image:    :param rect:    :return:    &#x27;&#x27;&#x27;    x, y, w, h=rect    point1=(x,y)    point2=(x+w,y+h)    cv2.rectangle(image, point1, point2, (0, 0, 255), thickness=2)    cv_show_image(win_name, image) def rgb_to_gray(image):    image = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)    return image def save_image(image_path, rgb_image,toUINT8=True):    if toUINT8:        rgb_image = np.asanyarray(rgb_image * 255, dtype=np.uint8)    if len(rgb_image.shape) == 2:  # 若是灰度图则转为三通道        bgr_image = cv2.cvtColor(rgb_image, cv2.COLOR_GRAY2BGR)    else:        bgr_image = cv2.cvtColor(rgb_image, cv2.COLOR_RGB2BGR)    cv2.imwrite(image_path, bgr_image) def combime_save_image(orig_image, dest_image, out_dir,name,prefix):    &#x27;&#x27;&#x27;    命名标准：out_dir/name_prefix.jpg    :param orig_image:    :param dest_image:    :param image_path:    :param out_dir:    :param prefix:    :return:    &#x27;&#x27;&#x27;    dest_path = os.path.join(out_dir, name + &quot;_&quot;+prefix+&quot;.jpg&quot;)    save_image(dest_path, dest_image)     dest_image = np.hstack((orig_image, dest_image))    save_image(os.path.join(out_dir, &quot;&#123;&#125;_src_&#123;&#125;.jpg&quot;.format(name,prefix)), dest_image)\napt-get卸载软件# 删除软件及其配置文件apt-get --purge remove openjdk-11-jdk# 删除没用的依赖包sudo apt-get autoremove openjdk-11-jdk# 此时dpkg的列表中有“rc”状态的软件包，可以执行如下命令做最后清理：dpkg -l |grep ^rc|awk &#x27;&#123;print $2&#125;&#x27; |sudo xargs dpkg -P\n","categories":["Snippet"],"tags":["Snippet","Python"]},{"title":"使用GithubPages搭建个人静态博客","url":"/2022/05/01/%E4%BD%BF%E7%94%A8GithubPages%E6%90%AD%E5%BB%BA%E4%B8%AA%E4%BA%BA%E9%9D%99%E6%80%81%E5%8D%9A%E5%AE%A2/","content":"原理GitHub Pages 是使用 GitHub 存储仓库托管静态网站，使用 YAML 和 Markdown 等标准技术，任何人都可以在几分钟内生成和维护网站，但它不仅仅是静态文件的集合。通过利用 Jekyll 和 Liquid 等网站生成技术，开发人员可定义完整静态网站的动态模板。每次将更改提交到与网站关联的源代码分支时，都会使用源代码分支的最新代码配置重新生成静态网页，Github 自动将其发布到目标 URL。欢迎关注：我的博客。\n环境配置操作系统：Windows11 -&gt; WSL2 -&gt; Ubuntu 22.04 LTSNode.js: v16.15.0Npm: 8.5.5\n操作部署安装NVM在安装nvm之前简单介绍下nvm，nodejs和npm之间的关系。\n\nnvm：nodejs 的版本管理工具。\nnodejs：项目开发所需要的代码库。\nnpm：nodejs 包管理工具，npm 管理nodejs中的第三方插件。\n\n参考 NPM-GITHUB，使用以下命令安装nvm。\nwget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.1/install.sh | bash\n安装过程Info截图如下：\n\n判断nvm安装是否成功,可以使用以下命令查看nvm的版本信息。\nnvm -v\n\n安装NODEJS使用 nvm install 命令可以安装指定版本的NodeJs，本次安装v16版本，执行以下命令。\nnvm install 16\n通过下述命令查看node和npm的版本。\nnode -vnpm -v\n可查看到Node和npm的版本。\n\n安装Hexo和配置安装Hexo命令行Hexo是一个快速、简洁且高效的博客框架，官方提供一个命令行工具，用于快速创建项目、页面、编译、部署Hexo博客，安装命令行工具。\nnpm install -g hexo-cli\n初始化本地运行接下来我们使用 Hexo 命令行创建一个项目，并将其在本地跑起来，整体跑通看看。\n首先使用如下命令创建项目：hexo init {blog_name}，这里的 blog_name 就是博客名，我这里要创建 hexo_blog 的博客，我就把项目取名为 hexo_blog ，命令如下：hexo init hexo_blog。\nhexo init &#123;blog_name&#125;cd &#123;blog_name&#125;npm installhexo server\nhexo_blog 文件夹下就会出现 Hexo 的初始化文件，包括 themes、scaffolds、source 等文件夹。\n\n接下来我们首先进入新生成的文件夹里面，然后调用 Hexo 的 generate 命令，将 Hexo 编译生成 HTML 代码，命令如下：hexo generate可以看到输出结果里面包含了 js、css、font 等内容，并发现他们都处在了项目根目录下的 public 文件夹下面了。\n然后使用 Hexo 提供的 server 命令把博客在本地运行起来，命令如下：hexo server 运行之后命令行输出如下：\nINFO  Start processingINFO  Hexois running at http://localhost:4000 . Press Ctrl+C to stop.\n本地 4000 端口上就可以查看博客站点，如图所示：\n\n主题配置登录官网Hexo官网,然后点击主题页面Theme主题，选择一个喜欢主题，我这边选择的主题是keep主题。执行安装命令，会将主题文件安装在node_modules文件夹中。\nnpm install hexo-theme-keep --save\n\n执行hexo server命令执行本地查看。\n可以看出此时数据已经更新成keep主题。\n部署到Github执行部署到Github需要使用hexo-deployer-git插件\nnpm install hexo-deployer-git --save\n安装好hexo-deployer-git后可以进行文件的配置。具体配置如下：\n\n在这之前需要github配置相应的仓库，我配置的仓库名为：yqstar.github.io.\n\n\n参考MoreInfo:hexo-asset-image插件MoreInfo:如何使用本地插入图片\n"}]