TDengine+OpenVINO+AIxBoard，助力時序數據分類

英(ying)特爾軟件(jian)架構師, 馮偉

2023-09-28 / 技術文章 - 時序數據庫, 精選

時間序列數據分析在工業，能源，醫療，交通，金融，零售等多個領域都有廣泛應用。其中時間序列數據分類是分析時序數據(ju)的常(chang)見任(ren)務之一。本文(wen)將(jiang)通(tong)過一個具(ju)體的案例，介(jie)紹 Intel 團隊如何使用 TDengine 作(zuo)為基(ji)礎軟件存儲(chu)實驗數據(ju)，并通(tong)過 TDengine 高效的查詢(xun)能力(li)在(zai) OpenVINO 部(bu)署深度學習模型，最終(zhong)在(zai) AIxBoard 開發板上實時運行分類任(ren)務。

模型簡介

近(jin)年來機器學習(xi)和深度(du)學習(xi)在(zai)時序數據分(fen)類任務(wu)中(zhong)取(qu)得了(le)顯(xian)著進(jin)展，HIVE-COTE 和 InceptionTime 模型都取(qu)得了(le)不錯的成(cheng)果(guo)。相比(bi)基(ji)于 Nearest Neighbor 和 DTW 算法的 HIVE-COTE 模型，基(ji)于一維卷積 (Conv1D) 的 InceptionTime 模型成(cheng)果(guo)更為顯(xian)著，其在(zai)極(ji)大降(jiang)低計算復雜度(du)的基(ji)礎(chu)上，還達到了(le)與 HIVE-COTE 相當的分(fen)類精度(du)。

如下圖所示，Inception 模(mo)塊是 InceptionTime 模(mo)型(xing)的基本(ben)組成(cheng)模(mo)塊，由(you)多個一(yi)維卷積 (Conv1D) 操作(zuo)堆疊，并于殘差連接(jie)而成(cheng)。

TDengine+OpenVINO+AIxBoard，助力時序數據分類 - TDengine Database 時序數據庫

完(wan)整的 InceptionTime 模型由多個 Inception 模塊連接(jie)而成。

關于 InceptionTime 的更多細節請參考(kao)論文：。

數據集

本文采用的數據集來自，由 128 個時間序列分類任務組成。其中的 Wafer 數據集包含 1000 條訓練數據和和 6164 條測試數據，每條數據均包含標簽值和長度 152 的時間序列數據。數據通過程序提前寫入到 TDengine 中。

這里描(miao)述(shu)的時(shi)序(xu)數(shu)(shu)據(ju)是(shi)晶片(pian)生成(cheng)過(guo)程中同一個(ge)工具(ju)通過(guo)單個(ge)傳感器(qi)記錄(lu)的時(shi)間序(xu)列數(shu)(shu)據(ju)。下圖展示了正常(chang) (class 1) 和異常(chang) (class 0) 兩種(zhong)標簽對應的時(shi)序(xu)數(shu)(shu)據(ju)示例。

不難看(kan)出(chu)，這是一(yi)個標準的監督(du)學習分(fen)類任務。我(wo)們(men)希望找到(dao)一(yi)個模(mo)(mo)型，在每輸入長度(du) 152 的時(shi)序(xu)數(shu)(shu)據(ju)時(shi)，模(mo)(mo)型輸出(chu) 0 或 1，以此判斷輸入時(shi)序(xu)數(shu)(shu)據(ju)對(dui)應的晶片在生成過程是否存在異常。

模型訓練

本文中(zhong)我們(men)將使用 Wafer 數據(ju)集訓(xun)(xun)練一(yi)個 InceptionTime 模型(xing)。訓(xun)(xun)練得到的模型(xing)可以根據(ju)晶(jing)片生(sheng)產(chan)過程中(zhong)傳感(gan)器記錄的時(shi)序數據(ju)，判(pan)斷某個晶(jing)片的生(sheng)產(chan)過程是否存在異常。

InceptionTime 的作者開源了基于 tensorflow.keras 的實現，本文的模型代碼基于 InceptionTime 開源版本并集成 TDengine 支持。

首(shou)先加(jia)載(zai) Python 庫(ku)。

from os import path
import numpy as np
from sklearn import preprocessing

from tensorflow import keras
from tensorflow.keras.layers import (
    Activation, Add, BatchNormalization, Concatenate,
    Conv1D, Dense, Input, GlobalAveragePooling1D, MaxPool1D
)

from sqlalchemy import create_engine, text

然后(hou)使用 TDengine 的 SQLAlchemy 驅動加(jia)載 Wafer 數(shu)據集并進行預處理。

def readucr(conn, dbName, tableName):
    data = pd.read_sql(
        text(
            "select * from " + dbName + "." + tableName
        ),
        conn,
    )
    y = data[:, 0]
    x = data[:, 1:]
    return x, y

def load_data(db):
    engine = create_engine("taos://root:taosdata@localhost:6030/" + db)
    try:
        conn = engine.connect()
    except Exception as e:
        print(e)
        exit(1)

    if conn is not None:
        print("Connected to the TDengine ...")
    else:
        print("Failed to connect to taos")
        exit(1)
        
    x_train, y_train = readucr(conn, db + '_TRAIN.tsv')
    x_test, y_test = readucr(conn, db + '_TEST.tsv')
    n_classes = len(np.unique(y_train))
    enc = preprocessing.OneHotEncoder()
    y = np.concatenate((y_train, y_test), axis=0).reshape(-1,1)
    enc.fit(y)
    y_tr = enc.transform(y_train.reshape(-1,1)).toarray()
    y_te = enc.transform(y_test.reshape(-1,1)).toarray()
    x_tr, x_te = map(lambda x: x.reshape(x.shape[0], x.shape[1], 1), [x_train, x_test])
    return x_tr, y_tr, x_te, y_te, n_classes

x_tr, y_tr, x_te, y_te, n_classes = load_data('Wafer')

再(zai)使用 tensorflow.keras 實現 IncetionTime，并(bing)創建模型。

def inception_module(input_tensor, filters, kernel_size, bottleneck_size,
                     activation='relu', use_bottleneck=True):
    if use_bottleneck and int(input_tensor.shape[-1]) > 1:
        input_inception = Conv1D(filters=bottleneck_size, kernel_size=1, padding='same',
                                 activation=activation, use_bias=False)(input_tensor)
    else:
        input_inception = input_tensor
    kernel_size_s = [kernel_size // (2 ** i) for i in range(3)] # [40, 20, 10]
    conv_list = []
    for i in range(len(kernel_size_s)):
        conv = Conv1D(filters=filters, kernel_size=kernel_size_s[i],
                      strides=1, padding='same', activation=activation,
                      use_bias=False)(input_inception)
        conv_list.append(conv)
    max_pool = MaxPool1D(pool_size=3, strides=1, padding='same')(input_tensor)
    conv_6 = Conv1D(filters=filters, kernel_size=1, padding='same',
                      activation=activation, use_bias=False)(max_pool)
    conv_list.append(conv_6)
    x = Concatenate(axis=2)(conv_list)
    x = BatchNormalization()(x)
    x = Activation(activation='relu')(x)
    return x

def shortcut_layer(input_tensor, output_tensor):
    y = Conv1D(filters=int(output_tensor.shape[-1]), kernel_size=1,
               padding='same', use_bias=False)(input_tensor)
    y = BatchNormalization()(y)
    x = Add()([y, output_tensor])
    x = Activation(activation='relu')(x)
    return x

def build_model(input_shape, n_classes, depth=6,
                filters=32, kernel_size=40, bottleneck_size=32,
                use_residual=True):
    input_layer = Input(input_shape)
    x = input_layer
    input_res = input_layer
    for d in range(depth):
        x = inception_module(x, filters, kernel_size, bottleneck_size)
        if use_residual and d % 3 == 2:
            x = shortcut_layer(input_res, x)
            input_res = x
    gap_layer = GlobalAveragePooling1D()(x)
    output_layer = Dense(n_classes, activation="softmax")(gap_layer)
    model = keras.Model(input_layer, output_layer)
    return model

model = build_model(x_tr.shape[1:], n_classes)

model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

訓練模型：

ckpt_path = path.sep.join(['.', 'models', 'inception_wafer.h5'])

callbacks = [
    keras.callbacks.ReduceLROnPlateau(
        monitor='val_loss', factor=0.5, patience=20, min_lr=0.0001
    ),
    keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, verbose=1),
    keras.callbacks.ModelCheckpoint(
        filepath=ckpt_path, monitor='val_loss', save_best_only=True
    )
]

batch_size = 32
epochs = 500

history = model.fit(x_tr, y_tr, batch_size, epochs, verbose='auto', shuffle=True, validation_split=0.2, callbacks=callbacks)

簡單顯示一下(xia)訓練過程：

metric = 'accuracy'
plt.figure(figsize=(10, 5))
plt.plot(history.history[metric])
plt.plot(history.history['val_'+metric])
plt.title("model " + metric)
plt.ylabel(metric, fontsize='large')
plt.xlabel('epoch', fontsize='large')
plt.legend(["train", "val"], loc="best")
plt.show()
plt.close()

使用(yong)測試數據驗證(zheng)模型的推理(li)精度。

classifier = keras.models.load_model(ckpt_path)
test_loss, test_acc = classifier.evaluate(x_te, y_te)
print("Test accuracy: ", test_acc)
print("Test loss: ", test_loss)

193/193 [==============================] - 2s 11ms/step - loss: 0.0142 - accuracy: 0.9958
Test accuracy:  0.9957819581031799
Test loss:  0.014155667275190353

我們的模型在 Wafer 測試數據上取得了 99.58% 的精度。

模型轉換

為了達成使用(yong) OpenVINO Runtime 進(jin)行推理(li)計(ji)算的目的，我們(men)需要將 tensorflow 模型轉換為 OpenVINO IR 格式。

from pathlib import Path
from openvino.tools import mo
from tensorflow import keras

model = keras.models.load_model('models/inception_wafer.h5')

model_path = Path('models/inception.0_float')
model.save(model_path)

model_dir = Path("ov")
model_dir.mkdir(exist_ok=True)
ir_path = Path("ov/inception.xml")

input_shape = [1, 152, 1]

if not ir_path.exists():
    print("Exporting TensorFlow model to IR...")
    ov_model = mo.convert_model(saved_model_dir=model_path, input_shape=input_shape, compress_to_fp16=True)
    serialize(ov_model, ir_path)
else:
    print(f"IR model {ir_path} already exists.")

轉換完(wan)成后，生成的 IR 格式模型被存儲為(wei)模型定義文件 inception.xml 和二進制文件 inception.bin。

模型部署

接(jie)下來我(wo)們在 AIxBoard 開發板(ban)上(shang)部(bu)署剛剛訓練的 IncetpionTime 模(mo)型。首(shou)先將 inception.bin、inception.xml 和 Wafer_TEST.tsv 幾個文件復制(zhi)到(dao) AIxBoard 板(ban)上(shang)。

加(jia)載 Python 庫。

from pathlib import Path
import numpy as np
from openvino.runtime import Core, serialize

使用 OpenVINO 運行(xing) Inception 模型。

ir_path = Path("inception.xml")
core = Core()
model = core.read_model(ir_path)

import ipywidgets as widgets

device = widgets.Dropdown(
    options=core.available_devices + ["AUTO"],
    value='AUTO',
    description='Device:',
    disabled=False
)

device

def readucr(filename, delimiter='\t'):
    data = np.loadtxt(filename, delimiter=delimiter)
    y = data[:, 0]
    x = data[:, 1:]
    y[y==-1] = 0
    return np.expand_dims(x, axis=2), y

X, y = readucr('Wafer_TEST.tsv')

compiled_model = core.compile_model(model, device_name=device.value)

input_key = compiled_model.input(0)
output_key = compiled_model.output(0)
network_input_shape = input_key.shape

counter = 0
for idx, i in enumerate(X):
    i = np.expand_dims(i, axis=0)
    r = compiled_model(i)[output_key]
    counter += 1 if r.argmax() == y[idx] else 0

print('{:.6f}'.format(counter/len(y)))

0.995782

使用OpenVINO推理的精度跟tensorflow模型推理精度一致，同樣達到了99.58%。我們在模型轉換(huan)時(shi)將原模型數據格式壓縮為 FP16，這一操(cao)作并沒有導致精度下降。

性能測試

使(shi)用 OpenVINO 自帶(dai)的 benchmark 工(gong)具可以輕松地在 AIxBoard 上進行性(xing)能測試。

benchmark_app -m inception.xml -hint latency -d CPU

[ INFO ] First inference took 8.59 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            8683 iterations
[ INFO ] Duration:         60012.27 ms
[ INFO ] Latency:
[ INFO ]    Median:        6.44 ms
[ INFO ]    Average:       6.81 ms
[ INFO ]    Min:           6.34 ms
[ INFO ]    Max:           37.13 ms
[ INFO ] Throughput:   144.69 FPS

benchmark_app -m inception.xml -hint latency -d GPU

[ INFO ] First inference took 10.58 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['GPU.0']
[ INFO ] Count:            7151 iterations
[ INFO ] Duration:         60026.34 ms
[ INFO ] Latency:
[ INFO ]    Median:        7.50 ms
[ INFO ]    Average:       8.23 ms
[ INFO ]    Min:           7.04 ms
[ INFO ]    Max:           21.78 ms
[ INFO ] Throughput:   119.13 FPS

從上面結果可以看出，使用AIxBoard的CPU運行InceptionTime模型推理，平均時長為6.81ms。使用集成 GPU 推(tui)理，平均(jun)時長為 8.23ms。

總結

本文介紹(shao)了(le)如何利(li)(li)用(yong) TDengine 支持時(shi)間(jian)序(xu)列數(shu)據的(de)底層存儲，以及如何通過分(fen)類(lei)模型(xing) InceptionTime 在 UCR 時(shi)序(xu)數(shu)據集的(de) Wafer 分(fen)類(lei)任務上進行訓練(lian)。最后(hou)，我們(men)使(shi)用(yong) OpenVINO 將(jiang)該(gai)模型(xing)部署在 AIxBoard 開發板上，實(shi)(shi)現了(le)高效的(de)實(shi)(shi)時(shi)時(shi)序(xu)數(shu)據分(fen)類(lei)任務。希(xi)望本文的(de)內容能(neng)夠幫(bang)助大家(jia)在項目中利(li)(li)用(yong) TDengine、OpenVINO 和 AIxBoard 來解決更多的(de)時(shi)間(jian)序(xu)列分(fen)析問題。

關于 AIxBoard

英(ying)特爾開(kai)發(fa)者(zhe)套件(jian) AIxBoard（愛克斯(si)開(kai)發(fa)板(ban)）是專(zhuan)為支(zhi)(zhi)(zhi)持入(ru)門級邊(bian)緣(yuan) AI 應用(yong)(yong)(yong)程序和(he)設備而設計，能(neng)夠滿足人工智(zhi)能(neng)學(xue)習(xi)、開(kai)發(fa)、實訓等(deng)(deng)應用(yong)(yong)(yong)場景。該開(kai)發(fa)板(ban)是類(lei)樹(shu)莓派的(de) x86 主(zhu)機，可(ke)(ke)(ke)支(zhi)(zhi)(zhi)持 Linux Ubuntu 及完(wan)整版 Windows 操作系統，板(ban)載一顆英(ying)特爾 4 核(he)處理器，最高運(yun)行頻(pin)率可(ke)(ke)(ke)達(da) 2.9 GHz，且內(nei)置(zhi)核(he)顯（iGPU），板(ban)載 64GB eMMC 存儲及 LPDDR4x 2933MHz（4GB/6GB/8GB），內(nei)置(zhi)藍(lan)牙和(he) Wi-Fi 模(mo)組，支(zhi)(zhi)(zhi)持 USB 3.0、HDMI 視頻(pin)輸出、3.5mm 音頻(pin)接口(kou)，1000Mbps 以(yi)太網口(kou)，完(wan)全可(ke)(ke)(ke)把它作為一臺 mini 小電腦來看(kan)待，且其可(ke)(ke)(ke)集成一塊 Arduino Leonardo 單片機，可(ke)(ke)(ke)外拓各種傳感器模(mo)塊。此外，其接口(kou)與 Jetson Nano 載板(ban)兼容，GPIO 與樹(shu)莓派兼容，能(neng)夠最大限(xian)度(du)地復用(yong)(yong)(yong)樹(shu)莓派、Jetson Nano 等(deng)(deng)生態資源，無論(lun)是攝像(xiang)頭(tou)物體(ti)識別，3D 打印，還是 CNC 實時插(cha)補控(kong)制都能(neng)穩定運(yun)行，不僅可(ke)(ke)(ke)作為邊(bian)緣(yuan)計算引(yin)擎用(yong)(yong)(yong)于人工智(zhi)能(neng)產品驗證、開(kai)發(fa)，也可(ke)(ke)(ke)作為域控(kong)核(he)心用(yong)(yong)(yong)于機器人產品開(kai)發(fa)。

產品鏈接：

關于 TDengine

TDengine 核心是一款高性能、集群開源、云原生的時序數據庫（Time Series Database，TSDB），專(zhuan)為(wei)物聯網、工業互聯網、電力(li)、IT 運(yun)維(wei)等場景(jing)設(she)計(ji)并優(you)化，具有(you)極強的(de)(de)(de)彈性(xing)伸(shen)縮能力(li)。同(tong)時它還帶有(you)內(nei)建的(de)(de)(de)緩存、流式計(ji)算、數(shu)(shu)據訂閱等系(xi)統(tong)功(gong)能，能大幅減少系(xi)統(tong)設(she)計(ji)的(de)(de)(de)復雜度，降低研發和(he)運(yun)營成本(ben)，是一個高性(xing)能、分布式的(de)(de)(de)物聯網、工業大數(shu)(shu)據平臺(tai)。當前 TDengine 主要提供兩大版本(ben)，分別是支持(chi)私有(you)化部署的(de)(de)(de) TDengine Enterprise 以(yi)及全托管的(de)(de)(de)物聯網、工業互聯網云(yun)服務平臺(tai) TDengine Cloud，兩者在開源時序數(shu)(shu)據庫 TDengine OSS 的(de)(de)(de)功(gong)能基礎上有(you)更多加強，用戶可(ke)根據自身(shen)業務體量和(he)需(xu)求進行版本(ben)選擇。

關于作者

馮(feng)偉，英(ying)(ying)特爾(er)軟(ruan)(ruan)件架(jia)構師，16 年軟(ruan)(ruan)件研發經驗，涵蓋瀏覽器(qi)、計(ji)算機(ji)視覺、虛擬機(ji)等多個(ge)領域。2015 年加入英(ying)(ying)特爾(er)，近年來專注(zhu)于邊(bian)緣(yuan)計(ji)算、深度學習模(mo)型落(luo)地，以及時序數據分析等方向。

物聯網

工業互聯網

車聯網

電力

IT運維

金融

文檔

博客

資源

活動

TDengine TSDB-OSS

知識庫

開發者論壇

集成與解決方案伙伴

渠道伙伴

云服務伙伴

技術伙伴

社區伙伴

技術生態解決方案

TDengine+OpenVINO+AIxBoard，助力時序數據分類

模型簡介

數據集

模型訓練

模型轉換

模型部署

性能測試

總結

關于 AIxBoard

關于 TDengine

關于作者

IDMP 應用場景

電動汽車場景

微電網監控場景

无码人妻精品一区二区三18禁,影音先锋男人AV橹橹色,污污污污污污www网站免费,日韩成人av无码一区二区三区,欧美性受xxxx狂喷水

物聯網

工業互聯網

車聯網

電力

IT運維

金融

文檔

博客

資源

活動

TDengine TSDB-OSS

知識庫

開發者論壇

集成與解決方案伙伴

渠道伙伴

云服務伙伴

技術伙伴

社區伙伴

技術生態解決方案

TDengine+OpenVINO+AIxBoard，助力時序數據分類

模型簡介

數據集

模型訓練

模型轉換

模型部署

性能測試

總結

關于 AIxBoard

關于 TDengine

關于作者

IDMP 應用場景

電動汽車場景

微電網監控場景