-
-
Notifications
You must be signed in to change notification settings - Fork 433
Open
Labels
questionFurther information is requestedFurther information is requested
Description
You must follow the issue template and provide as much information as possible. otherwise, this issue will be closed.
请按照 issue 模板要求填写信息。如果没有按照 issue 模板填写,将会忽略并关闭这个 issue
Check List
Thanks for considering to open an issue. Before you submit your issue, please confirm these boxes are checked.
You can post pictures, but if specific text or code is required to reproduce the issue, please provide the text in a plain text format for easy copy/paste.
- [Y] I have searched in existing issues but did not find the same one.
- [Y ] I have read the documents
Environment
- Debian 11
- Python3.6.8
- requirements.txt:
cudatoolkit 10.0.130 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
cudnn 7.6.5 cuda10.0_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
kashgari 1.1.5 pypi_0 pypi
keras 2.3.1 pypi_0 pypi
keras-applications 1.0.8 py_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
keras-bert 0.89.0 pypi_0 pypi
keras-embed-sim 0.10.0 pypi_0 pypi
keras-gpt-2 0.17.0 pypi_0 pypi
keras-layer-normalization 0.16.0 pypi_0 pypi
keras-multi-head 0.29.0 pypi_0 pypi
keras-pos-embd 0.13.0 pypi_0 pypi
keras-position-wise-feed-forward 0.8.0 pypi_0 pypi
keras-preprocessing 1.1.2 pyhd3eb1b0_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
keras-self-attention 0.51.0 pypi_0 pypi
keras-transformer 0.40.0 pypi_0 pypi
numpy 1.16.4 pypi_0 pypi
numpy-base 1.19.2 py36hfa32c7d_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
tensorboard 1.14.0 py36hf484d3e_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorflow 1.14.0 gpu_py36h57aa796_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
tensorflow-addons 0.9.1 pypi_0 pypi
tensorflow-estimator 1.14.0 py_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
tensorflow-gpu 1.14.0 pypi_0 pypi
And also nvidia-smi
Tue Jan 10 11:08:55 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 On | N/A |
| 0% 39C P8 19W / 220W | 568MiB / 7979MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1288 G /usr/lib/xorg/Xorg 223MiB |
| 0 N/A N/A 1402 G /usr/bin/gnome-shell 71MiB |
| 0 N/A N/A 1698 G ...b/firefox-esr/firefox-esr 171MiB |
| 0 N/A N/A 1969 G ...b/firefox-esr/firefox-esr 3MiB |
| 0 N/A N/A 2217 G ...RendererForSitePerProcess 90MiB |
| 0 N/A N/A 9821 G ...b/firefox-esr/firefox-esr 3MiB |
+-----------------------------------------------------------------------------+
My model:
import pandas as pd
import kashgari
from kashgari.embeddings import BERTEmbedding
from kashgari.tasks.classification import BiLSTM_Model
import numpy
import os
BERT_PATH = r'/chinese_L-12_H-768_A-12'
# 初始化 Embeddings
embed = BERTEmbedding(BERT_PATH,
task=kashgari.CLASSIFICATION,
sequence_length=64, layer_nums=4)
tokenizer = embed.tokenizer
df = pd.read_excel('data.xlsx')
# 进行分词处理
df['cutted'] = df['review'].apply(lambda x: tokenizer.tokenize(x))
df["label"] = df['label'].astype("str")
# 准备训练测试数据集
train_x = list(df['cutted'][:int(len(df)*0.7)])
train_y = list(df['label'][:int(len(df)*0.7)])
valid_x = list(df['cutted'][int(len(df)*0.7):int(len(df)*0.85)])
valid_y = list(df['label'][int(len(df)*0.7):int(len(df)*0.85)])
test_x = list(df['cutted'][int(len(df)*0.85):])
test_y = list(df['label'][int(len(df)*0.85):])
# 使用 embedding 初始化模型
model = BiLSTM_Model(embed)
# 先只训练一轮
model.fit(train_x, train_y, valid_x, valid_y, batch_size=12, epochs=1)
model.evaluate(test_x, test_y, batch_size=12)
Question
Alter training the model, i got some errors:
Traceback (most recent call last):
File "train_model.py", line 41, in <module>
model.fit(train_x, train_y, valid_x, valid_y, batch_size=12, epochs=1)
File "/home/interstellar/.conda/envs/bert/lib/python3.6/site-packages/kashgari/tasks/base_model.py", line 321, in fit
**fit_kwargs)
File "/home/interstellar/.conda/envs/bert/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1433, in fit_generator
steps_name='steps_per_epoch')
File "/home/interstellar/.conda/envs/bert/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_generator.py", line 264, in model_iteration
batch_outs = batch_function(*batch_data)
File "/home/interstellar/.conda/envs/bert/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1175, in train_on_batch
outputs = self.train_function(ins) # pylint: disable=not-callable
File "/home/interstellar/.conda/envs/bert/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 3292, in __call__
run_metadata=self.run_metadata)
File "/home/interstellar/.conda/envs/bert/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1458, in __call__
run_metadata_ptr)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: Blas xGEMMBatched launch failed : a.shape=[144,64,64], b.shape=[144,64,64], m=64, n=64, k=64, batch_size=144
[[{{node Encoder-1-MultiHeadSelfAttention/Encoder-1-MultiHeadSelfAttention-Attention/MatMul}}]]
[[metrics/acc/Identity/_1711]]
(1) Internal: Blas xGEMMBatched launch failed : a.shape=[144,64,64], b.shape=[144,64,64], m=64, n=64, k=64, batch_size=144
[[{{node Encoder-1-MultiHeadSelfAttention/Encoder-1-MultiHeadSelfAttention-Attention/MatMul}}]]
0 successful operations.
0 derived errors ignored.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested