Skip to content

Conversation

@Inokinoki
Copy link
Member

@Inokinoki Inokinoki commented Jun 29, 2023

Description

In an internal worker, log the artifacts in the local Giskard home. Server is able to locate them and provides them to the frontend.

Although this is not a common case in Giskard 2.0 anymore, because only external workers can run models and log the artifacts. However, this fixes the inspections of some inherited projects from the previous version, with mlWorkerType == MLWorkerType.INTERNAL.

In the frontend, the dataset.categoryFeature could be null for some reasons. A simple check is also performed before using it in Inspector.

Related Issue

Type of Change

  • 📚 Examples / docs / tutorials / dependencies update
  • 🔧 Bug fix (non-breaking change which fixes an issue)
  • 🥂 Improvement (non-breaking change which improves an existing feature)
  • 🚀 New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to change)
  • 🔐 Security fix

Checklist

  • I've read the CODE_OF_CONDUCT.md document.
  • I've read the CONTRIBUTING.md guide.
  • I've updated the code style using make codestyle.
  • I've written tests for all new methods and classes that I created.
  • I've written the docstring in Google format for all the methods and classes that I used.

@Inokinoki Inokinoki requested a review from andreybavt June 29, 2023 10:35
@github-actions
Copy link

Please add the 'safe for build' label in order to perform the sonar analysis!

@Inokinoki
Copy link
Member Author

Do not hesitate if there is any suggestions, in both code and codestyle.

@Inokinoki Inokinoki changed the title Fix internal worker exception due to artifact logging [GSK-1384] Fix internal worker exception due to artifact logging Jun 30, 2023
@linear
Copy link

linear bot commented Jun 30, 2023

GSK-1384 Internal worker exception during model inspection

Issue Type

Bug

Source

source

Giskard Library Version

2.0.0b10

Giskard Server Version

2.0.0b10

OS Platform and Distribution

macOS 13.4.1

Python version

3.9.6

Installed python packages

absl-py==1.4.0
aiohttp==3.8.4
aiosignal==1.3.1
alabaster==0.7.13
anyascii==0.3.2
anyio==3.7.0
appnope==0.1.3
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
astroid==2.15.5
asttokens==2.2.1
astunparse==1.6.3
async-lru==2.0.2
async-timeout==4.0.2
attrs==23.1.0
Babel==2.12.1
backcall==0.2.0
bandit==1.7.5
beautifulsoup4==4.12.2
bert-score==0.3.13
black==23.3.0
bleach==6.0.0
blinker==1.6.2
CacheControl==0.13.1
cachetools==5.3.1
catboost==1.2
certifi==2023.5.7
cffi==1.15.1
cfgv==3.3.1
chardet==5.1.0
charset-normalizer==3.1.0
click==8.1.3
cloudpickle==2.2.1
colorama==0.4.6
comm==0.1.3
contourpy==1.1.0
coverage==7.2.7
cycler==0.11.0
darglint==1.8.1
databricks-cli==0.17.7
dataclasses-json==0.5.8
datasets==2.13.0
debugpy==1.6.7
decorator==5.1.1
defusedxml==0.7.1
deptry==0.11.0
dill==0.3.6
distlib==0.3.6
docker==6.1.3
docutils==0.18.1
dparse==0.6.2
eli5==0.13.0
entrypoints==0.4
evaluate==0.4.0
exceptiongroup==1.1.1
execnet==1.9.0
executing==1.2.0
fastjsonschema==2.17.1
filelock==3.12.2
findpython==0.2.5
flatbuffers==1.12
fonttools==4.40.0
fqdn==1.5.1
frozenlist==1.3.3
fsspec==2023.6.0
furo==2023.5.20
gast==0.4.0
giskard==2.0.0b10
gitdb==4.0.10
GitPython==3.1.31
google-auth==2.20.0
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
googleapis-common-protos==1.59.1
graphviz==0.20.1
grpcio==1.51.1
grpcio-status==1.48.2
grpcio-tools==1.48.2
h5py==3.8.0
httpretty==1.1.4
huggingface-hub==0.15.1
identify==2.5.24
idna==3.4
imagesize==1.4.1
imbalanced-learn==0.10.1
importlib-metadata==6.6.0
importlib-resources==5.12.0
iniconfig==2.0.0
installer==0.7.0
ipykernel==6.23.2
ipython==8.12.2
ipython-genutils==0.2.0
ipywidgets==8.0.6
isoduration==20.11.0
isort==5.12.0
jedi==0.18.2
Jinja2==3.1.2
joblib==1.2.0
json5==0.9.14
jsonpointer==2.4
jsonschema==4.17.3
jupyter==1.0.0
jupyter_client==8.2.0
jupyter-console==6.6.3
jupyter_core==5.3.1
jupyter-events==0.6.3
jupyter-lsp==2.2.0
jupyter_server==2.6.0
jupyter_server_terminals==0.4.4
jupyterlab==4.0.2
jupyterlab-pygments==0.2.2
jupyterlab_server==2.23.0
jupyterlab-widgets==3.0.7
keras==2.9.0
Keras-Preprocessing==1.1.2
kiwisolver==1.4.4
langchain==0.0.202
langchainplus-sdk==0.0.10
langdetect==1.0.9
lazy-object-proxy==1.9.0
libclang==16.0.0
lightgbm==3.3.5
livereload==2.6.3
llvmlite==0.40.1rc1
lockfile==0.12.2
Markdown==3.4.3
markdown-it-py==3.0.0
MarkupSafe==2.1.3
marshmallow==3.19.0
marshmallow-enum==1.5.1
matplotlib==3.7.1
matplotlib-inline==0.1.6
mccabe==0.7.0
mdit-py-plugins==0.4.0
mdurl==0.1.2
mistune==2.0.5
mixpanel==4.10.0
mlflow-skinny==2.4.1
mpmath==1.3.0
msgpack==1.0.5
multidict==6.0.4
multiprocess==0.70.14
mypy==1.3.0
mypy-extensions==1.0.0
mypy-protobuf==3.3.0
myst-parser==2.0.0
nbclassic==1.0.0
nbclient==0.8.0
nbconvert==7.5.0
nbformat==5.9.0
nbsphinx==0.9.2
nest-asyncio==1.5.6
networkx==3.1
nltk==3.8.1
nodeenv==1.8.0
notebook==6.5.4
notebook_shim==0.2.3
numba==0.57.0
numexpr==2.8.4
numpy==1.23.5
oauthlib==3.2.2
openapi-schema-pydantic==1.2.4
opt-einsum==3.3.0
overrides==7.3.1
packaging==23.1
pandas==1.5.3
pandocfilters==1.5.0
parso==0.8.3
pathspec==0.11.1
pbr==5.11.1
pdm==2.7.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.5.0
pip==23.1.2
platformdirs==3.5.3
plotly==5.15.0
pluggy==1.0.0
pockets==0.9.1
portalocker==2.7.0
pre-commit==3.3.3
prometheus-client==0.17.0
prompt-toolkit==3.0.38
protobuf==3.19.6
psutil==5.9.5
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==12.0.1
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycparser==2.21
pycryptodome==3.18.0
pydantic==1.10.9
pydocstyle==6.3.0
Pygments==2.15.1
PyJWT==2.7.0
pylint==2.17.4
pyngrok==6.0.0
pyparsing==3.0.9
pyproject_hooks==1.0.0
pyrsistent==0.19.3
pytest==7.3.2
pytest-cov==4.1.0
pytest-xdist==3.3.1
python-daemon==2.3.2
python-dateutil==2.8.2
python-dotenv==1.0.0
python-json-logger==2.0.7
pytz==2023.3
pyupgrade==3.6.0
PyYAML==6.0
pyzmq==25.1.0
qtconsole==5.4.3
QtPy==2.3.1
regex==2023.6.3
requests==2.31.0
requests-mock==1.11.0
requests-oauthlib==1.3.1
requests-toolbelt==1.0.0
resolvelib==1.0.1
responses==0.18.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.4.2
rsa==4.9
ruamel.yaml==0.17.32
ruamel.yaml.clib==0.2.7
ruff==0.0.272
safety==2.3.4
scikit-learn==1.0.2
scipy==1.8.1
Send2Trash==1.8.2
setuptools==67.8.0
shap==0.41.0
shellingham==1.5.0.post1
six==1.16.0
slicer==0.0.7
smmap==5.0.0
sniffio==1.3.0
snowballstemmer==2.2.0
soupsieve==2.4.1
Sphinx==6.2.1
sphinx-autoapi==2.1.1
sphinx-autobuild==2021.3.14
sphinx-basic-ng==1.0.0b1
sphinx-click==4.4.0
sphinx-copybutton==0.5.2
sphinx_design==0.4.1
sphinx-rtd-theme==1.2.2
sphinx-tabs==3.4.1
sphinxcontrib-applehelp==1.0.4
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.1
sphinxcontrib-jquery==4.1
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-napoleon==0.7
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.5
SQLAlchemy==2.0.16
sqlparse==0.4.4
stack-data==0.6.2
stevedore==5.1.0
sympy==1.12
tabulate==0.9.0
tenacity==8.2.2
tensorboard==2.9.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow-estimator==2.9.0
tensorflow-hub==0.13.0
tensorflow-macos==2.9.2
termcolor==2.3.0
terminado==0.17.1
threadpoolctl==3.1.0
tinycss2==1.2.1
tokenize-rt==5.1.0
tokenizers==0.13.3
toml==0.10.2
tomli==2.0.1
tomlkit==0.11.8
torch==2.0.1
torchdata==0.6.1
torchtext==0.15.2
tornado==6.3.2
tqdm==4.65.0
traitlets==5.9.0
transformers==4.29.2
types-protobuf==4.23.0.1
typing_extensions==4.6.3
typing-inspect==0.9.0
unearth==0.9.1
uri-template==1.2.0
urllib3==1.26.16
virtualenv==20.23.1
wcwidth==0.2.6
webcolors==1.13
webencodings==0.5.1
websocket-client==1.6.0
Werkzeug==2.3.6
wheel==0.40.0
widgetsnbextension==4.0.7
wrapt==1.15.0
xgboost==1.7.5
xxhash==3.2.0
yarl==1.9.2
zipp==3.15.0
zstandard==0.21.0

Current Behaviour?

A bug happened when inspecting/debugging a model:

'NoneType' object has no attribute 'client'
AttributeError
Traceback (most recent call last):
  File "/Users/***/Builds/giskard/python-client/giskard/ml_worker/utils/request_interceptor.py", line 50, in wrapper
    res = await loop.run_in_executor(pool, behavior, request, context)
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/***/Builds/giskard/python-client/giskard/ml_worker/server/ml_worker_service.py", line 432, in runModel
    self.ml_worker.tunnel.client.log_artifact(
AttributeError: 'NoneType' object has no attribute 'client'


### Standalone code OR list down the steps to reproduce the issue

```shell
Worker related issue:
1. Create a project that would be run by the internal worker (`mlWorkerType == MLWorkerType.INTERNAL`).
2. Upload the models and the datasets.
3. Inspect/Debug a dataset on a model.
4. Get an exception from the internal worker.

Relevant log output

2023-06-30 10:19:49,283 pid:2142 MainThread giskard.commands.cli_worker INFO     Starting ML Worker server
2023-06-30 10:19:49,283 pid:2142 MainThread giskard.commands.cli_worker INFO     Python: /Users/***/Builds/giskard/python-client/.venv/bin/python3.9 (3.9.6)
2023-06-30 10:19:49,283 pid:2142 MainThread giskard.commands.cli_worker INFO     Giskard Home: /Users/***/giskard-home
2023-06-30 10:19:54,135 pid:2142 MainThread giskard.ml_worker.ml_worker INFO     Started ML Worker server on localhost:50051
2023-06-30 10:21:52,661 pid:2142 ml_worker_thread_0 giskard.ml_worker.server.ml_worker_service INFO     Collecting ML Worker info
2023-06-30 10:22:14,166 pid:2142 ml_worker_thread_0 giskard.ml_worker.server.ml_worker_service INFO     Collecting ML Worker info
2023-06-30 10:24:39,087 pid:2142 ml_worker_thread_0 giskard.datasets.base INFO     Casting dataframe columns from {'account_check_status': 'object', 'duration_in_month': 'int64', 'credit_history': 'object', 'purpose': 'object', 'credit_amount': 'int64', 'savings': 'object', 'present_employment_since': 'object', 'installment_as_income_perc': 'int64', 'sex': 'object', 'personal_status': 'object', 'other_debtors': 'object', 'present_residence_since': 'int64', 'property': 'object', 'age': 'int64', 'other_installment_plans': 'object', 'housing': 'object', 'credits_this_bank': 'int64', 'job': 'object', 'people_under_maintenance': 'int64', 'telephone': 'object', 'foreign_worker': 'object', 'default': 'object'} to {'account_check_status': 'object', 'age': 'int64', 'credit_amount': 'int64', 'credit_history': 'object', 'credits_this_bank': 'int64', 'default': 'object', 'duration_in_month': 'int64', 'foreign_worker': 'object', 'housing': 'object', 'installment_as_income_perc': 'int64', 'job': 'object', 'other_debtors': 'object', 'other_installment_plans': 'object', 'people_under_maintenance': 'int64', 'personal_status': 'object', 'present_employment_since': 'object', 'present_residence_since': 'int64', 'property': 'object', 'purpose': 'object', 'savings': 'object', 'sex': 'object', 'telephone': 'object'}
Feature 'people_under_maintenance' is declared as 'numeric' but has 2 (<= category_threshold=2) distinct values. Are you sure it is not a 'category' feature?
2023-06-30 10:24:39,422 pid:2142 ml_worker_thread_0 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:24:39,432 pid:2142 ml_worker_thread_0 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:27:19,831 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Casting dataframe columns from {'account_check_status': 'object', 'duration_in_month': 'int64', 'credit_history': 'object', 'purpose': 'object', 'credit_amount': 'int64', 'savings': 'object', 'present_employment_since': 'object', 'installment_as_income_perc': 'int64', 'sex': 'object', 'personal_status': 'object', 'other_debtors': 'object', 'present_residence_since': 'int64', 'property': 'object', 'age': 'int64', 'other_installment_plans': 'object', 'housing': 'object', 'credits_this_bank': 'int64', 'job': 'object', 'people_under_maintenance': 'int64', 'telephone': 'object', 'foreign_worker': 'object', 'default': 'object'} to {'account_check_status': 'object', 'age': 'int64', 'credit_amount': 'int64', 'credit_history': 'object', 'credits_this_bank': 'int64', 'default': 'object', 'duration_in_month': 'int64', 'foreign_worker': 'object', 'housing': 'object', 'installment_as_income_perc': 'int64', 'job': 'object', 'other_debtors': 'object', 'other_installment_plans': 'object', 'people_under_maintenance': 'int64', 'personal_status': 'object', 'present_employment_since': 'object', 'present_residence_since': 'int64', 'property': 'object', 'purpose': 'object', 'savings': 'object', 'sex': 'object', 'telephone': 'object'}
Feature 'people_under_maintenance' is declared as 'numeric' but has 2 (<= category_threshold=2) distinct values. Are you sure it is not a 'category' feature?
2023-06-30 10:27:19,997 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:27:19,999 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:29:55,958 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Casting dataframe columns from {'account_check_status': 'object', 'duration_in_month': 'int64', 'credit_history': 'object', 'purpose': 'object', 'credit_amount': 'int64', 'savings': 'object', 'present_employment_since': 'object', 'installment_as_income_perc': 'int64', 'sex': 'object', 'personal_status': 'object', 'other_debtors': 'object', 'present_residence_since': 'int64', 'property': 'object', 'age': 'int64', 'other_installment_plans': 'object', 'housing': 'object', 'credits_this_bank': 'int64', 'job': 'object', 'people_under_maintenance': 'int64', 'telephone': 'object', 'foreign_worker': 'object', 'default': 'object'} to {'account_check_status': 'object', 'age': 'int64', 'credit_amount': 'int64', 'credit_history': 'object', 'credits_this_bank': 'int64', 'default': 'object', 'duration_in_month': 'int64', 'foreign_worker': 'object', 'housing': 'object', 'installment_as_income_perc': 'int64', 'job': 'object', 'other_debtors': 'object', 'other_installment_plans': 'object', 'people_under_maintenance': 'int64', 'personal_status': 'object', 'present_employment_since': 'object', 'present_residence_since': 'int64', 'property': 'object', 'purpose': 'object', 'savings': 'object', 'sex': 'object', 'telephone': 'object'}
Feature 'people_under_maintenance' is declared as 'numeric' but has 2 (<= category_threshold=2) distinct values. Are you sure it is not a 'category' feature?
2023-06-30 10:29:56,057 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:29:56,119 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:29:56,122 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:29:56,127 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Casting dataframe columns from {'account_check_status': 'object', 'duration_in_month': 'int64', 'credit_history': 'object', 'purpose': 'object', 'credit_amount': 'int64', 'savings': 'object', 'present_employment_since': 'object', 'installment_as_income_perc': 'int64', 'sex': 'object', 'personal_status': 'object', 'other_debtors': 'object', 'present_residence_since': 'int64', 'property': 'object', 'age': 'int64', 'other_installment_plans': 'object', 'housing': 'object', 'credits_this_bank': 'int64', 'job': 'object', 'people_under_maintenance': 'int64', 'telephone': 'object', 'foreign_worker': 'object'} to {'account_check_status': 'object', 'duration_in_month': 'int64', 'credit_history': 'object', 'purpose': 'object', 'credit_amount': 'int64', 'savings': 'object', 'present_employment_since': 'object', 'installment_as_income_perc': 'int64', 'sex': 'object', 'personal_status': 'object', 'other_debtors': 'object', 'present_residence_since': 'int64', 'property': 'object', 'age': 'int64', 'other_installment_plans': 'object', 'housing': 'object', 'credits_this_bank': 'int64', 'job': 'object', 'people_under_maintenance': 'int64', 'telephone': 'object', 'foreign_worker': 'object'}
2023-06-30 10:29:56,131 pid:2142 ml_worker_thread_1 giskard.ml_worker.utils.logging INFO     Predicted dataset with shape (200, 22) executed in 0:00:00.072331
2023-06-30 10:29:56,169 pid:2142 MainThread giskard.ml_worker.utils.request_interceptor ERROR    'NoneType' object has no attribute 'client'
Traceback (most recent call last):
  File "/Users/***/Builds/giskard/python-client/giskard/ml_worker/utils/request_interceptor.py", line 50, in wrapper
    res = await loop.run_in_executor(pool, behavior, request, context)
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/***/Builds/giskard/python-client/giskard/ml_worker/server/ml_worker_service.py", line 432, in runModel
    self.ml_worker.tunnel.client.log_artifact(
AttributeError: 'NoneType' object has no attribute 'client'

@Inokinoki
Copy link
Member Author

Inokinoki commented Jun 30, 2023

Fix #1214, and have a partial workaround for #1215
Still need to investigate the origin of null in #1215

@Inokinoki Inokinoki added feature bug Something isn't working and removed feature labels Jun 30, 2023
@Inokinoki
Copy link
Member Author

Fix #1214, and have a partial workaround for #1215 Still need to investigate the origin of null in #1215

Fix DB migration problem from DB connection.

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

0.0% 0.0% Coverage
0.0% 0.0% Duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working safe for build

Development

Successfully merging this pull request may close these issues.

4 participants