docs: check for markdown in plugin docs, check for plugin module docstrings#1505
docs: check for markdown in plugin docs, check for plugin module docstrings#1505leondz merged 7 commits intoNVIDIA:mainfrom
Conversation
mikemckiernan
left a comment
There was a problem hiding this comment.
plmk if I can clarify anything and thanks!
garak/detectors/divergence.py
Outdated
| # SPDX-FileCopyrightText: Portions Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| """ Detectors which attempt to identify if output is deviating from expectations in a way that indicates a successufl attack """ |
garak/detectors/divergence.py
Outdated
| """See if output deviates from repeated phrase into something interesting, | ||
| using attempt's notes[triggers] as list of strings to match for the repeated content. |
There was a problem hiding this comment.
nit: s/See/Determine/ or similar for inclusive language.
sugg: notes[triggers because that looks like code, and "as a list..." to avoid telegraphic language.
I-don't-recall-why: stylistically, I've tended to avoid the possessive form for inanimate objects like an attempt. A) I don't know enough about the code to be confident, but would "...using the notes[triggers field of the attempt object as a list of..." work? B) This is def optional, I suspect I'm just conditioned for formal language.
There was a problem hiding this comment.
Appreciate this. I'm not a fan of affording agency to artefacts that don't have it, in a technical setting
docs/source/garak.probes._tier.rst
Outdated
| So what is a tier in garak? The flippant answer is that it's a convenient way to deal with the question "What probes should I run?" -- something new users and those who don't like to spin their GPU for extended periods of time often ask. It effectively establishes a hierarchy to say "If you can only run a small number of probes, these are the most important ones". But what makes a probe important? Well, unfortunately, the best answer to that question is a classic: it depends. | ||
|
|
||
| So in the absence of knowing what you care about, should you care about ```av_spam_scanning``? Almost certainly not, unless you're trying to test the efficacy of an antivirus or spam scanner you've put in front of your model. Should you care about ``malwaregen``? Do you care if your model/system will write malicious code? | ||
| So in the absence of knowing what you care about, should you care about ``av_spam_scanning``? Almost certainly not, unless you're trying to test the efficacy of an antivirus or spam scanner you've put in front of your model. Should you care about ``malwaregen``? Do you care if your model/system will write malicious code? |
There was a problem hiding this comment.
nit: "model or system" to avoid slashes--they are no more clear than "or" and at some point used to interfere with translation.
Heads up about the future tense of "will write": typically, I favor present tense in most cases, but in this narrative context, I think future tense works fine and is at least as clear as present tense.
garak/generators/azure.py
Outdated
| #. [Deploy a model](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal#deploy-a-model) and copy paste the model and deployment names. | ||
|
|
||
| #. Visit https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models and find the LLM you'd like to use. | ||
| #. `Deploy a model <https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal#deploy-a-model>`_ and copy paste the model and deployment names. |
There was a problem hiding this comment.
should it be s/copy paste/copy/ in this step?
garak/generators/azure.py
Outdated
|
|
||
| #. Visit https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models and find the LLM you'd like to use. | ||
| #. `Deploy a model <https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal#deploy-a-model>`_ and copy paste the model and deployment names. | ||
| #. On the Azure portal page for the Azure OpenAI you want to use click on "Resource Management -> Keys and Endpoint" and copy paste the API Key and endpoint. |
There was a problem hiding this comment.
nit: s/click on/click/
nit: UI elements use bold and just the greater-than symbol: Resource Management > Keys and Endpoint. (For the record, there's a :menuselection: role, but NVIDIA hasn't adopted using it.)
| ``[Link display text]\(https://{domain}/logo.png?{var}=[{injection_marker}])`` | ||
| ``[Link display text][ref]\\n\\n[ref]: https://{domain}/logo.png?{var}=[{injection_marker}]`` | ||
| ``[Link display text][ref]\\n[Link display text]: https://{domain}/logo.png?{var}=[{injection_marker}]`` | ||
| ``[Link display text][ref]\\n[Link display text]: https://{domain}/logo.png?{var}=[{injection_marker}]`` |
There was a problem hiding this comment.
Because the code-block uses Markdown as the language, I think these might be OK as-is.
Same in the next section.
erickgalinkin
left a comment
There was a problem hiding this comment.
Looks good overall. Thanks Leon!
garak/detectors/ansiescape.py
Outdated
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
|
|
||
| """ Detect presence of ANSI control sequences. """ |
There was a problem hiding this comment.
Should we link to wiki here? https://en.wikipedia.org/wiki/ANSI_escape_code
| """Generic configurable detector to utilize a generator as a judge | ||
|
|
||
| For system prompt interactions this detector relies on the [FastChat](https://github.com/lm-sys/fastchat) package | ||
| For system prompt interactions this detector relies on the `FastChat <https://github.com/lm-sys/fastchat>`_ package |
There was a problem hiding this comment.
We should fix this soonest so we can rip out fschat, especially since we have system prompt support now.
Docs are presented via ReadTheDocs using RST. Markdown doesn't render in RST - it looks like this instead:
This PR tests for hints of Markdown in docstrings and files that will be processed using RST.
Additionally, we now test for module/group-level docstrings, as well as class-level docstrings.