You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/content/en/working_with_findings/finding_deduplication/deduplication_tuning_os.md
+30-3Lines changed: 30 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -103,15 +103,42 @@ Notes:
103
103
104
104
## After changing deduplication settings
105
105
106
-
- Changes to dedupe configuration (e.g., `HASHCODE_FIELDS_PER_SCANNER`, `HASH_CODE_FIELDS_ALWAYS`, `DEDUPLICATION_ALGORITHM_PER_PARSER`) trigger background processing via Celery.
107
-
- Hashes for findings of the affected test types are recalculated asynchronously; deduplication relationships can update over time.
108
-
- Allow some time after changes or imports before evaluating results, as updates are not instantaneous.
106
+
- Changes to dedupe configuration (e.g., `HASHCODE_FIELDS_PER_SCANNER`, `HASH_CODE_FIELDS_ALWAYS`, `DEDUPLICATION_ALGORITHM_PER_PARSER`) are not applied retroactively automatically. To re-evaluate existing findings you must run the management command below.
107
+
108
+
Run inside the uwsgi container. Example (hash codes only, no dedupe):
--parser PARSER List of parsers for which hash_code needs recomputing
118
+
(defaults to all parsers)
119
+
--hash_code_only Only compute hash codes
120
+
--dedupe_only Only run deduplication
121
+
--dedupe_sync Run dedupe in the foreground, default false
122
+
```
123
+
124
+
If you submit dedupe to Celery (without `--dedupe_sync`), allow time for tasks to complete before evaluating results.
109
125
110
126
## Where to configure
111
127
112
128
- Prefer environment variables in deployments. For local development or advanced overrides, use `local_settings.py`.
113
129
- See `configuration.md` for details on how to set environment variables and configure local overrides.
114
130
131
+
### Troubleshooting
132
+
133
+
To help troubleshooting deduplication use the following tools:
134
+
135
+
- Observe log out in the `dojo.specific-loggers.deduplication` category. This is a class independant logger that outputs details about the deduplication process and settings when processing findings.
136
+
- Observe the `unique_id_from_tool` and `hash_code` values by hovering over the `ID` field or `Status` column:
137
+
138
+

139
+
140
+

141
+
115
142
## Related documentation
116
143
117
144
- [Deduplication Algorithms](deduplication_algorithms): conceptual overview and endpoint behavior.
0 commit comments