Skip to content

_process_names in class DicomImageRedactorEngine unconditionally adds all DICOM metadata strings as PHI #1726

@atinm

Description

@atinm

Describe the bug
_process_names in class DicomImageRedactorEngine unconditionally adds all DICOM metadata strings as PHI

To Reproduce
Code inspection: first line is phi_list = text_metadata.copy() and then it is just extended when is_name is
True:

    def _process_names(cls, text_metadata: list, is_name: list) -> list:
        """Process names to have multiple iterations in our PHI list.

        :param text_metadata: List of all the instance's element values
        (excluding pixel data).
        :param is_name: True if the element is specified as being a name.

        :return: Metadata text with additional name iterations appended.
        """
        phi_list = text_metadata.copy()

        for i in range(0, len(text_metadata)):
            if is_name[i] is True:
                original_text = str(text_metadata[i])
                phi_list += cls.augment_word(original_text)

        return phi_list

A second issue is that it is converting MultiValue values into strings directly so you end up with an array of strings looking like: ["['1000', '234566']", 'Joe'] instead of the function returning an array of strings: ['1000', '234566', 'Joe'].

Expected behavior
_process_names should only add words where is_name is True and should handle MultiValue metadata fields correctly.

Additional context
Too many words are being removed when all we should be removing is PHI and multi-value meta strings are being added incorrectly.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions