Observatory UI incorrectly detects data collections from any text mention

## Problem

The `_extract_collection_refs` method in `ui/app/dataloader.py` (line 519-521) uses a simple substring check to detect which BERDL collections a project uses:

```python
def _extract_collection_refs(self, readme_content: str) -> list[str]:
    """Extract BERDL collection IDs mentioned in README text."""
    return [cid for cid in self._COLLECTION_IDS if cid in readme_content]
```

This scans the **entire concatenated text** of README.md, RESEARCH_PLAN.md, and REPORT.md (line 453-458). Any mention of a collection ID — even in Future Directions, Literature Context, or a passing reference — causes that collection to appear as a "Data Collection" on the project's observatory page.

## Example

The `phb_granule_ecology` project mentioned `kescience_fitnessbrowser` only in a Future Directions bullet ("Query the BERDL Fitness Browser (`kescience_fitnessbrowser`) for phaC mutant fitness phenotypes..."). The observatory displayed Fitness Browser as a data source even though the project never queried it.

**Workaround applied**: Removed the backtick-quoted collection ID from the text (commit 49cb304).

## Suggested Fix

Instead of scanning all text, restrict collection detection to the **Data Sources** section of README.md or the **Data** section of REPORT.md. For example:

```python
def _extract_collection_refs(self, all_text: str) -> list[str]:
    # Only look in Data Sources / Data sections, not the full text
    data_sections = []
    for section_name in ["Data Sources", "Data"]:
        match = re.search(
            rf"^## {re.escape(section_name)}\s*$\n(.*?)(?=^## |\Z)",
            all_text, re.MULTILINE | re.DOTALL,
        )
        if match:
            data_sections.append(match.group(1))
    search_text = "\n".join(data_sections) if data_sections else all_text
    return [cid for cid in self._COLLECTION_IDS if cid in search_text]
```

This would only flag collections that are explicitly listed as data sources, not those mentioned in passing.

## Affected Code

- `ui/app/dataloader.py`, lines 197-210 (`_COLLECTION_IDS`), 453-458 (text concatenation), 519-521 (`_extract_collection_refs`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observatory UI incorrectly detects data collections from any text mention #105

Problem

Example

Suggested Fix

Affected Code

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Observatory UI incorrectly detects data collections from any text mention #105

Description

Problem

Example

Suggested Fix

Affected Code

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions