Add search_people_with_past_company tool for advanced people filtering by guykwan · Pull Request #205 · stickerdaniel/linkedin-mcp-server

guykwan · 2026-03-06T02:31:35Z

Summary

This PR introduces a new tool search_people_with_past_company that enables advanced people search with filtering by past companies and current job titles.

New Feature: `search_people_with_past_company`

Use Cases

Find founders who previously worked at major tech companies
Identify executives with experience at specific companies
Build talent pools based on company background

Parameters

keywords (required): Search keywords (e.g., "founder", "CEO")
location (optional): Location filter (e.g., "Beijing", "San Francisco")
past_companies (optional): Comma-separated company names (e.g., "Alibaba,ByteDance,Tencent")
current_title (optional): Current job title filter (e.g., "founder", "CEO")
max_results (optional): Maximum results (default: 10)

Example

mcporter call linkedin.search_people_with_past_company \
    keywords="founder" \
    location="Beijing" \
    past_companies="Alibaba,ByteDance" \
    current_title="founder"

Implementation

Two-step search: basic search + detailed profile filtering
Rate limiting protection (1.5s delay between profiles)
Progress reporting during search
Returns both full and partial matches

Changes

Added search_people_with_past_company() tool
Added 5 helper functions for URL extraction and profile parsing
Added asyncio import

Testing

✅ Code syntax validated
✅ No breaking changes
✅ Follows existing patterns

This PR introduces a search_people_with_past_company tool that performs a two-step search: first fetching LinkedIn people search results, then iterating each profile to filter by past company and current title. Unfortunately, the implementation has several blocking issues that prevent it from functioning at all.

Key issues found:

Wrong file location: The new file is created at tools/person.py (repository root) instead of linkedin_mcp_server/tools/person.py (the actual module). The server only imports from linkedin_mcp_server.tools.person, so the new tool is never registered.
Wrong keyword argument: extractor.scrape_person(username, requested_sections={"experience"}) uses a non-existent parameter name — the actual parameter is requested. This raises a TypeError on every profile fetch.
URL extraction is fundamentally broken: _extract_profile_urls searches for full https://linkedin.com/in/... URLs inside innerText, but extract_page returns plain text (no HTML). Profile URLs are only in href attributes and are never printed as visible text, so this function always returns an empty list.
username field always None: scrape_person returns {"url": ..., "sections": ...} — no "username" key — so every matched profile's username field will be None.
Non-deterministic profile ordering: Use of set() in _extract_profile_urls loses LinkedIn's relevance-ranked ordering.
Non-English comment and module-level import re style issues.

Confidence Score: 1/5

Not safe to merge — the new tool is unreachable due to wrong file placement and contains multiple critical runtime errors.
Three independent blocking defects (wrong module path, wrong keyword argument, broken URL extraction from plain text) each individually prevent the feature from functioning. The tool is effectively dead code in its current form.
tools/person.py — all changes are in this single file, which needs to be moved to linkedin_mcp_server/tools/person.py and the logic bugs fixed before any of the new functionality can work.

Important Files Changed

Filename	Overview
tools/person.py	New file added at wrong path (root-level `tools/` instead of `linkedin_mcp_server/tools/`), making the new tool completely unreachable. Contains multiple critical bugs: wrong keyword argument name on `scrape_person`, URL extraction from `innerText` that will always return empty, and `username` always being `None` in output.

Sequence Diagram

sequenceDiagram
    participant Client
    participant MCP as MCP Server
    participant Tool as search_people_with_past_company
    participant Extractor as LinkedInExtractor

    Client->>MCP: call search_people_with_past_company(keywords, location, past_companies, current_title)
    MCP->>Tool: invoke
    Tool->>Extractor: search_people(keywords, location)
    Extractor-->>Tool: {url, sections: {search_results: innerText}}
    Note over Tool: _extract_profile_urls(innerText)<br/>⚠️ Always returns [] — URLs not in plain text
    loop For each profile URL (up to max_results × 3)
        Tool->>Extractor: scrape_person(username, requested_sections={"experience"})<br/>⚠️ TypeError: wrong kwarg name (should be 'requested')
        Extractor-->>Tool: {url, sections: {experience: text}}
        Note over Tool: _parse_profile_for_filters()<br/>profile_result.get("username") → None always
        alt matches_all
            Tool->>Tool: append to matching_profiles
        else matches_partial
            Tool->>Tool: append to partial_matches
        end
        Note over Tool: asyncio.sleep(1.5)
    end
    Tool-->>Client: {search_url, total_checked, filters, matching_profiles, partial_matches}

_{Last reviewed commit: 0192ab1}

Greptile also left 7 inline comments on this PR.

_{(2/5) Greptile learns from your feedback when you react with thumbs up/down!}

greptile-apps · 2026-03-06T02:33:57Z

tools/person.py

@@ -0,0 +1,347 @@
+"""


File placed in wrong directory — tool never registered

This file is added to tools/person.py at the repository root, but the MCP server imports from linkedin_mcp_server.tools.person (see linkedin_mcp_server/server.py line 20):

from linkedin_mcp_server.tools.person import register_person_tools

The actual module that is loaded and registered is at linkedin_mcp_server/tools/person.py. This new file at tools/person.py is never imported by anything, so search_people_with_past_company will never be registered as an MCP tool and is completely dead code. The new tool and helper functions need to be added to linkedin_mcp_server/tools/person.py instead.

Prompt To Fix With AI

This is a comment left during a code review. Path: tools/person.py Line: 1 Comment: **File placed in wrong directory — tool never registered** This file is added to `tools/person.py` at the repository root, but the MCP server imports from `linkedin_mcp_server.tools.person` (see `linkedin_mcp_server/server.py` line 20): ```python from linkedin_mcp_server.tools.person import register_person_tools ``` The actual module that is loaded and registered is at `linkedin_mcp_server/tools/person.py`. This new file at `tools/person.py` is never imported by anything, so `search_people_with_past_company` will never be registered as an MCP tool and is completely dead code. The new tool and helper functions need to be added to `linkedin_mcp_server/tools/person.py` instead. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-06T02:33:58Z

tools/person.py

+
+                    await ctx.report_progress(
+                        progress=30 + int((idx / len(profile_urls)) * 60),
+                        total=100,
+                        message=f"Checking profile {idx + 1}/{len(profile_urls[:max_results * 3])}: {username}"


Wrong keyword argument name causes TypeError at runtime

scrape_person is defined with the parameter name requested (see linkedin_mcp_server/scraping/extractor.py line 254):

async def scrape_person(self, username: str, requested: set[str]) -> dict[str, Any]:

Calling it with the keyword argument requested_sections will raise a TypeError: scrape_person() got an unexpected keyword argument 'requested_sections' at runtime, causing every profile check to fail.

Suggested change

await ctx.report_progress(

progress=30 + int((idx / len(profile_urls)) * 60),

total=100,

message=f"Checking profile {idx + 1}/{len(profile_urls[:max_results * 3])}: {username}"

profile_result = await extractor.scrape_person(

username, requested={"experience"}

)

Prompt To Fix With AI

This is a comment left during a code review. Path: tools/person.py Line: 218-222 Comment: **Wrong keyword argument name causes `TypeError` at runtime** `scrape_person` is defined with the parameter name `requested` (see `linkedin_mcp_server/scraping/extractor.py` line 254): ```python async def scrape_person(self, username: str, requested: set[str]) -> dict[str, Any]: ``` Calling it with the keyword argument `requested_sections` will raise a `TypeError: scrape_person() got an unexpected keyword argument 'requested_sections'` at runtime, causing every profile check to fail. ```suggestion profile_result = await extractor.scrape_person( username, requested={"experience"} ) ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-06T02:33:59Z

tools/person.py

+            )
+
+            # Extract profile URLs from search results


URL extraction from innerText will always return an empty list

extractor.search_people() calls extract_page(), which returns main.innerText — plain text with no HTML markup. LinkedIn profile URLs (e.g. https://www.linkedin.com/in/username) are rendered as hyperlinks in the DOM, not printed as visible text. They will never appear in the innerText string, so _extract_profile_urls will always return [], meaning the second-step filtering never runs and the function always returns zero matches.

To reliably extract profile URLs, the extractor would need to read href attributes directly from the DOM (similar to how _extract_job_ids does it via page.evaluate) rather than parsing plain text.

Prompt To Fix With AI

This is a comment left during a code review. Path: tools/person.py Line: 196-198 Comment: **URL extraction from `innerText` will always return an empty list** `extractor.search_people()` calls `extract_page()`, which returns `main.innerText` — plain text with no HTML markup. LinkedIn profile URLs (e.g. `https://www.linkedin.com/in/username`) are rendered as hyperlinks in the DOM, not printed as visible text. They will never appear in the `innerText` string, so `_extract_profile_urls` will always return `[]`, meaning the second-step filtering never runs and the function always returns zero matches. To reliably extract profile URLs, the extractor would need to read `href` attributes directly from the DOM (similar to how `_extract_job_ids` does it via `page.evaluate`) rather than parsing plain text. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-06T02:34:00Z

tools/person.py

+def _extract_profile_urls(search_text: str) -> list[str]:
+    """Extract LinkedIn profile URLs from search results text."""
+    import re


profile_result.get("username") always returns None

scrape_person returns {"url": ..., "sections": ...} — there is no "username" key in its return dict. This means every profile in matching_profiles and partial_matches will have "username": None, making it impossible for callers to look up or identify the matching profiles.

Suggested change

def _extract_profile_urls(search_text: str) -> list[str]:

"""Extract LinkedIn profile URLs from search results text."""

import re

"username": url.split("/in/")[-1].rstrip("/") if url else None,

Or more cleanly, pass the username variable (already extracted on line 210) into _parse_profile_for_filters.

Prompt To Fix With AI

This is a comment left during a code review. Path: tools/person.py Line: 270-272 Comment: **`profile_result.get("username")` always returns `None`** `scrape_person` returns `{"url": ..., "sections": ...}` — there is no `"username"` key in its return dict. This means every profile in `matching_profiles` and `partial_matches` will have `"username": None`, making it impossible for callers to look up or identify the matching profiles. ```suggestion "username": url.split("/in/")[-1].rstrip("/") if url else None, ``` Or more cleanly, pass the `username` variable (already extracted on line 210) into `_parse_profile_for_filters`. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-06T02:34:01Z

tools/person.py

+) -> dict[str, Any]:
+    """Parse profile result and check if it matches filters."""
+    sections = profile_result.get("sections", {})
+    experience_text = sections.get("experience", "")
+    main_text = sections.get("main", "")


Non-deterministic URL ordering from set() deduplication

_extract_profile_urls returns [f"https://linkedin.com/in/{username}" for username in set(matches)]. The set conversion removes duplicates but destroys the original ordering from the search results page (where LinkedIn orders results by relevance). Each call may iterate profiles in a different order, producing inconsistent results. Use dict.fromkeys to preserve insertion order while deduplicating:

Suggested change

) -> dict[str, Any]:

"""Parse profile result and check if it matches filters."""

sections = profile_result.get("sections", {})

experience_text = sections.get("experience", "")

main_text = sections.get("main", "")

seen = dict.fromkeys(matches)

return [f"https://linkedin.com/in/{username}" for username in seen]

Prompt To Fix With AI

This is a comment left during a code review. Path: tools/person.py Line: 290-294 Comment: **Non-deterministic URL ordering from `set()` deduplication** `_extract_profile_urls` returns `[f"https://linkedin.com/in/{username}" for username in set(matches)]`. The `set` conversion removes duplicates but destroys the original ordering from the search results page (where LinkedIn orders results by relevance). Each call may iterate profiles in a different order, producing inconsistent results. Use `dict.fromkeys` to preserve insertion order while deduplicating: ```suggestion seen = dict.fromkeys(matches) return [f"https://linkedin.com/in/{username}" for username in seen] ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-06T02:34:02Z

tools/person.py

+
+        except Exception as e:
+            raise_tool_error(e, "search_people")  # NoReturn
+


Non-English inline comment

The comment # 更长超时，因为需要获取多个档案 is in Chinese. The rest of the codebase uses English exclusively for comments and documentation. Please translate this to English to keep the codebase consistent:

Suggested change

timeout=TOOL_TIMEOUT_SECONDS * 3, # Longer timeout because multiple profiles need to be fetched

Prompt To Fix With AI

This is a comment left during a code review. Path: tools/person.py Line: 124 Comment: **Non-English inline comment** The comment `# 更长超时，因为需要获取多个档案` is in Chinese. The rest of the codebase uses English exclusively for comments and documentation. Please translate this to English to keep the codebase consistent: ```suggestion timeout=TOOL_TIMEOUT_SECONDS * 3, # Longer timeout because multiple profiles need to be fetched ``` How can I resolve this? If you propose a fix, please make it concise.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-03-06T02:34:03Z

tools/person.py

+    profile_result: dict[str, Any],
+    past_company_list: list[str],
+    current_title: str | None,
+) -> dict[str, Any]:
+    """Parse profile result and check if it matches filters."""
+    sections = profile_result.get("sections", {})
+    experience_text = sections.get("experience", "")
+    main_text = sections.get("main", "")
+


import re inside function body

re is imported inside both _extract_profile_urls (line 288) and _extract_username_from_url (line 298). While Python caches module imports, the convention in this codebase (and generally) is to place all imports at the top of the module. Move import re to the module-level imports alongside import asyncio and import logging.

Prompt To Fix With AI

This is a comment left during a code review. Path: tools/person.py Line: 287-295 Comment: **`import re` inside function body** `re` is imported inside both `_extract_profile_urls` (line 288) and `_extract_username_from_url` (line 298). While Python caches module imports, the convention in this codebase (and generally) is to place all imports at the top of the module. Move `import re` to the module-level imports alongside `import asyncio` and `import logging`. How can I resolve this? If you propose a fix, please make it concise.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

guykwan added 2 commits March 6, 2026 10:27

Create person.py

4d4168d

Update person.py

0192ab1

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add search_people_with_past_company tool for advanced people filtering#205

Add search_people_with_past_company tool for advanced people filtering#205
guykwan wants to merge 2 commits intostickerdaniel:mainfrom
guykwan:main

guykwan commented Mar 6, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot Mar 6, 2026

Uh oh!

greptile-apps bot Mar 6, 2026

Uh oh!

greptile-apps bot Mar 6, 2026

Uh oh!

greptile-apps bot Mar 6, 2026

Uh oh!

greptile-apps bot Mar 6, 2026

Uh oh!

greptile-apps bot Mar 6, 2026

Uh oh!

greptile-apps bot Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		except Exception as e:
		raise_tool_error(e, "search_people") # NoReturn


	timeout=TOOL_TIMEOUT_SECONDS * 3, # Longer timeout because multiple profiles need to be fetched

Conversation

guykwan commented Mar 6, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New Feature: search_people_with_past_company

Use Cases

Parameters

Example

Implementation

Changes

Testing

Related

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

guykwan commented Mar 6, 2026 •

edited by greptile-apps bot

Loading

New Feature: `search_people_with_past_company`