Skip to content

feat: batch scrape#150

Draft
zizzfizzix wants to merge 29 commits intomainfrom
33-allow-scraping-a-list-of-pages
Draft

feat: batch scrape#150
zizzfizzix wants to merge 29 commits intomainfrom
33-allow-scraping-a-list-of-pages

Conversation

@zizzfizzix
Copy link
Copy Markdown
Owner

No description provided.

@zizzfizzix zizzfizzix linked an issue Dec 28, 2025 that may be closed by this pull request
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
@zizzfizzix zizzfizzix force-pushed the 33-allow-scraping-a-list-of-pages branch from 7c941dc to 5a3f97c Compare December 28, 2025 13:36
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
…tistics handling to avoid polling

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
- Added a pause signal mechanism to allow immediate response when a batch scrape is paused.
- Updated batch scrape completion logic to account for paused state.
- Modified the startBatchScrape and resumeBatchScrape functions to provide immediate feedback to the UI without waiting for completion.
- Enhanced error handling for asynchronous operations related to batch scraping.

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
- Introduced a new BatchSettingsDialog component for configuring batch settings.
- Updated BatchActionButtons to conditionally display the settings button based on batch status.
- Refactored BatchSettings to separate form controls and added a reset button option.

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
…y params

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
- Added a feature flag for batch scraping, allowing conditional rendering of related UI components.
- Integrated feature flag checks in the settings and side panel to manage batch scrape visibility.
- Updated context menu and background handlers to respect the batch scrape feature flag.
- Enhanced settings to allow users to toggle batch scrape options and manage overrides.

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
- Added a new ButtonGroup component for better button organization in the UI.
- Updated BatchActionButtons to include settings management and starting functionality.
- Introduced UrlProgressTable for displaying live URL scraping progress with filtering options.
- Refactored ConfigForm to support recent selectors and conditional button visibility.
- Integrated visual picker functionality for selecting main selectors during batch scraping.
- Removed deprecated BatchConfig and BatchSettingsComponent components to streamline the codebase.

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
- Removed redundant completion checks from startBatchScrape function.
- Moved batch completion logic to updateBatchStatistics to ensure accurate status updates when all URLs are processed.
- Enhanced handling of batch status to account for both normal and paused states.

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
…e screens

- Removed the navigateToDuplicate function and replaced it with URL generation for duplicating batches.
- Updated DuplicateBatchButton to use getDuplicateBatchUrl for navigation.
- Refactored BatchScrapeApp and BatchScrapeHistoryApp to utilize centralized URL functions for batch history and new batch creation.
- Introduced batch-urls utility for consistent URL management across components.

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
- Added tabId prop to ConfigForm for session storage linking.
- Updated BatchScrapeApp to load configuration from sidepanel session storage using tab ID.
- Refactored URL generation in batch-urls utility to include loading from tab ID.
- Removed unused batch scrape message handlers from background UI handlers.

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
… picker

- Added state management for the origin tab ID to return focus after closing the visual picker tab.
- Updated closePicker and openPicker functions to handle tab focus correctly.
- Improved error handling for tab operations to ensure robustness.

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
- Implemented a reset button in BatchSettingsDialog to restore default settings.
- Added logic to check if current settings are default in BatchSettingsDialog.
- Removed reset button from BatchSettingsForm to streamline the component.
- Updated imports to include necessary icons and constants for new functionality.

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
- Added a new entry point for the app with a centralized HTML file and main TypeScript file.
- Implemented routing using @tanstack/react-router for navigation between different app pages.
- Created new routes for onboarding, scrapes, and data views, enhancing the user experience.
- Refactored URL generation for navigation to utilize centralized utility functions.
- Removed deprecated batch scrape components and their associated files to streamline the codebase.

Signed-off-by: Kuba Serafinowski <kuba.serafinowski@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow scraping a list of pages

1 participant