Commit 0ba04d1
authored
refactor!: Update the crawlers & storage clients structure (#828)
## Description
Update the dir structure of crawlers & storage clients, as discussed
earlier on the Slack.
I decided to export nothing on the 2nd level because of the extras & it
would also be pretty huge (taking into account we have also models
there).
E.g. for BS crawler:
```diff
- from crawlee.beautifulsoup_crawler import BeautifulSoupCrawler, BeautifulSoupCrawlingContext
+ from crawlee.crawlers import BeautifulSoupCrawler, BeautifulSoupCrawlingContext
```
Or for memory storage client:
```diff
- from memory_storage_client import MemoryStorageClient
+ from storage_clients import MemoryStorageClient
```
This should be generally more aligned with the concepts of Crawlee. Of
course, quite a breaking change though. Better to do it now than later.
This will not be applied to the JS version because sub-pkgs like
`PlaywrightCrawler` are its own package.
## Issue
- Closes: #764
## Breaking changes
### Crawlers & CrawlingContexts
- All crawler and crawling context classes have been consolidated into a
single sub-package called `crawlers`.
- The affected classes include: `AbstractHttpCrawler`,
`AbstractHttpParser`, `BasicCrawler`, `BasicCrawlerOptions`,
`BasicCrawlingContext`, `BeautifulSoupCrawler`,
`BeautifulSoupCrawlingContext`, `BeautifulSoupParserType`,
`ContextPipeline`, `HttpCrawler`, `HttpCrawlerOptions`,
`HttpCrawlingContext`, `HttpCrawlingResult`,
`ParsedHttpCrawlingContext`, `ParselCrawler`, `ParselCrawlingContext`,
`PlaywrightCrawler`, `PlaywrightCrawlingContext`,
`PlaywrightPreNavCrawlingContext`.
Example update:
```diff
- from crawlee.beautifulsoup_crawler import BeautifulSoupCrawler, BeautifulSoupCrawlingContext
+ from crawlee.crawlers import BeautifulSoupCrawler, BeautifulSoupCrawlingContext
```
### Storage clients
- All storage client classes have been moved into a single sub-package
called `storage_clients`.
- The affected classes include: `MemoryStorageClient`,
`BaseStorageClient`.
Example update:
```diff
- from crawlee.memory_storage_client import MemoryStorageClient
+ from crawlee.storage_clients import MemoryStorageClient
```
### CurlImpersonateHttpClient
- The `CurlImpersonateHttpClient` changed its import location.
Example update:
```diff
- from crawlee.http_clients.curl_impersonate import CurlImpersonateHttpClient
+ from crawlee.http_clients import CurlImpersonateHttpClient
```1 parent c58e973 commit 0ba04d1
File tree
175 files changed
+479
-345
lines changed- docs
- deployment/code/apify
- examples/code
- guides/code
- http_clients
- proxy_management
- request_storage
- result_storage
- scaling_crawlers
- introduction/code
- quick-start/code
- upgrading
- src/crawlee
- _utils
- base_storage_client
- basic_crawler
- browsers
- crawlers
- _abstract_http
- _basic
- _beautifulsoup
- _http
- _parsel
- _playwright
- http_clients
- memory_storage_client
- project_template/templates
- request_loaders
- storage_clients
- _base
- _memory
- storages
- templates
- beautifulsoup/{{cookiecutter.project_name}}/{{cookiecutter.project_name}}
- playwright/{{cookiecutter.project_name}}/{{cookiecutter.project_name}}
- tests/unit
- _utils
- crawlers
- _basic
- _beautifulsoup
- _http
- _parsel
- _playwright
- http_clients
- storage_clients/_memory
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
175 files changed
+479
-345
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
| 4 | + | |
6 | 5 | | |
7 | 6 | | |
8 | 7 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
0 commit comments