Conversation
Added Functionality to Scrape Instructions
Add donna hay
Added Functionality to Scrape Ingredients
|
Thanks all! And apologies for taking a while here; I plan to review this within the next 24h or so. |
jayaddison
left a comment
There was a problem hiding this comment.
This is looking pretty good! I have two requests after reading through the code:
- Could we try retrieving the recipe
titlefrom one of the other elements on the page, and filtering out the pipe (|) and subsequent content from that? I think that would make for more readable recipe titles. - To confirm that the
ingredient_groupsfunctionality works as expected, could we add another test case for a recipe that involves ingredient groupings?
Can do, I'll use https://www.donnahay.com.au/recipes/snacks-and-sides/smoky-eggplant-dip-with-hand-cut-potato-chips as the target if that works? |
|
Sounds good - thanks, @a1831319! |
Update based on feedback
Additional tests for testing the ingredient groups
Resolved, thank you @a1831319 @mlduff!
This isn't completely resolved yet - could we use the HTML |
Retrieve recipe names from title element
|
|
||
| def title(self): | ||
| return self.soup.find("h1", class_="recipe-title__mobile").text | ||
| return self.soup.find("title").text.split("|")[0].strip() |
There was a problem hiding this comment.
| return self.soup.find("title").text.split("|")[0].strip() | |
| html_title = self.soup.find("title") | |
| recipe_title, _, _ = html_title.text.partition("|") | |
| return recipe_title.strip() |
Edit: call str.partition instead of str.rpartition
There was a problem hiding this comment.
Ah, nope.. not quite correct. rpartition would return an empty result when | is not found in the string.
There was a problem hiding this comment.
(updated/fixed to use str.partition instead)
jayaddison
left a comment
There was a problem hiding this comment.
Looks good to me! Thank you @a1831319 @heathrampazis @mlduff @Mooree003!
Ready to merge once the merge conflict in __init__.py is resolved; the str.partition usage suggestion is optional.
| # mypy: allow-untyped-defs | ||
|
|
There was a problem hiding this comment.
This pull request is generally ready I think - just some merge conflicts to resolve.
There's a small cleanup opportunity here too - after #1174 we don't need these allow-untyped-defs mypy directives, so this can be removed from the file header.
| @@ -0,0 +1,62 @@ | |||
| { | |||
There was a problem hiding this comment.
Please note: we've begun checking for a preferred ordering of the JSON key names (not alphabetical; more like priority/review-aid based).
After merging recent changes into your branch, one of the unit tests may begin complaining about the JSON files because of that. There is however a script provided that can automatically fix them -- running python scripts/reorder_json_keys.py should do that for you.
Resolves #1150
No schema support. Most functions are supported (except for times).
Worked on collaboratively by myself, @heathrampazis , @a1831319 and @Mooree003.