Support for GUI Toolkit/Toga owned pyscript configurations and insertions (Phase 2 + 3)#2442
Conversation
Draft to replace code loading pyscript.toml from the web-template files with a process to gather them from toga_web or 3rd party GUI toolkit
|
Draft work and currently untested. The commented out code is from Kavi's original work that I've tried to repurpose. Kavi will likely merge more work into this PR before we would like to merge it. |
|
@freakboy3742 Hi Russell, we’ve updated the PR with the phase 3 changes. We’d appreciate it if you could review them before we move forward. Thanks! |
freakboy3742
left a comment
There was a problem hiding this comment.
This is headed in the right direction, but there's a bunch of simplifications and idiomatic code usage that can be cleaned up.
I'm also not 100% certain the code can work as written - there's at least 2 areas I've flagged that, if they work for you, are likely accidental at best.
Also - I can see you've tried to preserve at least of the historical "static" handling; that's good, but (a) we only need to retain legacy functionality in the form it historically supported - we don't need to add hybrid handling of legacy and new layout; and (b) we need to retain all the functionality, including the banner/separator content.
| if len(path.parts) < 2: | ||
| continue | ||
|
|
||
| is_inserts = (path.parts[1] == "inserts") or ( |
There was a problem hiding this comment.
Why handle both inserts and deploy/inserts? There isn't a legacy consideration here; is there some other use case I'm missing?
I'd also recommend rewriting this check has path.parts[:2] == ["deploy", "inserts"]
| and path.parts[1] == "deploy" | ||
| and path.parts[2] == "inserts" | ||
| ) | ||
| if is_inserts and path.name: |
There was a problem hiding this comment.
What's the case for path.name not existing? And why pre-compute is_inserts if it's not going to be used later?
| and path.parts[1] == "deploy" | ||
| and path.parts[2] == "static" | ||
| ) | ||
| if is_static: |
There was a problem hiding this comment.
Similar to above - what's the use case for deploy/static? It's not a historical usage.
| deploy_parent = Path(deploy_file).parent | ||
| if ( | ||
| deploy_parent == deploy_dir | ||
| and Path(deploy_file).name == f"{backend}.toml" |
There was a problem hiding this comment.
While I understand the intention to make this pluggable, there's no reason to believe that any other framework will adopt framework.toml as its configuration file. This would be much better handled as a specific pyscript configuration loader.
There was a problem hiding this comment.
that's understandable, this wasn't very thought out. Will leave this out for now and leave it for when other backends are supported.
| and Path(deploy_file).name == f"{backend}.toml" | ||
| ): | ||
| backend_counter += 1 | ||
| # Raise an error if more than one pyscript.toml file is found. |
There was a problem hiding this comment.
This isn't quite right - it's going to search all files for backend.toml. So if one backend has a pyscript.toml, and another has foobar.toml, this error won't be triggered.
The real constraint is the config counter (which is implemented above), not a count of pyscript configurations.
| with target_path.open("r", encoding="utf-8") as f: | ||
| file_text = f.read() |
There was a problem hiding this comment.
| with target_path.open("r", encoding="utf-8") as f: | |
| file_text = f.read() | |
| target_path.read_text(encoding="utf-8") |
There was a problem hiding this comment.
You can also combine this statement with the "if exists" check - try to read the content, and catch the FileNotFound exception. That's a general principle in Python of "it's better to ask forgiveness than permission".
| css_body = "\n".join( | ||
| css_banner.format(package=pkg, content=packages[pkg]) | ||
| for pkg in sorted(packages.keys()) | ||
| ) |
There was a problem hiding this comment.
This doesn't seem right - it's using the same content to render both HTML and CSS...
There was a problem hiding this comment.
This makes sense now; but organisationally, this should come before the marker bodies. As is, the "body" handling is split in half.
This would also allow using the computed body directly, rather than needing to use strings like html as a proxy for "selecting" the right body.
| replaced = False | ||
| for pattern_tmpl, repl_tmpl, kind in marker_styles: | ||
| pattern = re.compile( | ||
| pattern_tmpl.format(nsert=insert), flags=re.MULTILINE | re.DOTALL |
There was a problem hiding this comment.
Well that can't work...
| pattern_tmpl.format(nsert=insert), flags=re.MULTILINE | re.DOTALL | |
| pattern_tmpl.format(insert=insert), flags=re.MULTILINE | re.DOTALL |
| f.write(line) | ||
|
|
||
| def _process_wheel(self, wheelfile, css_file): | ||
| def _merge_insert_content(self, inserts, key, path): |
There was a problem hiding this comment.
Is this method actually used?
There was a problem hiding this comment.
This method was brought over from the old PR#1285 to handle and merge multiple CSS files into one insert block, but it was overlooked accidentally when wiring it to the rest of the code. We’ve left it commented out for now until we get the main wheels up and running...
| file_text, | ||
| ) | ||
| replaced = True | ||
| break |
There was a problem hiding this comment.
I'm very wary of breaks inside deeply nested for loops - if you want to know why, read up on the causes of the 1990 Martin Luther King Jr Day crash of the US phone system.
To me, either if not replaced checks, or restructuring this loop so it's a while not replaced would make more sense.
… finding _gather_backend_config_file function might be obsolete. This now attempts to find pyscript.toml in the same directory as config.toml using wheel.open()
41fb595 to
2504757
Compare
Refactor wheel processing, simplify deploy/* checks, drop unused cases, and sort inserts/packages for consistent output
|
Hi @freakboy3742, the updates are in place now, and the changes are ready for your review. |
freakboy3742
left a comment
There was a problem hiding this comment.
A few comments inline, but this is looking good. Most of the comments are about high level structural things where you've got the right logic, but there are ways to reorder the code so that it requires less nesting (and/or less confusing error paths).
The only other issue is whether we actually need to do "content type" separation on inserts. It's entirely legal for a single .html page to contain CSS, JS and HTML inserts (or for a JS page to potentially include HTML content); so the substitution should be entirely based on the insert key name and full filename, not qualified by content type.
| f.write(line) | ||
|
|
||
| def _process_wheel(self, wheelfile, css_file): | ||
| def _write_inserts(self, app: AppConfig, filename: Path, inserts: dict): |
There was a problem hiding this comment.
As a type annotation, dict should be qualified by what it is a dictionary of - in this case, I think it's dict[str, dict[str, str]]?
|
|
||
| # Each insert slot and its package contributions are processed in sorted order | ||
| for insert, packages in sorted(inserts.items()): | ||
| packages = inserts[insert] |
There was a problem hiding this comment.
This line isn't required if you're iterating over items().
| css_body = "\n".join( | ||
| css_banner.format(package=pkg, content=packages[pkg]) | ||
| for pkg in sorted(packages.keys()) | ||
| ) |
There was a problem hiding this comment.
This makes sense now; but organisationally, this should come before the marker bodies. As is, the "body" handling is split in half.
This would also allow using the computed body directly, rather than needing to use strings like html as a proxy for "selecting" the right body.
| # Find first matching marker in file | ||
| matched_style = None | ||
| for pattern_tmpl, repl_tmpl, kind in marker_styles: | ||
| pattern = re.compile( |
There was a problem hiding this comment.
Since this pattern doesn't vary with anything in the loop, it can be defined outside the loop (which will be faster - regex compilation is a mildly expensive operation) and then reused on each iteration.
| if matched_style is None and pattern.search(file_text): | ||
| matched_style = (pattern, repl_tmpl, kind) | ||
|
|
||
| if matched_style is not None: |
There was a problem hiding this comment.
Why is this being done as a separate if block? Isn't this the behavior that should be done if a match is found?
Also - a page could contain HTML and CSS inserts on the same page, as it would be legal to have a <style> element on a HTML page. I'm not sure we can use the "bail on first insert found" approach here.
Also
| self.console.warning( | ||
| f" {source}: missing ':<insert>'; skipping insert." | ||
| ) | ||
| continue |
There was a problem hiding this comment.
I don't mind having "top of loop shortcut" continue statements; but once you're in body logic, you should be working out how to use if statements. continue is a GOTO wearing lipstick, and has all the same problems.
| backend = config_data.get("backend") | ||
|
|
||
| # Currently, only pyscript is supported, will raise an error if another backend is found. | ||
| if backend != "pyscript": |
There was a problem hiding this comment.
The logic here is all valid AFAICT, but it's getting a bit deep in nesting.
Rather than process the package when it is found, It might be worth organising this so that you:
- Find all packages that have config files, accumulating a list.
- If the accumulated list has >1 entry, raise an error (reporting all the packages that have config files)
- If the accumulated list has 1 entry, processing the config in that package
- If the accumulated list has no entries, perform default handling.
| ) from e | ||
| except FileNotFoundError: | ||
| config = {} | ||
| config = self._gather_backend_config(self.wheel_path(app).glob("*.whl")) |
There was a problem hiding this comment.
This is bike shedding, but I'd go with extract_backend_config; and it doesn't need to be a private method.
| f.write(line) | ||
|
|
||
| def _process_wheel(self, wheelfile, css_file): | ||
| def _write_inserts(self, app: AppConfig, filename: Path, inserts: dict): |
There was a problem hiding this comment.
Also - this doesn't need to be a private method.
| pkg_entry[kind] += text | ||
| continue | ||
|
|
||
| # Handle static files under deploy/static |
There was a problem hiding this comment.
This static handling is different to the historical handling. I flagged this last time, but if we're going to preserve support for static, we need to preserve the historical handling. There's no point preserving historical syntax if we're not going to preserve historical behavior.
There was a problem hiding this comment.
Hi @freakboy3742, I believe you’re referring to the new deploy/static/ directory structure introduced here. That actually came from our proposed solutions, to move the static folder inside the deploy directory. At the moment, the static folder lives inside the toga_web directory. In the old Toga PR #1945, both the index.html and toga.css files were located in the inserts folder.
Caydn and I discussed this and agreed to remove the static folder entirely, placing both index.html and toga.css in the deploy/inserts directory, as in PR1945 and preserve that directory structure and legacy functionality including the old CSS banners. I just wanted to confirm if this approach would address the issue before I push the new changes.
There was a problem hiding this comment.
I evidently misread the significance of that /static handling part of your proposal, so apologies for the confusion on this.
There's three options on the table when it comes to /static handling:
- Preserving the historical functionality of the /static directory
- Preserving the old functionality, but requiring the content to be in a new location
- Completely removing the old functionality
(1) is the backwards compatible approach. (3) is the backwards incompatible approach, where we tell users they need to transition to inserts. (2) is what you've proposed - something that isn't compatible for old users, and adds new functionality that doesn't really have much utility when the option for using inserts exists.
Based purely on functionality, my inclination would be to just move to (3). toga-web is sufficiently immature that I'm comfortable introducing a backwards incompatibility, and toga-web will the only user of this functionality in practice.
However, the issue is the transition as we release this set of changes. If we land this code in Briefcase, we won't be able to deploy a published version of toga-web; if we land an updated toga-web PR, we don't be able to use a published version of Briefcase to deploy it. We'd essentially be in a position where there's a need to tightly coordinate releases of 2 projects, and ensure all users update both simultaneously - which isn't something we can guarantee.
So - we need to do (1).
By way of implementation: my suggestion is to treat the backwards compatibility path as inserts - but inserts with fixed names/behaviors.
So - if a wheel provides CSS content in a /static folder, that content is read, and treated as an insert into the default CSS file in the template - but if static content is read in this way, a warning is raised that it's legacy handling that will be removed in the future.
Similarly, if the project configuration defines style_framework, that triggers the addition of specific inserts that match what the template currently includes automatically, with a warning to the user that the feature will go away in the future.
Does that make sense?
There was a problem hiding this comment.
Hi Russell, thank you for the detailed explanation. I understand and agree with you here. :)
We will return the functionality to handle static files under toga_web/static/ or similar.
In the event that the version of toga-web does not supply shoelace inserts and uses static files but the version of the Web Template has insert markers and no hard-coded pyscript version or shoelace declarations, would we need functionality to account for this? Currently, the pyscript version should default to 2024.11.1 in the case it is not supplied but I'm not particularly sure of shoelace.
There was a problem hiding this comment.
We essentially need to handle 2 cases:
- A legacy layout using
/static, with no special inserts - A "modern" layout using the deploy folder, with all the inserts.
There won't have the case of a package that has both; and we only need to maintain the legacy layout for transition purposes. I'd imagine that when this PR lands, we'll open a PR to remind us to remove the legacy support in 6-12 months.
So - if it's a legacy layout, then we can hardcode the behavior of the current template into Briefcase - that is, the specific Shoelace (or Bootstrap) and PyScript versions that are currently defined in the template can be put into Briefcase.
Refactor web static build to remove fragile type inference, apply all marker styles per file and streamline wheel processing
e31b35d to
a520633
Compare
|
Hi @freakboy3742, I believe this is pretty much done along with our other PRs (beeware/toga#3666 and beeware/briefcase-web-static-template#21). Whenever you have a chance, we would like to have another review :) Feel free to try out the features using the method outlined in our issue post (#2337) |
|
Also, I believe the failing tests for 3.14 might be false negatives as I have definitely included those lines in |
ACK - I'm currently snowed under with a work deadline; hopefully I'll get a chance to look at this in the next day or two.
Hrm - that's definitely an odd one. It's possibly a bug in the coverage tool itself; as part of my review, I'll see if I can work out what is going on. |
freakboy3742
left a comment
There was a problem hiding this comment.
This is looking really good - I've flagged a few things (and fixed most of them in a review pass auditing the error messages); the few that remain are mostly cosmetic.
The biggest impact change is renaming backend to implementation; this is one of those things that doesn't really stand out until you look at the code in action, but pyscript isn't really a "backend" in this context - it's a Python implementation. That obviously requires a change in the template as well, but it's a fairly minor one.
There's also some minor test re-organization stuff - but the tests themselves look great.
I've resolved the weird coverage issue; it definitely looks like a bug in coverage, born out of a triple nested context manager where the innermost is in an exception. I've worked around the problem by collapsing the syntax for the outermost 2 context managers, which means you also gain an extra level of indentation.
One last thing - it's a notable gap that the asset-gathering strategy isn't currently documented at all. If you want to earn some major OSS contribution points, adding some initial documentation to the web backend detailing what a wheel can do to include HTML/CSS content, that would be amazing. It's probably not a hard requirement for merging though, so if you don't have time, I can add those docs myself at some point.
| :param path: Path object of the file inside the wheel. | ||
| :param filename: Filename string inside the wheel. |
There was a problem hiding this comment.
You shouldn't need to pass both the path and filename. The two are functionally interchangeable.
There was a problem hiding this comment.
I've added the filename back as using path was causing issues in finding files as forward slashes would be converted to back slashes
There was a problem hiding this comment.
Ok - that shouldn't be a problem... where exactly is this manifesting? If the problem is that we need to be able to guarantee forward slashes, you can use PosixPath instead of Path - that's the subclass of Path that is used on Linux/macOS, but you can also use it on Windows.
Also - is this something that is revealed by a test? If not... it should be.
There was a problem hiding this comment.
It was revealed when running test_build_app at first. It seemed the path would become malformed at some point as str(path) would be "dependency\\\\static\\\\style.css". So when attempting the line css_text = wheel.read(str(path)).decode("utf-8") it would result in KeyError.
This function is tested in test_build__handle_legacy_css.py, however this test only covers the function itself and not process_wheel as a whole.
There was a problem hiding this comment.
I should also note that I switched back to filename to match _write_insert, which also uses filename as well as 'parts' to create 'rel_inside'. If PosxPath were to resolve this issue as well, I'd be happy to switch both functions over to using that :)
There was a problem hiding this comment.
Ok - that's how the test is failing, but what's the actual manifestation in practice? The \\\\ format suggests that something is getting double escaped - something is interpreting a Windows path as a unix path, and then back to a windows path (or similar).
The core of what I'm getting at here is that Path() as an object contains all the information we need. We need to be careful about the exact format - but that's a code hygiene issue. You see the same thing with unicode and byte strings - you need to know, at every step of the process, if you have an array of unicode code points, or an array of bytes - and if it's an array of bytes, it is utf-8, or utf-16, or something else exotic. You don't fix the problem by passing both bytes and str around - you work out what format you have, what format you need, find out the representation that is richest and requires the least conversion (or the least risky covnersion), and then convert as needed.
In this case - does Zip on Windows return / or \ paths? Can you read a / path out of a Windows zip file? We need to establish the ground truth here, and then work a clean solution from that.
There was a problem hiding this comment.
I see what you mean. Zip should always return / regardless of OS.
After re-reading these functions, it looks like _handle_insert doesn't even need 'parts' as it immediately converts them to a PosixPath with rel_inside = "/".join(parts[3:]) and _handle_legacy_css does a similar conversion with rel_inside = "/".join(path.parts[2:])
So, I think the best solution with the most control would be to replace the first conversion at Path(filename) in _process_wheel with a PosixPath, as you initially suggested. This would eliminate the need for parts and filename and in both handle functions.
There was a problem hiding this comment.
This has now been implemented and has resolved any issues with backslash in paths :)
Minor comment update warn on each legacy /static/*.css, not once per wheel
Create and update test_build__handle_legacy_css.py that houses the test cases for _handle_legacy_css method
Create and update test_build__handle_insert.py that houses the test cases for _handle_insert method
Log invalid insert files as warnings instead of debug in _handle_insert()
debugs have been changed to warning, monkey patch no longer necessary
|
Hi @freakboy3742, we have addressed your review comments and fixed up our test cases. This should be good for another review :) On the note of documentation, I'm unsure if Kavi or I will have time to complete that this week. If we do, we can submit it as another PR if that works for you |
freakboy3742
left a comment
There was a problem hiding this comment.
This all looks great. One last question/tweak possibly required around passing in the filename; details inline.
Update Briefcase’s build system to support toolkit-owned pyscript/backend configuration and insertion content.
Modified platforms/web/static.py to:
index.htmlandbriefcase.csswith corresponding insert content.Refs #2337
PR Checklist: