Skip to content

Conversation

@phil-opp
Copy link
Collaborator

We forgot to create the working directory in a few cases in #901. This PR fixes this by adding the proper create_dir_all calls.

Alternative to #1064

@phil-opp phil-opp requested a review from haixuanTao July 15, 2025 14:50
@haixuanTao
Copy link
Collaborator

No this does not work, when I run build. the spawn command fails with:

[ERROR]
build failed: failed to build node `camera`

Caused by:
   0: build command failed
   1: build command `pip install ../../node-hub/opencv-video-capture` returned exit status: 2

Location:
    libraries/core/src/build/build_command.rs:81:24

Location:
    /Users/xaviertao/Documents/work/dora/binaries/coordinator/src/listener.rs:119:54

Location:
    binaries/cli/src/command/build/distributed.rs:102:25

@haixuanTao
Copy link
Collaborator

Even though the _work/01980d54-7d39-74e3-bca4-f6e82a9af347 exist.

You can reproduce by trying to build:

  • Spawning 2 daemons ( named encoder and decoder ) or modifying the underlying dataflow with a single daemon
  • Run the build command with
dora build examples/av1-encoding/dataflow.yml --uv

@phil-opp
Copy link
Collaborator Author

Thanks for the details! We should definitely improve the error reporting so that we also see the stdout/stderr instead of just the exit code.

Ensures that we see all error messages instead of just the exit code.
@phil-opp
Copy link
Collaborator Author

I pushed 6073c9c to ensure that the stdout and stderr are printed before we print the exit code.

@phil-opp
Copy link
Collaborator Author

You can reproduce by trying to build:

* Spawning 2 daemons ( named encoder and decoder ) or modifying the underlying dataflow with a single daemon

* Run the build command with
dora build examples/av1-encoding/dataflow.yml --uv

Looks like this dataflow requires a specific working directory. You can set it using a working_dir key:

- id: dav1d-remote
  _unstable_deploy:
      machine: decoder
      working_dir: /path/to/your/clone/of/dora-rs
    path: dora-dav1d
    build: cargo build -p dora-dav1d --release

As an alternative, you can do one of the following:

  • use absolute paths in your build key: cargo build --manifest-path /path/to/your/dora/clone/Cargo.toml -p dora-dav1d --release
  • if you want to rely on the directory where the dora daemon is started, you can provide a relative path that goes two directories up: cargo build --manifest-path ../../Cargo.toml -p dora-dav1d --release

@phil-opp
Copy link
Collaborator Author

I also opened #1067 to imrpove the log output. With that PR, you should see the exact command and working directory for each build command.

@theol0403
Copy link

theol0403 commented Jul 16, 2025

I ran into this issue today when trying to set up distributed nodes.

Using just a local python node works fine:

  - id: keyboard_controller
    build: pip install -e keyboard_controller
    path: keyboard_controller/src/main.py
    inputs:
      tick: dora/timer/millis/100

  - id: rust_node
    build: cargo build -p rust_node
    path: target/debug/rust_node
    inputs:
      tick: dora/timer/millis/100

Then, dora build and dora start work as expected and start the python and rust nodes locally.

However, making just the rust node distributed:

  - id: rust_node
    _unstable_deploy:
      machine: other
    path: dynamic
    inputs:
      tick: dora/timer/millis/100

Causes keyboard_controller on the LOCAL daemon to fail.

keyboard_controller on default daemon : stdout    /opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python: can't open file '/Users/theol/Documents/github/rust-ethercat/_work/01981518-3271-7b98-9fcc-97fd5d61297c/keyboard_controller/src/main.py': [Errno 2] No such file or directory
keyboard_controller on default daemon : stdout    
keyboard_controller on default daemon : stdout    
keyboard_controller on default daemon : ERROR  daemon    exited with code 2 with stderr output:
---------------------------------------------------------------------------------
/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python: can't open file '/Users/theol/Documents/github/rust-ethercat/_work/01981518-3271-7b98-9fcc-97fd5d61297c/keyboard_controller/src/main.py': [Errno 2] No such file or directory
---------------------------------------------------------------------------------

Then, changing the python node to:

  - id: keyboard_controller
    build: pip install -e ../../keyboard_controller
    path: ../../keyboard_controller/src/main.py
    inputs:
      tick: dora/timer/millis/100

solves the problem. But, I think this is very unintuitive and was very difficult to troubleshoot. Now, when commenting out the distributed rust node, the python node fails because the paths are wrong!

It would be good if the nodes were spawned in the working directory of their daemons.

@phil-opp
Copy link
Collaborator Author

phil-opp commented Jul 17, 2025

Thanks for your feedback!

Let's discuss this in a separate issue since this is not directly relevant to this PR. I opened #1074 for this.

@phil-opp phil-opp enabled auto-merge July 22, 2025 10:48
@phil-opp
Copy link
Collaborator Author

Adding this PR to the merge queue, given that the changes are just small bugfixes. We want to have them either way, even if we decide for another default working directory in #1074.

@phil-opp phil-opp disabled auto-merge July 22, 2025 11:51
@phil-opp phil-opp merged commit 2f610f9 into main Jul 22, 2025
200 of 203 checks passed
@phil-opp phil-opp deleted the create-working-dir branch July 22, 2025 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants