-
Notifications
You must be signed in to change notification settings - Fork 257
Description
Self Checks
To make sure we get to you in time, please check the following :)
- I have searched for existing issues search for existing issues, including closed ones.
- I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- "Please do not modify this template :) and fill in all the required fields."
Versions
- dify-plugin-daemon Version: 0.1.3-local
- dify-api Version: 1.4.3
Describe the bug
Plugin daemon experiences network connection failures when ECS tasks are redeployed, causing plugins to become inaccessible. Two distinct but related issues occur:
Important Question: Is deploying new versions and restarting ECS tasks an expected/supported use case for dify-plugin-daemon? We want to understand if this is a supported deployment pattern or if we're using the plugin daemon outside of its intended scope.
- S3 Storage Issue (Previous): When using S3 for plugin storage, plugins fail to load after container restart with error "plugin_unique_identifier is not valid"
- Network Connection Issue (Current): After switching to EFS + local storage and forcing new deployment, plugin daemon fails to connect to itself using old IP addresses
To Reproduce
Steps to reproduce the behavior:
Issue 1: S3 Storage Problem
- Configure dify-plugin-daemon with
PLUGIN_STORAGE_TYPE=aws_s3 - Install plugins (OpenAI, Bedrock) via Dify web interface
- Restart/redeploy ECS service
- Observe error logs:
2025/07/17 04:50:14 watcher.go:73: [ERROR]list installed plugins failed: plugin_unique_identifier is not valid:
Issue 2: Network Connection Problem (After EFS Migration)
- Configure dify-plugin-daemon with
PLUGIN_STORAGE_TYPE=localand EFS mount - Install plugins via Dify web interface
- Force new deployment via ECS service update
- Check logs and see connection errors to old task IP addresses
Expected behavior
Plugin daemon should:
- Successfully reload installed plugins after container restarts/redeployments
- Handle IP address changes gracefully without requiring manual intervention
- Maintain plugin state consistency across deployments
Screenshots
Error Logs - S3 Storage Issue:
2025/07/17 04:50:14 watcher.go:73: [ERROR]list installed plugins failed:
plugin_unique_identifier is not valid:
Error Logs - Network Connection Issue:
2025/07/18 05:14:00 middleware.go:131: [ERROR]redirect request failed: Post "http://169.254.172.2:5002/plugin/a5df51ca-fba9-4170-8369-4ae0eff4f543/dispatch/model/schema": dial tcp 169.254.172.2:5002: connect: cannot assign requested address
2025/07/18 05:14:00 factory.go:28: [ERROR]PluginDaemonInternalServerError: redirect request failed: Post "http://169.254.172.2:5002/plugin/a5df51ca-fba9-4170-8369-4ae0eff4f543/dispatch/model/schema": dial tcp 169.254.172.2:5002: connect: cannot assign requested address
Environment Configuration:
SERVER_HOST=0.0.0.0
SERVER_PORT=5002
PLUGIN_WORKING_PATH=/app/shared_plugins
PLUGIN_INSTALLED_PATH=installed
PLUGIN_PACKAGE_CACHE_PATH=packages
PLUGIN_STORAGE_TYPE=localAdditional context
Infrastructure Details:
- Platform: AWS ECS Fargate
- Network: awsvpc mode with dynamic IP assignment
- Storage:
- Initial setup: S3 for plugin storage
- Current setup: EFS (NFSv4.1) mounted at
/app/shared_plugins
- Database: Aurora PostgreSQL 15.10
Analysis:
- S3 Issue: Suggests plugin metadata in database becomes inconsistent with actual S3 storage
- Network Issue: Plugin daemon appears to cache/store IP addresses internally, causing connection failures when container IP changes
Potential Root Causes:
- Plugin daemon may be storing absolute network references instead of using localhost/relative addressing
- Database plugin metadata may contain stale network configuration
- Plugin state management may not be designed for ephemeral container environments
Workaround Applied:
Complete plugin cleanup and reinstallation after each deployment:
# Clear EFS plugin data
rm -rf /app/shared_plugins/langgenius/
# Clear database plugin references
DELETE FROM plugins;
DELETE FROM plugin_installations;
DELETE FROM plugin_declarations;Request:
Is there a configuration option or best practice to make plugin daemon more resilient to container restarts and IP address changes in containerized environments like ECS Fargate?