This project implements real-time object detection and tracking using multiple Intel RealSense cameras and YOLO object detection models. The cameras are placed in a surround configuration to detect objects from multiple perspectives, calculate their 3D coordinates, and merge similar detections across the cameras. This system is meant for robotics and autonomous systems that require a 360-degree field of view.
- Multi-camera setup: Uses multiple RealSense cameras for complete surround detection.
- YOLOv10 and YOLOv9_seg models: Support for both box-based and mask-based detection.
- Real-time 3D tracking: Converts 2D detections into 3D coordinates using camera depth data.
- Object merging: Detects and merges similar objects across different camera views.
- Live visualization: Real-time display of detection results in both 2D and 3D.
- Python 3.x
- Intel RealSense SDK (
pyrealsense2) - OpenCV
- YOLO models from Ultralytics
- NumPy
- Matplotlib
You can install the required dependencies using the following command:
pip install --no-cache-dir -r requirements.txt- one or multiple Intel RealSense depth cameras (D455 or similar)
- USB 3.0 ports for camera connectivity
- A system with CUDA support is recommended for YOLO detection
Create a JSON configuration file CAMERAS.json with the serial numbers of the RealSense cameras:
{
"serials": [
"1234567890",
...
],
"extrinsics": {
"1234567890": [[1,0,0,0], [0,1,0,0], [0,0,1,0], [0,0,0,1]],
...
}
}Each camera's extrinsic transformation matrix should be provided to align them in the 3D space.
The configuration can be verified using plot_cameras.py. Camera config files for Jackal 04 with 5 cameras is included.
To test object detection using multiple cameras:
python multidetector.pyThis will:
- Initialize the RealSense cameras
- Run YOLO object detection on each camera in separate threads
- Optionally, Visualize the 2D detection results
- Optionally, display the detections in a 3D plot
You can modify the behavior of the detectors with the following arguments in multidetector.py and merger.py:
camera_config_path: Path to the JSON file containing camera serials and extrinsics.detection_type: Choose between'box'for bounding box detection or'mask'for segmentation mask detection.show: Enable/disable showing the real-time video feed.draw: Enable/disable annotations on the video feed.plot: Enable/disable 3D plotting of object positions.verbose: Enable/disable verbose output for debugging.
Example:
detector = MultiDetector(camera_config_path='CAMERAS.json', detection_type='box', show=True, draw=True, plot=False, verbose=False)
detector.start()To merge detections across multiple cameras and visualize the results:
merger = Merger(camera_config_path='CAMERAS_Jackal04.json', threshold=0.35, show=True, plot=True)
merger.merge_and_plot_detections()This will:
- Start YOLO detection on all cameras in separate threads using the MultiDetector class
- Merge similar objects detected from different perspectives
- Optionally, display both individual and merged detections in a 2D plot and camera feed
multidetector.py: Main class that runs YOLO object detection on multiple RealSense cameras and outputs 3D coordinates for each detected object.merger.py: Class for merging object detections across cameras and plotting the results.CAMERAS.json: Configuration file specifying the serial numbers and extrinsics for each camera.plot_cameras.py: Visualize camera extrinsics specified in CAMERAS.json as verification