Methods
In our project, we use ROS2 as our middleware, enabling communication between the different components of our robotics system.
ROS2 facilitates robust message exchange across our nodes, which form the backbone of our client-server architecture.
Our architecture consists of dedicated server and client nodes. The server node manages tasks, handles logic, and coordinates actions, while the client node sends
commands and processes responses. This separation of concerns not only enhances modularity but also ensures efficient real-time performance.
The core modules of our system include:
- State Machine: Manages the logical flow with eight defined states to control the robot’s behavior.
- Language: Processes audio inputs using advanced speech-to-text algorithms, translating voice commands into actionable tasks.
- Perception: Integrates YOLOv8 for real-time object detection (tasks 1 and 3) and MediaPipe Hands for gesture recognition (task 3 only). The goal is to enable the robot to detect fruits and their locations, as well as detect which fruit a person is pointing at.
- Pose Estimation: Employs transformation algorithms and sensor data to determine the robot’s position and orientation, guiding precise movements.
- Navigation: Implements motion planning algorithms for safe and efficient pathfinding, facilitating object retrieval and obstacle avoidance.
This modular approach, powered by ROS2, allows our system to be scalable and flexible, making it easier to incorporate future enhancements and adapt to diverse application scenarios.