The "Data Quality Assurance Tool" project is a comprehensive solution for data quality management and control in Internet of Things (IoT) environments. Specifically designed for applications like aquaculture, this repository introduces an event-driven software architecture, leveraging the Python programming language.
-
Event-Driven Architecture: The repository implements an event-driven architecture to handle real-time data collection, processing, and analysis generated by IoT sensors.
-
Modular Layers: Organized into distinct modules, the project includes components for event generation (producers), event processing (processors), and event consumption (consumers).
-
Efficient Persistence: The persistence layer uses a dedicated database to store events and quality metrics, ensuring durability and query capability.
-
Flexible Integration: Integration modules allow seamless connection with other systems and services, featuring adapters for external APIs and connectors for databases.
-
Robust Security: It includes authentication and authorization features to ensure system security, controlling access to data and events.
-
Monitoring and Logs: A monitoring layer provides tools to trace event flow, analyze system performance, and log information for auditing.
-
Centralized Configuration: A configuration module enables easy adjustment of project settings to meet specific environment needs.
- Clone the repository.
- Configure and start the different modules according to your specific requirements.
- Personalize settings and adapt adapters as needed.
- Monitor and review logs to ensure system integrity.
- Python ^3.8
- Kafka broker running (localhost:9092)
- InfluxDB
The dataset used in this project was obtained from Kaggle: Sensor-Based Aquaponics Fish Pond Datasets.
The dataset includes sensor readings for parameters such as temperature, turbidity, ammonia, nitrate, pH, dissolved oxygen (DO), and geographic coordinates. The data is stored in a CSV file located at QualityTool/data/.
This project was created during the Master's program in Information Engineering at the Federal University of ABC (UFABC). It represents a culmination of academic research and practical implementation, addressing challenges in data quality management within IoT environments.
If DQAT has been useful to you, and you would like to cite it in a scientific publication, please refer to the paper published at SBrT 2024:
@inproceedings{romero2024dqat,
title={DQAT: An Online Machine Learning Framework for Real-Time Data Quality Assurance in IoT},
author={Romero, Marcos Lima and Suyama, Ricardo},
booktitle={XLII Brazilian Symposium on Telecommunications and Signal Processing (SBrT 2024)},
year={2024},
organization={SBrT}
}Feel free to contribute to this open-source project. Your contributions are valuable for further research, collaboration, and learning.
This project is licensed under the GNU General Public License v3.0.
