This open-source project aims to provide a comprehensive analysis of data related to the Ukraine war sourced from Reddit. The project focuses on extracting hashtags and keywords from Reddit posts and comments, and then visualizing the data in the form of graphs to reveal insights and trends related to the Ukraine war. The analysis is performed using Python programming, leveraging the Reddit API, and various data processing and visualization libraries.
-
Data Extraction from Reddit:
- Utilizes the Reddit API to collect data related to the Ukraine war, including posts and comments from relevant subreddits.
- Provides the ability to specify the subreddits of interest and the time frame for data retrieval.
-
Hashtag and Keyword Extraction:
- Extracts hashtags and keywords from Reddit posts and comments using natural language processing techniques.
- Categorizes and counts the most frequently used hashtags and keywords to identify trending topics and discussions.
-
Network Graph Visualization:
- Creates network graphs based on the relationships between hashtags, keywords, and Reddit users.
- Visualizes the graphs to illustrate the connections and interactions within the Reddit community regarding the Ukraine war.
-
GitHub Repository:
- Organized and well-documented codebase available on GitHub for easy access and collaboration.
- Contains step-by-step instructions and tutorials on setting up the project environment and running the analysis.
-
Data Insights and Reporting:
- Provides insights and findings based on the analysis, offering a deeper understanding of the Reddit community's engagement with the Ukraine war.
To use this project, follow these steps:
-
Clone the Repository:
- Clone this GitHub repository to your local machine using the provided repository URL.
-
Setup Environment:
- Create a virtual environment for the project using Python.
- Install the required Python libraries specified in the project's
requirements.txtfile to ensure a consistent environment.
-
Configure Reddit API Credentials:
- Obtain Reddit API credentials by creating a Reddit Developer Application as instructed in the project documentation.
- Configure the credentials in your project to enable access to Reddit's data.
-
Run the Analysis:
- Execute the Python scripts in the project to perform data extraction, keyword and hashtag analysis, and graph visualization.
- Customize the parameters to focus on specific subreddits or time periods for analysis.
-
Explore Results:
- Explore the generated network graphs and insights to gain a deeper understanding of the Ukraine war discussions on Reddit.
Contributions to this project are welcome! Feel free to open issues, submit pull requests, or provide feedback to help improve the project. Follow the guidelines outlined in the project's documentation to contribute effectively.
This project was developed by Sina Tavakoli and is made possible through the use of the Reddit API and various Python libraries, including PRAW, NetworkX, and others.
This project is licensed under the MIT license.