Atheneo is a comprehensive system that combines Reddit sentiment analysis with sports market data to identify market sentiment patterns and opportunities. Developed as a graduate-level machine learning project, it processes social media signals, matches them with market movements, and generates actionable insights using advanced sports analytics. The system includes a Streamlit app for visualizing insights and sentiment analysis.
- Data Processing: Handles 500+ Reddit posts daily across 9 European football leagues
- Team Recognition: Achieves >90% accuracy in identifying team mentions (validated on 1000+ test posts)
- Analysis Speed: Reduces manual sentiment analysis from hours to minutes through automation
- Visualization: Features 6+ interactive visualizations tracking 20+ teams simultaneously
- Overview Metrics: Total signals, active matches, and confidence scores
- Sentiment Distribution: Visual breakdown of soccer related sentiments on subreddits (strongly positive to strongly negative)
- Team Analysis: Track trending teams and market sentiment
- Latest Insights: Real-time market sentiment recommendations with match details
- Multi-level Classification:
- Strongly Positive: High confidence positive sentiment
- Positive: Good sentiment opportunities
- Neutral: Balanced or unclear signals
- Negative: Poor sentiment or high risk
- Strongly Negative: Strong negative sentiment signals
- Confidence Scoring:
- Automated confidence assessment (0-1 scale)
- Based on signal strength and market consensus
- Weighted by source reliability
- Mention Tracking: Monitor team discussion frequency
- Pattern Recognition: Identify team aliases and nicknames
- Context Analysis: Understand team references in various formats
- Reddit Integration:
- Monitors key subreddits for sports sentiment signals
- Tracks team news, injuries, and lineups
- Filters relevant sports discussions
- Signal Processing:
- Team identification using pattern matching
- Market data matching with current conditions
- Signal validation and reliability scoring
- Market Analysis:
- Real-time market data aggregation via API
- Implied probability calculations
- Expected value analysis
- Risk assessment for market movements
- Python 3.8+
- Reddit API credentials (see below for setup)
- OpenAI API key
- Git LFS (for model files)
-
Clone the repository:
git clone https://github.com/asadadnan11/atheneo.git cd atheneo -
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables in
.env:REDDIT_CLIENT_ID=your_client_id REDDIT_CLIENT_SECRET=your_client_secret OPENAI_API_KEY=your_api_key -
Initialize the system:
python reddit_harvester.py # Start data collection python gpt_signal_matcher.py # Process signals
-
Launch the Streamlit app:
streamlit run streamlit_app.py
- Go to https://www.reddit.com/prefs/apps
- Click "Create App" or "Create Another App"
- Select "script"
- Fill in the required information
- Copy the client ID and client secret to your
.envfile
- View key metrics and sentiment distribution
- Monitor active signals and confidence scores
- Track overall market sentiment
- Explore team-specific analysis
- View trending teams and market movement
- Analyze historical patterns
- See latest market sentiment recommendations
- View detailed match information
- Track confidence levels and reasoning
config.py: Adjust sentiment metrics and thresholdsteam_aliases.py: Customize team identification patterns.env: Set up API credentials and environment variables.streamlit/config.toml: Customize Streamlit app appearance
-
API Rate Limits:
- Implement proper delays between Reddit API calls
- Use caching for frequent requests
-
Model Loading Errors:
- Ensure Git LFS properly downloaded model files
- Check model file paths in config
-
Sentiment Analysis Issues:
- Verify confidence thresholds in config
- Check pattern matching rules
-
Dashboard Performance:
- Enable caching for heavy computations
- Optimize data loading patterns
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
MIT License
- Implement ML-based team identification
- Add signal reliability scoring
- Develop market movement prediction
- Enhance sentiment analysis
- Optimize risk assessment
- Add strategy adaptation
For issues and feature requests, please use the GitHub issue tracker.