Get the rooting interest of the person who Tweeted.
Get the most frequently referenced team names, player names, etc.
Those are strong indicators of passion for a team.
Word associations
Lakers - Kobe Bryant
Lakers - Lakeshow
Lakers - lake show
Lakers - showtime
etc.
Some of these are basically synonyms, but that might not be worth exploring.
---
Another way to get a a list of good users is the followers of certain Twitter accounts like @NFL, @NBA, @Lakers, etc.
---
Algorithm:
For each team:
Finding the keywords that are related to a team
Team name, acronym, nicknames
Also get the most commonly associated keywords
(which might include player names)
For the followers of each team (like @Lakers, @Clippers, etc)
Find the followers to mention the keywords the most (maybe like 100)
For each follower
Get a collection of tweets
Perhaps a certain number
Perhaps a collection of tweets around the time of a game event
see how many of the keywords associated with each team they have
Have a confidence score for each team that this person might root for.
Collect the highest scoring individuals for a certain team
Manually read their Twitter feeds, determine if they are a fan of the team or not.
This generates a statistic: how many of the people our system found were actually fans of the given team?
Then, to compare:
From the followers of a certain team, randomly pick users.
Manually inspect these users, determine if they are a fan of the team they follow.
Get statistic for this set.
Compare statistics.
No comments:
Post a Comment