Delete Old Tweets Selectively Using Python and Tweepy
For some time I’ve used an online service to delete tweets that are more than one week old. I do this because I use Twitter for levity, for throwaway comments and retweets on issues of the day, and I don’t really want those saved for posterity. Thanks to search crawlers and caches I can never be certain that tweets are gone forever, but this is a small step in that direction.
When I joined Keybase I discovered that I needed to prevent my ‘proof’ tweet from being deleted, and the simple method used by the online deletion service was no longer an option. My solution uses an exception list containing the IDs of the tweets I wish to save, and these are ignored when their contemporaries are merged with the infinite.
I’ve written a Python script that uses Tweepy to scan the contents of my timeline and delete any tweet that meets two criteria - more than seven days old and not in my exception list. It’s very simple, there are probably better ways of doing it (please let me know), but it works well for me as a nightly cron job.
Please note that since I’ve been deleting my old tweets this way for some time I’ve never had issues with the Twitter API rate limits. Every deletion is an API call, so if you have many tweets you may need to consider initially limiting the number returned via the .items() method. This is demonstrated in the Tweepy cursor tutorial.
To get the required authentication keys you will need to register a Twitter application.
Update
Since my initial post I’ve added functionality to unfavor (or ‘unfavorite’) tweets, too. I’ve included the full script below.
#!/usr/bin/env python
import tweepy
from datetime import datetime, timedelta
# options
test_mode = False
verbose = False
delete_tweets = True
delete_favs = True
days_to_keep = 7
tweets_to_save = [
573245340398170114, # keybase proof
573395137637662721, # a tweet to this very post
]
favs_to_save = [
362469775730946048, # tony this is icac
]
# auth and api
consumer_key = 'XXXXXXXX'
consumer_secret = 'XXXXXXXX'
access_token = 'XXXXXXXX'
access_token_secret = 'XXXXXXXX'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
# set cutoff date, use utc to match twitter
cutoff_date = datetime.utcnow() - timedelta(days=days_to_keep)
# delete old tweets
if delete_tweets:
# get all timeline tweets
print "Retrieving timeline tweets"
timeline = tweepy.Cursor(api.user_timeline).items()
deletion_count = 0
ignored_count = 0
for tweet in timeline:
# where tweets are not in save list and older than cutoff date
if tweet.id not in tweets_to_save and tweet.created_at < cutoff_date:
if verbose:
print "Deleting %d: [%s] %s" % (tweet.id, tweet.created_at, tweet.text)
if not test_mode:
api.destroy_status(tweet.id)
deletion_count += 1
else:
ignored_count += 1
print "Deleted %d tweets, ignored %d" % (deletion_count, ignored_count)
else:
print "Not deleting tweets"
# unfavor old favorites
if delete_favs:
# get all favorites
print "Retrieving favorite tweets"
favorites = tweepy.Cursor(api.favorites).items()
unfav_count = 0
kept_count = 0
for tweet in favorites:
# where tweets are not in save list and older than cutoff date
if tweet.id not in favs_to_save and tweet.created_at < cutoff_date:
if verbose:
print "Unfavoring %d: [%s] %s" % (tweet.id, tweet.created_at, tweet.text)
if not test_mode:
api.destroy_favorite(tweet.id)
unfav_count += 1
else:
kept_count += 1
print "Unfavored %d tweets, ignored %d" % (unfav_count, kept_count)
else:
print "Not unfavoring tweets"
Georges-Henry PORTEFAIT