Twitter's broken data model becomes slightly less broken

September 19, 2009 in Data,Internet

I don't usually have anything nice to say about Twitter (though I still ignore my mother's advice and say it anyway), but the company is finally taking steps to improve one of the most glaring faults with their service: retweets.

Previously, retweets were simply new tweets that happened to contain old information. This created clutter and noise: you may read my full thoughts here. Now, Twitter developer Marcel Molina admits:

One of the main confusions and criticisms about the retweet API was 
around what happens when a given tweet is retweeted multiple times. 
The explanation was that developers need to do their own retweet 
collapsing. If N people retweet a given tweet, you'd get N instances 
of that same tweet in the appropriate retweet timeline and the home 
timeline. You would then have to do your own internal book keeping 
about whether that tweet had already come in. If it hadn't you'd 
display it for the first time. If it had you'd update the already 
displayed tweet.

Asking developers to collapse retweets in timelines is onerous, 
complicated and confusing. We're not going to do it that way. We are 
going to add a resource that gives you all retweets for a given tweet. 
In timelines you will get only the first retweet. You can then request 
all retweets for that tweet at any time to get up to 100 retweets that 
have been created for it.

So we are on our way toward a real data model! No longer will data be spuriously copied and clutter the Twittersphere - now information will retain one form, with many instances of the same. This is a mature approach and how it should have been handled in the first place. Bravo to the Twitter team - late is infinitely better than never.

Leave a Comment

Previous post:

Next post: