Following on my last post about the significance of preserving access to Tweets I’ve spent the evening experimenting with the framework I briefly touched on in the post and wanted to take some time to discuss not only what I’ve done so far, but also some issues and questions that have arose along the way.
Components
As indicated, the syndication framework I’ve set-up incorporates WordPress as the blogging platform, FeedWordPress for aggregation and syndication, and a WP Theme called P2, which is designed as “a group blog theme for short update messages.” Add in Akismet for spam protection, and WordPress.com Stat’s to “tracks views, post/page views, referrers, and clicks” and that’s everything.
From a technical standpoint it’s a very, very basic WordPress framework, however as I’ve discovered it works amazingly well for the task I had set out to achieve.
I’ve christened the site “Tweets in Perpetuity” and it is available at: http://p2.techticker.net/
Syndication
In terms of setting up the syndication elements, I’m currently experimenting with a couple of use cases – the feed for my Twitter updates, as well as the search results for #CCK09 and #ECI831. The latter two feeds were important to test out, because I wasn’t sure exactly how WordPress would handle the fact the feed contained contributions from different authors.
As it turns out, the FeedWordPress settings give you control over a variety of different elements – including Authors, Posts, Categories and Tags.
For the Author setting, I’ve configured FeedWordPress to create a new WordPress account for Authors who haven’t been syndicated before. This means it’s easier to filter out the contributions of different people so they’re not all lumped into one general category. From the standpoint of attribution and citing sources this is very important – I don’t own the content and do not to wish have it appear as such.
Following on this logic, I’ve configured the permalinks to point back to the source on Twitter, rather than the local instance syndicated in the WordPress blog.
Categorising the Feeds
Due to the fact I currently have 3 different feed sources coming into the blog, it made sense to provide additional organisational structures to differentiate one stream from the others. To accommodate this I’ve set FeedWordPress to automatically assign each post in a feed to a specific category as soon as it’s syndicated and have used a term that is clearly related to the feed – in this case @mbogle, #cck09 and #eci831 respectively.
So now, using the category links in the right menu, you’re able to quickly filter the updates to one stream.
Search Indexing
One benefit of syndicating through WordPress that I’d failed to remember is the fact that, once the content has been pulled in and syndicated, it is also indexed and therefore searchable. So now you’re able to take advantage of WordPress’s awesome full-text indexing and locate information far more easily that you can with the native Twitter search options.
Comment Fracturing
One factor that I’m still grappling with is how to mitigate comment fracturing, and ensure that any discussion that takes place on the WordPress blog is channeled back into Twitter and made available to the author of the content.
One of the settings in FeedWordPress concerns whether to allow commenting or not. In the case of syndicated blogs it’s best to deactivate commenting and refer people back to the point of origin to engage in discussion there.
At the moment I have the commenting option activated, and people are allowed to post replies on the blog – however I’m in two minds about whether that should remain the case or not.
I spent quite a while trying to locate a plugin for WordPress that would enable people posting comments to also sent their comments to Twitter, but the one that looked promising does not appear compatible with the P2 theme I’m using.
The other option is to try incorporating a commenting framework such as Disqus, which includes this option as well – but again, I’m not sure whether P2 would be compatible or not.
Implications of Copyright on Syndication
Much more significant a topic than comment fracturing though is that of copyright, and the implications of copyright law on syndication frameworks such as this one. I am not really all that cognizant of what the legal implications are of syndicating other people’s Tweets, and need to investigate this further.
For example, I’m uncertain about where fair use comes into play versus the need to seek approval prior to replication or reproduction of content.
If it is true that the moment a piece of content is created, it’s immediately copyrighted by the author, then theoretically I am violating the copyright of everyone I’ve syndicated so far who has not explicitly released their material under an open license.
For the moment what I’m going to do is include a note on the site providing my email address and requesting that anyone who does not want their content reproduced to contact me and I will ensure their material is removed from the site and not syndicated again in the future.
Recap and Looking Forward
All in all I’m quite pleased with this little experiment so far. In only a few hours I managed to set up a syndication framework for preserving and indexing content that would otherwise have become harder and harder to locate, and indeed start to establish a more efficient means of filtering and organising the data.
The fact the framework runs regularly checks for new content means the system is largely automated, and appears to therefore require a relatively low degree of maintenance.
That said there are issues to contend with, in the form of legal questions of copyright versus reuse, comment fracturing and preservation of existing conversational threads – and it may turn out there are additional issues (or benefits) that I have not yet considered. It is my hope these topics will become clearer the more I contemplate and experiment.

Very interesting work, Mike. I don't suppose you're thinking of offering this as an archiving service more broadly, are you? I'd love to volunteer as an alpha tester if so
All my tweets are CC licensed: http://tweetcc.com/results/?username=edwebb
Hi Ed,
I'm not so sure about releasing this widely as a service necessarily, however I am quite happy to add you to the list – in fact I've just begun to aggregate and syndicate your updates.
I think my preference would be to err on the side of caution in the short term, while I assess how well things are working. So limiting the number of feeds coming through (at least initially) seems like the safest plan.
That said I'm happy to assist/advise others on how they can set up similar systems if they desire and will continue to document the process I'm working through – including issues or pleasant surprises. It's not very difficult and yet lets you retain control over your data (whether you choose to release it under open licenses or not).
One of the main considerations at this stage is load and capacity. I have no idea what sort of strain syndication will exert on the system – or what thresholds may exist after which things start to have trouble. I seem to recall that one of my colleagues at UNSW ran into problems syndicating the feeds of many blogs, and which ultimately started bringing down the server. I don't expect it would happen in this instance – because he was pulling in several hundred blogs-worth of data – however it is a possibility to bear in mind.
I've also just discovered that the feeds from individual users are being handled differently from those of hashtags. @ references and weblinks are being made formatted as clickable links in the instance of hashtags, but not for individual users – and I have no idea why that's the case.
Realistically there's a fair amount of work and investigation I want to do to ensure the system is reliable and effective. So as long as you're happy for things to be occasionally unpredictable you're more than welcome aboard.
I’ve investigated this issue of clickable links further and it turns out the differences are visible in the feeds themselves. When the RSS feed is rendered in a browser, the @ replies and links are clickable in the feed coming through the search results for the hashtag, but not in the feed that exists for each user. So barring another RSS source for individual Twitter users or a change from the Twitter developers, there may not be a way to fix this.
Pingback: I’d like to borrow this Tweet from the library, please | TechTicker
Pingback: I’d like to borrow this Tweet from the library, please | TechTicker
RE: @edwebb Hi Ed,
I’m not so sure about releasing this widely as a service necessarily, however I am quite happy to… http://disq.us/d32t
This comment was originally posted on Twitter
RE: @mbogle I’ve investigated this issue of clickable links further and it turns out the differences are visible in the… http://disq.us/d346
This comment was originally posted on Twitter