Log in

No account? Create an account
Saving wiki pages from spammers: how to Revert to a previous Revision - Zer Netmouse
November 14th, 2011
06:47 pm


Previous Entry Share Next Entry
Saving wiki pages from spammers: how to Revert to a previous Revision
I wrote a brief article last month for issue 300 of The Drink Tank (which isn't out yet -- Edited to add: here's that article), on how SF topics on Wikipedia and other sf-related, wikimedia-based wikis could really use the attention of more fans. The non-wikipedia wikis especially need help right now because they are under attack by spammers. if a larger pool of people gave them a little bit of attention it could help a lot. This week I was working on the Carl Brandon Society wiki. I wrote up a bit on how to recover past content that has been replaced by spam for the society mailing list, and I thought I'd post it here as well:

Retrieving real content that's been deleted by spammers is always possible on a Wikimedia wiki, because it tracks the complete history of edits on each page. Basically, if you find yourself looking at a page that says just some nonsense, where you would have expected real content, you click on the "History" tab and look for the last edit by someone who was really working on the article, not a vandal. You then click into the version of the article you want to return to, hit "edit" for that version, and save. That will overwrite the current version of the article with the old version.

Alternatively, if there's just one edit that was made that was vandalism, you can click "undo", which is a link at the far right of each entry in the page's history list, then save the page.

Interpreting the history list is easier when real people register recognizable usernames and always log in to edit. What can also help though is often the spambots leave an editing comment that's complete gibberish, like yKtdjhtRb. Making detailed Summary comments when you edit as to what changes you made can also make it easier to distinguish between real edits and vandalism. If you're not sure about an entry in the history list though, you can easily compare it to the previous entry. Just click on the radio buttons in front of each entry then hit the "compare selected revisions" button at the top to see the changes highlighted.

For instance, yesterday I noticed the K. Tempest Bradford article in the Carl Brandon wiki had effectively been erased by vandals. When I looked at the history, it looked like this (plus two columns of radio buttons, which run between (prev) and the date/time stamp):
(cur) (prev)  14:55, 13 November 2011 (Talk) (46 bytes) (rRHjrefbzPRVc) (undo)
(cur) (prev)  08:04, 13 November 2011 (Talk) (59 bytes) (PzlFqAZYKGfVNrcgy) (undo)
(cur) (prev)  07:40, 13 November 2011 (Talk) (2,184 bytes) (→Editorial Positions) (undo)
(cur) (prev)  08:07, 29 October 2011 (Talk) (2,335 bytes) (→Projects) (undo)
(cur) (prev)  05:17, 18 July 2011 (Talk) (2,427 bytes) (undo)
(cur) (prev)  17:06, 9 October 2009 WendyDawson (Talk | contribs) (2,424 bytes) (→External Sources) (undo)
(cur) (prev)  09:12, 14 September 2009 Zeborah (Talk | contribs) (2,268 bytes) (→Works: link to Enmity) (undo)
(cur) (prev)  18:27, 27 July 2009 Sparkymonster (Talk | contribs) m (2,217 bytes) (undo)

The changes made that day were obviously spam, but I wasn't sure about the previous two. Using the "compare selected revisions" function I determined that the July edit was just a category change, but everything after it was spam. So I clicked on the date, 05:17, 18 July 2011, then clicked the edit tab. I see a warning, "Warning: You are editing an out-of-date revision of this page. If you save it, any changes made since this revision will be lost." Perfect. The changes made since that revision were all spam. I entered "reverting to last complete version" in the Summary comment box at the bottom, checked "Watch this page", and hit Save Page.

It takes a little bit of time, but it's pretty easy. If more people could watch and patrol pages, wikis like the Carl Brandon Wiki, the SF Editors Wiki and the SF Artists Wiki have a much better change of survival. I think these could be valuable resources for the community, so I hope people will.

(5 comments | Leave a comment)

Date:November 15th, 2011 03:18 am (UTC)
I'll add to this that there are ways to configure Mediawiki that can help to control spam. Some of these include:
- Requiring CAPTCHAs for certain operations such as anonymous edits or new account creation.
- Limiting the creation of new pages to users with accounts older than a certain age.
- Blocking anonymous edits.

This page contains information on some such measures.
[User Picture]
Date:November 15th, 2011 03:29 am (UTC)
MediaWiki, the software that these wikis run, has a bunch of extensions available to help prevent and manage spam and vandalism, as nicegeek pointed out. And there are people who have rolled their own solutions as well -- check out https://www.noisebridge.net/wiki/Secretaribot . And it really helps to have upgraded to the latest version of MediaWiki, which right now is 1.17.0. If you have trouble, please visit the #mediawiki channel on the freenode IRC network -- this page explains how:

[User Picture]
Date:November 15th, 2011 03:39 am (UTC)
sadly, when we upgraded sfartistwatch.com to the newest version of mediawiki, it broke. It looks ok at first, aside from the fact that the reference to the logo we had was destroyed in the upgrade, but then if you log in, the formatting gets messed up and the tabs and sidebar go away. We still haven't figured that one out.

This sort of thing is not encouraging so far as upgrading regularly goes.
[User Picture]
Date:November 15th, 2011 01:23 pm (UTC)
Argh! That is very discouraging! Please email me at sumanah at wikimedia dot org with a description of what happened -- what the old version was, what operating system and database and version of PHP and server you are using, whether you had a custom skin, etc. -- and a list of the problems you're facing. I'll get it to the mediawiki-l list https://lists.wikimedia.org/mailman/listinfo/mediawiki-l and get you some help, and maybe we can file appropriate bug reports and make things better.
[User Picture]
Date:November 15th, 2011 03:35 am (UTC)
unfortunately some of the spammers create new user accounts and then come back and use them later, but these do get rid of a lot.

The Carl Brandon site has one of the security methods that requires simple math to add links, which may be why recently one of the primary modes of attack it is suffering is the "replace content with a stupid comment" attack, which seems so pointlessly destructive to me that I really wish I had the skills to hunt those people down and kick them off the internet permanently.
Netmouse on the web Powered by LiveJournal.com