Preventing and Cleaning Up Spam

If you have a public-facing Confluence site, your site may be affected by spammers.

Stopping Spammers

To prevent spammers:

  1. Enable Captcha. See Configuring Captcha for Spam Prevention.
  2. Run Confluence behind an Apache webserver and create rules to block the spammer's IP address.

Blocking Spam at Apache or System Level

If a spam bot is attacking your Confluence site, they are probably coming from one IP address or a small range of IP addresses. To find the attacker's IP address, follow the Apache access logs in real time and filter for a page that they are attacking.

For example, if the spammers are creating users, you can look for signup.action:

$ tail -f confluence.atlassian.com.log | grep signup.action
1.2.3.4 - - [13/Jan/2010:00:14:51 -0600] "GET /signup.action HTTP/1.1" 200 9956 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)" 37750

Compare the actual spam users being created with the log entries to make sure you do not block legitimate users. By default, Apache logs the client's IP address in the first field of the log line.

Once you have the offender's IP address or IP range, you can add it to your firewall's blacklist. For example, using the popular Shorewall firewall for Linux you can simply do this:

# echo "1.2.3.4" >> /etc/shorewall/blacklist
# /etc/init.d/shorewall reload

To block an IP address at the Apache level, add this line to your Apache vhost config:

Deny from 1.2.3.4

You can restart Apache with a "graceful" command which will apply the changes without dropping any current sessions.

If this still does not stop the spam, then consider turning off public signup.

Deleting Spam

Profile Spam

By 'profile spam', we mean spammers who create accounts on Confluence and post links to their profile page.

If you have had many such spam profiles created, it is easier to delete them via SQL, as described below.

To delete a spam profile:

  1. Shut down Confluence and back up your database. Note: This step is essential before you run any SQL commands on your Confluence dattabase.
  2. Find the last real profile:

    SELECT bodycontentid,body FROM bodycontent WHERE contentid IN 
      (SELECT contentid FROM content WHERE contenttype='USERINFO') 
      ORDER BY bodycontentid DESC; 
    
  3. Look through the bodies of the profile pages until you find where the spammer starts. You may have to identify an number of ranges.
  4. Find the killset:

    CREATE TEMP TABLE killset AS SELECT bc.bodycontentid,c.contentid,c.username FROM 
      bodycontent bc JOIN content c ON bc.contentid=c.contentid WHERE 
      bodycontentid >= BOTTOM_OF_SPAM_RANGE AND bodycontentID <= TOP_OF_SPAM_RANGE 
      AND  c.contenttype='USERINFO';
    
    DELETE FROM bodycontent WHERE bodycontentid IN (SELECT bodycontentid FROM killset);
    
    DELETE FROM links WHERE contentid IN (SELECT contentid FROM killset);
    
    DELETE FROM content WHERE prevver IN (SELECT contentid FROM killset);
    
    DELETE FROM attachments WHERE pageid IN (SELECT contentid FROM killset);
    
    DELETE FROM content WHERE contentid IN (SELECT contentid FROM killset);
    
    DELETE FROM os_user_group WHERE user_id IN (SELECT id FROM killset k JOIN os_user o ON o.username=k.username);
    
    DELETE FROM os_user WHERE username IN (SELECT username FROM killset);
    
  5. Once the spam has been deleted, restart Confluence and rebuild the index. This will remove any references to the spam from the search index.

Notes

  • See CONF-1469. Your comments that issue are very much appreciated.

Was this helpful?

Thanks for your feedback!

Why was this unhelpful?

Have a question about this article?

See questions about this article

Powered by Confluence and Scroll Viewport