Non-Bayesian filtering may be good for known spam, but it falls weak when:

  1. Updates can't be accessed meaning that your filter reference data isn't current.
  2. New spam is encountered that hasn't yet reached the centralised reference - will your filter be able to identify it as spam?
  3. Spam contains random content making each occurence different from any other - may get around filters that rely on unique 'fingerprints' of spam.
  4. Good email is falsely identified as spam due to filters that can't be personalised for every individual using them - this means that the reference dataset needs to be as large and encompassing as possible.

Bayesian filtering steps in to tackle these weaknesses by:

  1. Removing the need for program updates except for bug fixes and extra functionality - the filters automatically update themselves.
  2. Removing the need for a centralised reference source - filters are tuned to filter the spam that individual users receive.
  3. Scoring email by statistical analysis of email received by individual users.
  4. Being able to recognise the characteristics that make up 'good email' as well as spam.

Adaptable

Bayesian filtering works by calculating and adapting its filters to an individual user. It does this by looking at the email you have already received and classed as either good email or spam. By analysing these two groups of emails, Bayesian filters can assign scores to content and use this scoring data to analyse any new emails that are put through the filter. Because this scoring data is vital in order to classify emails, some incorrect decisions can be made i.e. good email identified as spam and spam identified as good email (false positives and false negatives respectively) when an 'untrained' filter is first put into use.

Training Bayesian filters is often just a case of clicking a button in order to swap the group to which an email has been placed. Some Bayesian filters are supplied pre-trained and so will be effective out-of-the-box.