How it works

As an example of how Bayesian filters work, let's say that we have two well-established groups of several hundred emails; one group is all of our emails that have been identified as good (either by the filter or by us in the training process), the other contains all of our emails that have been identified as spam (again either by the filter or by us in the training process).

Let's assume that the word 'diploma' appears only in the bad email group and never in the good email group. Let's also assume that the word 'afternoon' occurs only in the good email group. The Bayesian filter could then surmise that the probability of any email c

ontaining the word 'diploma' being spam to be very likely and assign a high score of almost 1 (100% probability of being spam) to it. Likewise, the probability of any email containing the word 'afternoon' being spam would be quite low and so an appropriately low score of almost 0 (0% probability of being spam) would be assigned.

This is a very simple example to illustrate a point. Filtering will actually be carried out taking many other characteristics into account before coming to a final decision e.g. good content scores as well as spam content scores, information contained in the email headers, etc.

Any time an incorrect decision is arrived at, a manual correction by the user can be made and the filter automatically adjusts itself accordingly. Therein lies one of the biggest benefits of Bayesian filtering, being able to easily adapt the filter according to your own experience.