$$ \textbf{Bayesian Credibility for CS1 Actuarial Statistics} $$$$ \textbf{1. Introduction and Motivation} $$

In this section, we look at a direct application of the Bayesian framework for estimation. We adapt the idea of basing our estimates on some combination of observed data and prior beliefs. Why do we do this? Because estimating things is hard! Sometimes you have no data and need to make predictions for the outcomes of some experiment. Other times, you might have data but question its relevance to your specific objective. The idea of credibility theory, as we examine it here, is to provide a structured approach for handling such situations and making the best possible estimates with the information available. Let’s imagine a scenario. Suppose you're running a car insurance business and Uber approaches you to insure all their vehicles operating in Johannesburg. Before setting a price, you decide to estimate the expected cost of claims per vehicle for the coming year. Uber tells you that over the past five years in Johannesburg, they’ve experienced an average cost of R30,000 per car. They also mention that across the rest of their global operations (Europe, Asia, and America), they’ve averaged R25,000 per car over a longer history. You recognize that the Johannesburg data is more relevant to your pricing task, but the global data is statistically more reliable due to its larger sample size. How do you combine these two sources of information? A Bayesian credibility approach would suggest taking a weighted average — a linear combination — of the two estimates. In other words, you predict a value somewhere between R25,000 and R30,000, using the formula:

$$ \text{Expected Cost} = 30{,}000*Z + 25{,}000*(1-Z), \quad Z \in [0,1] $$$$ \textbf{Z, called the credibility factor, must be between 0 and 1} $$

This combines both sources of information through a credibility factor, Z, which reflects how much you "trust" the local Johannesburg data relative to the broader data. The value 𝑍 captures the credibility of your direct data: the more relevant and abundant it is, the higher 𝑍 becomes. The remaining weight, (1−𝑍), is assigned to the external or prior data — useful, but less directly applicable. Initially, if you have little Johannesburg data, 𝑍 might be small, reflecting greater reliance on the global average. As you gather more local data over time,𝑍 increases, and your estimates shift closer to the observed Johannesburg experience. This is the core idea of credibility: adapting your estimates dynamically as more relevant data becomes available.

$$ \textbf{2. Applying the idea of Bayesian Credibility} $$

In this section, we begin combining the ideas introduced above with the statistical models we've been working with. The only real difference now lies in how we derive the values like R30,000, R25,000, and the credibility weight 𝑍 To make this connection, let’s draw the parallels between the earlier insurance example and the more formal statistical context to which we’re now applying it. As you've hopefully seen, the Bayesian framework begins with a prior distribution for an unknown (random!) parameter. When we incorporate observed data, we update this prior to produce a posterior distribution that reflects both our prior beliefs and the new evidence. From this posterior, we can then derive Bayesian estimates — depending on the choice of loss function — to guide our decisions. Once we’ve obtained such an estimate, we can often express it in the familiar weighted form from earlier: i.e $$ \mathit{something} \, Z + \mathit{other\ thing} \,(1 - Z) $$ where intepretation to "something" and "other thing" will be given as we go. Locked in? Let's go!!

$$ \textbf{2.1 Exponential-Gamma model} $$

$\textbf{Assumptions:}$

Assumptions:

We now apply the Bayesian estimation framework using a specific probabilistic model. Assume that the individual claim sizes for Uber cars in Johannesburg are denoted by $X_i$ ,where each claim $X_i$, conditional on the parameter $\lambda$ follows an exponential distribution:

$𝑋_𝑖 | \lambda$ ~ $Exponential(\lambda), \quad i \in (1,2,3,...n)$

Futher to this, we assume $\Lambda_i$ ~ $Gamma(\alpha,\beta)$

This setup — exponential likelihood with a gamma prior — gives us a conjugate prior structure, meaning the posterior distribution will also be a Gamma distribution. We will work under the quadratic loss function, which, as you've seen, leads us to use the posterior mean of 𝜆 as our Bayesian estimator. We'll soon see how this estimator takes on a credibility-weighted form, consistent with our earlier intuition.

From the distributional assumptions above, we can derive the form of the posterior distribution. Using Bayes’ theorem, we know that the posterior density is proportional to the product of the likelihood and the prior:

$f_{post}(\lambda) \propto e^{-(\lambda \sum{x_i} +\beta \lambda)}\quad \lambda ^{\alpha +n-1}$

We recognize this as the kernel of a Gamma distribution. Hence, the posterior distribution is $ Gamma(\alpha+n , \sum{x_i} +\beta)$

The Bayesian estimate under the quatratic loss function is given by the posterior mean, which in our case is just the mean of a $ Gamma(\alpha+n , \sum{x_i} +\beta)$

And so,

$ E[\Lambda| \underline{x}] = \frac{\alpha +n}{\sum{x_i}+\beta} $

Almost there!

The final thing we need to do is decompose the posterior mean we calculated above and get it into the form:

$$ \mathit{something} \, Z + \mathit{other\ thing} \,(1 - Z) $$

I promised I'd tell you what "something" and "other thing", and now I will. Recall from the Uber example the expected cost was a weighted average of the mean cost from our data (Johannesburg) and the mean cost from the more reliable data global data. Well that's exactly what "something" and "other thing" are! They are, respectively,the MLE of the mean from the data "our data" and the mean of the prior.

$$\textbf{Rewriting the posterior mean:}$$

First, we know the MLE of the $\lambda$ rate for is $\frac {1}{\overline{x}}$ and the prior mean is $ \frac{\alpha}{\beta} $ .

So this is the "something" and "other thing" respectively.

Thus, we have :

$\frac{\alpha +n}{\sum{x_i}+\beta} = \frac{\alpha}{\sum{x_i}+\beta} + \frac{n}{\sum{x_i}+\beta} $

$\frac{\alpha +n}{\sum{x_i}+\beta} = \frac{\alpha}{\sum{x_i}+\beta} + \frac{n}{\sum{x_i}} \frac{\sum{x_i}}{\sum{x_i}+\beta} $

$\frac{\alpha +n}{\sum{x_i}+\beta} = \frac{\alpha}{\beta}(1-\frac{\sum{x_i}}{\sum{x_i}+\beta}) + \frac{n}{\sum{x_i}} \frac{\sum{x_i}}{\sum{x_i}+\beta} $

$\frac{\alpha +n}{\sum{x_i}+\beta} = \frac{\alpha}{\beta}(1-Z) + \frac{n}{\sum{x_i}} Z $

$\frac{\alpha +n}{\sum{x_i}+\beta} = \frac{1}{\overline{x}} Z + \frac{\alpha}{\beta}(1-Z) \quad for \quad Z =\frac{\sum{x_i}}{\sum{x_i}+\beta} $

This set of notes is intended to complement your primary study materials. For a fuller and more rigorous treatment of the topic, please refer to the official CS1 notes, especially for additional examples involving other models and loss functions. These notes aim to give you intuition and a working understanding of how Bayesian credibility fits into the broader statistical framework.

I hope you found them helpful and engaging.

— Sim