# Value-of-information

Value-of-information (VOI) methods determine the worth of acquiring extra information to help the decision-maker. From a decision analysis perspective, acquiring extra information is only useful if it has a significant probability of changing the decision-maker's currently preferred strategy. The penalty of acquiring more information is usually valued as the cost of that extra information, and sometimes also the delay incurred in waiting for the information.

VOI techniques are based on analyzing the revised estimates of model inputs that come with extra data, together with the costs of acquiring the extra data and a decision rule that can be converted into a mathematical formula to analyze whether the decision would alter. The ideas are well-developed (see the reference list below) but the probability algebra can be somewhat complex, and simulation is more flexible and a lot easier for most VOI calculations.

The usual starting point of a VOI analysis is to consider the value of perfect information (VOPI), i.e. answering the question "What would be the benefit, in terms we are focusing on (usually money, but it could be lives saved, etc.), of being able to know some parameter(s) perfectly?" If perfect knowledge would not change a decision, the extra information is worthless, and if it does change a decision then the value is the difference between the expected net benefit of the new selected option compared to that previously favoured. VOPI is a useful limiting tool, because it tells us the maximum value that any data may have in better evaluating the input parameter of concern. If the information costs more than that maximum value, we know not to pursue it any further.

After a VOPI check, one then looks at the value of imperfect information (VOII). Usually, the collection of more data will decrease, not eliminate, uncertainty about an input parameter, so VOII focuses on whether the decrease in uncertainty is worth the cost of collecting extra information. In fact, if new data are inconsistent with previous data or beliefs that were used to estimate the parameter, new data may even increase the uncertainty.

If the data being used are n random observations (e.g. survey or experimental results) the uncertainty about the value of a parameter has a width (roughly) proportional to 1/SQRT(n). So if you already have n observations and would like to halve the uncertainty, you will need a total of 4n observation (an increase of 3n). If you want to decrease uncertainty by a factor of 10 you will need a total of 100n observations (an increase of 99n). In other words a decrease in uncertainty about a parameter value becomes exponentially more expensive the closer the uncertainty gets to zero. Thus, if a VOPI analysis shows that it is economically justified to collect more information before making a decision, there will certainly be a point in the data collection where the cost of collecting data will outweigh their benefit.

## VOPI analysis method

1. Consider the range of possible values for the parameter(s) for which you could collect more information;

2. Determine whether there are possible values for these parameters that, if known, would make the decision-maker select a different option from the one currently deemed to be best; and

3. Calculate the extra value (e.g. expected profit) that the more informed decision would give. This is the VOPI.

## VOII analysis method

2. Model what observations might be made with new data using the prior belief;

3. Determine the decision rule that would be affected by these new data;

4. Calculate any improvement in the decision capability given the new data: the measure of improvement requires some valuation and comparison of possible outcomes, which is usually taken to be expected monetary or utility value though this is rather restrictive; and

5. Determine whether any improvement in the decision capability exceeds the cost of the extra information.

## VOI example

Your company wants to develop a new cosmetic but there is some concern that people will have a minor adverse skin reaction to the product. The cost of development of the product to market is \$1.8 million. The revenue NPV (including the cost of development) if the product is of the required quality is \$3.7 million.

Cosmetic regulations state that you will have to withdraw the product if 2% or more of consumers have an adverse reaction to your product. You have already performed some preliminary trials on 200 random people selected from the target demographic, at a cost/person of \$500Three of those people had an adverse reaction to the product.

Management decide the product will only be developed if they can be 85% confident that the product will affect less than the required 2% of the population. Decision question: Should we test more people or just abandon the product development now? If we should test more people, then how many more?

### VOPI analysis

Having observed 3 affected people out of 200 our prior belief about p can be modeled as Beta(3+1,200-3+1) = Beta(4,198) which gives a 57.24% confidence that 2% or less of the target demographic will be affected.

Thus, the current level of information means that management would not pursue development of the product, with no resultant cost or revenue, i.e. a net revenue of \$0. However, the Beta distribution shows that it is quite possible that p is less than 2%, and we could be losing a good opportunity by quitting now. If this was known for sure the company would get a profit of \$3.7 million, so the VOPI = \$3.7 million * 57.24% + \$0 million * 42.76% = \$2.12 million, and each test only costs \$500, so it is certainly possible that more information could be worth the expense.

### VOII analysis

VOII model solutions are provided below:

##### The model performs the VOII steps described above:
1. The parameter of concern is the fraction of people (prevalence) p in the target demographic (women 18-65) who would have an adverse reaction, with a prior uncertainty described by Beta(4,198), cell C17.

2. The people in the study are randomly sampled from this demographic so if we test m extra people (cell C27) we can assume the number of people who would be adversely affected s would follow a Binomial(m,p) distribution (cell C29);

3. The revised estimate for p would then become Beta(4+s,198+(m-s)). The confidence we then have that p is < 2% is given by BETADIST(2%,4+s,198+(m-s),1), cell C32. If this confidence exceeds 85% management would take the decision to develop the product (cells C36:C37);

4. The model simulates different possible values of p from the prior. Varying the possible number of extra tests m and simulating the extra data generated (s out of m) will evaluate the expected return of the resultant decision. Of course, although one may have reached the required confidence for p, the true value for p doesn't change and a bad decision may still be taken. The value of information is calculated for each iteration, and the CB.GetForeStatFN(x,2) function used to calculate the expected value of information.

##### The model performs the VOII steps described above:
1. The parameter of concern is the fraction of people (prevalence) p in the target demographic (women 18-65) who would have an adverse reaction, with a prior uncertainty described by Beta(4,198), cell C17.

2. The people in the study are randomly sampled from this demographic so if we test m extra people (cell C27) we can assume the number of people who would be adversely affected s would follow a Binomial(m,p) distribution (cell C29);

3. The revised estimate for p would then become Beta(4+s,198+(m-s)). The confidence we then have that p is < 2% is given by BETADIST(2%,4+s,198+(m-s),1), cell C32. If this confidence exceeds 85% management would take the decision to develop the product (cells C36:C37);

4. The model simulates different possible values of p from the prior. Varying the possible number of extra tests m and simulating the extra data generated (s out of m) will evaluate the expected return of the resultant decision. Of course, although one may have reached the required confidence for p, the true value for p doesn't change and a bad decision may still be taken. The value of information is calculated for each iteration, and the RiskMean(x) function used to calculate the expected value of information.

Note that for this example, the question being posed is how many more people to test in one go. A more optimal strategy would be to test a smaller number, review the results, and perform a VOII analysis. This iterative process will either achieve the required confidence at a smaller test cost, or lead one to abandon further testing because one is fairly sure that the required performance will not be achieved.

It might at first seem that we are getting something for nothing here. After all, we don't actually know anything more until we perform the extra tests. However, the decision that would be made would depend on the results of those extra tests, and those results depend on what the true value of p actually is. Thus, the analysis is based on our prior for p (i.e. what we know to date about p) and the decision rule. When the model generates a scenario it selects a value from the prior for p. It is saying: "Let's imagine that this is the true value for p". If that value is <2% we should develop the product of course, but we'll never know the value of p (until we have launched the product and have enough customer history to know its value). However, extra tests will get us closer to knowing its true value and so we end up taking less of a gamble. When the model picks a small value for p, it will probably generate a small number of affected people in our new tests, and our interpretation of this small number as meaning p is small will often be correct. The danger is that a high p value could by chance result in an unrepresentatively small fraction of m being affected, which will be misinterpreted as a small p, and lead management to make the wrong decision. However, as m gets bigger so that risk diminishes. The balance that needs to be made is that the tests cost money. The model simulates twenty scenarios where m is varied between 100 and 3000, with the following results:

It tells us that the optimal strategy, i.e. with the greatest expected VOII, is to perform about another 700 tests. The saw-tooth effect in these plots occurs because of the discrete nature of the extra number affected one would observe in the new data. Note that if the tests had no cost, the graph above would look very different:

Now it is continually worth collecting more information (providing it is feasible to actually do) because there is no penalty to be paid in running more tests (except perhaps time which is not included as part of this problem). In this case the value of information asymptotically approaches the VOPI (= \$2.12million) as the number of people tested approaches infinity.