PDT Overview

You are here because you want to optimize a complex experiment. You have some way of judging the quality of your experiments, by measuring an experimental response. Your experiment is complex, because it has several experimental parameters (e.g., between four and twenty) that interact with each other in unknown, unpredictable ways. You want to optimize these parameters, i.e., find values of the experimental parameters that lead to desired experimental responses. You will have some way of executing many experiments (e.g. using a high-throughput experimental platform).

We will consider your experiments as being grouped into generations. PDT is a tool for designing experiments for each generation, based on sophisticated statistical modeling of data from all past generations, with the aim of optimizing experimental responses. Your use of PDT entails three basic phases, two for initialization, and one that is repeated over and over with each generation.

1. Experimental space definition

You start by specifying the experimental space for your experimental campaign, that is a list of experimental parameters, with a specification of the discrete values these parameters may take when you execute an experiment. Each parameter defines a different dimension of your experimental space.

You then specify the generation parameters, Np = population size and Nr = number of repeats. These two numbers determine how many experiments there will be in a generation: Nexp = Np (Nr + 1).

2. Initial experiments

Before launching a PDT campaign, you may want to enter experimental results (experiments and response measurements) that you have gathered in preliminary studies, e.g., calibration runs, as initial experiments. To be useable by PDT, these experiments must lie in the expanded experimental space determined in the previous experimental space definition phase. This means that each experiment must have a specified value for each of the parameters in the experimental space definition, but is not necessary that its parameter values correspond to particular values in the experimental space definition.

3. Experiment design and response measurements

PDT iteratively chooses designs of Nexp experiments, each selected from your experimental space, called generations. The first generation is typically chosen at random, while keeping into account any initial experiments you may have entered in the previous phase. At the end of every generation, PDT uses all available experimental results to build a model that predicts which, among all untried experiments, are most likely to have good experimental response, and accordingly chooses the design for the following generation.

In addition to the experiments specified by the PDT design, you may optionally perform additional or extra experiments, either to further validate results from previous generations, e.g., for calibration reasons, or to explore your own intuition.