Current location - Quotes Website - Famous sayings - Those things of experimental design-Fisher's experimental logic
Those things of experimental design-Fisher's experimental logic
If the father-son relationship is the best in the history of mathematics, you may think of the Bernoulli family. The Bernoulli family has produced eight mathematicians, three of whom are world-class mathematicians. In statistics, there is also the greatest relationship between Weng Xu and ronald fisher (189 -1962) and George bocks (1919-213), two heavyweight statisticians who have made outstanding contributions to the important branch of statistical optimization.

However, perhaps most people are not familiar with the fact that such an important optimization method of experimental design was born in an agricultural experimental station called Lausanne, the oldest agricultural research station in the world, and its inventor was ronald fisher, a master of statistics, who systematically introduced his exquisite ideas about experimental design in his books, such as Statistical Methods for Researchers and Experimental Design.

Coincidentally, George bocks, ronald fisher's son-in-law, also imbued Fisher with profound statistical thinking, made great achievements in experimental design, time series model and other fields, and published a large number of important articles and works. He has a famous saying that is regarded as a classic by practitioners in statistics and big data industry: "All models are wrong, but some are useful". Bocks systematically introduced his profound understanding of experimental design in his works, such as Evolutionary Operation-A Statistical Method for Process Improvement, Statistic of Experimenters and Empirical Modeling and Response Surface Method. At the same time, during his eight years of internship and work in ICI, he studied and explored with his colleagues (chemists and chemical engineers) how to design and analyze experiments to improve the efficiency of experiments, and put forward the following suggestions.

experimental design was first applied to agriculture to increase the yield per mu, and then quickly popularized to the production and R&D activities of all walks of life, such as chemistry, medicine, electronics, machinery, etc. During this process, researchers from all walks of life gradually realized the unique advantages of experimental design method in the optimization category in practice. This paper will systematically discuss why experimental design? Three principles of experimental design, the workflow of experimental design, why are some factors tested? The response surface and sequential test strategy, as well as the application scenarios of experimental design are briefly introduced.

first, why the experimental design? -Face search or line search?

bocks once mentioned in his work The Road to a Master of Statistics-Memoirs of George bocks, "Statistics is about how to generate and use data to solve scientific problems. Therefore, it is very important to be familiar with science and scientific methods. In scientific and technological research, we often need to study many variables. Call those variables that you can change as "input variables" or "factors", and those variables that you can only observe as "output variables" or "response variables". At one time, people thought that the correct way to study the system affected by multiple factors was to change only one factor at a time. But as early as more than 8 years ago, R.A. Fisher revealed to the world that this method was too inefficient and wasted a lot of experimental efforts. In fact, you should change multiple factors at the same time according to the so-called "experimental design" arrangement. However, even now, the method of changing only one factor at a time is still taught in the classroom.

It can be seen that even now, some researchers still use the method of changing one factor at a time (also called COST, that is, Change One Separate factor at a Time, and correspondingly, experimental design is also called DOE. That is, Design of Experiments) to find the optimal value. However, this method of changing only one factor at a time obviously has some shortcomings, such as low efficiency and easy to miss the optimal value because it is impossible to evaluate the interaction effect.

We can take a look at the case shown in Figure 1 first:

A team found through research that the yield of a chemical reaction of a certain product of its company has a great relationship with the pressure of the reaction kettle and the amount of catalyst added. In order to explore the best process, the following experiments were carried out:

1) The amount of catalyst added was fixed at 5kg, and the pressure of the reaction kettle was adjusted for several times, and finally it was concluded that the yield was the best when the pressure of the reaction kettle was 75Mpa;

2) Then, the pressure of the reaction kettle was fixed at 75Mpa, and the amount of catalyst was adjusted for several times. Finally, it was concluded that the yield was the best when the amount of catalyst was 3kg.

3) Therefore, the research team thinks that the overall yield is the best when the reaction kettle is at 75Mpa and the amount of catalyst is 3kg.

So, are the facts the same as the conclusion reached by this team? We can clearly see from the contour map obtained by experimental design on the right that the best advantage actually occurs when the reaction kettle pressure is =65Mpa and the catalyst addition is 3 kg, at which time the yield will be higher than 91%, while the best yield obtained by the first method is estimated to be around 9%. Thus, the first method does have the risk of missing the best value. In fact, as we can see from the above figure, the first method is actually a way of line search, while the experimental design is a way of surface search. Obviously, searching by surface is more efficient than searching by line, and it is easier to capture the best value. At the same time, through the method of experimental design, we can obtain a very intuitive response surface and contour map between response variables and factors, which can help us better understand the law of response variables changing with factors.

second, the three principles of experimental design? -fisher's farmland

from the first section above, we understand why DOE is more efficient than COST method, but how to design the experiment is a very important link, which will directly affect the efficiency and success of the experiment. During the 14 years (1919 -1933) when Fisher worked in Lausanne Agricultural Experimental Station, he summarized three principles of universality of experimental design through a large number of experimental studies, namely:

(1) repeated experiments;

(2) randomization;

(3) regionalization.

However, sometimes the interpretation of these three principles in some professional books is very obscure. Here we try to interpret these three principles from another perspective through a fabricated story-Fisher's farmland.

as shown in figure 2, the story tells that when Fisher was working in Lausanne Agricultural Experimental Station, he had done an evaluation experiment on the yield per mu of two kinds of rice seeds. After careful thinking, he finally came up with three principles of experimental design, which have been regarded as classics by the academic circles:

① Fisher's idea was to plant rice seeds A and B in two paddy fields respectively, and then to see which one had higher yield per mu, so that we could see which one had higher yield per mu.

(2) However, Fisher is a big statistician, so after a little consideration, he thinks that it is not rigorous to judge only from one point. Therefore, he divides the left and right fields into four pieces, and then plants A and B, so that not only the average yield per mu of rice seeds A and B can be obtained, but also the standard deviation of their yield per mu can be obtained, which is more convincing.

(3) However, the keen Fisher soon realized that the above test method was still flawed, because according to his years of working experience in the agricultural experimental station, the soil fertility in the experimental station was very uneven, assuming that the farmland soil on the left was average, while the farmland soil on the right was more fertile. If the final conclusion was that the yield of rice seed B was higher, was it the rice seed that caused the high yield or the soil that caused the high yield? There is confusion between the two factors, so after thinking, he redesigned the experimental method. He planted A and B in the farmland on the left and right respectively, so that A and B rice seeds were planted equally in fertile soil and ordinary farmland, so the result would be more reasonable.

(4) Originally, the experimental scheme could be operated directly. At this time, the agricultural experimental station received a new task temporarily, asking to evaluate whether the newly invented mechanical seeder is more helpful to improve the yield per mu than manual sowing. In order to reduce the number of experiments, it is required to combine the two evaluations. Fisher is a genius, and he soon found a perfect method to solve this problem. He planted half of the farmland on the left and the other half with manual sowing, so that it is not necessary.

in fact, in the above experimental logic thinking ②, ③ and ④, Fisher has creatively applied the three experimental design principles of repeated experiment, randomization and block grouping to the evaluation task of the yield per mu of two kinds of rice, thus ensuring the effectiveness and rationality of the experimental results and providing a strong guarantee for the final scientific evaluation results.

Of course, there is a basic principle for grouping, that is, "those who can group are divided into groups, and those who can't are randomized.".

Third, the workflow of experimental design and analysis

Above we have known the advantages of experimental design and the three principles of experimental design. Here we introduce the workflow of experimental design and analysis through a complete factorial design with all factors.

as shown in figure 3, this is a typical schematic diagram of factor design. From the figure, it can be seen that the experimental designer tried to study the influence of three factors, A, B and C, on the response variable. Therefore, the designer designed the following experimental scheme, and hoped to estimate the coefficients of the following regression equation through the above experiments:

From the above regression equation, it can be clearly seen that there are 8 coefficients to be estimated in a * * *, so at least one experiment is needed, similar.

according to the three principles of test design mentioned above, we still need to carry out repeated tests, but in order to reduce the number of tests, we generally choose to carry out 3-4 repeated tests at the central point. Another advantage of choosing the center point for repetitive test is that it can find out whether the model has bending phenomenon. If there is bending phenomenon, we need to add higher-order terms of factors to form a response surface. Generally speaking, it means that we have basically found the optimal value. Of course, in the case of bending, some test points need to be added to estimate the model parameters, which will be discussed later. At the same time, we also need to randomize the test sequence. There is no requirement for regionalization in this case, so we can conduct the test directly according to this test design and obtain the corresponding test data.

In addition, when setting the high and low levels of factors, it is necessary to set the high and low levels as far as possible, otherwise the noise in the experiment may drown out the original significant effects. Moreover, setting the test point far away is also helpful to explore the unknown process location, as shown in Figure 4.

after obtaining the experimental data, it is necessary to start the analysis of the experimental design, which basically follows the following process:

In fact, the first three steps in the above workflow have been introduced in detail in the one-dimensional linear regression. The slight difference is that in this regression model, the number of factors (main effects) is greater than one, and there are second-order interactive terms, so it is necessary to carry out regression coefficients. In order to determine whether its influence on the response variables is significant, at the same time, we need to see whether the model is bent or not. For example, if we see from the model results that a main effect is not significant, or a second-order interaction effect is not significant, we need to remove these items and regress.

When there is no abnormality in the improved model, we can enter the model interpretation stage. At this stage, we need to do two things:

(1) further verify and confirm the significance of the output factors through the main effect diagram and interaction effect diagram;

(2) By outputting contour map and response surface, we can more intuitively understand the law of response variable changing with independent variables, so as to help find the best setting.

Next, we need to find the optimal setting through the response optimizer and judge whether the optimal value has reached the original goal. If it has, it does not mean that the work is over, but we need to do further verification tests. Usually, we need to do more than three verification tests at the optimal point. Of course, if the original goal has not been reached, we need to continue to arrange the experimental design centered on the optimal point until the predetermined goal is reached.

fourth, why are some factors tested? -balance between resolution and test efficiency

As mentioned earlier, when the number of factors is n, if a complete factorial design is needed, at least one experiment is needed, and we can get it by simple calculation. When the number of factors reaches 5, the complete factorial design needs 32 experiments (excluding the central point), and when the number of factors reaches 6, the complete factorial design needs 64 experiments. Taking four factors as an example, we give the model equations of the experimental design as follows:

It can be seen that if we do the complete factorial design, in addition to the constant term, there are four estimated main effects, six second-order interaction effects, four third-order interaction effects, one fourth-order interaction effect and one * * *, and these The only parameters we really need to infer are the constant term, the main effect term and the second-order interaction effect term. One * * * is 11 terms. Therefore, it is possible to realize the idea that we want to do less experiments and at the same time understand the constants, first-order and second-order terms in the model equation.

In real work, due to the limitation of resources and time, this demand for efficiency and test cost control is common. Still taking the four factors (A, B, C, D) as an example, the complete factorial design needs 16 experiments, but at this time, according to the restrictive conditions, only 8 experiments can be done, so how to choose these 8 experiments is the most reasonable? According to the analysis, it can be concluded that it is the most reasonable to choose the experiment according to the generator D=ABC (defined as ABCD= 1, abbreviated as "word"), which can not only ensure the orthogonality of the experimental design, but also ensure that the first-order main effect and the second-order interaction effect are not mixed.

in the partial factor experiment, there are also many generators (that is, many words). At this time, the length of the word with the shortest length among all words is defined as the resolution of the whole design, which is usually given in Roman numerals, such as I, II, III, IV, etc. In the previous example, eight experimental arrangements were obtained by ABCD=1, so its resolution is IV, and this design scheme is recorded as, more generally, the partial factor design with resolution of R is recorded as, where k is the number of factors and p is the number of generators or words.

for the convenience of researchers, statisticians specially compiled the resolution table of some factor tests in Table 2, and at the same time, in minit.