| Glossary for Design of Experiments |
|||||||||||||
ANOVA -- An acronym for
"ANalysis Of VAriance," a statistical technique that separates the variation in
an experiment into categories relating to the causes of the variation. For example, ANOVA
will separate variation into categories for each factor and each combination of factor
interactions.
|
|||||||||||||
Average -- The center of a
bell-shaped pile of data. The average is calculated by adding all replicates of a
measurement together and dividing by the number of replicates.
|
|||||||||||||
b-coefficients -- Coefficients for
the mathematical model representing a response surface. When all factors are coded on the
-1 to 1 scale, the b coefficients can be ranked by magnitude to determine the importance
of a factor's or interaction's contribution to the total variation in an experiment.
|
|||||||||||||
Bell-Shaped Pile of Data -- A
common shape for data when it is "piled up." Another name for a pile of data is
a "Histogram." A Bell-Shaped pile of data is also called a "Normal
Distribution" or a "Gaussian Distribution."
|
|||||||||||||
block -- A group of experiments.
Blocks are often used to eliminate "nuisance factors," factors that influence a
response, but are of no interest in an experiment.
|
|||||||||||||
Box-Cox Plot -- A plot of the log
of the average vs. the log of the standard deviation for a set of replicates. This plot
can be used to test for consistent, or "homogeneous,"estimates of standard
deviation. If the plot shows a slope, the standard deviation estimates may not be
consistent. The slope can be used to determine a logical transformation of the data to
make the standard deviation estimates consistent. See "Box-Cox Transformation."
|
|||||||||||||
Box-Cox Transformation -- A
transformation used to make standard deviation estimates consistent. The transformation is
made by raising each response to the power of a number called "lambda.". Lambda
is equal to 1 minus the slope of the line in a Box-Cox Plot.
|
|||||||||||||
central composite -- Central
composite designs allow for the collection of data to fit full quadratic models. They have
spherical geometries. They are one type of "Interaction Plus Star" design.
|
|||||||||||||
coding -- As it applies to DOE,
coding changes factor levels from their natural units (such as time or temperature) to a
-1 to 1 scale. This provides greater accuracy during computations and allows
b-coefficients to be ranked by their magnitude for screening.
|
|||||||||||||
coefficients -- Coefficients are
constants that are multiplied by factors in a mathematical model. They are often referred
to as "b-coefficients" in DOE.
|
|||||||||||||
combined s -- When standard
deviation estimates for different trials in an experiment are consistent they can be
combined to give a better estimate of the standard deviation. The method used to combine
standard deviations is called "pooling." Other names for consistent standard
deviation are "pooled s" and "pure error."
|
|||||||||||||
confidence limits -- Confidence
limits are numbers between which an average is expected to lie with a certain probability
(confidence). For example, "The long term average for Young's Modulus for hot melt
inks lies between 2.38 and 2.99 with 95% confidence."
|
|||||||||||||
consistent standard deviation estimates -- When the estimates for standard deviation at different trials all estimate the same
underlying standard deviation the estimates are consistent. In other words, the standard
deviation for all trials is the same -- the estimates only differ because of response
variation.
|
|||||||||||||
constrained mixture -- A
constrained mixture is a mixture whose components are not allowed to vary over the entire
range of 0% to 100%. For example, a constrained mixture might require water to vary from
50% to75%.
|
|||||||||||||
constrained mixture design -- A
constrained mixture design restricts the upper and lower levels for one or more components
to a range less than 0% to 100% and requires that the components in each trial add to
100%.
|
|||||||||||||
constraint -- A constraint is a
limitation. For example, in a mixture all of the components must add to 100%. This
constraint is imposed by nature. Some constraints are imposed by the experimenter, such as
"the sum of the laser pulses and the laser power must be less than 89."
|
|||||||||||||
continuous -- Continuous means
that you can always find a number between any other two numbers, no matter how close
together they are. Time is continuous -- you can always find a moment in time
between any two other moments in time.
|
|||||||||||||
contour plots -- Contour plots are
plots that show lines of equal value. In DOE, contour plots show lines of equal responses,
usually equally spaced in response values. They provide an easy way to determine a
response value for a response surface.
|
|||||||||||||
correlation -- Correlation is a
relationship between two factors.
|
|||||||||||||
cubical geometry -- When all
factors in an experiment may vary from highest to lowest regardless of the levels of the
other factors the geometry is cubical. The easiest way to see this is to consider the cube
formed by the high and low levels of three factors.
|
|||||||||||||
D-Optimal -- D-Optimal refers to a
type of experiment design that attempts to produce the most accurate b-coefficients for a
model. These designs are quite useful for screening experiments. The D stands for
"determinant," a property of a matrix that is useful in generating these
designs.
|
|||||||||||||
degrees of freedom -- Degrees of
freedom provide a measure of the quality of a standard deviation estimate -- the larger
the degrees of freedom, the better the quality of the standard deviation estimate.
|
|||||||||||||
design -- This is sort for
"experiment design." Various types of experiment designs are described below:
|
|||||||||||||
Design of Experiments (DOE) -- DOE
is a statistical technique that allows you to run the minimum number of experiments to
optimize your product or process. It involves determining the best experiments to run to
fit a particular mathematical model.
|
|||||||||||||
designed experiment -- A designed
experiment is an experiment with trials chosen to meet specific goals, including fitting a
particular mathematical model.
|
|||||||||||||
df -- This is shorthand for
"degrees of freedom."
|
|||||||||||||
discrete -- Discrete indicates
that only specific levels are possible for a factor. For example, if you have catalyst A
and catalyst B there is no level in between.
|
|||||||||||||
effects -- An effect is the
average change in a response for a change from low to high level of a factor, interaction,
quadratic term, etc. Effects can be ranked by magnitude to determine the strongest to
weakest of factors in a screening experiment.
|
|||||||||||||
experiment design -- An experiment
design is a plan for collecting and analyzing data.
|
|||||||||||||
Experiment Design Assistant -- The
Experiment Design Assistant is a program that helps you to use the I-Optimal Design
Library to generate designs for your experimentation. The
program is available for free.
|
|||||||||||||
experimental error -- Experimental
error is a term used to refer to all of the uncertainty in an experiment, including
systematic error and random response variation.
|
|||||||||||||
face-centered-cubic (FCC) design -- An FCC design is a type of interaction plus star design in which the star points are in
the centers of the faces of a cube or hyper cube.
|
|||||||||||||
factor -- A factor is a variable
over which you have direct control in an experiment. Some examples are time, temperature,
and pressure.
|
|||||||||||||
factorial -- Factorial, as used in
DOE, means "pertaining to factors." It is only used when referring to
interaction designs.
|
|||||||||||||
Fisher -- Fisher is better known
as the "Experiment Design Assistant." It was named in honor of Sir Ronald
Fisher, the father of modern experimentation.
|
|||||||||||||
fractional factorial -- Fractional
factorial refers to an interaction design in which only a portion of the trials is run.
These typically have interaction order 2.
|
|||||||||||||
Gaussian Distribution -- A
Gaussian Distribution is a bell-shaped pile of data. It was named in honor of Karl
Friedrich Gauss, the first man to write about random response variation.
|
|||||||||||||
Gosset -- Gosset is the consummate
computer program for creating experiment designs. It allows for the routine creation of
I-Optimal designs. Gosset was named after Thorold Gosset, the first mathematician to study
geometrical structures in six, seven and eight dimensions, and William Sealy Gosset,
inventor of the t-distribution. The Gosset algorithm was developed by Dr. N.J.A. Sloane
and it was coded by Ron Hardin, both at AT&T Bell Labs.
|
|||||||||||||
I-Optimal -- I-Optimal is a
property of a design that makes it very good at making precise predictions. The I stands
for "Integrated Variance," the average variance for a design over its region of
interest.
|
|||||||||||||
I-Optimal design -- An I-Optimal
design is a design that attempts to provide the best predictions for any given trial. The
I stands for "Integrated Variance," the average variance for a design. I-Optimal
designs minimize the average variance for responses throughout a region of interest.
|
|||||||||||||
I-Optimal Design Assistant -- The
I-Optimal Design Assistant has a collection of I-Optimal designs created using Gosset. The
designs provide for a wide variety of experiments that cannot be performed using classical
designs. Designs can include different types of factors in the same experiment, such as
mixture and process factors in the same experiment. The Assistant is available for free.
|
|||||||||||||
interaction -- An interaction is a
joint effect of factors. For instance, time and temperature interact when baking a cake.
Both must be set together to get good results. Another common example is drug interaction,
where two medicines taken together produce an effect that neither could produce by itself.
|
|||||||||||||
interaction design -- An
interaction design is a design to fit an interaction model. Other names are factorial
design and fractional factorial design.
|
|||||||||||||
interaction order -- The
interaction order is the highest number of factors for which an interaction term exists in
a model. For example, models that include only 2-factor interactions have interaction
order 2. Models that include 3-factor interactions, but no higher, have interaction order
3.
|
|||||||||||||
interaction plus star design --
The interaction plus star design is a design specifically intended to fit a full quadratic
model. This design uses the corners of a cube or hyper cube, the center point, and the
star points. Examples of this design type are the central composite design and the face
centered cubic design.
|
|||||||||||||
K -- K is a factor used to
calculate statistical tolerance limits using the equation Tolerance = Average Y Ks,
where s is the standard deviation.
|
|||||||||||||
lambda -- Lambda (l ) is a Greek letter
used to symbolize the power in a Box-Cox transformation: Y* = Yl. l is equal to 1 minus the slope of
the line in a Box-Cox plot.
|
|||||||||||||
level -- The value to which a
factor should be set in an experiment. For example, 6 hrs. is a level for time.
|
|||||||||||||
main effect -- The main effect for
a factor is the effect on a response due to that factor only.
|
|||||||||||||
mathematical model -- A
mathematical model is an equation that can be used to make predictions of experimental
results. In DOE mathematical models are typically polynomials.
|
|||||||||||||
mean -- The mean is a particular
average, the sum of a number of replicates divided by the number of replicates. Other
types of average are the "median," the number above which half of the replicates
lie and below which half of the replicates lie, and the "mode," the most
frequently occurring value among the replicates.
|
|||||||||||||
mixture -- A mixture is a
combination of components. Examples of mixtures are gasolines, inks, and your favorite
soda.
|
|||||||||||||
model -- A model is a theory
expressed as a surface.
|
|||||||||||||
N -- N is a symbol indicating the
number of replicates for a trial.
|
|||||||||||||
Normal Distribution -- A Normal
Distribution is a bell-shaped pile of data. It is a very normal shape for a pile of
industrial data.
|
|||||||||||||
Normal Probability Plot -- A
Normal Probability Plot, or Normal Plot, is a plot that makes data from bell-shaped piles
plot as a straight line. It is used to test for the consistency of standard deviation
estimates.
|
|||||||||||||
OFAT -- This is short for
"One-Factor-at -A-Time," an experimental technique in which only one factor is
varied in any experiment, the remaining factors being held constant. It fails to look for
interactions among the factors.
|
|||||||||||||
orthogonal -- Orthogonality is a
mathematical property of a matrix. In DOE the term is often used to indicate a very good
design.
|
|||||||||||||
polynomial -- A polynomial is an
equation that is the sum of a number of terms.
|
|||||||||||||
practical test -- The practical
test checks to see if your Sweet Spot meets your goal. This is not a common name -- it is
a term used only by Math Options.
|
|||||||||||||
pooled standard deviation -- The
pooled standard deviation is the combined standard deviation for a number of trials. This
is another name for combined standard deviation.
|
|||||||||||||
Predictions vs. Residuals Plot --
This is a plot used to help determine if standard deviation estimates are consistent.
|
|||||||||||||
probability -- Probability is the
proportion of time a given event can be expected to happen. For example, if the
probability of confidence limits being correct is 0.95 (95%), then you can expect your
confidence limits to be correct 95 times out of every 100 times you report them.
|
|||||||||||||
pure error -- Pure error is
another name for combined standard deviation. It is called pure error because it only
includes random response variation that you measured in your experiments.
|
|||||||||||||
r -- r is the correlation
coefficient. It can vary from -1 to 1. Values near -1 or 1 indicate very good correlation.
Values near 0 indicate very poor correlation.
|
|||||||||||||
r2 -- r2 is the coefficient of
determination. It can range from 0 to 1. It represents the percentage of the variation
observed explained by the correlation.
|
|||||||||||||
r-critical -- r-critical is a
tabulated value that helps you to judge if r is statistically significant. If your r is
greater than r-critical, your r is statistically significant, i.e. your correlation
appears to be real.
|
|||||||||||||
random -- Random means that no
pattern is followed. Given a series of random numbers, you cannot predict the next number.
|
|||||||||||||
regression -- Regression is a
mathematical technique used to fit data to a mathematical model.
|
|||||||||||||
region of interest -- The region
of interest is the set of all experiments you may wish to predict results for. It is
generally represented by a cube.
|
|||||||||||||
replicate -- A replicate is a
measurement. If one measurement is made, you have one replicate. If two measurements are
made, you have two replicates, etc.
|
|||||||||||||
residual -- A residual is the
difference between a prediction and an observation.
|
|||||||||||||
response -- A response is a
variable over which you do not have direct control. You have to vary factors to change a
response. For example, you must vary the ingredients (factors) in a cake to change its
flavor (response).
|
|||||||||||||
response surface -- A response
surface is a surface that represents predicted responses to variations in factors. It can
have any number of dimensions depending on the number of factors.
|
|||||||||||||
response surface methodology (RSM) -- RSM is a technique that uses response surfaces to analyze experimental data. It is very
powerful in that it allows you to predict the results of experiments you have never
performed. It can also be used to predict the Sweet Spot.
|
|||||||||||||
response variation -- Response
variation refers to differences in replicate measurements. It refers specifically to the
random variation that is a part of nature and cannot be entirely eliminated.
|
|||||||||||||
run -- A run is the execution of
an experimental trial. Multiple executions of the same trial count as separate runs.
|
|||||||||||||
run order -- The order in which
experimental trials should be executed. This order should be random whenever possible.
|
|||||||||||||
s -- s is the symbol for standard
deviation.
|
|||||||||||||
scientific method -- The
scientific method is a means of learning about nature. It is composed of observation,
reason, and experimentation.
|
|||||||||||||
scrambled -- A scrambled run order
is a random run order. It has been chosen from a large group of random run orders to
provide the best protection against shifts, drifts, and cycles in your data.
|
|||||||||||||
screening -- Screening is a
technique used to determine which factors in a list of contenders are most important.
Screening generally neglects interactions to keep the design as small as possible.
|
|||||||||||||
simplex lattice design -- A
simplex lattice design is a mixture design that can include vertices, center point,
centers of edges and centers of faces. All components can range from 0% to 100% of the
mixture.
|
|||||||||||||
slope -- Slope measures the degree
of tilt of a line. It is the rise of the line divided by the run of the line. Positive
slope indicates rising from left to right. Negative slope indicates rising from right to
left. Zero slope indicates a horizontal line.
|
|||||||||||||
small design -- "Small
design" is a term used by STATISTICA to indicate that not all of the trials from the
full design are being used. These designs typically have an interaction order of 2.
|
|||||||||||||
spherical geometry -- When some
factors in an experiment may vary from higher or lower levels when the levels of the other
factors are at their middle levels the geometry is spherical. The easiest way to see this
is to consider the sphere that encloses the cube formed by the high and low levels of
three factors. The star points on this sphere are above the centers of the cube faces.
|
|||||||||||||
standard deviation -- The standard
deviation is the width of half the bell-shaped pile, or Normal distribution, at half of
its height. This width can be used to estimate the full width of the pile.
|
|||||||||||||
star points -- These are points
added to an interaction design, or factorial design, to allow fitting a full quadratic
model. They are in the centers of faces for cubical geometries and on the sphere
surrounding the cube for spherical geometries.
|
|||||||||||||
STATISTICA -- A full-featured
statistics software that can be used for experiment design and analysis.
|
|||||||||||||
statistical test -- The
"statistical test" compares the prediction limits for a trial that was not used
to fit your model with the observation for that trial. If the observed value lies between
the prediction limits, the test passes. If not, it fails. This term is not universal : it
is a Math Options term only.
|
|||||||||||||
STRATEGY -- A complete DOE
software that can analyze any Gosset design. It is also the only DOE software with
I-Optimal designs built in (called Hardin-Sloane designs).
|
|||||||||||||
Student's t value -- Student's t
is a correction factor for standard deviations calculated from small sample sizes. It
allows you to state confidence limits.
|
|||||||||||||
Sweet Spot -- The Sweet Spot is
the experimental trial that meets all of your response goals simultaneously.
|
|||||||||||||
systematic error -- Systematic
error is experimental error that can, at lease theoretically, be eliminated. It includes
mistakes, instrument drift, external factors that are not held constant, etc.
|
|||||||||||||
t -- Student's t value. Student's
t is a correction factor for standard deviations calculated from small sample sizes. It
allows you to state confidence limits.
|
|||||||||||||
term -- A term is a part of an
equation that is separated by plus and / or minus signs. For example, in Y = b0 + b1X1, b0
and b1X1 are terms.
|
|||||||||||||
tolerance limits -- Short for
"statistical tolerance limits," they are limits between which a large proportion
of all individual measured responses will lie with some confidence. For example, 95%
tolerance limits on 99% of the population tell you the limits between which 99% of all
future response measurements should lie with 95% confidence.
|
|||||||||||||
transformation -- A transformation
is a mathematical operation performed on responses to make their standard deviation
estimates consistent.
|
|||||||||||||
trial -- A trial is a set of
factors and their associated levels that completely specifies an experiment to run. For
example, "Temp = 100 deg C, Time = 2 hours, and Pressure = 35 PSI" is a trial.
|
|||||||||||||
triangular plots -- Triangular
plots allow you to make contour plots for mixtures in which all of the components are
shown. These are referred to as "Type 1 plots" by Math Options because they were
the first type used historically.
|
|||||||||||||
type I model -- A type I model is
a model for mixtures that includes every component and has no constant term. The term is
not widely used -- Math Options uses it to emphasize that it came first historically.
|
|||||||||||||
type II model -- A type II model
is a model for mixtures that includes every component but 1 and includes a constant term.
The term is not widely used -- Math Options uses it to emphasize that it came second
historically.
|
|||||||||||||
type 1 plot -- Type 1 plots allow
you to make contour plots for mixtures in which all of the components are shown. These are
referred to as "Type 1 plots" by Math Options because they were the first type
used historically. They are also called triangular plots.
|
|||||||||||||
type 2 plot -- Type 2 plots allow
you to make contour plots for mixtures in which all of the components but one are shown.
These are referred to as "Type 2 plots" by Math Options because they were the
second type used historically.
|
|||||||||||||
variance -- Variance is the square
of the standard deviation. Variance is useful because variances can be added while
standard deviations cannot.
|
|||||||||||||
variation -- Variation refers to
differences in replicate measured responses for the same trial.
|
|||||||||||||
X1, X2, etc. -- These are symbols
used to indicate factors in a mathematical model.
|
|||||||||||||
Y -- Y is used to indicate a
response in a mathematical model.
|
|||||||||||||
Y-bar -- This is a term that
refers to the average for a response. It is symbolized as Y. |