As a simple example, consider that many bakers have noticed that the amount of "fluffiness" in a loaf of bread seems to be related to how much humidity there is in the air when the dough is being made. This can be formalized as the hypothesis: "all other things being considered equal, the greater the humidity, the fluffier the bread".
While this hypothesis might arise naturally from baking many loaves over time, an experiment to determine whether or not this is really true would be to carefully prepare bread dough, as identically as possible, on two types of days: days when the humidity is high, and days when the humidity is low. If the hypothesis is true, then the bread prepared on the high humidity days should be fluffier.
Several features of this experiment hold in general for all experiments:
Experimental design attempts to balance the requirements and limitations of the field of science in which one works so that the experiment can provide the best conclusion about the hypothesis being tested.
In some sciences, such as physics and chemistry, it is relatively easy to meet the requirements that all measurements be made objectively, and that all conditions can be kept controlled across experimental trials. On the other hand, in other cases such as biology, and medicine, it is often hard to ensure that the conditions of an experiment be performed consistently; and in the social sciences, it may even be difficult to determine a method for measuring the outcomes of an experiment in an objective manner.
For this reason, sciences such as physics are often referred to as "hard sciences", while others such as sociology are referred to as "soft sciences"; in an attempt to capture the idea that objective measurements are often far easier in the former, and far more difficult in the latter.
In addition, in the soft sciences, the requirement for a "controlled situation" may actually work against the utility of the hypothesis in a more general situation. When the desire is to test a hypothesis that works "in general", an experiment may have a great deal of internal validity, in the sense that it is valid in a highly controlled situation, while at the same time lack external validity when the results of the experiment are applied to a real world situation.
As a result of these considerations, experimental design in the "hard" sciences tends to focus on the elimination of exteraneous effects (type of flour, impurities in the water); while experimental design in the "soft" sciences focuses more on the problems of external validity, often through the use of statistical methods.
Many hypotheses in sciences such as physics can establish causality by noting that, until some phenomenon occurs, nothing happens; then when the phenomenon occurs, a second phenomenon is observed. But often in science, this situation is difficult to obtain.
For example, in the old joke, someone claims that they are snapping their fingers "to keep the tigers away"; and justifies this behaviour by saying "see - it's working!". While this "experiment" does not falsify the hypothesis "snapping fingers keeps the tigers away", it does not really support the hypothesis - not snapping your fingers keeps the tigers away as well.
To demonstrate a cause and effect hypothesis, an experiment must often show that, for example, a phenomenon occurs after a certain treatment is given to a subject, and that the phenomenon does not occur in the absence of the treatment.
A controlled experiment generally compares the results obtained from an experimental sample against a control sample, which is practically identical to the experimental sample except for the one aspect whose effect is being tested.
Controlled experiments can be particularly useful when it is difficult to exactly control all the conditions in an experiment. The experiment begins by creating two or more sample groups that are probabilistically equivalent, which means that measurements of traits should be similar among the groups and that the groups should respond in the same manner if given the same treatment. This equivalency is determined by statistical methods that take into account the amount of variation between individuals and the number of individuals in each group. In fields such as microbiology and chemistry, where there is very little variation between individuals and the group size is easily in the millions, these statistical methods are often bypassed and simply splitting a solution into equal parts is assumed to produce identical sample groups.
Once equivalent groups have been formed, the experimenter tries to treat them identically except for the one variable that he or she wishes to isolate. Human experimentation[?] requires special safeguards against outside variables such as the placebo effect. Such experiments are generally double blind, meaning that neither the volunteer nor the researcher knows which individuals are in the control group or the experimental group until after all of the data has been collected. This ensures that any effects on the volunteer are due to the treatment itself and are not a response to the knowledge that he is being treated.
Sometimes controlled experiments are prohibitively difficult, so researchers resort to natural experiments. Natural experiments take advantage of predictable natural changes in simple systems to measure the effect of that change on some phenomenon.
Much of astronomy relies on experiments of this type. It is clearly impratical, when trying to prove the hypotehsis "suns are collapsed clouds of hydrogen", to start out with a giant cloud of hydrogen, and then perform the experiment of waiting a few billion years for it to form a sun. However, by observing various clouds of hydrogen in various states of collapse, and other implications of the hypothesis (for example, the presence of various spectral emissions from the light of stars), we can collect the experimental data we require to support the hypothesis.
An early example of this type of experiment was the first verification in the 1600s that light does not travel from place to place instantaneously, but instead has a measurable speed. Observation of the the appearance of the moons of Jupiter were slightly delayed when Jupiter was far from Earth, as opposed to when Jupiter was closer to Earth; and this phenomenon was used to demonstrate that the time delays were consistent with a measurable speed of light.
Quasi-experiments are very much like controlled experiments except that they lack probabilistic equivalency between groups. These types of experiments often arise in the area of medicine where, for ethical reasons, it is not possible to create a truly controlled group. For example, one would not want to deny all forms of treatment for a life-threatening disease from one group of patients to evaluate the effectiveness of another treatment on a different group of patients. Researchers compensate for this with complicated statistical methods.
See also thought experiment