The Classroom | Empowering Students in Their College Journey

The Relationship Between Scientific Method & Critical Thinking

Scott Neuffer

What Is the Function of the Hypothesis?

Critical thinking, that is the mind’s ability to analyze claims about the world, is the intellectual basis of the scientific method. The scientific method can be viewed as an extensive, structured mode of critical thinking that involves hypothesis, experimentation and conclusion.

Critical Thinking

Broadly speaking, critical thinking is any analytical thought aimed at determining the validity of a specific claim. It can be as simple as a nine-year-old questioning a parent’s claim that Santa Claus exists, or as complex as physicists questioning the relativity of space and time. Critical thinking is the point when the mind turns in opposition to an accepted truth and begins analyzing its underlying premises. As American philosopher John Dewey said, it is the “active, persistent and careful consideration of a belief or supposed form of knowledge in light of the grounds that support it, and the further conclusions to which it tends.”

Critical thinking initiates the act of hypothesis. In the scientific method, the hypothesis is the initial supposition, or theoretical claim about the world, based on questions and observations. If critical thinking asks the question, then the hypothesis is the best attempt at the time to answer the question using observable phenomenon. For example, an astrophysicist may question existing theories of black holes based on his own observation. He may posit a contrary hypothesis, arguing black holes actually produce white light. It is not a final conclusion, however, as the scientific method requires specific forms of verification.

Experimentation

The scientific method uses formal experimentation to analyze any hypothesis. The rigorous and specific methodology of experimentation is designed to gather unbiased empirical evidence that either supports or contradicts a given claim. Controlled variables are used to provide an objective basis of comparison. For example, researchers studying the effects of a certain drug may provide half the test population with a placebo pill and the other half with the real drug. The effects of the real drug can then be assessed relative to the control group.

In the scientific method, conclusions are drawn only after tested, verifiable evidence supports them. Even then, conclusions are subject to peer review and often retested before general consensus is reached. Thus, what begins as an act of critical thinking becomes, in the scientific method, a complex process of testing the validity of a claim. English philosopher Francis Bacon put it this way: “If a man will begin with certainties, he shall end in doubts; but if he will be content to begin with doubts, he shall end in certainties.”

Related Articles

According to the Constitution, What Power Is Denied to the Judicial Branch?

According to the Constitution, What Power Is Denied to the Judicial ...

How to Evaluate Statistical Analysis

How to Evaluate Statistical Analysis

The Disadvantages of Qualitative & Quantitative Research

The Disadvantages of Qualitative & Quantitative Research

Qualitative and Quantitative Research Methods

Qualitative and Quantitative Research Methods

What Is Experimental Research Design?

What Is Experimental Research Design?

The Parts of an Argument

The Parts of an Argument

What Is a Confirmed Hypothesis?

What Is a Confirmed Hypothesis?

The Formula for T Scores

The Formula for T Scores

  • How We Think: John Dewey
  • The Advancement of Learning: Francis Bacon

Scott Neuffer is an award-winning journalist and writer who lives in Nevada. He holds a bachelor's degree in English and spent five years as an education and business reporter for Sierra Nevada Media Group. His first collection of short stories, "Scars of the New Order," was published in 2014.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Microb Biotechnol
  • v.16(10); 2023 Oct
  • PMC10527184

Logo of microbiotech

Science, method and critical thinking

Antoine danchin.

1 School of Biomedical Sciences, Li KaShing Faculty of Medicine, Hong Kong University, Pokfulam Hong Kong, China

Science is founded on a method based on critical thinking. A prerequisite for this is not only a sufficient command of language but also the comprehension of the basic concepts underlying our understanding of reality. This constraint implies an awareness of the fact that the truth of the World is not directly accessible to us, but can only be glimpsed through the construction of models designed to anticipate its behaviour. Because the relationship between models and reality rests on the interpretation of founding postulates and instantiations of their predictions (and is therefore deeply rooted in language and culture), there can be no demarcation between science and non‐science. However, critical thinking is essential to ensure that the link between models and reality is gradually made more adequate to reality, based on what has already been established, thus guaranteeing that science progresses on this basis and excluding any form of relativism.

Science understands that we only can reach the truth of the World via creation of models. The method, based on critical thinking, is embedded in the scientific method, named here the Critical Generative Method.

An external file that holds a picture, illustration, etc.
Object name is MBT2-16-1888-g003.jpg

Before illustrating the key requirements for critical thinking, one point must be made clear from the outset: thinking involves using language, and the depth of thought is directly related to the ‘active’ vocabulary (Magyar,  1942 ) used by the thinker. A recent study of young students in France showed that a significant percentage of the population had a very limited vocabulary. This unfortunate situation is shared by many countries (Fournier & Rakocevic,  2023 ). This omnipresent fact, which precludes any attempt to improve critical thinking in the general population, is very visible in a great many texts published on social networks. This is the more concerning because science uses a vocabulary that lies well beyond that available to most people. For example, a word such as ‘metabolism’ is generally not understood. As a consequence, it is essential to agree on a minimal vocabulary before teaching paths to critical thinking. This may look trivial, but this is an essential prerequisite. Typically, words such as analysis and synthesis must be understood (and the idea of what a ‘concept’ is not widely shared). It must also be remembered that the way the scientific vocabulary kept creating neologisms in the most creative times of science was based on using the Ancient Greek language, and for a good reason: a considerable advantage of that unsaid rule is that this makes scientific objects and concepts prominent for scientists from all over the world, while precluding implicit domination by any country over the others when science is at stake (Iliopoulos et al.,  2019 ). Unfortunately, and this demonstrates how the domination of an ignorant subset of the research community gains ground, this rule is now seldom followed. This also highlights the lack of extensive scientific background of the majority of researchers: the creation of new words now follows the rule of the self‐assertive. Interestingly, the very observation that a neologism in a scientific paper does not follow the traditional rule provides us with a critical way to identify either ignorance of the scientific background of the work or the presence in the text of hidden agendas that have nothing to do with science.

In practice, the initiation of the process of critical thinking ought to begin with a step similar to the ‘due diligence’ required by investors when they study whether they will invest, or not, in a start‐up company. The first expected action should be ‘verify’, ‘verify’, ‘verify’… any statement which is used as a basis for the reasoning that follows. This asks not only for understanding what is said or written (hence the importance of language), but also for checking the origins of the statement, not only by investigating who is involved but also by checking that the historical context is well known.

Of course, nobody has complete knowledge of everything, not even anything in fact, which means that at some point people have to accept that they will base their reasoning on some kind of ‘belief’. This inevitable imperative forces future scientists asking a question about reality to resort to a set of assertions called ‘postulates’ in conventional science, that is, beliefs temporarily accepted without further discussion but understood as such. The way in which postulates are formulated is therefore key to their subsequent role in science. Similarly, the fact that they are temporary is essential to understanding their role. A fundamental feature of critical thinking is to be able to identify these postulates and then remember that they are provisional in nature. When needed this enables anyone to return to the origins of reasoning and then decide whether it is reasonable to retain the postulates or modify or even abandon them.

Here is an example illustrated with the famous greenhouse effect that allows our planet not to be a snowball (Arrhenius,  1896 ). Note that understanding this phenomenon requires a fair amount of basic physics, as well as a trait that is often forgotten: common sense. There is no doubt that carbon dioxide is a greenhouse gas (this is based on well‐established physics, which, nevertheless must be accepted as a postulate by the majority, as they would not be able to demonstrate that). However, a straightforward question arises, which is almost never asked in its proper details. There are many gases in the atmosphere, and the obvious preliminary question should be to ask what they all are, and each of their relative contribution to greenhouse effect. This is partially understood by a fraction of the general public as asking for the contribution of methane, and sometimes N 2 O and ozone. However, this is far from enough, because the gas which contributes the most to the greenhouse effect on our planet is … water vapour (about 60% of the total effect: https://www.acs.org/climatescience/climatesciencenarratives/its‐water‐vapor‐not‐the‐co2.html )! This fact is seldom highlighted. Yet it is extremely important because water is such a strange molecule. Around 300 K water can evolve rapidly to form a liquid, a gas, or a solid (ice). The transitions between these different states (with only the gas having a greenhouse effect, while water droplets in clouds have generally a cooling effect) make that water is unable to directly control the Earth's temperature. Worse, in fact, these phase transitions will amplify the fluctuations around a given temperature, generally in a feedforward way. We know very well the situation in deserts, where the night temperature is very low, with a very high temperature during the day. In fact, this explains why ‘global warming’ (i.e. shifting upwards the average temperature of the planet) is also parallel with an amplification of weather extremes. It is quite remarkable that the role of water, which is well established, does not belong to popular knowledge. Standard ‘due diligence’ would have made this knowledge widely shared.

Another straightforward example of the need to have a clear knowledge of the thought of our predecessors is illustrated in the following. When we see expressions such as ‘paradigm change’, ‘change of paradigm’, ‘paradigm shift’ or ‘shift of paradigm’ (12,424 articles listed in PubMed as of June 26, 2023), we should be aware that the subject of interest of these articles has nothing to do with a paradigm shift, simply because such a change in paradigm is extremely rare, being distributed over centuries, at best (Kuhn,  1962 ). Worse, the use of the word implies that the authors of the works have most probably never read Thomas Kuhn's work, and are merely using a fashionable hearsay. As a consequence, critical thinking should lead authentic scientists to put aside all these works before further developing their investigation (Figure  1 ).

An external file that holds a picture, illustration, etc.
Object name is MBT2-16-1888-g002.jpg

Number of articles identified in the PubMed database with the keywords ‘paradigm change’ or ‘change of paradigm’ or ‘paradigm shift’ or ‘shift of paradigm’. A very low number of articles, generally reporting information consistent with the Kuhnian view of scientific revolutions is published before 1993. Between 1993 and 2000 a looser view of the term paradigm begins to be used in a metaphoric way. Since then the word has become fashionable while losing entirely its original meaning, while carrying over lack of epistemological knowledge. This example of common behaviour illustrates the decadence of contemporary science.

This being understood, we can now explore the general way science proceeds. This has been previously discussed at a conference meant to explain the scientific method to an audience of Chinese philosophers, anthropologists and scientists and held at Sun Yat Sen (Zhong Shan) University in Canton (Guangzhou) in 1991. This discussion is expanded in The Delphic Boat (Danchin,  2002 ). For a variety of reasons, it would be useful to anticipate the future of our world. This raises an unlimited number of questions and the aim of the scientific method is to try and answer those. The way in which questions emerge is a subject in itself. This is not addressed here, but this should also be the subject of critical thinking (Yanai & Lercher,  2019 ).

The basis for scientific investigation accepts that, while the truth of the world exists in itself (‘relativism’ is foreign to scientific knowledge, as science keeps building up its progresses on previous knowledge, even when changing its paradigms), we can only access it through the mediation of a representation. This has been extensively debated at the time, 2500 years ago, when science and philosophy designed the common endeavour meant to generate knowledge (Frank,  1952 ). It was then apparent that we cannot escape this omnipresent limitation of human rationality, as Xenophanes of Colophon explicitly stated at the time [discussed in Popper,  1968 ]. This limitation comes from an inevitable constraint: contrary to what many keep saying, data do not speak . Reality must be interpreted within the frame of a particular representation that critical thinking aims at making visible. A sentence that we all forget to reject, such as ‘results show…’ is meaningless: results are interpreted as meaning this or that.

Accepting this limitation is a difficult attribute of scientific judgement. Yet the quality of thought progresses as the understanding of this constraint becomes more effective: to answer our questions we have to build models of the world, and be satisfied with this perspective. It is through our knowledge of the world's models that we are able to explore and act upon it. We can even become the creators of new behaviours of reality, including new artefacts such as a laser beam, a physics‐based device that is unlikely to exist in the universe except in places where agents with an ability similar to ours would exist. Indeed, to create models is to introduce a distance, a mediation through some kind of symbolic coding (via the construction of a model), between ourselves and the world. It is worth pointing out that this feature highlights how science builds its strength from its very radical weakness, which is to know that it is incapable, in principle, of attaining truth. Furthermore and fortunately, we do not have to begin with a tabula rasa . Science keeps progressing. The ideas and the models we have received from our fathers form the basis of our first representation of the world. The critical question we all face, then, is: how well these models match up with reality? how do they fare in answering our questions?

Many, over time, think they achieve ultimate understanding of reality (or force others to think so) and abide by the knowledge reached at the time, precluding any progress. A few persist in asking questions about what remains enigmatic in the way things behave. Until fairly recently (and this can still be seen in the fashion for ‘organic’ things, or the idea, similar to that of the animating ‘phlogiston’ of the Middle Ages, that things spontaneously organize themselves in certain elusive circumstances usually represented by fancy mathematical models), things were thought to combine four elements: fire, air, water, and earth, in a variety of proportions and combinations. In China, wood, a fifth element that had some link to life was added to the list. Later on, the world was assumed to result from the combination of 10 categories (Danchin,  2009 ). It took time to develop a physic of reality involving space, time, mass, and energy. What this means is still far from fully understood. How, in our times when the successes of the applications of science are so prominent, is it still possible to question the generally accepted knowledge, to progress in the construction of a new representation of reality?

This is where critical thinking comes in. The first step must be to try and simplify the problem, to abstract from the blurred set of inherited ideas a few foundational concepts that will not immediately be called into question, at least as a preliminary stage of investigation. We begin by isolating a phenomenon whose apparent clarity contrasts with its environment. A key point in the process is to be aware of the fact that the links between correlation and causation are not trivial (Altman & Krzywinski,  2015 ). The confusion between both properties results probably in the major anti‐science behaviour that prevents the development of knowledge. In our time, a better understanding of what causality is is essential to understand the present development of Artificial Intelligence (Schölkopf et al.,  2021 ) as this is directly linked to the process of rational decision (Simon,  1996 ).

Subsequently, a set of undisputed rules, phenomenological criteria and postulates is associated with the phenomenon. It constitutes temporarily the founding dogma of the theory, made up of the phenomenon of interest, the postulates, the model and the conditions and results of its application to reality. This epistemological attitude can legitimately be described as ‘dogmatic’ and it remains unchanged for a long time in the progression of scientific knowledge. This is well illustrated by the fact that the word ‘dogma’, a religious word par excellence, is often misused when referring to a scientific theory. Many still refer, for example, to the expression ‘the central dogma of molecular biology’ to describe the rules for rewriting the genetic program from DNA to RNA and then proteins (Crick,  1970 ). Of course, critical thinking understands that this is no dogma, and variations on the theme are omnipresent, as seen for instance in the role of the enzyme reverse transcriptase which allows RNA to be rewritten into a DNA sequence.

Yet, whereas isolating postulates is an important step, it does not permit one to give explanations or predictions. To go further, one must therefore initiate a constructive process. The essential step there will be the constitution of a model (or in weaker instances, a simulation) of the phenomenon (Figure  2 ).

An external file that holds a picture, illustration, etc.
Object name is MBT2-16-1888-g001.jpg

The Critical Generative Method. Science is based on the premises that while we can look for the truth of reality, this is in principle impossible. The only way out is to build up models of reality (‘realistic models’) and find ways to compare their outcome to the behaviour of reality [see an explicit example for genome sequences in Hénaut et al.,  1996 ]. The ultimate model is mathematical model, but this is rarely possible to achieve. Other models are based on simulations, that is, models that mimic the behaviour of reality without trying to propose an explanation of that behaviour. A primitive attempt of this endeavour is illustrated when people use figurines that they manipulate hoping that this will anticipate the behaviour of their environment (e.g. ‘voodoo’). This is also frequent in borderline science (Friedman & Brown,  2018 ).

To this aim, the postulates will be interpreted in the form of entities (concrete or abstract) or of relationships between entities, which will be further manipulated by an independent set of processes. The perfect stage, generally considered as the ultimate one, associates the manipulation of abstract entities, interpreting postulates into axioms and definitions, manipulable according to the rules of logic. In the construction of a model, one assists therefore first to a process of abstraction , which allows one to go from the postulates to the axioms. Quite often, however, one will not be able to axiomatize the postulates. It will only be possible to represent them using analogies involving the founding elements of another phenomenon, better known and considered as analogous. One could also change the scales of a phenomenon (this is the case when one uses mock‐ups as models). In these families of approaches, the model is considered as a simulation. For example, it will be possible to simulate an electromagnetic phenomenon using a hydrodynamic phenomenon [for a general example in physics (Vives & Ricou,  1985 )]. In recent times the simulation is generally performed numerically, using (super)computers [e.g. the mesoscopic scale typical for cells (Huber & McCammon,  2019 )]. While all these approaches have important implications in terms of diagnostic, for example, they are generally purely phenomenological and descriptive. This is understood by critical thinking, despite the general tendency to mistake the mimic for what it represents. Recent artificial intelligence approaches that use ‘neuronal networks’ are not, at least for the time being, models of the brain.

However useful and effective, the simulation of a phenomenon is clearly an admission of failure. A simulation represents behaviour that conforms to reality, but does not explain it. Yet science aims to do more than simply represent a phenomenon; it aims to anticipate what will happen in the near and distant future. To get closer to the truth, we need to understand and explain, that is, reduce the representation to simpler elementary principles (and as few as possible) in order to escape the omnipresent anecdotes that parasitize our vision of the future. In the case of the study of genomes, for example, this will lead us to question their origin and evolution. It will also require us to understand the formal nature of the control processes (of which feedback, e.g. is one) that they encode. As soon as possible, therefore, we would like to translate the postulates that enabled the model's construction into well‐formed statements that will constitute the axioms and definitions of an explanatory model. At a later stage, the axioms and definitions will be linked together to create a demonstration leading to a theorem or, more often than not, a simple conjecture.

When based on mathematics, the model is made up of its axioms and definitions, and the demonstrations and theorems it conveys. It is an entirely autonomous entity, which can only be justified by its own rules. To be valid, it must necessarily be true according to the rules of mathematical logic. So here we have an essential truth criterion, but one that can say nothing about the truth of the phenomenon. A key feature of critical thinking is the understanding that the truth of the model is not the truth of the phenomenon. The amalgam of these two truths, common in magical thinking, often results in the model (identified as a portion of the world) being given a sacred value, and changes the role of the scientist to that of a priest.

Having started from the phenomenon of interest to build the model, we now need to return from the model to the real world. A process symmetrical to that which provided the basis for the model, an instantiation of the conclusions summarized in the theorem, is now required. This can take the form of predictions, observations or experiments, for which at least two types can be broadly identified. These predictions are either existential (the object, process, or relations predicted by the instantiation of the theorem must be discovered), or phenomenological, and therefore subject to verification and deniability. An experimental set‐up will have to be constructed to explore what has been predicted by the instantiations of the model theorems and to support or falsify the predictions. In the case of hypotheses based on genes, for example, this will lead to synthetic biology constructs experiments (Danchin & Huang,  2023 ), where genes are replaced by counterparts, even made of atoms that differ from the canonical ones.

The reaction of reality, either to simple (passive) observation or to the observation of phenomena triggered by the experiments, will validate the model and measure the degree of adequacy between the model and the reality. This follows a constructive path when the model's shortcomings are identified, and when are discovered the predicted new objects that must now be included in further models of reality. This process imposes the falsification of certain instantiated conclusions that have been falsified as a major driving force for the progression of the model in line with reality. This part of the thought process is essential to escape infinite regression in a series of confirmation experiments, one after the other, ad infinitum. Identifying this type of situation, based on the understanding that the behaviour of the model is not reality but an interpretation of reality, is essential to promote critical thinking.

It must also be stressed that, of course, the weight of the proof of the model's adequacy to reality belongs to the authors of the model. It would be both contrary to the simplest rules of logic (the proof of non‐existence is only possible for finite sets), and also totally inefficient, as well as sterile, to produce an unfalsifiable model. This is indeed a critical way to identify the many pretenders who plague science. They are easy to recognize since they identify themselves precisely by the fact that they ask the others: ‘repeat my experiments again and show me that they are wrong!’. Unfortunately, this old conjuring trick is still well spread, especially in a world dominated by mass media looking for scoops, not for truth.

When certain predictions of the model are not verified, critical thinking forces us to study its relationship with reality, and we must proceed in reverse, following the path that led to these inadequate predictions (Figure  2 ). In this reverse process, we go backwards until we reach the postulates on which the model was built, at which point we modify, refine and, if necessary, change them. The explanatory power of the model will increase each time we can reduce the number of postulates on which it is built. This is another way of developing critical thinking skills: the more factors there are underlying an explanation, the less reliable the model. As an example in molecular biology, the selective model used by Monod and coworkers to account for allostery (Monod et al.,  1965 ) used far fewer adjustable parameters than Koshland's induced‐fit model (Koshland,  1959 ).

In real‐life situations, this reverse path is long and difficult to build. The model's resistance to change is quickly organized, if only because, lacking critical thinking, its creators cannot help thinking that, in fact, the model manifests, rather than represents, the truth of the world. It is only natural, then, to think that the lack of predictive power is primarily due not to the model's inadequacy, but to the inappropriate way in which its broad conclusions have been instantiated. This corresponds, in effect, to a stage where formal terms have been interpreted in terms of real behaviour, which involves a great deal of fine‐tuning. Because it is inherently difficult to identify the inadequacy of the model or its links with the phenomenon of interest, it is often the case that a model persists, sometimes for a very long time, despite numerous signs of imperfection.

During this critical process, the very nature of the model is questioned, and its construction, the meaning it represents, is clarified and refined under the constraint of contradictions. The very terms of the instantiations of predictions, or of the abstraction of founding postulates, are made finer and finer. This is why this dogmatic stage plays such an essential role: a model that was too inadequate would have been quickly discarded, and would not have been able to generate and advance knowledge, whereas a succession of improvements leads to an ever finer understanding, and hence better representation of the phenomenon of interest. Then comes a time when the very axioms on which the model is based are called into question, and when the most recent abstractions made from the initial postulates lead to them being called into question. This is of course very rare and difficult, and is the source of those genuine scientific revolutions, those paradigm shifts (to use Thomas Kuhn's word), from which new models are born, develop and die, based on assumptions that differ profoundly from those of their predecessors. This manifests an ultimate, but extremely rare, success of critical thinking.

A final comment. Karl Popper in his Logik der Forschung ( The Logic of Scientific Discovery ) tried to show that there was a demarcation separating science from non‐science (Keuth and Popper,  1934 ). This resulted from the implementation of a refutation process that he named falsification that was sufficient to tell the observer that a model was failing. However, as displayed in Figure  2 , refutation does not work directly on the model of interest, but on the interpretation of its predictions . This means that while science is associated with a method, its implementation in practice is variable, and its borders fuzzy. In fact, trying to match models with reality allows us to progress by producing better adequacy with reality (Putnam,  1991 ). Nevertheless, because the separation between models and reality rests on interpretations (processes rooted in culture and language), establishing an explicit demarcation is impossible. This intrinsic difficulty, which is associated with a property that we could name ‘context associated with a research programme’ (Lakatos,  1976 , 1978 ), shows that the demarcation between science and non‐science is dominated by a particular currency of reality, which we have to consider under the name information , using the word with all its common (and accordingly fuzzy) connotations, and which operates in addition to the standard categories, mass, energy, space and time.

The first attempts to solve contradictions between model predictions and observed phenomena do not immediately discard the model, as Popper would have it. The common practice is for the authors of a model to re‐interpret the instantiation process that has coupled the theorem to reality. Typically: ‘exceptions make the rule’, or ‘this is not exactly what we meant, we need to focus more on this or that feature’, etc. This polishing step is essential, it allows the frontiers of the model and its associated phenomena to be defined as accurately as possible. It marks the moment when technically arid efforts such as defining a proper nomenclature, a database data schema, etc., have a central role. In contrast to the hopes of Popper, who sought for a principle telling us whether a particular creation of knowledge can be named Science, using refutation as principle, there is no ultimate demarcation between science and non‐science. Then comes a time when, despite all efforts to reconcile predictions and phenomena, the inadequacy between the model and reality becomes insoluble. Assuming no mistake in the demonstration (within the model), this contradiction implies that we need to reconsider the axioms and definitions upon which the model has been constructed. This is the time when critical thinking becomes imperative.

AUTHOR CONTRIBUTIONS

Antoine Danchin: Conceptualization (lead); writing – original draft (lead); writing – review and editing (lead).

CONFLICT OF INTEREST STATEMENT

This work belongs to efforts pertaining to epistemological thinking and does not imply any conflict of interest.

ACKNOWLEDGEMENTS

The general outline of the Critical Generative Method presented at Zhong Shan University in Guangzhou, China in 1991, and discussed over the years in the Stanislas Noria seminar ( https://www.normalesup.org/~adanchin/causeries/causeries‐en.html ) has previously been published in Danchin ( 2009 ) and in a variety of texts. Because scientific knowledge results from accumulation of knowledge painstakingly created by the generations that preceded us, the present text purposely makes reference to work which is seldom cited at a moment when scientists become amnesiac and tend to reinvent the wheel.

Danchin, A. (2023) Science, method and critical thinking . Microbial Biotechnology , 16 , 1888–1894. Available from: 10.1111/1751-7915.14315 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

  • Altman, N. & Krzywinski, M. (2015) Association, correlation and causation . Nature Methods , 12 , 899–900. [ PubMed ] [ Google Scholar ]
  • Arrhenius, S. (1896) XXXI. On the influence of carbonic acid in the air upon the temperature of the ground . The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science , 41 , 237–276. [ Google Scholar ]
  • Crick, F. (1970) Central dogma of molecular biology . Nature , 227 , 561–563. [ PubMed ] [ Google Scholar ]
  • Danchin, A. (2002) The Delphic boat: what genomes tell us . Cambridge, MA: Harvard University Press. [ Google Scholar ]
  • Danchin, A. (2009) Information of the chassis and information of the program in synthetic cells . Systems and Synthetic Biology , 3 , 125–134. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Danchin, A. & Huang, J.D. (2023) synbio 2.0, a new era for synthetic life: neglected essential functions for resilience . Environmental Microbiology , 25 , 64–78. [ PubMed ] [ Google Scholar ]
  • Fournier, Y. & Rakocevic, R. (2023) Objectifs éducation et formation 2030 de l'UE: où en est la France en 2023? Note d'Information , 23 , 20. [ Google Scholar ]
  • Frank, P. (1952) The origin of the separation between science and philosophy . Proceedings of the American Academy of Arts and Sciences , 80 , 115–139. [ Google Scholar ]
  • Friedman, H.L. & Brown, N.J.L. (2018) Implications of debunking the “critical positivity ratio” for humanistic psychology: introduction to special issue . Journal of Humanistic Psychology , 58 , 239–261. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hénaut, A. , Rouxel, T. , Gleizes, A. , Moszer, I. & Danchin, A. (1996) Uneven distribution of GATC motifs in the Escherichia coli chromosome, its plasmids and its phages . Journal of Molecular Biology , 257 , 574–585. [ PubMed ] [ Google Scholar ]
  • Huber, G.A. & McCammon, J.A. (2019) Brownian dynamics simulations of biological molecules . Trends in Chemistry , 1 , 727–738. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Iliopoulos, I. , Ananiadou, S. , Danchin, A. , Ioannidis, J.P. , Katsikis, P.D. , Ouzounis, C.A. et al. (2019) Hypothesis, analysis and synthesis, it's all Greek to me . eLife , 8 , e43514. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Koshland, D.E. (1959) Enzyme flexibility and enzyme action . Journal of Cellular and Comparative Physiology , 54 , 245–258. [ PubMed ] [ Google Scholar ]
  • Kuhn, T.S. (1962) The structure of scientific revolutions , 3rd edition. Chicago, IL: University of Chicago Press. [ Google Scholar ]
  • Lakatos, I. (1976) Proofs and refutations: the logic of mathematical discovery . Cambridge: Cambridge University Press. [ Google Scholar ]
  • Lakatos, I. (1978) The methodology of scientific research programmes . Cambridge: Cambridge University Press. [ Google Scholar ]
  • Magyar, F. (1942) The compilation of an active vocabulary . The German Quarterly , 15 , 214–217. [ Google Scholar ]
  • Monod, J. , Wyman, J. & Changeux, J.P. (1965) On the nature of allosteric transitions: a plausible model . Journal of Molecular Biology , 12 , 88–118. [ PubMed ] [ Google Scholar ]
  • Popper, K.R. (1934) Logik der Forschung, 4., bearbeitete Auflage . Berlin: Akademie Verlag. Translation prepared by the author (1959): The logic of scientific discovery . London: Hutchinson & Co. [ Google Scholar ]
  • Popper, K.R. (1968) Conjectures and refutations: the growth of scientific knowledge . London; New York, NY: Routledge. [ Google Scholar ]
  • Putnam, H. (1991) Representation and reality . Cambridge, MA: MIT Press. [ Google Scholar ]
  • Schölkopf, B. , Locatello, F. , Bauer, S. , Ke, N.R. , Kalchbrenner, N. , Goyal, A. et al. (2021) Towards causal representation learning . arXiv , 2021 , 11107. [ Google Scholar ]
  • Simon, H.A. (1996) The sciences of the artificial , 3rd edition. Cambridge, MA: MIT Press. [ Google Scholar ]
  • Vives, C. & Ricou, R. (1985) Experimental study of continuous electromagnetic casting of aluminum alloys . Metallurgical Transactions B , 16 , 377–384. [ Google Scholar ]
  • Yanai, I. & Lercher, M. (2019) What is the question? Genome Biology , 20 , 1314–1315. [ PMC free article ] [ PubMed ] [ Google Scholar ]

Enhancing Scientific Thinking Through the Development of Critical Thinking in Higher Education

  • First Online: 22 September 2019

Cite this chapter

critical thinking and scientific method

  • Heidi Hyytinen 3 ,
  • Auli Toom 3 &
  • Richard J. Shavelson 4  

1503 Accesses

27 Citations

3 Altmetric

Contemporary higher education is committed to enhancing students’ scientific thinking in part by improving their capacity to think critically, a competence that forms a foundation for scientific thinking. We introduce and evaluate the characteristic elements of critical thinking (i.e. cognitive skills, affective dispositions, knowledge), problematising the domain-specific and general aspects of critical thinking and elaborating justifications for teaching critical thinking. Finally, we argue that critical thinking needs to be integrated into curriculum, learning goals, teaching practices and assessment. The chapter emphasises the role of constructive alignment in teaching and use of a variety of teaching methods for teaching students to think critically in order to enhance their capacity for scientific thinking.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Fostering scientific literacy and critical thinking in elementary science education.

critical thinking and scientific method

Critical Thinking

critical thinking and scientific method

A Model of Critical Thinking in Higher Education

Abrami, P. C., Bernard, R. M., Borokhovski, E., Wade, A., Surkes, M. A., Tamim, R., et al. (2008). Instructional interventions affecting critical thinking skills and dispositions: A stage 1 meta-analysis. Review of Educational Research, 78 (4), 1102–1134. https://doi.org/10.3102/0034654308326084 .

Article   Google Scholar  

Arum, R., & Roksa, J. (2011). Academically adrift: Limited learning on college campuses . Chicago, IL: The University of Chicago Press.

Google Scholar  

Ayala, C. C., Shavelson, R. J., Araceli Ruiz-Primo, M., Brandon, P. R., Yin, Y., Furtak, E. M., et al. (2008). From formal embedded assessments to reflective lessons: The development of formative assessment studies. Applied Measurement in Education, 21 (4), 315–334. https://doi.org/10.1080/08957340802347787 .

Badcock, P. B. T., Pattison, P. E., & Harris, K.-L. (2010). Developing generic skills through university study: A study of arts, science and engineering in Australia. Higher Education, 60 (4), 441–458. https://doi.org/10.1007/s10734-010-9308-8 .

Bailin, S., Case, R., Coombs, J. R., & Daniels, L. B. (1999). Conceptualizing critical thinking. Journal of Curriculum Studies, 31 (3), 285–302. https://doi.org/10.1080/002202799183133 .

Bailin, S., & Siegel, H. (2003). Critical thinking. In N. Blake, P. Smeyers, R. Smith, & P. Standish (Eds.), The Blackwell guide to the philosophy of education (pp. 181–193). Oxford: Blackwell Publishing.

Banta, T., & Pike, G. (2012). Making the case against—One more time . Occasional Paper #15. National Institute for Learning Outcomes Assessment. Retrieved from http://learningoutcomesassessment.org/documents/HerringPaperFINAL.pdf .

Barrie, S. C. (2006). Understanding what we mean by the generic attributes of graduates. Higher Education, 51 (2), 215–241. https://doi.org/10.1007/s10734-004-6384-7 .

Barrie, S. C. (2007). A conceptual framework for the teaching and learning of generic graduate attributes. Studies in Higher Education, 3 (4), 439–458. https://doi.org/10.1080/03075070701476100 .

Bell, S. (2010). Project-based learning for the 21st century: Skills for the future. The Clearing House: A Journal of Educational Strategies, Issues and Ideas, 83 (2), 39–43. https://doi.org/10.1080/00098650903505415 .

Bereiter, C. (2002). Education and mind in the knowledge age . Hillsdale: Erlbaum.

Biggs, J., & Tang, C. (2009). Teaching for quality learning at university: What the student does (3rd ed.). Berkshire, England: SRHE and Open University Press.

Bok, D. (2006). Our underachieving colleges. A candid look at how much students learn and why they should be learning more . Princeton, NJ: Princeton University Press.

Brooks, R., & Everett, G. (2009). Post-graduation reflections on the value of a degree. British Educational Research Journal, 35 (3), 333–349. https://doi.org/10.1080/01411920802044370 .

Denton, H., & McDonagh, D. (2005). An exercise in symbiosis: Undergraduate designers and a company product development team working together. The Design Journal, 8 (1), 41–51. https://doi.org/10.2752/146069205789338315 .

Dewey, J. (1910). How we think . Boston, MA: D. C. Heath & Co.

Book   Google Scholar  

Dewey, J. (1941). Propositions, warranted assertibility, and truth. The Journal of Philosophy, 38 (7), 169–186. https://doi.org/10.2307/2017978 .

Dunlap, J. (2005). Problem-based learning and self-efficacy: How a capstone course prepares students for a profession. Educational Technology Research and Development, 53 (1), 65–83. https://doi.org/10.1007/BF02504858 .

Ennis, R. (1991). Critical thinking: A streamlined conception. Teaching Philosophy, 14 (1), 5–24. https://doi.org/10.5840/teachphil19911412 .

Ennis, R. (1993). Critical thinking assessment. Theory into practice, 32 (3), 179–186. https://doi.org/10.1080/00405849309543594 .

Evens, M., Verburgh, A., & Elen, J. (2013). Critical thinking in college freshmen: The impact of secondary and higher education. International Journal of Higher Education, 2 (3), 139–151. https://doi.org/10.5430/ijhe.v2n3p139 .

Facione, P. A. (1990). Critical thinking: A statement of expert consensus for purposes of educational assessment and instruction. Research findings and recommendations (ERIC Document Reproduction Service No. ED315423). Newark, DE: American Philosophical Association.

Fischer, F., Chinn, C. A., Engelmann, K., & Osborne, J. (Eds.). (2018). Scientific reasoning and argumentation: The roles of domain-specific and domain-general knowledge . New York, NY: Routledge.

Fisher, A. (2011). Critical thinking: An introduction . Cambridge: Cambridge University Press.

Gillies, R. (2004). The effects of cooperative learning on junior high school students during small group learning. Learning and Instruction, 14 (2), 197–213. https://doi.org/10.1016/S0959-4752(03)00068-9 .

Greiff, S., Wüstenberg, S., Csapó, B., Demetriou, A., Hautamäki, J., Graesser, A., et al. (2014). Domain-general problem solving skills and education in the 21st century. Educational Research Review, 13, 74–83. https://doi.org/10.1016/j.edurev.2014.10.002 .

Halpern, D. F. (2014). Thought and knowledge (5th ed.). New York, NY: Psychology Press.

Healy, A. (Ed.). (2008). Multiliteracies and diversity in education . Melbourne, VIC, Australia: Oxford University Press.

Helle, L., Tynjälä, P., & Olkinuora, E. (2006). Project-based learning in post-secondary education—Theory, practice and rubber sling shots. Higher Education, 51 (2), 287–314. https://doi.org/10.1007/s10734-004-6386-5 .

Hmelo-Silver, C. (2004). Problem-based learning: What and how do students learn? Educational Psychology Review, 16 (3), 235–266. https://doi.org/10.1023/B:EDPR.0000034022.16470.f3 .

Hofstein, A., Shore, R., & Kipnis, M. (2004). Providing high school chemistry students with opportunities to develop learning skills in an inquiry-type laboratory—A case study. International Journal of Science Education, 26 (1), 47–62. https://doi.org/10.1080/0950069032000070342 .

Holma, K. (2015). The critical spirit: Emotional and moral dimensions of critical thinking. Studier I Pædagogisk Filosofi, 4 (1), 17–28. https://doi.org/10.7146/spf.v4i1.18280 .

Holma, K., & Hyytinen, H. (equal contribution) (2015). The philosophy of personal epistemology. Theory and Research in Education , 13 (3), 334 – 350. https://doi.org/10.1177/1477878515606608 .

Hopmann, S. (2007). Restrained teaching: The common core of Didaktik. European Educational Research Journal, 6 (2), 109–124. https://doi.org/10.2304/eerj.2007.6.2.109 .

Hyytinen, H. (2015). Looking beyond the obvious: Theoretical, empirical and methodological insights into critical thinking (Doctoral dissertation). University of Helsinki, Studies in Educational Sciences 260. https://helda.helsinki.fi/bitstream/handle/10138/154312/LOOKINGB.pdf?sequence=1 .

Hyytinen, H., Löfström, E., & Lindblom-Ylänne, S. (2017). Challenges in argumentation and paraphrasing among beginning students in educational science. Scandinavian Journal of Educational Research, 61 (4), 411–429. https://doi.org/10.1080/00313831.2016.1147072 .

Hyytinen, H., Nissinen, K., Ursin, J., Toom, A., & Lindblom-Ylänne, S. (2015). Problematising the equivalence of the test results of performance-based critical thinking tests for undergraduate students. Studies in Educational Evaluation, 44, 1–8. https://doi.org/10.1016/j.stueduc.2014.11.001 .

Hyytinen, H., & Toom, A. (2019). Developing a performance assessment task in the Finnish higher education context: Conceptual and empirical insights. British Journal of Educational Psychology , 89 (3), 551–563. https://doi.org/10.1111/bjep.12283 .

Hyytinen, H., Toom, A., & Postareff, L. (2018). Unraveling the complex relationship in critical thinking, approaches to learning and self-efficacy beliefs among first-year educational science students. Learning and Individual Differences, 67, 132–142. https://doi.org/10.1016/j.lindif.2018.08.004 .

Jenert, T. (2014). Implementing-oriented study programmes at university: The challenge of academic culture. Zeitschrift für Hochschulentwicklung , 9 (2), 1–12. Retrieved from https://www.alexandria.unisg.ch/publications/230455 .

Krolak-Schwerdt, S., Pitten Cate, I. M., & Hörstermann, T. (2018). Teachers’ judgments and decision-making: studies concerning the transition from primary to secondary education and their implications for teacher education. In O. Zlatkin-Troitschanskaia, M. Toepper, H. A. Pant, C. Lautenbach, & C. Kuhn (Eds.), Assessment of learning outcomes in higher education—Cross-national comparisons and perspectives (pp. 73–101). Wiesbaden: Springer. https://doi.org/10.1007/978-3-319-74338-7_5 .

Chapter   Google Scholar  

Kuhn, D. (2005). Education for thinking . Cambridge, MA: Harvard University Press.

Marton, F., & Trigwell, K. (2000). Variatio est mater studiorum. Higher Education Research & Development, 19 (3), 381–395. https://doi.org/10.1080/07294360020021455 .

Mills-Dick, K., & Hull, J. M. (2011). Collaborative research: Empowering students and connecting to community. Journal of Public Health Management & Practice, 17 (4), 381–387. https://doi.org/10.1097/PHH.0b013e3182140c2f .

Muukkonen, H., Lakkala, M., Toom, A., & Ilomäki, L. (2017). Assessment of competences in knowledge work and object-bound collaboration during higher education courses. In E. Kyndt, V. Donche, K. Trigwell, & S. Lindblom-Ylänne (Eds.), Higher education transitions: Theory and research (pp. 288–305). London: Routledge.

Neisser, U. (1967). Cognitive psychology . New York, NY: Appleton-Century-Crofts.

Niiniluoto, I. (1980). Johdatus tieteenfilosofiaan . Helsinki: Otava.

Niiniluoto, I. (1984). Tieteellinen päättely ja selittäminen . Helsinki: Otava.

Niiniluoto, I. (1999). Critical scientific realism . Oxford: Oxford University Press.

Oljar, E., & Koukal, D. R. (2019, February 3). How to make students better thinkers. The Chronicle of Higher Education. Retrieved from https://www.chronicle.com/article/How-to-Make-Students-Better/245576 .

Paul, R., & Elder, L. (2008). The thinker’s guide to scientific thinking: Based on critical thinking concepts and principles . Foundation for Critical Thinking.

Phielix, C., Prins, F. J., Kirschner, P. A., Erkens, G., & Jaspers, J. (2011). Group awareness of social and cognitive performance in a CSCL environment: Effects of a peer feedback and reflection tool. Computers in Human Behavior, 27 (3), 1087–1102. https://doi.org/10.1016/j.chb.2010.06.024 .

Rapanta, C., Garcia-Mila, M., & Gilabert, S. (2013). What is meant by argumentative competence? An integrative review of methods of analysis and assessment in education. Review of Educational Research, 83 (4), 483–520. https://doi.org/10.3102/0034654313487606 .

Ruiz-Primo, M., Schultz, S. E., Li, M., & Shavelson, R. J. (2001). Comparison of the reliability and validity of scores from two concept-mapping techniques. Journal of Research in Science Teaching, 38 (2), 260–278. https://doi.org/10.1002/1098-2736(200102)38:2%3c260:AID-TEA1005%3e3.0.CO;2-F .

Samarapungavan, A. (2018). Construing scientific evidence: The role of disciplinary knowledge in reasoning with and about evidence in scientific practice. In F. Fischer, C. A. Chinn, K. Engelmann, & J. Osborne (Eds.), Scientific reasoning and argumentation: The roles of domain-specific and domain-general knowledge (pp. 56–76). New York, NY: Routledge.

Segalàs, J., Mulder, K. F., & Ferrer-Balas, D. (2012). What do EESD “experts” think sustainability is? Which pedagogy is suitable to learn it? Results from interviews and Cmaps analysis gathered at EESD 2008. International Journal of Sustainability in Higher Education, 13 (3), 293–304. https://doi.org/10.1108/14676371211242599 .

Shavelson, R. J. (2010a). On the measurement of competency. Empirical Research in Vocational Education and Training , 2 (1), 41–63. Retrieved from http://ervet.ch/pdf/PDF_V2_Issue1/shavelson.pdf .

Shavelson, R. J. (2010b). Measuring college learning responsibly: Accountability in a new era . Stanford, CA: Stanford University Press.

Shavelson, R. J. (2018). Discussion of papers and reflections on “exploring the limits of domain-generality”. In F. Fischer, C. A. Chinn, K. Engelmann, & J. Osborne (Eds.), Scientific reasoning and argumentation: The roles of domain-specific and domain-general knowledge (pp. 112–118). New York, NY: Routledge.

Shavelson, R. J., Zlatkin-Troitschanskaia, O., & Mariño, J. (2018). International performance assessment of learning in higher education (iPAL): Research and development. In O. Zlatkin-Troitschanskaia, M. Toepper, H. A. Pant, C. Lautenbach, & C. Kuhn (Eds.). Assessment of learning outcomes in higher education—Cross-national comparisons and perspectives (pp. 193–214). Wiesbaden: Springer. https://doi.org/10.1007/978-3-319-74338-7_10 .

Siegel, H. (1991). The generalizability of critical thinking. Educational Philosophy and Theory, 23 (1), 18–30. https://doi.org/10.1111/j.1469-5812.1991.tb00173.x .

Strijbos, J., Engels, N., & Struyven, K. (2015). Criteria and standards of generic competences at bachelor degree level: A review study. Educational Research Review, 14, 18–32. https://doi.org/10.1016/j.edurev.2015.01.001 .

Tomperi, T. (2017). Kriittisen ajattelun opettaminen ja filosofia. Pedagogisia perusteita. Niin & Näin , 4 (17), 95–112. Retrieved from https://netn.fi/artikkeli/kriittisen-ajattelun-opettaminen-ja-filosofia-pedagogisia-perusteita .

Toom, A. (2017). Teacher’s professional competencies: A complex divide between teacher’s work, teacher knowledge and teacher education. In D. J. Clandinin & J. Husu (Eds.), The SAGE handbook of research on teacher education (pp. 803–819). London: Sage.

Tremblay, K., Lalancette, D., & Roseveare, D. (2012). Assessment of higher education learning outcomes. In Feasibility study report. Design and implementation (Vol. 1) . OECD. Retrieved from http://www.oecd.org/edu/skills-beyond-school/AHELOFSReportVolume1.pdf .

Trigg, R. (2001). Understanding social science: A philosophical introduction to the social sciences . Oxford: Blackwell publishing.

Utriainen, J., Marttunen, M., Kallio, E., & Tynjälä, P. (2016). University applicants’ critical thinking skills: The case of the Finnish educational sciences. Scandinavian Journal of Educational Research, 61, 629–649. https://doi.org/10.1080/00313831.2016.1173092 .

Vartiainen, H., Liljeström, A., & Enkenberg, J. (2012). Design-oriented pedagogy for technology-enhanced learning to cross over the borders between formal and informal environments. Journal of Universal Computer Science, 18 (15), 2097–2119. https://doi.org/10.3217/jucs-018-15-2097 .

Virtanen, A., & Tynjälä, P. (2018). Factors explaining the learning of generic skills: a study of university students’ experiences. Teaching in Higher Education , https://doi.org/10.1080/13562517.2018.1515195 .

Zahner, D., & Ciolfi, A. (2018). International comparison of a performance-based assessment in higher education. In O. Zlatkin-Troitschanskaia, M. Toepper, H. A. Pant, C. Lautenbach, & C. Kuhn (Eds.). Assessment of learning outcomes in higher education—Cross-national comparisons and perspectives (pp. 215–244). Wiesbaden: Springer. https://doi.org/10.1007/978-3-319-74338-7_11 .

Download references

Author information

Authors and affiliations.

University of Helsinki, Helsinki, Finland

Heidi Hyytinen & Auli Toom

Stanford University, Stanford, CA, USA

Richard J. Shavelson

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Heidi Hyytinen .

Editor information

Editors and affiliations.

Faculty of Education and Culture, Tampere University, Tampere, Finland

Mari Murtonen

Department of Higher Education, University of Surrey, Guildford, UK

Kieran Balloo

Rights and permissions

Reprints and permissions

Copyright information

© 2019 The Author(s)

About this chapter

Hyytinen, H., Toom, A., Shavelson, R.J. (2019). Enhancing Scientific Thinking Through the Development of Critical Thinking in Higher Education. In: Murtonen, M., Balloo, K. (eds) Redefining Scientific Thinking for Higher Education. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-24215-2_3

Download citation

DOI : https://doi.org/10.1007/978-3-030-24215-2_3

Published : 22 September 2019

Publisher Name : Palgrave Macmillan, Cham

Print ISBN : 978-3-030-24214-5

Online ISBN : 978-3-030-24215-2

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Accelerate Learning

  • MISSION / VISION
  • DIVERSITY STATEMENT
  • CAREER OPPORTUNITIES
  • Kide Science
  • STEMscopes Science
  • Collaborate Science
  • STEMscopes Math
  • Math Nation
  • STEMscopes Coding
  • Mastery Coding
  • DIVE-in Engineering
  • STEMscopes Streaming
  • Tuva Data Literacy
  • NATIONAL INSTITUTE FOR STEM EDUCATION
  • STEMSCOPES PROFESSIONAL LEARNING
  • RESEARCH & EFFICACY STUDIES
  • STEM EDUCATION WEBINARS
  • LEARNING EQUITY
  • DISTANCE LEARNING
  • PRODUCT UPDATES
  • LMS INTEGRATIONS
  • STEMSCOPES BLOG
  • FREE RESOURCES
  • TESTIMONIALS

Critical Thinking in Science: Fostering Scientific Reasoning Skills in Students

ALI Staff | Published  July 13, 2023

Thinking like a scientist is a central goal of all science curricula.

As students learn facts, methodologies, and methods, what matters most is that all their learning happens through the lens of scientific reasoning what matters most is that it’s all through the lens of scientific reasoning.

That way, when it comes time for them to take on a little science themselves, either in the lab or by theoretically thinking through a solution, they understand how to do it in the right context.

One component of this type of thinking is being critical. Based on facts and evidence, critical thinking in science isn’t exactly the same as critical thinking in other subjects.

Students have to doubt the information they’re given until they can prove it’s right.

They have to truly understand what’s true and what’s hearsay. It’s complex, but with the right tools and plenty of practice, students can get it right.

What is critical thinking?

This particular style of thinking stands out because it requires reflection and analysis. Based on what's logical and rational, thinking critically is all about digging deep and going beyond the surface of a question to establish the quality of the question itself.

It ensures students put their brains to work when confronted with a question rather than taking every piece of information they’re given at face value.

It’s engaged, higher-level thinking that will serve them well in school and throughout their lives.

Why is critical thinking important?

Critical thinking is important when it comes to making good decisions.

It gives us the tools to think through a choice rather than quickly picking an option — and probably guessing wrong. Think of it as the all-important ‘why.’

Why is that true? Why is that right? Why is this the only option?

Finding answers to questions like these requires critical thinking. They require you to really analyze both the question itself and the possible solutions to establish validity.

Will that choice work for me? Does this feel right based on the evidence?

How does critical thinking in science impact students?

Critical thinking is essential in science.

It’s what naturally takes students in the direction of scientific reasoning since evidence is a key component of this style of thought.

It’s not just about whether evidence is available to support a particular answer but how valid that evidence is.

It’s about whether the information the student has fits together to create a strong argument and how to use verifiable facts to get a proper response.

Critical thinking in science helps students:

  • Actively evaluate information
  • Identify bias
  • Separate the logic within arguments
  • Analyze evidence

4 Ways to promote critical thinking

Figuring out how to develop critical thinking skills in science means looking at multiple strategies and deciding what will work best at your school and in your class.

Based on your student population, their needs and abilities, not every option will be a home run.

These particular examples are all based on the idea that for students to really learn how to think critically, they have to practice doing it. 

Each focuses on engaging students with science in a way that will motivate them to work independently as they hone their scientific reasoning skills.

Project-Based Learning

Project-based learning centers on critical thinking.

Teachers can shape a project around the thinking style to give students practice with evaluating evidence or other critical thinking skills.

Critical thinking also happens during collaboration, evidence-based thought, and reflection.

For example, setting students up for a research project is not only a great way to get them to think critically, but it also helps motivate them to learn.

Allowing them to pick the topic (that isn’t easy to look up online), develop their own research questions, and establish a process to collect data to find an answer lets students personally connect to science while using critical thinking at each stage of the assignment.

They’ll have to evaluate the quality of the research they find and make evidence-based decisions.

Self-Reflection

Adding a question or two to any lab practicum or activity requiring students to pause and reflect on what they did or learned also helps them practice critical thinking.

At this point in an assignment, they’ll pause and assess independently. 

You can ask students to reflect on the conclusions they came up with for a completed activity, which really makes them think about whether there's any bias in their answer.

Addressing Assumptions

One way critical thinking aligns so perfectly with scientific reasoning is that it encourages students to challenge all assumptions. 

Evidence is king in the science classroom, but even when students work with hard facts, there comes the risk of a little assumptive thinking.

Working with students to identify assumptions in existing research or asking them to address an issue where they suspend their own judgment and simply look at established facts polishes their that critical eye.

They’re getting practice without tossing out opinions, unproven hypotheses, and speculation in exchange for real data and real results, just like a scientist has to do.

Lab Activities With Trial-And-Error

Another component of critical thinking (as well as thinking like a scientist) is figuring out what to do when you get something wrong.

Backtracking can mean you have to rethink a process, redesign an experiment, or reevaluate data because the outcomes don’t make sense, but it’s okay.

The ability to get something wrong and recover is not only a valuable life skill, but it’s where most scientific breakthroughs start. Reminding students of this is always a valuable lesson.

Labs that include comparative activities are one way to increase critical thinking skills, especially when introducing new evidence that might cause students to change their conclusions once the lab has begun.

For example, you provide students with two distinct data sets and ask them to compare them.

With only two choices, there are a finite amount of conclusions to draw, but then what happens when you bring in a third data set? Will it void certain conclusions? Will it allow students to make new conclusions, ones even more deeply rooted in evidence?

Thinking like a scientist

When students get the opportunity to think critically, they’re learning to trust the data over their ‘gut,’ to approach problems systematically and make informed decisions using ‘good’ evidence.

When practiced enough, this ability will engage students in science in a whole new way, providing them with opportunities to dig deeper and learn more.

It can help enrich science and motivate students to approach the subject just like a professional would.

New call-to-action

Share this post!

Related articles.

STEMscopes Texas Math Meets TEKS and ELP Standards!

STEMscopes Texas Math Meets TEKS and ELP Standards!

There is quite a bit of uncertainty out there about the newly established Instructional Materials Review and Approval...

10 Quick, Fun Math Warm-Up Activities

10 Quick, Fun Math Warm-Up Activities

Creating an environment that allows students to engage with fun math activities, versus rote memorization, helps...

Overview of Instructional Materials Review and Approval (IMRA) and House Bill 1605

Overview of Instructional Materials Review and Approval (IMRA) and House Bill 1605

In May 2023, Texas approved a transformative bill (House Bill 1605) that significantly impacts educational funding for...

STAY INFORMED ON THE LATEST IN STEM. SUBSCRIBE TODAY!

Which stem subjects are of interest to you.

STEMscopes Tech Specifications      STEMscopes Security Information & Compliance      Privacy Policy      Terms and Conditions

© 2024 Accelerate Learning

  • Skip to content
  • Select language
  • Skip to search

The web browser you are using is out of date, please upgrade .

Please use a modern web browser with JavaScript enabled to visit OpenClassrooms.com

Develop Your Critical Thinking

Free online content available in this course.

course.header.alt.is_video

course.header.alt.is_certifying

Last updated on 3/17/22

Use the Scientific Method

Identify the pillars of science.

Critical thinking isn’t a scientific discipline like mathematics, linguistics, or sociology. However, it can use scientific methods to reduce the risk of making a mistake, which we’ll look at in this chapter.

These four points often characterize science:

Scientists:  those who participate in scientific research, produce knowledge by analyzing the raw results, and publish articles. This group could include some experts who appear in the media. They aren’t researchers but represent a source of potentially reliable information on a particular subject.

The sum of human knowledge:  this knowledge is considered fact and explains how the world works. While not necessarily the truth, it aims to get closer. All scientific knowledge can be made less true by new evidence that questions previous results. 

Technopolitics:  those who decide what or how resources are devoted to particular scientific research subjects. 

The scientific approach:  a rigorous, demanding intellectual approach designed to make testable, credible, objective statements about the natural and social world. This approach attempts to exclude the biases that could affect the researcher, the subject (if a person), or the analyst who compiles the statistical results. Lastly, the research must be reproducible. However, there is no such thing as zero risks. Furthermore, the way the media presents and explains scientific results can lead to a biased interpretation. 

Practice the Scientific Method

The scientific method is characterized by the following:

The systematic search for errors.

The consideration of potential biases before, during, and after the experiment.

The ability to reproduce  the experiment.

While this approach is used in the “hard” and human sciences, it can also be applied to critical thinking and uncovering the truth.

For example, you can use it in a market study or analyze marketing operations to guarantee reliability and minimize risks. Above all, it is about following the scientific method. It will help alert you to any errors in the decision-making process.

Remember, though: science is  apolitical  and  amoral.  It aims to explain the world as it is, not as it ought to be, which means humans skew science for bad or good. 

The  media  can be tough on science. There are several reasons for this:

Scientists are not always  great communicators.  They are often highly specialized and lack the skills to communicate in laymen's terms. Exceptions such as  Neil DeGrasse Tyson , the famous director of the Hayden Planetarium in New York only confirm the rule. 

A scientific article (research protocols, potential conflicts, results) is not  easy to read or understand.  It is not unusual for a blog or journal article to report research conclusions inaccurately or incorrectly. 

The scientific community is just like any other, with  biased,  sensitive, motivated, empathetic (or otherwise) individuals. In other words, they are fallible. 

Readers don't always understand that a scientific  article is not an opinion.  It includes statistical data and peer review. 

Even peer-reviewed studies don’t offer proof. Though most published studies have statistically relevant findings, as many as 5% will be incorrect. The truth lies in the  replication  of an investigation by independent researchers who arrive at the same conclusion. 

When you hear, “A U.S. study shows that…,” consider it low-quality evidence. It shouldn’t affect your credibility gauge. On the other hand, a meta-analysis gives you a fairly precise idea of the status of your knowledge on a given subject.

Test Your Understanding of the Method

critical thinking and scientific method

Try to rank the following statements in order of credibility:

My brother, a telecoms expert, tells me there's nothing to fear from 5G.

The Shift Project reports that online video streaming generates as high a level of greenhouse gases as an entire country such as Spain.

A Cochrane Review shows that when a company gives step trackers to its employees to increase their physical activity, it has no noticeable effect on health.

A BBC report shows that glyphosate has to be banned. Professor Séralini's studies in 2012 had the same conclusions. 

Are you finished?

Here is the order from most credible to the weakest hypothesis:

C. Read this article "Do activity monitors help adults with stroke become more physically active?" from the Cochrane Library.

B. Read this page from The Shift Project.

A and D are equal.

Note that only statements 3 and 5 were the subject of a scientific publication.

The scientific method helps you avoid making mistakes when thinking of a situation. However, there are many ways to approach critical thinking; feel free to dig deeper into some of them:

A philosophical-historical approach from Greek philosophers or the Enlightenment (university studies, philosophy, etc.).

A popular approach (YouTube channels, such as  Big Think ,  Veritasium ,  Kevin deLaplante’s channel ,  Holy Koolaid , and  Captain Disillusion ).

Real-life applications, such as the study of paranormal phenomena (i.e.,  James Randi ’s debunking of psychics,  Mark Rober ’s scientific testing of common myths, MythBusters, resources on  Skeptic.org.uk ,  Questions for science , etc.).

Read dedicated texts:  Factfulness: Ten Reasons We’re Wrong About the World and Why Things Are Better Than You Think  by Ola Rosling,  Black Box Thinking  by Matthew Syed,  The Art of Thinking Clearly  by Rolf Dobelli, and  The Demon-Haunted World: Science as a Candle in the Dark  by Carl Sagan.

Use a teaching resource approach (MOOC, TED-Ed, GoogleTalks, Khan Academy, Crash Course, The School of Life, etc.). 

Let's Recap!

In this chapter, you learned that the scientific method could be a useful tool in critical thinking. To apply it, you need to observe the following points:

Always look out for errors.

Look for any biases before, during, and after the experiment. 

Replicate the experiment.

Use a peer review system to avoid errors.

Bertrand Russell says, “Give me good reasons to think what you think.” From his quote, you can surmise that the scientific method is useful in critical thinking.

In the first part of this course, you explored the cognitive biases that can lead to error and looked at the scientific method’s key points. Take the quiz to test what you have learned!  😉  

Example of certificate of achievement

Only Premium members can download videos from our courses. However, you can watch them online for free.

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents

Bibliography

Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Scientific Method

Science is an enormously successful human enterprise. The study of scientific method is the attempt to discern the activities by which that success is achieved. Among the activities often identified as characteristic of science are systematic observation and experimentation, inductive and deductive reasoning, and the formation and testing of hypotheses and theories. How these are carried out in detail can vary greatly, but characteristics like these have been looked to as a way of demarcating scientific activity from non-science, where only enterprises which employ some canonical form of scientific method or methods should be considered science (see also the entry on science and pseudo-science ). Others have questioned whether there is anything like a fixed toolkit of methods which is common across science and only science. Some reject privileging one view of method as part of rejecting broader views about the nature of science, such as naturalism (Dupré 2004); some reject any restriction in principle (pluralism).

Scientific method should be distinguished from the aims and products of science, such as knowledge, predictions, or control. Methods are the means by which those goals are achieved. Scientific method should also be distinguished from meta-methodology, which includes the values and justifications behind a particular characterization of scientific method (i.e., a methodology) — values such as objectivity, reproducibility, simplicity, or past successes. Methodological rules are proposed to govern method and it is a meta-methodological question whether methods obeying those rules satisfy given values. Finally, method is distinct, to some degree, from the detailed and contextual practices through which methods are implemented. The latter might range over: specific laboratory techniques; mathematical formalisms or other specialized languages used in descriptions and reasoning; technological or other material means; ways of communicating and sharing results, whether with other scientists or with the public at large; or the conventions, habits, enforced customs, and institutional controls over how and what science is carried out.

While it is important to recognize these distinctions, their boundaries are fuzzy. Hence, accounts of method cannot be entirely divorced from their methodological and meta-methodological motivations or justifications, Moreover, each aspect plays a crucial role in identifying methods. Disputes about method have therefore played out at the detail, rule, and meta-rule levels. Changes in beliefs about the certainty or fallibility of scientific knowledge, for instance (which is a meta-methodological consideration of what we can hope for methods to deliver), have meant different emphases on deductive and inductive reasoning, or on the relative importance attached to reasoning over observation (i.e., differences over particular methods.) Beliefs about the role of science in society will affect the place one gives to values in scientific method.

The issue which has shaped debates over scientific method the most in the last half century is the question of how pluralist do we need to be about method? Unificationists continue to hold out for one method essential to science; nihilism is a form of radical pluralism, which considers the effectiveness of any methodological prescription to be so context sensitive as to render it not explanatory on its own. Some middle degree of pluralism regarding the methods embodied in scientific practice seems appropriate. But the details of scientific practice vary with time and place, from institution to institution, across scientists and their subjects of investigation. How significant are the variations for understanding science and its success? How much can method be abstracted from practice? This entry describes some of the attempts to characterize scientific method or methods, as well as arguments for a more context-sensitive approach to methods embedded in actual scientific practices.

1. Overview and organizing themes

2. historical review: aristotle to mill, 3.1 logical constructionism and operationalism, 3.2. h-d as a logic of confirmation, 3.3. popper and falsificationism, 3.4 meta-methodology and the end of method, 4. statistical methods for hypothesis testing, 5.1 creative and exploratory practices.

  • 5.2 Computer methods and the ‘new ways’ of doing science

6.1 “The scientific method” in science education and as seen by scientists

6.2 privileged methods and ‘gold standards’, 6.3 scientific method in the court room, 6.4 deviating practices, 7. conclusion, other internet resources, related entries.

This entry could have been given the title Scientific Methods and gone on to fill volumes, or it could have been extremely short, consisting of a brief summary rejection of the idea that there is any such thing as a unique Scientific Method at all. Both unhappy prospects are due to the fact that scientific activity varies so much across disciplines, times, places, and scientists that any account which manages to unify it all will either consist of overwhelming descriptive detail, or trivial generalizations.

The choice of scope for the present entry is more optimistic, taking a cue from the recent movement in philosophy of science toward a greater attention to practice: to what scientists actually do. This “turn to practice” can be seen as the latest form of studies of methods in science, insofar as it represents an attempt at understanding scientific activity, but through accounts that are neither meant to be universal and unified, nor singular and narrowly descriptive. To some extent, different scientists at different times and places can be said to be using the same method even though, in practice, the details are different.

Whether the context in which methods are carried out is relevant, or to what extent, will depend largely on what one takes the aims of science to be and what one’s own aims are. For most of the history of scientific methodology the assumption has been that the most important output of science is knowledge and so the aim of methodology should be to discover those methods by which scientific knowledge is generated.

Science was seen to embody the most successful form of reasoning (but which form?) to the most certain knowledge claims (but how certain?) on the basis of systematically collected evidence (but what counts as evidence, and should the evidence of the senses take precedence, or rational insight?) Section 2 surveys some of the history, pointing to two major themes. One theme is seeking the right balance between observation and reasoning (and the attendant forms of reasoning which employ them); the other is how certain scientific knowledge is or can be.

Section 3 turns to 20 th century debates on scientific method. In the second half of the 20 th century the epistemic privilege of science faced several challenges and many philosophers of science abandoned the reconstruction of the logic of scientific method. Views changed significantly regarding which functions of science ought to be captured and why. For some, the success of science was better identified with social or cultural features. Historical and sociological turns in the philosophy of science were made, with a demand that greater attention be paid to the non-epistemic aspects of science, such as sociological, institutional, material, and political factors. Even outside of those movements there was an increased specialization in the philosophy of science, with more and more focus on specific fields within science. The combined upshot was very few philosophers arguing any longer for a grand unified methodology of science. Sections 3 and 4 surveys the main positions on scientific method in 20 th century philosophy of science, focusing on where they differ in their preference for confirmation or falsification or for waiving the idea of a special scientific method altogether.

In recent decades, attention has primarily been paid to scientific activities traditionally falling under the rubric of method, such as experimental design and general laboratory practice, the use of statistics, the construction and use of models and diagrams, interdisciplinary collaboration, and science communication. Sections 4–6 attempt to construct a map of the current domains of the study of methods in science.

As these sections illustrate, the question of method is still central to the discourse about science. Scientific method remains a topic for education, for science policy, and for scientists. It arises in the public domain where the demarcation or status of science is at issue. Some philosophers have recently returned, therefore, to the question of what it is that makes science a unique cultural product. This entry will close with some of these recent attempts at discerning and encapsulating the activities by which scientific knowledge is achieved.

Attempting a history of scientific method compounds the vast scope of the topic. This section briefly surveys the background to modern methodological debates. What can be called the classical view goes back to antiquity, and represents a point of departure for later divergences. [ 1 ]

We begin with a point made by Laudan (1968) in his historical survey of scientific method:

Perhaps the most serious inhibition to the emergence of the history of theories of scientific method as a respectable area of study has been the tendency to conflate it with the general history of epistemology, thereby assuming that the narrative categories and classificatory pigeon-holes applied to the latter are also basic to the former. (1968: 5)

To see knowledge about the natural world as falling under knowledge more generally is an understandable conflation. Histories of theories of method would naturally employ the same narrative categories and classificatory pigeon holes. An important theme of the history of epistemology, for example, is the unification of knowledge, a theme reflected in the question of the unification of method in science. Those who have identified differences in kinds of knowledge have often likewise identified different methods for achieving that kind of knowledge (see the entry on the unity of science ).

Different views on what is known, how it is known, and what can be known are connected. Plato distinguished the realms of things into the visible and the intelligible ( The Republic , 510a, in Cooper 1997). Only the latter, the Forms, could be objects of knowledge. The intelligible truths could be known with the certainty of geometry and deductive reasoning. What could be observed of the material world, however, was by definition imperfect and deceptive, not ideal. The Platonic way of knowledge therefore emphasized reasoning as a method, downplaying the importance of observation. Aristotle disagreed, locating the Forms in the natural world as the fundamental principles to be discovered through the inquiry into nature ( Metaphysics Z , in Barnes 1984).

Aristotle is recognized as giving the earliest systematic treatise on the nature of scientific inquiry in the western tradition, one which embraced observation and reasoning about the natural world. In the Prior and Posterior Analytics , Aristotle reflects first on the aims and then the methods of inquiry into nature. A number of features can be found which are still considered by most to be essential to science. For Aristotle, empiricism, careful observation (but passive observation, not controlled experiment), is the starting point. The aim is not merely recording of facts, though. For Aristotle, science ( epistêmê ) is a body of properly arranged knowledge or learning—the empirical facts, but also their ordering and display are of crucial importance. The aims of discovery, ordering, and display of facts partly determine the methods required of successful scientific inquiry. Also determinant is the nature of the knowledge being sought, and the explanatory causes proper to that kind of knowledge (see the discussion of the four causes in the entry on Aristotle on causality ).

In addition to careful observation, then, scientific method requires a logic as a system of reasoning for properly arranging, but also inferring beyond, what is known by observation. Methods of reasoning may include induction, prediction, or analogy, among others. Aristotle’s system (along with his catalogue of fallacious reasoning) was collected under the title the Organon . This title would be echoed in later works on scientific reasoning, such as Novum Organon by Francis Bacon, and Novum Organon Restorum by William Whewell (see below). In Aristotle’s Organon reasoning is divided primarily into two forms, a rough division which persists into modern times. The division, known most commonly today as deductive versus inductive method, appears in other eras and methodologies as analysis/​synthesis, non-ampliative/​ampliative, or even confirmation/​verification. The basic idea is there are two “directions” to proceed in our methods of inquiry: one away from what is observed, to the more fundamental, general, and encompassing principles; the other, from the fundamental and general to instances or implications of principles.

The basic aim and method of inquiry identified here can be seen as a theme running throughout the next two millennia of reflection on the correct way to seek after knowledge: carefully observe nature and then seek rules or principles which explain or predict its operation. The Aristotelian corpus provided the framework for a commentary tradition on scientific method independent of science itself (cosmos versus physics.) During the medieval period, figures such as Albertus Magnus (1206–1280), Thomas Aquinas (1225–1274), Robert Grosseteste (1175–1253), Roger Bacon (1214/1220–1292), William of Ockham (1287–1347), Andreas Vesalius (1514–1546), Giacomo Zabarella (1533–1589) all worked to clarify the kind of knowledge obtainable by observation and induction, the source of justification of induction, and best rules for its application. [ 2 ] Many of their contributions we now think of as essential to science (see also Laudan 1968). As Aristotle and Plato had employed a framework of reasoning either “to the forms” or “away from the forms”, medieval thinkers employed directions away from the phenomena or back to the phenomena. In analysis, a phenomena was examined to discover its basic explanatory principles; in synthesis, explanations of a phenomena were constructed from first principles.

During the Scientific Revolution these various strands of argument, experiment, and reason were forged into a dominant epistemic authority. The 16 th –18 th centuries were a period of not only dramatic advance in knowledge about the operation of the natural world—advances in mechanical, medical, biological, political, economic explanations—but also of self-awareness of the revolutionary changes taking place, and intense reflection on the source and legitimation of the method by which the advances were made. The struggle to establish the new authority included methodological moves. The Book of Nature, according to the metaphor of Galileo Galilei (1564–1642) or Francis Bacon (1561–1626), was written in the language of mathematics, of geometry and number. This motivated an emphasis on mathematical description and mechanical explanation as important aspects of scientific method. Through figures such as Henry More and Ralph Cudworth, a neo-Platonic emphasis on the importance of metaphysical reflection on nature behind appearances, particularly regarding the spiritual as a complement to the purely mechanical, remained an important methodological thread of the Scientific Revolution (see the entries on Cambridge platonists ; Boyle ; Henry More ; Galileo ).

In Novum Organum (1620), Bacon was critical of the Aristotelian method for leaping from particulars to universals too quickly. The syllogistic form of reasoning readily mixed those two types of propositions. Bacon aimed at the invention of new arts, principles, and directions. His method would be grounded in methodical collection of observations, coupled with correction of our senses (and particularly, directions for the avoidance of the Idols, as he called them, kinds of systematic errors to which naïve observers are prone.) The community of scientists could then climb, by a careful, gradual and unbroken ascent, to reliable general claims.

Bacon’s method has been criticized as impractical and too inflexible for the practicing scientist. Whewell would later criticize Bacon in his System of Logic for paying too little attention to the practices of scientists. It is hard to find convincing examples of Bacon’s method being put in to practice in the history of science, but there are a few who have been held up as real examples of 16 th century scientific, inductive method, even if not in the rigid Baconian mold: figures such as Robert Boyle (1627–1691) and William Harvey (1578–1657) (see the entry on Bacon ).

It is to Isaac Newton (1642–1727), however, that historians of science and methodologists have paid greatest attention. Given the enormous success of his Principia Mathematica and Opticks , this is understandable. The study of Newton’s method has had two main thrusts: the implicit method of the experiments and reasoning presented in the Opticks, and the explicit methodological rules given as the Rules for Philosophising (the Regulae) in Book III of the Principia . [ 3 ] Newton’s law of gravitation, the linchpin of his new cosmology, broke with explanatory conventions of natural philosophy, first for apparently proposing action at a distance, but more generally for not providing “true”, physical causes. The argument for his System of the World ( Principia , Book III) was based on phenomena, not reasoned first principles. This was viewed (mainly on the continent) as insufficient for proper natural philosophy. The Regulae counter this objection, re-defining the aims of natural philosophy by re-defining the method natural philosophers should follow. (See the entry on Newton’s philosophy .)

To his list of methodological prescriptions should be added Newton’s famous phrase “ hypotheses non fingo ” (commonly translated as “I frame no hypotheses”.) The scientist was not to invent systems but infer explanations from observations, as Bacon had advocated. This would come to be known as inductivism. In the century after Newton, significant clarifications of the Newtonian method were made. Colin Maclaurin (1698–1746), for instance, reconstructed the essential structure of the method as having complementary analysis and synthesis phases, one proceeding away from the phenomena in generalization, the other from the general propositions to derive explanations of new phenomena. Denis Diderot (1713–1784) and editors of the Encyclopédie did much to consolidate and popularize Newtonianism, as did Francesco Algarotti (1721–1764). The emphasis was often the same, as much on the character of the scientist as on their process, a character which is still commonly assumed. The scientist is humble in the face of nature, not beholden to dogma, obeys only his eyes, and follows the truth wherever it leads. It was certainly Voltaire (1694–1778) and du Chatelet (1706–1749) who were most influential in propagating the latter vision of the scientist and their craft, with Newton as hero. Scientific method became a revolutionary force of the Enlightenment. (See also the entries on Newton , Leibniz , Descartes , Boyle , Hume , enlightenment , as well as Shank 2008 for a historical overview.)

Not all 18 th century reflections on scientific method were so celebratory. Famous also are George Berkeley’s (1685–1753) attack on the mathematics of the new science, as well as the over-emphasis of Newtonians on observation; and David Hume’s (1711–1776) undermining of the warrant offered for scientific claims by inductive justification (see the entries on: George Berkeley ; David Hume ; Hume’s Newtonianism and Anti-Newtonianism ). Hume’s problem of induction motivated Immanuel Kant (1724–1804) to seek new foundations for empirical method, though as an epistemic reconstruction, not as any set of practical guidelines for scientists. Both Hume and Kant influenced the methodological reflections of the next century, such as the debate between Mill and Whewell over the certainty of inductive inferences in science.

The debate between John Stuart Mill (1806–1873) and William Whewell (1794–1866) has become the canonical methodological debate of the 19 th century. Although often characterized as a debate between inductivism and hypothetico-deductivism, the role of the two methods on each side is actually more complex. On the hypothetico-deductive account, scientists work to come up with hypotheses from which true observational consequences can be deduced—hence, hypothetico-deductive. Because Whewell emphasizes both hypotheses and deduction in his account of method, he can be seen as a convenient foil to the inductivism of Mill. However, equally if not more important to Whewell’s portrayal of scientific method is what he calls the “fundamental antithesis”. Knowledge is a product of the objective (what we see in the world around us) and subjective (the contributions of our mind to how we perceive and understand what we experience, which he called the Fundamental Ideas). Both elements are essential according to Whewell, and he was therefore critical of Kant for too much focus on the subjective, and John Locke (1632–1704) and Mill for too much focus on the senses. Whewell’s fundamental ideas can be discipline relative. An idea can be fundamental even if it is necessary for knowledge only within a given scientific discipline (e.g., chemical affinity for chemistry). This distinguishes fundamental ideas from the forms and categories of intuition of Kant. (See the entry on Whewell .)

Clarifying fundamental ideas would therefore be an essential part of scientific method and scientific progress. Whewell called this process “Discoverer’s Induction”. It was induction, following Bacon or Newton, but Whewell sought to revive Bacon’s account by emphasising the role of ideas in the clear and careful formulation of inductive hypotheses. Whewell’s induction is not merely the collecting of objective facts. The subjective plays a role through what Whewell calls the Colligation of Facts, a creative act of the scientist, the invention of a theory. A theory is then confirmed by testing, where more facts are brought under the theory, called the Consilience of Inductions. Whewell felt that this was the method by which the true laws of nature could be discovered: clarification of fundamental concepts, clever invention of explanations, and careful testing. Mill, in his critique of Whewell, and others who have cast Whewell as a fore-runner of the hypothetico-deductivist view, seem to have under-estimated the importance of this discovery phase in Whewell’s understanding of method (Snyder 1997a,b, 1999). Down-playing the discovery phase would come to characterize methodology of the early 20 th century (see section 3 ).

Mill, in his System of Logic , put forward a narrower view of induction as the essence of scientific method. For Mill, induction is the search first for regularities among events. Among those regularities, some will continue to hold for further observations, eventually gaining the status of laws. One can also look for regularities among the laws discovered in a domain, i.e., for a law of laws. Which “law law” will hold is time and discipline dependent and open to revision. One example is the Law of Universal Causation, and Mill put forward specific methods for identifying causes—now commonly known as Mill’s methods. These five methods look for circumstances which are common among the phenomena of interest, those which are absent when the phenomena are, or those for which both vary together. Mill’s methods are still seen as capturing basic intuitions about experimental methods for finding the relevant explanatory factors ( System of Logic (1843), see Mill entry). The methods advocated by Whewell and Mill, in the end, look similar. Both involve inductive generalization to covering laws. They differ dramatically, however, with respect to the necessity of the knowledge arrived at; that is, at the meta-methodological level (see the entries on Whewell and Mill entries).

3. Logic of method and critical responses

The quantum and relativistic revolutions in physics in the early 20 th century had a profound effect on methodology. Conceptual foundations of both theories were taken to show the defeasibility of even the most seemingly secure intuitions about space, time and bodies. Certainty of knowledge about the natural world was therefore recognized as unattainable. Instead a renewed empiricism was sought which rendered science fallible but still rationally justifiable.

Analyses of the reasoning of scientists emerged, according to which the aspects of scientific method which were of primary importance were the means of testing and confirming of theories. A distinction in methodology was made between the contexts of discovery and justification. The distinction could be used as a wedge between the particularities of where and how theories or hypotheses are arrived at, on the one hand, and the underlying reasoning scientists use (whether or not they are aware of it) when assessing theories and judging their adequacy on the basis of the available evidence. By and large, for most of the 20 th century, philosophy of science focused on the second context, although philosophers differed on whether to focus on confirmation or refutation as well as on the many details of how confirmation or refutation could or could not be brought about. By the mid-20 th century these attempts at defining the method of justification and the context distinction itself came under pressure. During the same period, philosophy of science developed rapidly, and from section 4 this entry will therefore shift from a primarily historical treatment of the scientific method towards a primarily thematic one.

Advances in logic and probability held out promise of the possibility of elaborate reconstructions of scientific theories and empirical method, the best example being Rudolf Carnap’s The Logical Structure of the World (1928). Carnap attempted to show that a scientific theory could be reconstructed as a formal axiomatic system—that is, a logic. That system could refer to the world because some of its basic sentences could be interpreted as observations or operations which one could perform to test them. The rest of the theoretical system, including sentences using theoretical or unobservable terms (like electron or force) would then either be meaningful because they could be reduced to observations, or they had purely logical meanings (called analytic, like mathematical identities). This has been referred to as the verifiability criterion of meaning. According to the criterion, any statement not either analytic or verifiable was strictly meaningless. Although the view was endorsed by Carnap in 1928, he would later come to see it as too restrictive (Carnap 1956). Another familiar version of this idea is operationalism of Percy William Bridgman. In The Logic of Modern Physics (1927) Bridgman asserted that every physical concept could be defined in terms of the operations one would perform to verify the application of that concept. Making good on the operationalisation of a concept even as simple as length, however, can easily become enormously complex (for measuring very small lengths, for instance) or impractical (measuring large distances like light years.)

Carl Hempel’s (1950, 1951) criticisms of the verifiability criterion of meaning had enormous influence. He pointed out that universal generalizations, such as most scientific laws, were not strictly meaningful on the criterion. Verifiability and operationalism both seemed too restrictive to capture standard scientific aims and practice. The tenuous connection between these reconstructions and actual scientific practice was criticized in another way. In both approaches, scientific methods are instead recast in methodological roles. Measurements, for example, were looked to as ways of giving meanings to terms. The aim of the philosopher of science was not to understand the methods per se , but to use them to reconstruct theories, their meanings, and their relation to the world. When scientists perform these operations, however, they will not report that they are doing them to give meaning to terms in a formal axiomatic system. This disconnect between methodology and the details of actual scientific practice would seem to violate the empiricism the Logical Positivists and Bridgman were committed to. The view that methodology should correspond to practice (to some extent) has been called historicism, or intuitionism. We turn to these criticisms and responses in section 3.4 . [ 4 ]

Positivism also had to contend with the recognition that a purely inductivist approach, along the lines of Bacon-Newton-Mill, was untenable. There was no pure observation, for starters. All observation was theory laden. Theory is required to make any observation, therefore not all theory can be derived from observation alone. (See the entry on theory and observation in science .) Even granting an observational basis, Hume had already pointed out that one could not deductively justify inductive conclusions without begging the question by presuming the success of the inductive method. Likewise, positivist attempts at analyzing how a generalization can be confirmed by observations of its instances were subject to a number of criticisms. Goodman (1965) and Hempel (1965) both point to paradoxes inherent in standard accounts of confirmation. Recent attempts at explaining how observations can serve to confirm a scientific theory are discussed in section 4 below.

The standard starting point for a non-inductive analysis of the logic of confirmation is known as the Hypothetico-Deductive (H-D) method. In its simplest form, a sentence of a theory which expresses some hypothesis is confirmed by its true consequences. As noted in section 2 , this method had been advanced by Whewell in the 19 th century, as well as Nicod (1924) and others in the 20 th century. Often, Hempel’s (1966) description of the H-D method, illustrated by the case of Semmelweiss’ inferential procedures in establishing the cause of childbed fever, has been presented as a key account of H-D as well as a foil for criticism of the H-D account of confirmation (see, for example, Lipton’s (2004) discussion of inference to the best explanation; also the entry on confirmation ). Hempel described Semmelsweiss’ procedure as examining various hypotheses explaining the cause of childbed fever. Some hypotheses conflicted with observable facts and could be rejected as false immediately. Others needed to be tested experimentally by deducing which observable events should follow if the hypothesis were true (what Hempel called the test implications of the hypothesis), then conducting an experiment and observing whether or not the test implications occurred. If the experiment showed the test implication to be false, the hypothesis could be rejected. If the experiment showed the test implications to be true, however, this did not prove the hypothesis true. The confirmation of a test implication does not verify a hypothesis, though Hempel did allow that “it provides at least some support, some corroboration or confirmation for it” (Hempel 1966: 8). The degree of this support then depends on the quantity, variety and precision of the supporting evidence.

Another approach that took off from the difficulties with inductive inference was Karl Popper’s critical rationalism or falsificationism (Popper 1959, 1963). Falsification is deductive and similar to H-D in that it involves scientists deducing observational consequences from the hypothesis under test. For Popper, however, the important point was not the degree of confirmation that successful prediction offered to a hypothesis. The crucial thing was the logical asymmetry between confirmation, based on inductive inference, and falsification, which can be based on a deductive inference. (This simple opposition was later questioned, by Lakatos, among others. See the entry on historicist theories of scientific rationality. )

Popper stressed that, regardless of the amount of confirming evidence, we can never be certain that a hypothesis is true without committing the fallacy of affirming the consequent. Instead, Popper introduced the notion of corroboration as a measure for how well a theory or hypothesis has survived previous testing—but without implying that this is also a measure for the probability that it is true.

Popper was also motivated by his doubts about the scientific status of theories like the Marxist theory of history or psycho-analysis, and so wanted to demarcate between science and pseudo-science. Popper saw this as an importantly different distinction than demarcating science from metaphysics. The latter demarcation was the primary concern of many logical empiricists. Popper used the idea of falsification to draw a line instead between pseudo and proper science. Science was science because its method involved subjecting theories to rigorous tests which offered a high probability of failing and thus refuting the theory.

A commitment to the risk of failure was important. Avoiding falsification could be done all too easily. If a consequence of a theory is inconsistent with observations, an exception can be added by introducing auxiliary hypotheses designed explicitly to save the theory, so-called ad hoc modifications. This Popper saw done in pseudo-science where ad hoc theories appeared capable of explaining anything in their field of application. In contrast, science is risky. If observations showed the predictions from a theory to be wrong, the theory would be refuted. Hence, scientific hypotheses must be falsifiable. Not only must there exist some possible observation statement which could falsify the hypothesis or theory, were it observed, (Popper called these the hypothesis’ potential falsifiers) it is crucial to the Popperian scientific method that such falsifications be sincerely attempted on a regular basis.

The more potential falsifiers of a hypothesis, the more falsifiable it would be, and the more the hypothesis claimed. Conversely, hypotheses without falsifiers claimed very little or nothing at all. Originally, Popper thought that this meant the introduction of ad hoc hypotheses only to save a theory should not be countenanced as good scientific method. These would undermine the falsifiabililty of a theory. However, Popper later came to recognize that the introduction of modifications (immunizations, he called them) was often an important part of scientific development. Responding to surprising or apparently falsifying observations often generated important new scientific insights. Popper’s own example was the observed motion of Uranus which originally did not agree with Newtonian predictions. The ad hoc hypothesis of an outer planet explained the disagreement and led to further falsifiable predictions. Popper sought to reconcile the view by blurring the distinction between falsifiable and not falsifiable, and speaking instead of degrees of testability (Popper 1985: 41f.).

From the 1960s on, sustained meta-methodological criticism emerged that drove philosophical focus away from scientific method. A brief look at those criticisms follows, with recommendations for further reading at the end of the entry.

Thomas Kuhn’s The Structure of Scientific Revolutions (1962) begins with a well-known shot across the bow for philosophers of science:

History, if viewed as a repository for more than anecdote or chronology, could produce a decisive transformation in the image of science by which we are now possessed. (1962: 1)

The image Kuhn thought needed transforming was the a-historical, rational reconstruction sought by many of the Logical Positivists, though Carnap and other positivists were actually quite sympathetic to Kuhn’s views. (See the entry on the Vienna Circle .) Kuhn shares with other of his contemporaries, such as Feyerabend and Lakatos, a commitment to a more empirical approach to philosophy of science. Namely, the history of science provides important data, and necessary checks, for philosophy of science, including any theory of scientific method.

The history of science reveals, according to Kuhn, that scientific development occurs in alternating phases. During normal science, the members of the scientific community adhere to the paradigm in place. Their commitment to the paradigm means a commitment to the puzzles to be solved and the acceptable ways of solving them. Confidence in the paradigm remains so long as steady progress is made in solving the shared puzzles. Method in this normal phase operates within a disciplinary matrix (Kuhn’s later concept of a paradigm) which includes standards for problem solving, and defines the range of problems to which the method should be applied. An important part of a disciplinary matrix is the set of values which provide the norms and aims for scientific method. The main values that Kuhn identifies are prediction, problem solving, simplicity, consistency, and plausibility.

An important by-product of normal science is the accumulation of puzzles which cannot be solved with resources of the current paradigm. Once accumulation of these anomalies has reached some critical mass, it can trigger a communal shift to a new paradigm and a new phase of normal science. Importantly, the values that provide the norms and aims for scientific method may have transformed in the meantime. Method may therefore be relative to discipline, time or place

Feyerabend also identified the aims of science as progress, but argued that any methodological prescription would only stifle that progress (Feyerabend 1988). His arguments are grounded in re-examining accepted “myths” about the history of science. Heroes of science, like Galileo, are shown to be just as reliant on rhetoric and persuasion as they are on reason and demonstration. Others, like Aristotle, are shown to be far more reasonable and far-reaching in their outlooks then they are given credit for. As a consequence, the only rule that could provide what he took to be sufficient freedom was the vacuous “anything goes”. More generally, even the methodological restriction that science is the best way to pursue knowledge, and to increase knowledge, is too restrictive. Feyerabend suggested instead that science might, in fact, be a threat to a free society, because it and its myth had become so dominant (Feyerabend 1978).

An even more fundamental kind of criticism was offered by several sociologists of science from the 1970s onwards who rejected the methodology of providing philosophical accounts for the rational development of science and sociological accounts of the irrational mistakes. Instead, they adhered to a symmetry thesis on which any causal explanation of how scientific knowledge is established needs to be symmetrical in explaining truth and falsity, rationality and irrationality, success and mistakes, by the same causal factors (see, e.g., Barnes and Bloor 1982, Bloor 1991). Movements in the Sociology of Science, like the Strong Programme, or in the social dimensions and causes of knowledge more generally led to extended and close examination of detailed case studies in contemporary science and its history. (See the entries on the social dimensions of scientific knowledge and social epistemology .) Well-known examinations by Latour and Woolgar (1979/1986), Knorr-Cetina (1981), Pickering (1984), Shapin and Schaffer (1985) seem to bear out that it was social ideologies (on a macro-scale) or individual interactions and circumstances (on a micro-scale) which were the primary causal factors in determining which beliefs gained the status of scientific knowledge. As they saw it therefore, explanatory appeals to scientific method were not empirically grounded.

A late, and largely unexpected, criticism of scientific method came from within science itself. Beginning in the early 2000s, a number of scientists attempting to replicate the results of published experiments could not do so. There may be close conceptual connection between reproducibility and method. For example, if reproducibility means that the same scientific methods ought to produce the same result, and all scientific results ought to be reproducible, then whatever it takes to reproduce a scientific result ought to be called scientific method. Space limits us to the observation that, insofar as reproducibility is a desired outcome of proper scientific method, it is not strictly a part of scientific method. (See the entry on reproducibility of scientific results .)

By the close of the 20 th century the search for the scientific method was flagging. Nola and Sankey (2000b) could introduce their volume on method by remarking that “For some, the whole idea of a theory of scientific method is yester-year’s debate …”.

Despite the many difficulties that philosophers encountered in trying to providing a clear methodology of conformation (or refutation), still important progress has been made on understanding how observation can provide evidence for a given theory. Work in statistics has been crucial for understanding how theories can be tested empirically, and in recent decades a huge literature has developed that attempts to recast confirmation in Bayesian terms. Here these developments can be covered only briefly, and we refer to the entry on confirmation for further details and references.

Statistics has come to play an increasingly important role in the methodology of the experimental sciences from the 19 th century onwards. At that time, statistics and probability theory took on a methodological role as an analysis of inductive inference, and attempts to ground the rationality of induction in the axioms of probability theory have continued throughout the 20 th century and in to the present. Developments in the theory of statistics itself, meanwhile, have had a direct and immense influence on the experimental method, including methods for measuring the uncertainty of observations such as the Method of Least Squares developed by Legendre and Gauss in the early 19 th century, criteria for the rejection of outliers proposed by Peirce by the mid-19 th century, and the significance tests developed by Gosset (a.k.a. “Student”), Fisher, Neyman & Pearson and others in the 1920s and 1930s (see, e.g., Swijtink 1987 for a brief historical overview; and also the entry on C.S. Peirce ).

These developments within statistics then in turn led to a reflective discussion among both statisticians and philosophers of science on how to perceive the process of hypothesis testing: whether it was a rigorous statistical inference that could provide a numerical expression of the degree of confidence in the tested hypothesis, or if it should be seen as a decision between different courses of actions that also involved a value component. This led to a major controversy among Fisher on the one side and Neyman and Pearson on the other (see especially Fisher 1955, Neyman 1956 and Pearson 1955, and for analyses of the controversy, e.g., Howie 2002, Marks 2000, Lenhard 2006). On Fisher’s view, hypothesis testing was a methodology for when to accept or reject a statistical hypothesis, namely that a hypothesis should be rejected by evidence if this evidence would be unlikely relative to other possible outcomes, given the hypothesis were true. In contrast, on Neyman and Pearson’s view, the consequence of error also had to play a role when deciding between hypotheses. Introducing the distinction between the error of rejecting a true hypothesis (type I error) and accepting a false hypothesis (type II error), they argued that it depends on the consequences of the error to decide whether it is more important to avoid rejecting a true hypothesis or accepting a false one. Hence, Fisher aimed for a theory of inductive inference that enabled a numerical expression of confidence in a hypothesis. To him, the important point was the search for truth, not utility. In contrast, the Neyman-Pearson approach provided a strategy of inductive behaviour for deciding between different courses of action. Here, the important point was not whether a hypothesis was true, but whether one should act as if it was.

Similar discussions are found in the philosophical literature. On the one side, Churchman (1948) and Rudner (1953) argued that because scientific hypotheses can never be completely verified, a complete analysis of the methods of scientific inference includes ethical judgments in which the scientists must decide whether the evidence is sufficiently strong or that the probability is sufficiently high to warrant the acceptance of the hypothesis, which again will depend on the importance of making a mistake in accepting or rejecting the hypothesis. Others, such as Jeffrey (1956) and Levi (1960) disagreed and instead defended a value-neutral view of science on which scientists should bracket their attitudes, preferences, temperament, and values when assessing the correctness of their inferences. For more details on this value-free ideal in the philosophy of science and its historical development, see Douglas (2009) and Howard (2003). For a broad set of case studies examining the role of values in science, see e.g. Elliott & Richards 2017.

In recent decades, philosophical discussions of the evaluation of probabilistic hypotheses by statistical inference have largely focused on Bayesianism that understands probability as a measure of a person’s degree of belief in an event, given the available information, and frequentism that instead understands probability as a long-run frequency of a repeatable event. Hence, for Bayesians probabilities refer to a state of knowledge, whereas for frequentists probabilities refer to frequencies of events (see, e.g., Sober 2008, chapter 1 for a detailed introduction to Bayesianism and frequentism as well as to likelihoodism). Bayesianism aims at providing a quantifiable, algorithmic representation of belief revision, where belief revision is a function of prior beliefs (i.e., background knowledge) and incoming evidence. Bayesianism employs a rule based on Bayes’ theorem, a theorem of the probability calculus which relates conditional probabilities. The probability that a particular hypothesis is true is interpreted as a degree of belief, or credence, of the scientist. There will also be a probability and a degree of belief that a hypothesis will be true conditional on a piece of evidence (an observation, say) being true. Bayesianism proscribes that it is rational for the scientist to update their belief in the hypothesis to that conditional probability should it turn out that the evidence is, in fact, observed (see, e.g., Sprenger & Hartmann 2019 for a comprehensive treatment of Bayesian philosophy of science). Originating in the work of Neyman and Person, frequentism aims at providing the tools for reducing long-run error rates, such as the error-statistical approach developed by Mayo (1996) that focuses on how experimenters can avoid both type I and type II errors by building up a repertoire of procedures that detect errors if and only if they are present. Both Bayesianism and frequentism have developed over time, they are interpreted in different ways by its various proponents, and their relations to previous criticism to attempts at defining scientific method are seen differently by proponents and critics. The literature, surveys, reviews and criticism in this area are vast and the reader is referred to the entries on Bayesian epistemology and confirmation .

5. Method in Practice

Attention to scientific practice, as we have seen, is not itself new. However, the turn to practice in the philosophy of science of late can be seen as a correction to the pessimism with respect to method in philosophy of science in later parts of the 20 th century, and as an attempted reconciliation between sociological and rationalist explanations of scientific knowledge. Much of this work sees method as detailed and context specific problem-solving procedures, and methodological analyses to be at the same time descriptive, critical and advisory (see Nickles 1987 for an exposition of this view). The following section contains a survey of some of the practice focuses. In this section we turn fully to topics rather than chronology.

A problem with the distinction between the contexts of discovery and justification that figured so prominently in philosophy of science in the first half of the 20 th century (see section 2 ) is that no such distinction can be clearly seen in scientific activity (see Arabatzis 2006). Thus, in recent decades, it has been recognized that study of conceptual innovation and change should not be confined to psychology and sociology of science, but are also important aspects of scientific practice which philosophy of science should address (see also the entry on scientific discovery ). Looking for the practices that drive conceptual innovation has led philosophers to examine both the reasoning practices of scientists and the wide realm of experimental practices that are not directed narrowly at testing hypotheses, that is, exploratory experimentation.

Examining the reasoning practices of historical and contemporary scientists, Nersessian (2008) has argued that new scientific concepts are constructed as solutions to specific problems by systematic reasoning, and that of analogy, visual representation and thought-experimentation are among the important reasoning practices employed. These ubiquitous forms of reasoning are reliable—but also fallible—methods of conceptual development and change. On her account, model-based reasoning consists of cycles of construction, simulation, evaluation and adaption of models that serve as interim interpretations of the target problem to be solved. Often, this process will lead to modifications or extensions, and a new cycle of simulation and evaluation. However, Nersessian also emphasizes that

creative model-based reasoning cannot be applied as a simple recipe, is not always productive of solutions, and even its most exemplary usages can lead to incorrect solutions. (Nersessian 2008: 11)

Thus, while on the one hand she agrees with many previous philosophers that there is no logic of discovery, discoveries can derive from reasoned processes, such that a large and integral part of scientific practice is

the creation of concepts through which to comprehend, structure, and communicate about physical phenomena …. (Nersessian 1987: 11)

Similarly, work on heuristics for discovery and theory construction by scholars such as Darden (1991) and Bechtel & Richardson (1993) present science as problem solving and investigate scientific problem solving as a special case of problem-solving in general. Drawing largely on cases from the biological sciences, much of their focus has been on reasoning strategies for the generation, evaluation, and revision of mechanistic explanations of complex systems.

Addressing another aspect of the context distinction, namely the traditional view that the primary role of experiments is to test theoretical hypotheses according to the H-D model, other philosophers of science have argued for additional roles that experiments can play. The notion of exploratory experimentation was introduced to describe experiments driven by the desire to obtain empirical regularities and to develop concepts and classifications in which these regularities can be described (Steinle 1997, 2002; Burian 1997; Waters 2007)). However the difference between theory driven experimentation and exploratory experimentation should not be seen as a sharp distinction. Theory driven experiments are not always directed at testing hypothesis, but may also be directed at various kinds of fact-gathering, such as determining numerical parameters. Vice versa , exploratory experiments are usually informed by theory in various ways and are therefore not theory-free. Instead, in exploratory experiments phenomena are investigated without first limiting the possible outcomes of the experiment on the basis of extant theory about the phenomena.

The development of high throughput instrumentation in molecular biology and neighbouring fields has given rise to a special type of exploratory experimentation that collects and analyses very large amounts of data, and these new ‘omics’ disciplines are often said to represent a break with the ideal of hypothesis-driven science (Burian 2007; Elliott 2007; Waters 2007; O’Malley 2007) and instead described as data-driven research (Leonelli 2012; Strasser 2012) or as a special kind of “convenience experimentation” in which many experiments are done simply because they are extraordinarily convenient to perform (Krohs 2012).

5.2 Computer methods and ‘new ways’ of doing science

The field of omics just described is possible because of the ability of computers to process, in a reasonable amount of time, the huge quantities of data required. Computers allow for more elaborate experimentation (higher speed, better filtering, more variables, sophisticated coordination and control), but also, through modelling and simulations, might constitute a form of experimentation themselves. Here, too, we can pose a version of the general question of method versus practice: does the practice of using computers fundamentally change scientific method, or merely provide a more efficient means of implementing standard methods?

Because computers can be used to automate measurements, quantifications, calculations, and statistical analyses where, for practical reasons, these operations cannot be otherwise carried out, many of the steps involved in reaching a conclusion on the basis of an experiment are now made inside a “black box”, without the direct involvement or awareness of a human. This has epistemological implications, regarding what we can know, and how we can know it. To have confidence in the results, computer methods are therefore subjected to tests of verification and validation.

The distinction between verification and validation is easiest to characterize in the case of computer simulations. In a typical computer simulation scenario computers are used to numerically integrate differential equations for which no analytic solution is available. The equations are part of the model the scientist uses to represent a phenomenon or system under investigation. Verifying a computer simulation means checking that the equations of the model are being correctly approximated. Validating a simulation means checking that the equations of the model are adequate for the inferences one wants to make on the basis of that model.

A number of issues related to computer simulations have been raised. The identification of validity and verification as the testing methods has been criticized. Oreskes et al. (1994) raise concerns that “validiation”, because it suggests deductive inference, might lead to over-confidence in the results of simulations. The distinction itself is probably too clean, since actual practice in the testing of simulations mixes and moves back and forth between the two (Weissart 1997; Parker 2008a; Winsberg 2010). Computer simulations do seem to have a non-inductive character, given that the principles by which they operate are built in by the programmers, and any results of the simulation follow from those in-built principles in such a way that those results could, in principle, be deduced from the program code and its inputs. The status of simulations as experiments has therefore been examined (Kaufmann and Smarr 1993; Humphreys 1995; Hughes 1999; Norton and Suppe 2001). This literature considers the epistemology of these experiments: what we can learn by simulation, and also the kinds of justifications which can be given in applying that knowledge to the “real” world. (Mayo 1996; Parker 2008b). As pointed out, part of the advantage of computer simulation derives from the fact that huge numbers of calculations can be carried out without requiring direct observation by the experimenter/​simulator. At the same time, many of these calculations are approximations to the calculations which would be performed first-hand in an ideal situation. Both factors introduce uncertainties into the inferences drawn from what is observed in the simulation.

For many of the reasons described above, computer simulations do not seem to belong clearly to either the experimental or theoretical domain. Rather, they seem to crucially involve aspects of both. This has led some authors, such as Fox Keller (2003: 200) to argue that we ought to consider computer simulation a “qualitatively different way of doing science”. The literature in general tends to follow Kaufmann and Smarr (1993) in referring to computer simulation as a “third way” for scientific methodology (theoretical reasoning and experimental practice are the first two ways.). It should also be noted that the debates around these issues have tended to focus on the form of computer simulation typical in the physical sciences, where models are based on dynamical equations. Other forms of simulation might not have the same problems, or have problems of their own (see the entry on computer simulations in science ).

In recent years, the rapid development of machine learning techniques has prompted some scholars to suggest that the scientific method has become “obsolete” (Anderson 2008, Carrol and Goodstein 2009). This has resulted in an intense debate on the relative merit of data-driven and hypothesis-driven research (for samples, see e.g. Mazzocchi 2015 or Succi and Coveney 2018). For a detailed treatment of this topic, we refer to the entry scientific research and big data .

6. Discourse on scientific method

Despite philosophical disagreements, the idea of the scientific method still figures prominently in contemporary discourse on many different topics, both within science and in society at large. Often, reference to scientific method is used in ways that convey either the legend of a single, universal method characteristic of all science, or grants to a particular method or set of methods privilege as a special ‘gold standard’, often with reference to particular philosophers to vindicate the claims. Discourse on scientific method also typically arises when there is a need to distinguish between science and other activities, or for justifying the special status conveyed to science. In these areas, the philosophical attempts at identifying a set of methods characteristic for scientific endeavors are closely related to the philosophy of science’s classical problem of demarcation (see the entry on science and pseudo-science ) and to the philosophical analysis of the social dimension of scientific knowledge and the role of science in democratic society.

One of the settings in which the legend of a single, universal scientific method has been particularly strong is science education (see, e.g., Bauer 1992; McComas 1996; Wivagg & Allchin 2002). [ 5 ] Often, ‘the scientific method’ is presented in textbooks and educational web pages as a fixed four or five step procedure starting from observations and description of a phenomenon and progressing over formulation of a hypothesis which explains the phenomenon, designing and conducting experiments to test the hypothesis, analyzing the results, and ending with drawing a conclusion. Such references to a universal scientific method can be found in educational material at all levels of science education (Blachowicz 2009), and numerous studies have shown that the idea of a general and universal scientific method often form part of both students’ and teachers’ conception of science (see, e.g., Aikenhead 1987; Osborne et al. 2003). In response, it has been argued that science education need to focus more on teaching about the nature of science, although views have differed on whether this is best done through student-led investigations, contemporary cases, or historical cases (Allchin, Andersen & Nielsen 2014)

Although occasionally phrased with reference to the H-D method, important historical roots of the legend in science education of a single, universal scientific method are the American philosopher and psychologist Dewey’s account of inquiry in How We Think (1910) and the British mathematician Karl Pearson’s account of science in Grammar of Science (1892). On Dewey’s account, inquiry is divided into the five steps of

(i) a felt difficulty, (ii) its location and definition, (iii) suggestion of a possible solution, (iv) development by reasoning of the bearing of the suggestions, (v) further observation and experiment leading to its acceptance or rejection. (Dewey 1910: 72)

Similarly, on Pearson’s account, scientific investigations start with measurement of data and observation of their correction and sequence from which scientific laws can be discovered with the aid of creative imagination. These laws have to be subject to criticism, and their final acceptance will have equal validity for “all normally constituted minds”. Both Dewey’s and Pearson’s accounts should be seen as generalized abstractions of inquiry and not restricted to the realm of science—although both Dewey and Pearson referred to their respective accounts as ‘the scientific method’.

Occasionally, scientists make sweeping statements about a simple and distinct scientific method, as exemplified by Feynman’s simplified version of a conjectures and refutations method presented, for example, in the last of his 1964 Cornell Messenger lectures. [ 6 ] However, just as often scientists have come to the same conclusion as recent philosophy of science that there is not any unique, easily described scientific method. For example, the physicist and Nobel Laureate Weinberg described in the paper “The Methods of Science … And Those By Which We Live” (1995) how

The fact that the standards of scientific success shift with time does not only make the philosophy of science difficult; it also raises problems for the public understanding of science. We do not have a fixed scientific method to rally around and defend. (1995: 8)

Interview studies with scientists on their conception of method shows that scientists often find it hard to figure out whether available evidence confirms their hypothesis, and that there are no direct translations between general ideas about method and specific strategies to guide how research is conducted (Schickore & Hangel 2019, Hangel & Schickore 2017)

Reference to the scientific method has also often been used to argue for the scientific nature or special status of a particular activity. Philosophical positions that argue for a simple and unique scientific method as a criterion of demarcation, such as Popperian falsification, have often attracted practitioners who felt that they had a need to defend their domain of practice. For example, references to conjectures and refutation as the scientific method are abundant in much of the literature on complementary and alternative medicine (CAM)—alongside the competing position that CAM, as an alternative to conventional biomedicine, needs to develop its own methodology different from that of science.

Also within mainstream science, reference to the scientific method is used in arguments regarding the internal hierarchy of disciplines and domains. A frequently seen argument is that research based on the H-D method is superior to research based on induction from observations because in deductive inferences the conclusion follows necessarily from the premises. (See, e.g., Parascandola 1998 for an analysis of how this argument has been made to downgrade epidemiology compared to the laboratory sciences.) Similarly, based on an examination of the practices of major funding institutions such as the National Institutes of Health (NIH), the National Science Foundation (NSF) and the Biomedical Sciences Research Practices (BBSRC) in the UK, O’Malley et al. (2009) have argued that funding agencies seem to have a tendency to adhere to the view that the primary activity of science is to test hypotheses, while descriptive and exploratory research is seen as merely preparatory activities that are valuable only insofar as they fuel hypothesis-driven research.

In some areas of science, scholarly publications are structured in a way that may convey the impression of a neat and linear process of inquiry from stating a question, devising the methods by which to answer it, collecting the data, to drawing a conclusion from the analysis of data. For example, the codified format of publications in most biomedical journals known as the IMRAD format (Introduction, Method, Results, Analysis, Discussion) is explicitly described by the journal editors as “not an arbitrary publication format but rather a direct reflection of the process of scientific discovery” (see the so-called “Vancouver Recommendations”, ICMJE 2013: 11). However, scientific publications do not in general reflect the process by which the reported scientific results were produced. For example, under the provocative title “Is the scientific paper a fraud?”, Medawar argued that scientific papers generally misrepresent how the results have been produced (Medawar 1963/1996). Similar views have been advanced by philosophers, historians and sociologists of science (Gilbert 1976; Holmes 1987; Knorr-Cetina 1981; Schickore 2008; Suppe 1998) who have argued that scientists’ experimental practices are messy and often do not follow any recognizable pattern. Publications of research results, they argue, are retrospective reconstructions of these activities that often do not preserve the temporal order or the logic of these activities, but are instead often constructed in order to screen off potential criticism (see Schickore 2008 for a review of this work).

Philosophical positions on the scientific method have also made it into the court room, especially in the US where judges have drawn on philosophy of science in deciding when to confer special status to scientific expert testimony. A key case is Daubert vs Merrell Dow Pharmaceuticals (92–102, 509 U.S. 579, 1993). In this case, the Supreme Court argued in its 1993 ruling that trial judges must ensure that expert testimony is reliable, and that in doing this the court must look at the expert’s methodology to determine whether the proffered evidence is actually scientific knowledge. Further, referring to works of Popper and Hempel the court stated that

ordinarily, a key question to be answered in determining whether a theory or technique is scientific knowledge … is whether it can be (and has been) tested. (Justice Blackmun, Daubert v. Merrell Dow Pharmaceuticals; see Other Internet Resources for a link to the opinion)

But as argued by Haack (2005a,b, 2010) and by Foster & Hubner (1999), by equating the question of whether a piece of testimony is reliable with the question whether it is scientific as indicated by a special methodology, the court was producing an inconsistent mixture of Popper’s and Hempel’s philosophies, and this has later led to considerable confusion in subsequent case rulings that drew on the Daubert case (see Haack 2010 for a detailed exposition).

The difficulties around identifying the methods of science are also reflected in the difficulties of identifying scientific misconduct in the form of improper application of the method or methods of science. One of the first and most influential attempts at defining misconduct in science was the US definition from 1989 that defined misconduct as

fabrication, falsification, plagiarism, or other practices that seriously deviate from those that are commonly accepted within the scientific community . (Code of Federal Regulations, part 50, subpart A., August 8, 1989, italics added)

However, the “other practices that seriously deviate” clause was heavily criticized because it could be used to suppress creative or novel science. For example, the National Academy of Science stated in their report Responsible Science (1992) that it

wishes to discourage the possibility that a misconduct complaint could be lodged against scientists based solely on their use of novel or unorthodox research methods. (NAS: 27)

This clause was therefore later removed from the definition. For an entry into the key philosophical literature on conduct in science, see Shamoo & Resnick (2009).

The question of the source of the success of science has been at the core of philosophy since the beginning of modern science. If viewed as a matter of epistemology more generally, scientific method is a part of the entire history of philosophy. Over that time, science and whatever methods its practitioners may employ have changed dramatically. Today, many philosophers have taken up the banners of pluralism or of practice to focus on what are, in effect, fine-grained and contextually limited examinations of scientific method. Others hope to shift perspectives in order to provide a renewed general account of what characterizes the activity we call science.

One such perspective has been offered recently by Hoyningen-Huene (2008, 2013), who argues from the history of philosophy of science that after three lengthy phases of characterizing science by its method, we are now in a phase where the belief in the existence of a positive scientific method has eroded and what has been left to characterize science is only its fallibility. First was a phase from Plato and Aristotle up until the 17 th century where the specificity of scientific knowledge was seen in its absolute certainty established by proof from evident axioms; next was a phase up to the mid-19 th century in which the means to establish the certainty of scientific knowledge had been generalized to include inductive procedures as well. In the third phase, which lasted until the last decades of the 20 th century, it was recognized that empirical knowledge was fallible, but it was still granted a special status due to its distinctive mode of production. But now in the fourth phase, according to Hoyningen-Huene, historical and philosophical studies have shown how “scientific methods with the characteristics as posited in the second and third phase do not exist” (2008: 168) and there is no longer any consensus among philosophers and historians of science about the nature of science. For Hoyningen-Huene, this is too negative a stance, and he therefore urges the question about the nature of science anew. His own answer to this question is that “scientific knowledge differs from other kinds of knowledge, especially everyday knowledge, primarily by being more systematic” (Hoyningen-Huene 2013: 14). Systematicity can have several different dimensions: among them are more systematic descriptions, explanations, predictions, defense of knowledge claims, epistemic connectedness, ideal of completeness, knowledge generation, representation of knowledge and critical discourse. Hence, what characterizes science is the greater care in excluding possible alternative explanations, the more detailed elaboration with respect to data on which predictions are based, the greater care in detecting and eliminating sources of error, the more articulate connections to other pieces of knowledge, etc. On this position, what characterizes science is not that the methods employed are unique to science, but that the methods are more carefully employed.

Another, similar approach has been offered by Haack (2003). She sets off, similar to Hoyningen-Huene, from a dissatisfaction with the recent clash between what she calls Old Deferentialism and New Cynicism. The Old Deferentialist position is that science progressed inductively by accumulating true theories confirmed by empirical evidence or deductively by testing conjectures against basic statements; while the New Cynics position is that science has no epistemic authority and no uniquely rational method and is merely just politics. Haack insists that contrary to the views of the New Cynics, there are objective epistemic standards, and there is something epistemologically special about science, even though the Old Deferentialists pictured this in a wrong way. Instead, she offers a new Critical Commonsensist account on which standards of good, strong, supportive evidence and well-conducted, honest, thorough and imaginative inquiry are not exclusive to the sciences, but the standards by which we judge all inquirers. In this sense, science does not differ in kind from other kinds of inquiry, but it may differ in the degree to which it requires broad and detailed background knowledge and a familiarity with a technical vocabulary that only specialists may possess.

  • Aikenhead, G.S., 1987, “High-school graduates’ beliefs about science-technology-society. III. Characteristics and limitations of scientific knowledge”, Science Education , 71(4): 459–487.
  • Allchin, D., H.M. Andersen and K. Nielsen, 2014, “Complementary Approaches to Teaching Nature of Science: Integrating Student Inquiry, Historical Cases, and Contemporary Cases in Classroom Practice”, Science Education , 98: 461–486.
  • Anderson, C., 2008, “The end of theory: The data deluge makes the scientific method obsolete”, Wired magazine , 16(7): 16–07
  • Arabatzis, T., 2006, “On the inextricability of the context of discovery and the context of justification”, in Revisiting Discovery and Justification , J. Schickore and F. Steinle (eds.), Dordrecht: Springer, pp. 215–230.
  • Barnes, J. (ed.), 1984, The Complete Works of Aristotle, Vols I and II , Princeton: Princeton University Press.
  • Barnes, B. and D. Bloor, 1982, “Relativism, Rationalism, and the Sociology of Knowledge”, in Rationality and Relativism , M. Hollis and S. Lukes (eds.), Cambridge: MIT Press, pp. 1–20.
  • Bauer, H.H., 1992, Scientific Literacy and the Myth of the Scientific Method , Urbana: University of Illinois Press.
  • Bechtel, W. and R.C. Richardson, 1993, Discovering complexity , Princeton, NJ: Princeton University Press.
  • Berkeley, G., 1734, The Analyst in De Motu and The Analyst: A Modern Edition with Introductions and Commentary , D. Jesseph (trans. and ed.), Dordrecht: Kluwer Academic Publishers, 1992.
  • Blachowicz, J., 2009, “How science textbooks treat scientific method: A philosopher’s perspective”, The British Journal for the Philosophy of Science , 60(2): 303–344.
  • Bloor, D., 1991, Knowledge and Social Imagery , Chicago: University of Chicago Press, 2 nd edition.
  • Boyle, R., 1682, New experiments physico-mechanical, touching the air , Printed by Miles Flesher for Richard Davis, bookseller in Oxford.
  • Bridgman, P.W., 1927, The Logic of Modern Physics , New York: Macmillan.
  • –––, 1956, “The Methodological Character of Theoretical Concepts”, in The Foundations of Science and the Concepts of Science and Psychology , Herbert Feigl and Michael Scriven (eds.), Minnesota: University of Minneapolis Press, pp. 38–76.
  • Burian, R., 1997, “Exploratory Experimentation and the Role of Histochemical Techniques in the Work of Jean Brachet, 1938–1952”, History and Philosophy of the Life Sciences , 19(1): 27–45.
  • –––, 2007, “On microRNA and the need for exploratory experimentation in post-genomic molecular biology”, History and Philosophy of the Life Sciences , 29(3): 285–311.
  • Carnap, R., 1928, Der logische Aufbau der Welt , Berlin: Bernary, transl. by R.A. George, The Logical Structure of the World , Berkeley: University of California Press, 1967.
  • –––, 1956, “The methodological character of theoretical concepts”, Minnesota studies in the philosophy of science , 1: 38–76.
  • Carrol, S., and D. Goodstein, 2009, “Defining the scientific method”, Nature Methods , 6: 237.
  • Churchman, C.W., 1948, “Science, Pragmatics, Induction”, Philosophy of Science , 15(3): 249–268.
  • Cooper, J. (ed.), 1997, Plato: Complete Works , Indianapolis: Hackett.
  • Darden, L., 1991, Theory Change in Science: Strategies from Mendelian Genetics , Oxford: Oxford University Press
  • Dewey, J., 1910, How we think , New York: Dover Publications (reprinted 1997).
  • Douglas, H., 2009, Science, Policy, and the Value-Free Ideal , Pittsburgh: University of Pittsburgh Press.
  • Dupré, J., 2004, “Miracle of Monism ”, in Naturalism in Question , Mario De Caro and David Macarthur (eds.), Cambridge, MA: Harvard University Press, pp. 36–58.
  • Elliott, K.C., 2007, “Varieties of exploratory experimentation in nanotoxicology”, History and Philosophy of the Life Sciences , 29(3): 311–334.
  • Elliott, K. C., and T. Richards (eds.), 2017, Exploring inductive risk: Case studies of values in science , Oxford: Oxford University Press.
  • Falcon, Andrea, 2005, Aristotle and the science of nature: Unity without uniformity , Cambridge: Cambridge University Press.
  • Feyerabend, P., 1978, Science in a Free Society , London: New Left Books
  • –––, 1988, Against Method , London: Verso, 2 nd edition.
  • Fisher, R.A., 1955, “Statistical Methods and Scientific Induction”, Journal of The Royal Statistical Society. Series B (Methodological) , 17(1): 69–78.
  • Foster, K. and P.W. Huber, 1999, Judging Science. Scientific Knowledge and the Federal Courts , Cambridge: MIT Press.
  • Fox Keller, E., 2003, “Models, Simulation, and ‘computer experiments’”, in The Philosophy of Scientific Experimentation , H. Radder (ed.), Pittsburgh: Pittsburgh University Press, 198–215.
  • Gilbert, G., 1976, “The transformation of research findings into scientific knowledge”, Social Studies of Science , 6: 281–306.
  • Gimbel, S., 2011, Exploring the Scientific Method , Chicago: University of Chicago Press.
  • Goodman, N., 1965, Fact , Fiction, and Forecast , Indianapolis: Bobbs-Merrill.
  • Haack, S., 1995, “Science is neither sacred nor a confidence trick”, Foundations of Science , 1(3): 323–335.
  • –––, 2003, Defending science—within reason , Amherst: Prometheus.
  • –––, 2005a, “Disentangling Daubert: an epistemological study in theory and practice”, Journal of Philosophy, Science and Law , 5, Haack 2005a available online . doi:10.5840/jpsl2005513
  • –––, 2005b, “Trial and error: The Supreme Court’s philosophy of science”, American Journal of Public Health , 95: S66-S73.
  • –––, 2010, “Federal Philosophy of Science: A Deconstruction-and a Reconstruction”, NYUJL & Liberty , 5: 394.
  • Hangel, N. and J. Schickore, 2017, “Scientists’ conceptions of good research practice”, Perspectives on Science , 25(6): 766–791
  • Harper, W.L., 2011, Isaac Newton’s Scientific Method: Turning Data into Evidence about Gravity and Cosmology , Oxford: Oxford University Press.
  • Hempel, C., 1950, “Problems and Changes in the Empiricist Criterion of Meaning”, Revue Internationale de Philosophie , 41(11): 41–63.
  • –––, 1951, “The Concept of Cognitive Significance: A Reconsideration”, Proceedings of the American Academy of Arts and Sciences , 80(1): 61–77.
  • –––, 1965, Aspects of scientific explanation and other essays in the philosophy of science , New York–London: Free Press.
  • –––, 1966, Philosophy of Natural Science , Englewood Cliffs: Prentice-Hall.
  • Holmes, F.L., 1987, “Scientific writing and scientific discovery”, Isis , 78(2): 220–235.
  • Howard, D., 2003, “Two left turns make a right: On the curious political career of North American philosophy of science at midcentury”, in Logical Empiricism in North America , G.L. Hardcastle & A.W. Richardson (eds.), Minneapolis: University of Minnesota Press, pp. 25–93.
  • Hoyningen-Huene, P., 2008, “Systematicity: The nature of science”, Philosophia , 36(2): 167–180.
  • –––, 2013, Systematicity. The Nature of Science , Oxford: Oxford University Press.
  • Howie, D., 2002, Interpreting probability: Controversies and developments in the early twentieth century , Cambridge: Cambridge University Press.
  • Hughes, R., 1999, “The Ising Model, Computer Simulation, and Universal Physics”, in Models as Mediators , M. Morgan and M. Morrison (eds.), Cambridge: Cambridge University Press, pp. 97–145
  • Hume, D., 1739, A Treatise of Human Nature , D. Fate Norton and M.J. Norton (eds.), Oxford: Oxford University Press, 2000.
  • Humphreys, P., 1995, “Computational science and scientific method”, Minds and Machines , 5(1): 499–512.
  • ICMJE, 2013, “Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals”, International Committee of Medical Journal Editors, available online , accessed August 13 2014
  • Jeffrey, R.C., 1956, “Valuation and Acceptance of Scientific Hypotheses”, Philosophy of Science , 23(3): 237–246.
  • Kaufmann, W.J., and L.L. Smarr, 1993, Supercomputing and the Transformation of Science , New York: Scientific American Library.
  • Knorr-Cetina, K., 1981, The Manufacture of Knowledge , Oxford: Pergamon Press.
  • Krohs, U., 2012, “Convenience experimentation”, Studies in History and Philosophy of Biological and BiomedicalSciences , 43: 52–57.
  • Kuhn, T.S., 1962, The Structure of Scientific Revolutions , Chicago: University of Chicago Press
  • Latour, B. and S. Woolgar, 1986, Laboratory Life: The Construction of Scientific Facts , Princeton: Princeton University Press, 2 nd edition.
  • Laudan, L., 1968, “Theories of scientific method from Plato to Mach”, History of Science , 7(1): 1–63.
  • Lenhard, J., 2006, “Models and statistical inference: The controversy between Fisher and Neyman-Pearson”, The British Journal for the Philosophy of Science , 57(1): 69–91.
  • Leonelli, S., 2012, “Making Sense of Data-Driven Research in the Biological and the Biomedical Sciences”, Studies in the History and Philosophy of the Biological and Biomedical Sciences , 43(1): 1–3.
  • Levi, I., 1960, “Must the scientist make value judgments?”, Philosophy of Science , 57(11): 345–357
  • Lindley, D., 1991, Theory Change in Science: Strategies from Mendelian Genetics , Oxford: Oxford University Press.
  • Lipton, P., 2004, Inference to the Best Explanation , London: Routledge, 2 nd edition.
  • Marks, H.M., 2000, The progress of experiment: science and therapeutic reform in the United States, 1900–1990 , Cambridge: Cambridge University Press.
  • Mazzochi, F., 2015, “Could Big Data be the end of theory in science?”, EMBO reports , 16: 1250–1255.
  • Mayo, D.G., 1996, Error and the Growth of Experimental Knowledge , Chicago: University of Chicago Press.
  • McComas, W.F., 1996, “Ten myths of science: Reexamining what we think we know about the nature of science”, School Science and Mathematics , 96(1): 10–16.
  • Medawar, P.B., 1963/1996, “Is the scientific paper a fraud”, in The Strange Case of the Spotted Mouse and Other Classic Essays on Science , Oxford: Oxford University Press, 33–39.
  • Mill, J.S., 1963, Collected Works of John Stuart Mill , J. M. Robson (ed.), Toronto: University of Toronto Press
  • NAS, 1992, Responsible Science: Ensuring the integrity of the research process , Washington DC: National Academy Press.
  • Nersessian, N.J., 1987, “A cognitive-historical approach to meaning in scientific theories”, in The process of science , N. Nersessian (ed.), Berlin: Springer, pp. 161–177.
  • –––, 2008, Creating Scientific Concepts , Cambridge: MIT Press.
  • Newton, I., 1726, Philosophiae naturalis Principia Mathematica (3 rd edition), in The Principia: Mathematical Principles of Natural Philosophy: A New Translation , I.B. Cohen and A. Whitman (trans.), Berkeley: University of California Press, 1999.
  • –––, 1704, Opticks or A Treatise of the Reflections, Refractions, Inflections & Colors of Light , New York: Dover Publications, 1952.
  • Neyman, J., 1956, “Note on an Article by Sir Ronald Fisher”, Journal of the Royal Statistical Society. Series B (Methodological) , 18: 288–294.
  • Nickles, T., 1987, “Methodology, heuristics, and rationality”, in Rational changes in science: Essays on Scientific Reasoning , J.C. Pitt (ed.), Berlin: Springer, pp. 103–132.
  • Nicod, J., 1924, Le problème logique de l’induction , Paris: Alcan. (Engl. transl. “The Logical Problem of Induction”, in Foundations of Geometry and Induction , London: Routledge, 2000.)
  • Nola, R. and H. Sankey, 2000a, “A selective survey of theories of scientific method”, in Nola and Sankey 2000b: 1–65.
  • –––, 2000b, After Popper, Kuhn and Feyerabend. Recent Issues in Theories of Scientific Method , London: Springer.
  • –––, 2007, Theories of Scientific Method , Stocksfield: Acumen.
  • Norton, S., and F. Suppe, 2001, “Why atmospheric modeling is good science”, in Changing the Atmosphere: Expert Knowledge and Environmental Governance , C. Miller and P. Edwards (eds.), Cambridge, MA: MIT Press, 88–133.
  • O’Malley, M., 2007, “Exploratory experimentation and scientific practice: Metagenomics and the proteorhodopsin case”, History and Philosophy of the Life Sciences , 29(3): 337–360.
  • O’Malley, M., C. Haufe, K. Elliot, and R. Burian, 2009, “Philosophies of Funding”, Cell , 138: 611–615.
  • Oreskes, N., K. Shrader-Frechette, and K. Belitz, 1994, “Verification, Validation and Confirmation of Numerical Models in the Earth Sciences”, Science , 263(5147): 641–646.
  • Osborne, J., S. Simon, and S. Collins, 2003, “Attitudes towards science: a review of the literature and its implications”, International Journal of Science Education , 25(9): 1049–1079.
  • Parascandola, M., 1998, “Epidemiology—2 nd -Rate Science”, Public Health Reports , 113(4): 312–320.
  • Parker, W., 2008a, “Franklin, Holmes and the Epistemology of Computer Simulation”, International Studies in the Philosophy of Science , 22(2): 165–83.
  • –––, 2008b, “Computer Simulation through an Error-Statistical Lens”, Synthese , 163(3): 371–84.
  • Pearson, K. 1892, The Grammar of Science , London: J.M. Dents and Sons, 1951
  • Pearson, E.S., 1955, “Statistical Concepts in Their Relation to Reality”, Journal of the Royal Statistical Society , B, 17: 204–207.
  • Pickering, A., 1984, Constructing Quarks: A Sociological History of Particle Physics , Edinburgh: Edinburgh University Press.
  • Popper, K.R., 1959, The Logic of Scientific Discovery , London: Routledge, 2002
  • –––, 1963, Conjectures and Refutations , London: Routledge, 2002.
  • –––, 1985, Unended Quest: An Intellectual Autobiography , La Salle: Open Court Publishing Co..
  • Rudner, R., 1953, “The Scientist Qua Scientist Making Value Judgments”, Philosophy of Science , 20(1): 1–6.
  • Rudolph, J.L., 2005, “Epistemology for the masses: The origin of ‘The Scientific Method’ in American Schools”, History of Education Quarterly , 45(3): 341–376
  • Schickore, J., 2008, “Doing science, writing science”, Philosophy of Science , 75: 323–343.
  • Schickore, J. and N. Hangel, 2019, “‘It might be this, it should be that…’ uncertainty and doubt in day-to-day science practice”, European Journal for Philosophy of Science , 9(2): 31. doi:10.1007/s13194-019-0253-9
  • Shamoo, A.E. and D.B. Resnik, 2009, Responsible Conduct of Research , Oxford: Oxford University Press.
  • Shank, J.B., 2008, The Newton Wars and the Beginning of the French Enlightenment , Chicago: The University of Chicago Press.
  • Shapin, S. and S. Schaffer, 1985, Leviathan and the air-pump , Princeton: Princeton University Press.
  • Smith, G.E., 2002, “The Methodology of the Principia”, in The Cambridge Companion to Newton , I.B. Cohen and G.E. Smith (eds.), Cambridge: Cambridge University Press, 138–173.
  • Snyder, L.J., 1997a, “Discoverers’ Induction”, Philosophy of Science , 64: 580–604.
  • –––, 1997b, “The Mill-Whewell Debate: Much Ado About Induction”, Perspectives on Science , 5: 159–198.
  • –––, 1999, “Renovating the Novum Organum: Bacon, Whewell and Induction”, Studies in History and Philosophy of Science , 30: 531–557.
  • Sober, E., 2008, Evidence and Evolution. The logic behind the science , Cambridge: Cambridge University Press
  • Sprenger, J. and S. Hartmann, 2019, Bayesian philosophy of science , Oxford: Oxford University Press.
  • Steinle, F., 1997, “Entering New Fields: Exploratory Uses of Experimentation”, Philosophy of Science (Proceedings), 64: S65–S74.
  • –––, 2002, “Experiments in History and Philosophy of Science”, Perspectives on Science , 10(4): 408–432.
  • Strasser, B.J., 2012, “Data-driven sciences: From wonder cabinets to electronic databases”, Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences , 43(1): 85–87.
  • Succi, S. and P.V. Coveney, 2018, “Big data: the end of the scientific method?”, Philosophical Transactions of the Royal Society A , 377: 20180145. doi:10.1098/rsta.2018.0145
  • Suppe, F., 1998, “The Structure of a Scientific Paper”, Philosophy of Science , 65(3): 381–405.
  • Swijtink, Z.G., 1987, “The objectification of observation: Measurement and statistical methods in the nineteenth century”, in The probabilistic revolution. Ideas in History, Vol. 1 , L. Kruger (ed.), Cambridge MA: MIT Press, pp. 261–285.
  • Waters, C.K., 2007, “The nature and context of exploratory experimentation: An introduction to three case studies of exploratory research”, History and Philosophy of the Life Sciences , 29(3): 275–284.
  • Weinberg, S., 1995, “The methods of science… and those by which we live”, Academic Questions , 8(2): 7–13.
  • Weissert, T., 1997, The Genesis of Simulation in Dynamics: Pursuing the Fermi-Pasta-Ulam Problem , New York: Springer Verlag.
  • William H., 1628, Exercitatio Anatomica de Motu Cordis et Sanguinis in Animalibus , in On the Motion of the Heart and Blood in Animals , R. Willis (trans.), Buffalo: Prometheus Books, 1993.
  • Winsberg, E., 2010, Science in the Age of Computer Simulation , Chicago: University of Chicago Press.
  • Wivagg, D. & D. Allchin, 2002, “The Dogma of the Scientific Method”, The American Biology Teacher , 64(9): 645–646
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.
  • Blackmun opinion , in Daubert v. Merrell Dow Pharmaceuticals (92–102), 509 U.S. 579 (1993).
  • Scientific Method at philpapers. Darrell Rowbottom (ed.).
  • Recent Articles | Scientific Method | The Scientist Magazine

al-Kindi | Albert the Great [= Albertus magnus] | Aquinas, Thomas | Arabic and Islamic Philosophy, disciplines in: natural philosophy and natural science | Arabic and Islamic Philosophy, historical and methodological topics in: Greek sources | Arabic and Islamic Philosophy, historical and methodological topics in: influence of Arabic and Islamic Philosophy on the Latin West | Aristotle | Bacon, Francis | Bacon, Roger | Berkeley, George | biology: experiment in | Boyle, Robert | Cambridge Platonists | confirmation | Descartes, René | Enlightenment | epistemology | epistemology: Bayesian | epistemology: social | Feyerabend, Paul | Galileo Galilei | Grosseteste, Robert | Hempel, Carl | Hume, David | Hume, David: Newtonianism and Anti-Newtonianism | induction: problem of | Kant, Immanuel | Kuhn, Thomas | Leibniz, Gottfried Wilhelm | Locke, John | Mill, John Stuart | More, Henry | Neurath, Otto | Newton, Isaac | Newton, Isaac: philosophy | Ockham [Occam], William | operationalism | Peirce, Charles Sanders | Plato | Popper, Karl | rationality: historicist theories of | Reichenbach, Hans | reproducibility, scientific | Schlick, Moritz | science: and pseudo-science | science: theory and observation in | science: unity of | scientific discovery | scientific knowledge: social dimensions of | simulations in science | skepticism: medieval | space and time: absolute and relational space and motion, post-Newtonian theories | Vienna Circle | Whewell, William | Zabarella, Giacomo

Copyright © 2021 by Brian Hepburn < brian . hepburn @ wichita . edu > Hanne Andersen < hanne . andersen @ ind . ku . dk >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2023 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

Thinking critically on critical thinking: why scientists’ skills need to spread

critical thinking and scientific method

Lecturer in Psychology, University of Tasmania

Disclosure statement

Rachel Grieve does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.

University of Tasmania provides funding as a member of The Conversation AU.

View all partners

critical thinking and scientific method

MATHS AND SCIENCE EDUCATION: We’ve asked our authors about the state of maths and science education in Australia and its future direction. Today, Rachel Grieve discusses why we need to spread science-specific skills into the wider curriculum.

When we think of science and maths, stereotypical visions of lab coats, test-tubes, and formulae often spring to mind.

But more important than these stereotypes are the methods that underpin the work scientists do – namely generating and systematically testing hypotheses. A key part of this is critical thinking.

It’s a skill that often feels in short supply these days, but you don’t necessarily need to study science or maths in order gain it. It’s time to take critical thinking out of the realm of maths and science and broaden it into students’ general education.

What is critical thinking?

Critical thinking is a reflective and analytical style of thinking, with its basis in logic, rationality, and synthesis. It means delving deeper and asking questions like: why is that so? Where is the evidence? How good is that evidence? Is this a good argument? Is it biased? Is it verifiable? What are the alternative explanations?

Critical thinking moves us beyond mere description and into the realms of scientific inference and reasoning. This is what enables discoveries to be made and innovations to be fostered.

For many scientists, critical thinking becomes (seemingly) intuitive, but like any skill set, critical thinking needs to be taught and cultivated. Unfortunately, educators are unable to deposit this information directly into their students’ heads. While the theory of critical thinking can be taught, critical thinking itself needs to be experienced first-hand.

So what does this mean for educators trying to incorporate critical thinking within their curricula? We can teach students the theoretical elements of critical thinking. Take for example working through [statistical problems](http://wdeneys.org/data/COGNIT_1695.pdf](http://wdeneys.org/data/COGNIT_1695.pdf) like this one:

In a 1,000-person study, four people said their favourite series was Star Trek and 996 said Days of Our Lives. Jeremy is a randomly chosen participant in this study, is 26, and is doing graduate studies in physics. He stays at home most of the time and likes to play videogames. What is most likely? a. Jeremy’s favourite series is Star Trek b. Jeremy’s favourite series is Days of Our Lives

Some critical thought applied to this problem allows us to know that Jeremy is most likely to prefer Days of Our Lives.

Can you teach it?

It’s well established that statistical training is associated with improved decision-making. But the idea of “teaching” critical thinking is itself an oxymoron: critical thinking can really only be learned through practice. Thus, it is not surprising that student engagement with the critical thinking process itself is what pays the dividends for students.

As such, educators try to connect students with the subject matter outside the lecture theatre or classroom. For example, problem based learning is now widely used in the health sciences, whereby students must figure out the key issues related to a case and direct their own learning to solve that problem. Problem based learning has clear parallels with real life practice for health professionals.

Critical thinking goes beyond what might be on the final exam and life-long learning becomes the key. This is a good thing, as practice helps to improve our ability to think critically over time .

Just for scientists?

For those engaging with science, learning the skills needed to be a critical consumer of information is invaluable. But should these skills remain in the domain of scientists? Clearly not: for those engaging with life, being a critical consumer of information is also invaluable, allowing informed judgement.

Being able to actively consider and evaluate information, identify biases, examine the logic of arguments, and tolerate ambiguity until the evidence is in would allow many people from all backgrounds to make better decisions. While these decisions can be trivial (does that miracle anti-wrinkle cream really do what it claims?), in many cases, reasoning and decision-making can have a substantial impact, with some decisions have life-altering effects. A timely case-in-point is immunisation.

Pushing critical thinking from the realms of science and maths into the broader curriculum may lead to far-reaching outcomes. With increasing access to information on the internet, giving individuals the skills to critically think about that information may have widespread benefit, both personally and socially.

The value of science education might not always be in the facts, but in the thinking.

This is the sixth part of our series Maths and Science Education .

  • Maths and science education

critical thinking and scientific method

Service Centre Senior Consultant

critical thinking and scientific method

Director of STEM

critical thinking and scientific method

Community member - Training Delivery and Development Committee (Volunteer part-time)

critical thinking and scientific method

Chief Executive Officer

critical thinking and scientific method

Head of Evidence to Action

Encyclopedia Britannica

  • History & Society
  • Science & Tech
  • Biographies
  • Animals & Nature
  • Geography & Travel
  • Arts & Culture
  • Games & Quizzes
  • On This Day
  • One Good Fact
  • New Articles
  • Lifestyles & Social Issues
  • Philosophy & Religion
  • Politics, Law & Government
  • World History
  • Health & Medicine
  • Browse Biographies
  • Birds, Reptiles & Other Vertebrates
  • Bugs, Mollusks & Other Invertebrates
  • Environment
  • Fossils & Geologic Time
  • Entertainment & Pop Culture
  • Sports & Recreation
  • Visual Arts
  • Demystified
  • Image Galleries
  • Infographics
  • Top Questions
  • Britannica Kids
  • Saving Earth
  • Space Next 50
  • Student Center

flow chart of scientific method

scientific method

Our editors will review what you’ve submitted and determine whether to revise the article.

  • University of Nevada, Reno - College of Agriculture, Biotechnology and Natural Resources Extension - The Scientific Method
  • World History Encyclopedia - Scientific Method
  • LiveScience - What Is Science?
  • Verywell Mind - Scientific Method Steps in Psychology Research
  • WebMD - What is the Scientific Method?
  • Chemistry LibreTexts - The Scientific Method
  • National Center for Biotechnology Information - PubMed Central - Redefining the scientific method: as the use of sophisticated scientific methods that extend our mind
  • Khan Academy - The scientific method
  • Simply Psychology - What are the steps in the Scientific Method?
  • Stanford Encyclopedia of Philosophy - Scientific Method

flow chart of scientific method

scientific method , mathematical and experimental technique employed in the sciences . More specifically, it is the technique used in the construction and testing of a scientific hypothesis .

The process of observing, asking questions, and seeking answers through tests and experiments is not unique to any one field of science. In fact, the scientific method is applied broadly in science, across many different fields. Many empirical sciences, especially the social sciences , use mathematical tools borrowed from probability theory and statistics , together with outgrowths of these, such as decision theory , game theory , utility theory, and operations research . Philosophers of science have addressed general methodological problems, such as the nature of scientific explanation and the justification of induction .

critical thinking and scientific method

The scientific method is critical to the development of scientific theories , which explain empirical (experiential) laws in a scientifically rational manner. In a typical application of the scientific method, a researcher develops a hypothesis , tests it through various means, and then modifies the hypothesis on the basis of the outcome of the tests and experiments. The modified hypothesis is then retested, further modified, and tested again, until it becomes consistent with observed phenomena and testing outcomes. In this way, hypotheses serve as tools by which scientists gather data. From that data and the many different scientific investigations undertaken to explore hypotheses, scientists are able to develop broad general explanations, or scientific theories.

See also Mill’s methods ; hypothetico-deductive method .

* Note: it is page 43 in the 6th edition

Dany S. Adams, Department of Biology, Smith College, Northampton, MA 01063



Sidelight in Gilbert's (2000, Sinauer Associates); that is, I harp on correlation, necessity, and sufficiency, and the kinds of experiments required to gather each type of evidence. In my own class, an upper division Developmental Biology lecture class, I use these techniques, which include both verbal and written reinforcement, to encourage students to evaluate claims about cause and effect, that is, to distinguish between correlation and causation; however, I believe that with very slight modifications, these tricks can be applied in a much greater array of situations.







. I am impressed over an over again by the improvement in my students' ability to UNDERSTAND the primary literature, to ASSESS the validity of claims, and to THINK critically about how to answer questions.



for one of my other classes and am reading this book on the microbes. I came across this paragraph, part of which I have to share with you!! It talks about how... 'the intimin of was shown to be NECESSARY BUT NOT SUFFICIENT to induce lesions.' I just thought it was so cool that I am reading this highly scientific book and can make sense of concepts that would have been so foreign to me not all that long ago!!"



warning the students that they will be asked to think about the experimental basis of knowledge. I read this out loud during the first class. Difference: it takes an extra two minutes.

. ". Every time a technique is mentioned in class, we pull out the toolbox and write notes about the technique in the appropriate box. Difference: by the end of the semester, the students have been introduced to, and thought about how to use,, an impressive number of techniques, and they UNDERSTAND the power and the limitations of those techniques. On a very practical level, they end up with a list of techniques and controls they can consult in the future.

. Difference: students actually UNDERSTAND controls.

, always worth 50%, that asks the students to make a hypothesis about an unfamiliar observation then design experiments to test the hypothesis:

       
Ion      
DNA      
RNA      
Protein Immunocytochemistry Western Blot w/ pure protein Stain known positive cells Pre-immune serum
2nd Ab only
Cell      
Tissue      

       
Ion      
DNA      
RNA      
Protein      
Cell      
Tissue Remove tissue Stain for marker
Histology
Remove then return

       
Ion      
DNA Transfect gene (w/inducible promoter & reporter) Look for reporter; northern &/or western Transfect with neutral DNA
RNA      
Protein      
Cell      
Tissue      

Posted on the SDB Web Site Monday, July 26, 1999, Modified Wednesday, December 27, 2000

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • For authors
  • Browse by collection
  • BMJ Journals

You are here

  • Volume 14, Issue 8
  • Exploring the link of personality traits and tutors’ instruction on critical thinking disposition: a cross-sectional study among Chinese medical graduate students
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • LingYing Wang 1 ,
  • WenLing Chang 2 ,
  • http://orcid.org/0000-0002-1507-7890 HaiTao Tang 3 ,
  • WenBo He 4 ,
  • http://orcid.org/0000-0002-6682-8279 Yan Wu 3 , 5
  • 1 Critical Care Medicine Department, West China Hospital, Sichuan University/West China School of Nursing, Sichuan University , Chengdu , China
  • 2 School of Population Health & Environmental Sciences , King’s College London , London SE1 1UL , UK
  • 3 Department of Postgraduate Students, West China School of Medicine/West China Hospital, Sichuan University , Chengdu , China
  • 4 Institute of Hospital Management, West China Hospital, Sichuan University , Chengdu , Sichuan , China
  • 5 College of Marxism, Sichuan University , Chengdu , China
  • Correspondence to Yan Wu; wuyan{at}wchscu.cn

Objectives This study aimed to investigate the associations between critical thinking (CT) disposition and personal characteristics and tutors’ guidance among medical graduate students, which may provide a theoretical basis for cultivating CT.

Design A cross-sectional study was conducted.

Setting This study was conducted in Sichuan and Chongqing from November to December 2021.

Participants A total of 1488 graduate students from clinical medical schools were included in this study.

Data analysis The distribution of the study participants’ underlying characteristics and CT was described and tested. The Spearman rank correlation coefficient was used to evaluate the correlation between each factor and the CT score. The independent risk factors for CT were assessed using a logistic regression model.

Results The average total CT score was 81.79±11.42 points, and the proportion of CT (score ≥72 points) was 78.9% (1174/1488). Female sex (OR 1.405, 95% CI 1.042 to 1.895), curiosity (OR 1.847, 95% CI 1.459 to 2.338), completion of scientific research design with reference (OR 1.779, 95% CI 1.460 to 2.167), asking ‘why’ (OR 1.942, 95% CI 1.508 to 2.501) and team members’ logical thinking ability (OR 1.373, 95% CI 1.122 to 1.681) were positively associated with CT while exhaustion and burn-out (OR 0.721, 95% CI 0.526 to 0.989), inattention (OR 0.572, 95% CI 0.431 to 0.759), Following others’ opinions in decision-making (OR 0.425, 95% CI 0.337 to 0.534) and no allow of doubt to tutors (OR 0.674, 95% CI 0.561 to 0.809) had negative associations with the formation of CT disposition in the fully adjusted model.

Conclusions Factors associated with motivation and internal drive are more important in the educational practice of cultivating CT. Educators should change the reward mechanism from result-oriented to motivation-maintaining to cultivate students’ CT awareness.

  • risk factors
  • public health

Data availability statement

Data are available on reasonable request.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:  http://creativecommons.org/licenses/by-nc/4.0/ .

https://doi.org/10.1136/bmjopen-2023-082461

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

Our study focused on postgraduate medical students, and the sample size was relatively large.

Previous research on critical thinking has focused primarily on Europe, the USA and Japan. Hence, researching critical thinking in Chinese populations is a valuable addition to this area.

Given the traditional limitations of cross-sectional studies, the findings of this study cannot be used as direct evidence of a causal relationship between potential influences and outcomes. Nevertheless, they can provide clues to reveal causal relationships.

Introduction

Critical thinking (CT) is reasoned, reflective thinking that decides what to believe or do. The emphasis is on reasonableness, reflection and decision-making. 1 CT is even more important in the medical field, where a lack of CT can lead to delayed or missed diagnoses, incorrect cognition and mismanagement. The centrality of CT is reflected in the competency framework of health professions and is a core skill of healthcare professionals. 2–6 Six crucial skills have been proposed to operationalise the definition of CT: interpretation, analysis, evaluation, inference, explanation and self-regulation. Specifically, interpretation involves comprehending the significance of information and conveying it effectively to others. Analysis requires piecing together fragmented data to decipher their intended purpose. Inference entails identifying and leveraging relevant information to formulate logical conclusions or hypotheses. Evaluation necessitates assessing the trustworthiness of a statement or information. Explanation aims to clarify shared information to ensure its comprehensibility to others. Finally, self-regulation pertains to regulating one’s thoughts, behaviours and emotions. 7–9

The role of CT in assisting medical students in navigating complex health scenarios and resolving clinical issues through sound decision-making is paramount. Extensive research has established positive correlations between CT and clinical proficiency, 10 11 academic excellence 12 and research capabilities. 13 Consequently, the Institute for International Medical Education has emphasised ‘CT and research’ as one of the seven crucial competencies that medical graduates must possess, as outlined in the Global Minimum Essential Requirements. 14 Similarly, the Ministry of Education in the People’s Republic of China has underscored the importance of ‘scientific attitude, innovation and CT’ as essential requirements for Chinese medical graduates. 15

Research on CT in medical students has been carried out to varying degrees in Western countries and many Asian countries. 16 17 Some scholars have pointed out that Western methods, including CT and clinical reasoning, are used in thinking skills education worldwide. However, there are significant differences between Chinese and Western culture, especially educational culture while cultural differences affect ways of thinking 17 18 ; therefore, previous research may not be able to reflect the actual situation of Chinese students and teaching methods may not apply to them. Most Western students tend to possess assimilating learning styles, enabling them to excel in student-centred learning environments. Conversely, Eastern students often exhibit accommodating learning styles that align more with teacher-centred instruction. 19 The discipline-based curriculum in China may not adequately foster the development of CT dispositions among Chinese medical students. This curriculum typically comprises isolated phases (theory, clerkship and internship), limited faculty–student interaction and a knowledge-focused evaluation system. 20

Previous research has suggested that a range of personal characteristics, including gender, major, blended learning methods, increased self-study hours, heightened self-efficacy in learning and performance, exposure to supportive environments and active participation in research activities, contribute to varying degrees of CT dispositions and skills. 21–24 A study conducted in Vietnam revealed that age, gender, ethnicity, educational level, health status, nursing experience, tenure at the current hospital, familiarity with ‘CT’ and job position all influence CT ability. 25 Furthermore, teacher support is paramount to learners’ mental and psychological development. This support encompasses educators’ empathy, compassion, commitment, reliability and warmth towards their students. 26 According to Tardy’s social support paradigm, 27 teacher support is defined as providing informational, instrumental, emotional or appraisal assistance to students, irrespective of their learning setting. Supportive teachers prioritise fostering personal relationships with their students and offering aid, assistance and guidance to those in need. 28 Practical teacher assistance can make students feel comfortable and inspired, motivating them to invest more effort in their studies, engage more actively in educational pursuits and achieve superior educational outcomes. 29

Current CT research on mainland Chinese medical students focuses on the impact of undergraduates’ experiences and classroom instruction. However, for postgraduates, their tutors play a more critical role in education and cultivation. According to Wosinski’s study, 30 tutors should be trained to effectively guide the teamwork of undergraduate nursing students during the problem-based learning (PBL) process to achieve their goals. There is no analysis of the influencing factors of CT focused on medical postgraduates.

Therefore, assessing the tutor’s effect on postgraduates’ CT disposition. This study investigated the associations between CT disposition and personal characteristics and tutors’ guidance among medical graduate students, which may provide a theoretical basis for cultivating CT.

Study design and participants

Study design.

This was a cross-sectional observational study. The project team sent 1525 electronic questionnaire links to WeChat groups of full-time medical graduate students in higher medical institutions in Sichuan and Chongqing between November and December 2021. After removing incomplete and duplicate questionnaires, a total of 1488 valid questionnaires were returned for an effective rate of 97.57%.

Sampling procedure

We employed a random sampling method to select medical graduate students carefully and used PASS V.15.0 software to calculate the sample size for different analyses and outcome scenarios. In the estimation of the sample size with the proportion of CT disposition as the primary outcome, we considered p=0.5, adopted the two-sided Z value under the significance level of a=0.05, and the sample size was the largest when the sampling error was 3%, which was 1067. Moreover, estimating of sample size with the correlation coefficient as the primary outcome, we considered r=0.1 according to the results from the prestudy, and the test power was 0.9; thus, we obtained n=1048. The sample size should be at least 1334 considering a 20% non-response rate.

The inclusion criteria were as follows: (1) full-time medical graduate students (clinical medicine, medicine technology, integrative Chinese and Western medicine, medical laboratory, nursing and so on) in higher medical institutions in Sichuan and Chongqing and (2) after reading the introduction to the research, participants voluntarily agreed to participate and electronically signed the study’s informed consent form. The exclusion criterion was a refusal to participate in the study.

Procedure and data collection

The electronic questionnaire we used consisted of a condensed version of the Critical Thinking Measurement Scale, which was used to evaluate participants’ scores on CT disposition and a Potential Influencing Factors Questionnaire, which investigated participants’ underlying information, personal factors and education-related factors. To increase the response rate, we told the students how long it might take to fill out this questionnaire when we sent the questionnaire link to WeChat groups. Moreover, our participants all had master’s degrees or above whose understanding ability and compliance were better. We also sent reminders to all invited participants three times, and the survey lasted approximately 1 month.

Critical Thinking Measurement Scale

We used the Chinese version of the short-form critical thinking disposition inventory (SF-CTDI-CV), which is based on the CTDI-CV reported by Huang. 31 The CTDI-CV includes seven subscales, namely Truth Seeking, Open-mindedness, Analyticity, Systematicity, Critical Thinking Self-confidence, Inquisitiveness and Cognitive Maturity, which have good reliability and validity (0.90 for the overall Cronbach’s alpha and 0.89 for the overall Content Validity Index). 32 Huang removed ineffective questions based on the CTDI-CV and obtained a simplified scale with 18 items of three factors, which increased the proportion of total explained variation and had better reliability and validity than the original version. Huang selected items according to important indicators in factor analysis, including factor loading and communality. Specifically, Huang removed items whose factor loading was less than 0.4 or whose commonality was less than 0.3. Each item of the SF-CTDI-CV has six options (Likert scale) from 1 to 6 (1 means complete agreement and 6 means disagree entirely); the higher the score is, the stronger the CT tendency. 31 The Kaiser-Meyer-Olkin (KMO) value for SF-CTDI-CV is 0.90 while the p value of Bartlett’s test is less than 0.05, indicating that this short-form inventory has ideal structural validity. A total score of 72 or more indicates a CT disposition, and all participants were divided into two groups according to whether they possessed essential characteristics of thinking.

Potential Influencing Factors Questionnaire

The Potential Influencing Factors Questionnaire was based on previous research and interviews. The interviewees including senior education practitioners and invited medical postgraduate students, focused on their experiences and feelings regarding medical education in China. We compiled an interview outline and invited a total of 22 professionals, including 9 doctoral candidates, 5 doctoral supervisors, 2 counsellors and 6 young backbone teachers, to participate in the interviews. The interview schedule is flexible, but to ensure efficiency, we controlled the interview duration for each participant to within 40 min. After the interviews, we used professional NVivo V.11.0 software to analyse the collected interview data thoroughly.

The Potential Influencing Factors Questionnaire consists of 10 questions in the essential information section, 35 questions in the influencing factors section and 3 flexible questions, for 48 valid entries. The essential information section includes gender, age, secondary education background, higher education major, level of education, type of degree, full-time work experience, type of household registration, the highest level of parental education and whether the respondent was from an only child family. The influencing factors section can be grouped into two main areas: ‘personal factors’ and ‘educational factors’, with personal factors including the individual characteristics section. The educational factors include the practice and training, tutor and team, and educational environment section. This study defines every potential factor as an ordinal variable, with greater rank, depth and frequency of the corresponding factors. For reliability, Cronbach’s alpha=0.795 indicates that the questionnaire’s reliability is good enough for investigation. The content validity of the questionnaire was tested to determine whether the content met the objectives and requirements of the study. Most of the items of the influencing factors questionnaire were selected from previous literature, and the content validity was good. The KMO values and p values for the Bartlett’s test of sphericity for every aspect indicate that the structural validity of the questionnaire is good (see more details in online supplemental table S1 ).

Supplemental material

In the questionnaire design process, we first formed a preliminary framework concerning previous qualitative and quantitative research. Then we conducted interviews with educators, doctoral supervisors and representatives of medical postgraduate students according to the initial framework to understand their work experience in the practice of medical postgraduate education in China. Then, the questionnaire was supplemented according to the frequently mentioned items in the interviews. Finally, a questionnaire focusing on whether personal and educational pathways influence the formation of CT disposition was developed, as well as the key points of CT cultivation.

Data collection and organisation

The project team designed the electronic questionnaire based on the Influencing Factors Questionnaire and Critical Thinking Measurement Scale. Excel 2019 collated the raw data exported from the electronic questionnaire platform. Using the electronic questionnaire platform, answer completion settings rule out the possibility of logical anomalies. Samples with missing answers on the Critical Thinking Inventory were eliminated. Participants who were missing other information were asked to fill in as much as possible through the telephone number they had left. Those who were unable to do so were eliminated. Each factor in the influencing factors section was assigned a value in steps of 1 from lowest to highest (eg, the four categorical variables were assigned values of 1, 2, 3, and 4; 1 for never and 4 for always).

Students and public involvement

Former students were involved in the preparatory phase of this study. They reviewed the informed consent form and provided feedback.

Statistical analysis

The data were analysed by using SPSS V.24.0 software. The distribution of the study participants’ underlying characteristics and CT were described and tested. Continuous variables are described as the mean±SD, and t-tests or one-way analysis of variance (ANOVA) were used for hypothesis tests. Categorical variables are expressed as composition ratios and χ 2 tests are used for hypothesis tests. Correlation analysis: The Spearman rank correlation coefficient was used to evaluate the correlation between each factor and the CT score. Difference analysis: Trend ANOVA was used to test whether there was a trend change in CT scores at different levels of each potential influencing factor. A t-test was used to compare the differences in the levels of influencing factors between different CT trait groups. Factors with differences between groups were included in a multivariate unconditional logistic regression model. We fitted several multivariate logistic regression models to evaluate potential confounding variables. By comparing the χ 2 value, the −2-likelihood ratio, the Akaike information criterion, and the practical meanings of this study’s interesting factors, the final model in which X variables could explain most of the Y variables (CT scores) was chosen. The above tests were performed at 0.05, and a p<0.05 was considered statistically significant.

Essential characteristics

A total of 1488 medical graduate students were included in this study, with an average age of 26.63±3.72 years. Most of the participants had a science background in high school (96.84%), a higher education major in clinical medicine (78.43%) and had never participated in full-time work (71.91%). Most of the participants were female (65.93%), lived in urban areas (61.69%), had parents with junior school education or below (39.18%), were not the only child in the family (51.48%), scientific graduate students (51.61%) and had a master’s degree (55.51%). Among all the research subjects, the average total CT score was 81.79±11.42 points, and the proportion of CT (score ≥72 points) was 78.9% (1174/1488). The essential characteristics of the included subjects are shown in table 1 .

  • View inline

Participants’ essential characteristics and the distribution of critical thinking dispositions

Distribution of CT disposition

Table 1 demonstrates the distribution of CT disposition among the study participants. For the essential CT scores, participants with urban residence, higher parental education, only-child families, a science background before admission, science-based graduates, longer full-time employment and higher education levels had significantly greater CT scores (p<0.05). According to the CT questionnaire used in this project, subjects with a score more excellent than 72 points were considered to have an apparent CT disposition. The results showed that among our participants, women (80.80% vs 75.10%), science students (79.50% vs 61.70%) and PhD students (81.60% vs 76.80%) had a more significant proportion of CT disposition (p<0.05).

CT scores are linearly correlated with impact factor scores

Table 2 shows the correlation between each factor and the CT scores. The Spearman correlation coefficients suggested that most terms related to personal factors were correlated with CT scores (p<0.001). Sense of achievement (r=0.324), curiosity (r=0.480) and following others’ opinions in decision-making (r=−0.292) were strongly correlated with CT scores. Regarding educational factors, all factors in the practice and training section, all factors in the tutor and team section, and most factors in the educational environment impacted CT scores (p<0.001). Factors in the tutor and team section were more strongly related to CT scores, such as teaching students according to their aptitude (r=0.247) and tutors asking heuristic questions (r=0.242). Only no allow of doubt to tutors hurt the CT scores (r=−0.179, p<0.001).

The correlation between the potential influencing factors and the score of critical thinking

Factors influencing CT disposition

Univariate analysis.

The influencing factors associated with CT disposition are presented in table 3 . Univariate analysis revealed that in terms of personal factors, a sense of achievement, curiosity and interpersonal skills were all possible facilitators of CT disposition (p<0.05), and the group with CT disposition had higher average scores. In contrast, fatigue and burn-out, inattention and following others’ opinions in decision-making were possible hindering factors. Regarding educational factors, most factors in the ‘practice and training’ section, all factors in the ‘tutor and team’ section, and some factors in the ‘educational environment’ section were impact factors on CT disposition. In the practice and traning section, academic performance (p<0.001), number of intensively reading (p<0.001), paper writing (p=0.001), participation in academic conferences (p=0.009), completion of scientific research design with reference (p<0.001), time for extracurricular reading (p=0.006), summarisation and reflection (p<0.001), asking ‘why’ (p<0.001) and knowledge of critical thinking (p<0.001) were all positively related to CT disposition. For the tutor and team section, participants with CT disposition had higher scores for the following factors (p<0.01): tutors sharing thinking methods, communicating learning and life with tutors, tutors asking heuristic questions, encouragement of using ‘possible’ and ‘potential’, advocation of logical thinking training and lifelong learning, teaching students according to their aptitude and team members’ logical thinking skills. No allow to doubt tutors hurt CT disposition (p<0.001). The use of multifunctional classrooms (p<0.001) and having active classes (TBL class, flipped class, p=0.006) in the educational environment section were also correlated with CT disposition.

Impact factors

Multivariate logistics regression analyses

Multivariate logistics regression analysis demonstrated that female (OR 1.405, 95% CI 1.042 to 1.895), curiosity (OR 1.847, 95% CI 1.459 to 2.338), completion of scientific research design with reference (OR 1.779, 95% CI 1.460 to 2.167), asking ‘why’ (OR 1.942, 95% CI 1.508 to 2.501) and team members’ logical thinking ability (OR 1.373, 95% CI 1.122 to 1.681) were the promoting factors for the development of CT disposition after adjusting for other confounding factors. However, exhaustion and burn-out (OR 0.721, 95% CI 0.526 to 0.989), inattention (OR 0.572, 95% CI 0.431 to 0.759) and following others’ opinions in decision-making (OR 0.425, 95% CI 0.337 to 0.534) and no allow of doubt to tutors (OR 0.674, 95% CI 0.561 to 0.809) may be hindering factors for the formation of CT disposition in the fully adjusted model ( table 4 , adjusted R 2 =0.323).

Multifactor regression model

This cross-sectional study explored the factors influencing the CT disposition of Chinese medical graduate students in terms of both personal and educational factors. A total of 78.9% of the participants in this study had positive CT dispositions (score ≥72, 1174/1488), and women were 40.5% more likely than men to have CT dispositions (OR 1.405, 95% CI 1.042 to 1.895). Multivariate logistics regression analysis revealed that among personal factors, curiosity was the promoting factor while exhaustion and burn-out, inattention and following others’ opinions in decision-making may be the hindering factors. For educational factors, completing the scientific research design with reference, asking ‘why’ and the high logical thinking ability of team members were associated with CT disposition. However, no allow of doubt to tutors may hinder the disposition of CT.

According to our demographic information, our study revealed a greater prevalence of CT disposition among women, aligning with Zhai’s findings. 22 Several factors may contribute to this observed gender disparity. A systematic review established that men tend to engage more with objects while women prefer interpersonal interactions. 33 Women are more inclined to engage in dialogue and foster their understanding through collaborative learning, often exhibiting a more receptive mindset. Second, a study using fractional anisotropy measures derived from diffusion tensor imaging in 425 participants, including 118 males, revealed that divergent thinking in females correlates positively with fractional anisotropy in the corpus callosum and the right superior longitudinal fasciculus. 34 Conversely, it correlates with fractional anisotropy in the right tapetum in males. Zhang et al ’s 34 research sheds light on the sex-specific structural connectivity within and between hemispheres that underpins divergent thinking. These gender differences in creativity may reflect the inherent diversity between males and females in society. However, Faramarzi and Khafri 35 reported contrasting results. They concluded that although the results differed between the sexes, the likely cause was females’ higher education level rather than a difference due to gender. Several studies concur that self-esteem is a principal determinant of CT. 22 35 Barkhordary et al , 36 in their study of 170 third-year and fourth-year nursing students in Yazd, identified a significant link between CT and self-esteem. Pilevarzadeh et al further demonstrated that students with higher self-esteem exhibit more robust CT skills. 37 Self-esteem is defined as ‘an individual’s overall subjective emotional assessment of their worth’. 38 Bleidorn et al 39 conducted a groundbreaking large-scale, cross-cultural study with an internet sample of 985 937 participants, examining gender and age differences in self-esteem across 48 nations. They discovered significant gender differences, with males consistently reporting higher self-esteem levels than females, which may influence their responses to negative feedback to some degree.

In the section on personal factors, the results of this study on personal internal and external environmental factors such as curiosity, burn-out and inattention are consistent with previous studies. 40–45 The relationship between these internal and external environmental factors and cognitive capacity has been described in cognitive load theory, 46 particularly the role of ‘working memory’, the capacity to process information. Specifically, researchers 47 reported on a consensus on CT teaching, assessment and faculty development compiled by a high-level team recommended by 32 medical schools across the USA. Learners’ personal attributes, characteristics, perspectives and behaviours are critical components of their motivation to prepare for and engage in deeper learning and laborious clinical reasoning. Distractions and interruptions, on the other hand, can reduce attention to important issues, affecting learners’ ability to engage in clinical reasoning and their CT skills. 48 Making decisions based on the opinions of others in this study may reflect the participants’ interdependent view of self, which was identified by Futami et al 49 as a negative factor for CT dispositions.

Regarding the educational factors, learning methods and research group membership characteristics were more strongly associated with CT disposition than learning frequency and learning form. Completing the scientific research design with reference and asking ‘why’ are learning methods that promote the formation of CT for medical graduate students. Research 50 suggests that CT requires a persistent effort to test any belief or supposed form of knowledge according to the evidence supporting it and the further conclusions it tends to help. Completing scientific research design with reference is the specific manifestation of evidence-based reasoning in the scientific research field, which may be why it affects the formation process of CT. Furthermore, similar to our research, much research has explored the crucial role that questioning or problem-based thinking plays in CT. 47 51–53 Our research also suggested that the teaching style of the group supervisor and the logical thinking ability of other group members also impacted CT dispositions. Although no previous research has explored the role-specific behaviours of subject mentors and peers in CT disposition from a quantitative perspective, Futami et al 49 reported higher CT scores for subjects who had connections with research experts, suggesting a positive influence of research mentors on CT. Self-esteem positively affects CT, and overbearing instructors may undermine students’ self-esteem and, thus, their CT disposition. Moreover, several authors 47 53 54 have argued that professors’ encouragement of students to express uncertainty, to question and assess the quality of knowledge learnt, and to improve team members’ logical thinking skills are all positively associated with CT, consistent with our findings.

The CT scores in our study were lower than those in several Western countries among medical students, 55 56 possibly because of differences in educational culture and methods. In China, medical education comprises three stages: primary medical education, clinical education and internships. Primary medical education introduces students to the medical world. The delivery of traditional courses used to be prescribed and even dull simply because teachers were accustomed to a conventional teaching style and were afraid of making changes to course delivery. 57 The strategies to develop reflective and CT in nursing students in eight countries indicated that reflexive CT was found in most curricula, although with diverse denominations. The principal learning strategies used were PBL, group dynamics, reflective reading, clinical practice and simulation laboratories. The evaluation methods are the knowledge test, case analysis and practical exam. 58

The importance of early clinical exposure is universally acknowledged, particularly in developing countries where its value is profoundly esteemed. For instance, the South African Health Professions Council has spearheaded educational reforms for medical professionals, enabling first-year medical students to participate in healthcare visits. These visits aim to enrich the comprehension of future professional environments and foster a more profound passion for medicine. 59 Notably, most students perceived these visits as invaluable learning experiences, leaving them better prepared for medical practice. Similarly, Chinese medical colleges offer comparable programmes spanning 1–2 weeks. A Peking University study using questionnaires and reports revealed that all students benefited from these activities, gaining perceptual knowledge of clinical work. Remarkably, 61.5% of students reported that their early clinical exposure had significantly assisted them. 60

Interestingly, there was a more significant proportion of PhD students with a CT disposition in our study. This may be because doctoral research is more in-depth and complex, requiring students to engage in more detailed, rigorous and innovative thinking based on their existing knowledge. During the research process, doctoral students must constantly question, analyse, evaluate and reconstruct knowledge, which undoubtedly exercises and enhances their CT abilities. 61 However, this does not imply that master’s students possess lower CT skills than doctoral students. The master’s programme also emphasises cultivating CT, although possibly differing in depth and breadth. Both stages have unique development paths and manifestations in terms of CT. Regardless of the stage, graduate students should focus on developing their CT skills to address challenges in academic research and life.

Our research revealed that factors influencing CT motivation appear to be more closely linked to CT tendencies in personal and educational components. Miele and Wigfield 50 suggested that the factors affecting students’ critical analytical thinking motivation can be divided into two aspects: quantity and quality, the quantitative relationship between motivation and CT, that is, whether students have sufficient motivation to make high-level spiritual efforts. This is reflected in our study regarding curiosity, burn-out, distraction, an interdependent self-view and influence by research team members. The qualitative relationship is the willingness of students to engage in CT, which corresponds to the desire to ask ‘why’ and to refer to existing evidence to complete a research design in this study. This suggests that internal motivation may play an essential role in CT and that educators should focus more on maintaining students’ motivation and building awareness than on the frequency of rigid external research training and curriculum formats. Students are actively promoted and encouraged to apply CT in practice. At the same time, the existing overly outcome-oriented reward mechanism is changed, and assessment criteria are enriched, for example, by including ‘whether you ask interesting questions’ as one of the criteria for classroom assessment to motivate people to become more proactive learners. Recently, medical education has garnered considerable attention and traditionally assumes that medical students are inherently motivated by their dedication to specialised training and a highly focused profession. However, motivation plays a crucial role in determining the quality of learning and ultimate success. Its absence may provide a plausible explanation for why teachers occasionally encounter medical students who appear discouraged, have lost interest or abandon their studies, feeling a sense of powerlessness or resignation. 62

To foster CT among medical students, educational reform should encompass several key aspects: (1) Encouraging active learning and exploration: Teachers must urge students to engage actively in the learning process, providing resources and guidance to kindle their intellectual curiosity. This will empower students to seek out challenges, pose inquiries and address them through a critical lens. 63 (2) Implementing heuristic learning and case studies: Educators should incorporate case studies, enabling students to hone their CT, discriminatory skills and decision-making abilities by analysing authentic or hypothetical scenarios. 64 65 (3) Stressing the mastery of professional knowledge: It is imperative to ensure that students grasp the fundamental theories and principles of the medical field, along with proficiency in practical medical skills. 66 (4) Nurturing teamwork skills: Group discussions, collaborative projects and similar activities should be used to cultivate teamwork among medical students. This teaches them to listen attentively, manage team dynamics, and allocate resources effectively, enhancing their CT and problem-solving capabilities. 67 (5) Providing clinical practical experience: Early exposure to clinical practice is crucial in developing students’ analytical and problem-solving skills through firsthand observation and participation in real-life case management. 68 (6) Shifting teachers’ roles: Educators must evolve into mentors and role models for CT, leading by example and inspiring students through their practices and teachings. 69 Collectively, these recommendations for educational reform will empower medical students to address intricate issues they may encounter in their future medical careers, ultimately increasing the quality and safety of healthcare services.

It is worth noting that our questionnaire incorporated many potential entries with high reliability. It mostly also showed differences between the two groups with or without CT disposition in univariate analysis but were not ultimately presented in the regression models. These factors are meaningful for the development of CT but taking into account the simplicity and informativeness of the model, other entries in the model may have represented them. Our model explained more of the variance in CT than regression models from previous studies. 49 70 71

Strengths and limitations

This study has particular strengths. First, the questionnaire for this study was scientific and practice based. The findings of previous studies on personal and educational factors were extensively referenced, and in-depth interviews were also conducted. Second, our study focused on postgraduate medical students and the sample size was relatively large. Postgraduate medical students are the key group for CT development, and the findings obtained among postgraduate medical students are more relevant and better reflect the thinking characteristics of postgraduate medical students. Research from China has considerably enriched the worldwide sample of CT influencing factors. It has been suggested that cultural context strongly influences CT, 72 but previous research on CT has mostly focused on Europe, the USA and Japan. Therefore, researching CT in Chinese populations is a valuable addition to this area. In addition, this study is the first to quantitatively explore the impact of tutor and team on CT disposition. For Chinese postgraduates, tutors and their scientific research teams are the people who have the most contact during their studies. In our previous interviews, educators, tutors and postgraduates all recognised the vital role of tutors in postgraduate education, especially in the cultivation of thinking. Based on interviews and literature extraction, we summarise the specific influence of tutors and teams and present them as numerical indicators to refine the influence of tutors on educational factors to make them more comprehensive and exact.

There are several limitations to our study. First, given the traditional constraints of cross-sectional studies, the findings of this study cannot be used as direct evidence of a causal relationship between potential influences and outcomes. Still, they can provide clues to reveal causal relationships. Second, some influencing factors, such as participation in project submissions, participation in CT courses, attempts at innovation and entrepreneurship, and exchange abroad may need to be revised when measured due to limited educational resources. The lack of opportunity for most students to participate in the projects mentioned above, even if they had the will to do so, may help obscure the correlation between CT and these factors. Our regression models did not include other factors of the same type with higher coverage, such as article writing. This suggests that specific formal factors do not significantly influence CT disposition and that bias may not affect the overall results. In addition, we did not use the CTDI-CV scale. Given the busy workload of postgraduate medical students and the fact that online surveys are challenging to monitor and quality control, to avoid as much as possible the impact of too many questions on the quality of the study and to increase the recall rate, we used a condensed version of the Critical Thinking Scale, which has a greater total explained variance than the CTDI-CV scale and has good reliability and validity.

Conclusions

In conclusion, this study provides a comprehensive scientific assessment of the factors influencing the CT disposition of Chinese medical postgraduates in terms of personal and educational factors. Being curious, completing the scientific research design with reference, asking ‘why’, and having high logical thinking ability among team members were positively associated with CT. Exhaustion and burn-out, inattention, following others’ opinions in decision-making and not allowing to doubt tutors were negatively associated with CT scores. These findings suggest that we pay more attention to factors related to motivation and internal drive in our educational practice, shift from an outcome-focused reward mechanism and focus on motivation maintenance to build students’ CT awareness.

Ethics statements

Patient consent for publication.

Not applicable.

Ethics approval

The research team collected data after obtaining their consent and signatures on the study’s informed consent form. The Ethics Committee of West China Hospital (tertiary), Sichuan University, approved the study in 2021 (Ethics No. 980).

Acknowledgments

The authors want to acknowledge the medical students who participated in this study.

  • Westerdahl F ,
  • Carlson E ,
  • Wennick A , et al
  • Diamond-Fox S ,
  • Biesta GJJ ,
  • van Braak M
  • Barnes TA ,
  • Kacmarek RM , et al
  • Persky AM ,
  • Medina MS ,
  • Castleberry AN
  • Kuo SH , et al
  • Onwuegbuzie AJ
  • Committee C
  • Ministry of Education, Ministry of Health
  • Christenson M ,
  • Chandratilake M , et al
  • Hodges BD ,
  • Maniate JM ,
  • Martimianakis MAT , et al
  • Chuenjitwongsa S ,
  • Bullock A ,
  • Colucciello ML
  • Nguyen TV ,
  • Kuo S-Y , et al
  • Liu J , et al
  • Hung C-A , et al
  • Van Nguyen T ,
  • Hejazi SY ,
  • Zhang H , et al
  • Wosinski J ,
  • Belcher AE ,
  • Dürrenberger Y , et al
  • Briley DA ,
  • Wee CJM , et al
  • Fan L , et al
  • Faramarzi M ,
  • Barkhordary M ,
  • Jalalmanesh S ,
  • Pilevarzadeh M ,
  • Mashayekhi F ,
  • Faramarzpoor M , et al
  • Bleidorn W ,
  • Arslan RC ,
  • Denissen JJA , et al
  • Jimenez J-M ,
  • Castro M-J , et al
  • Gershman SJ ,
  • Ratcliffe T ,
  • Picho K , et al
  • Durning SJ ,
  • Costanzo M ,
  • Artino AR , et al
  • Van Merrienboer J ,
  • Durning S , et al
  • Newman LR ,
  • Schwartzstein RM
  • Ratwani RM ,
  • Puthumana JS , et al
  • Noguchi-Watanabe M ,
  • Mikoshiba N , et al
  • Ghezzi JFSA ,
  • Higa E de FR ,
  • Lemes MA , et al
  • Henderson KJ ,
  • Coppens ER ,
  • Richards JB ,
  • Frederix GW ,
  • Profetto-McGrath Joanne
  • Cianelli R ,
  • Valenzuela J , et al
  • Yuan W , et al
  • Cárdenas-Becerril L ,
  • Jiménez-Gómez MA ,
  • Bardallo-Porras MD , et al
  • Wang H , et al
  • Pelaccia T ,
  • D’Isanto T ,
  • Aliberti S ,
  • Altavilla G , et al
  • Kaluzeviciute Greta
  • Sa YTR , et al
  • De Prada E ,
  • Mareque M ,
  • Pino-Juste M
  • Hughes LJ ,
  • Wardrop R , et al
  • Godoy-Pozo J , et al
  • Zhang Y. Q ,
  • Chen Y , et al

LW and WC contributed equally.

Contributors LW and WC were involved in designing the study, reviewing the literature, designing the protocol, developing the questionnaire, collecting the data, performing the statistical analysis and preparing the manuscript. TH and W-BH were involved in searching and collecting the data. YW was involved in interpreting the data and critically reviewed the manuscript. YW is responsible for the overall content as the guarantor . All the authors have read and approved the final manuscript.

Funding This study was supported by the Sichuan University Postgraduate Education Reform project (GSSCU2021038).

Competing interests None declared.

Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Read the full text or download the PDF:

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 31 August 2024

Development and validation of a higher-order thinking skills (HOTS) scale for major students in the interior design discipline for blended learning

  • Dandan Li 1 ,
  • Xiaolei Fan 2 &
  • Lingchao Meng 3  

Scientific Reports volume  14 , Article number:  20287 ( 2024 ) Cite this article

Metrics details

  • Environmental social sciences

Assessing and cultivating students’ HOTS are crucial for interior design education in a blended learning environment. However, current research has focused primarily on the impact of blended learning instructional strategies, learning tasks, and activities on the development of HOTS, whereas few studies have specifically addressed the assessment of these skills through dedicated scales in the context of blended learning. This study aimed to develop a comprehensive scale for assessing HOTS in interior design major students within the context of blended learning. Employing a mixed methods design, the research involved in-depth interviews with 10 education stakeholders to gather qualitative data, which informed the development of a 66-item soft skills assessment scale. The scale was administered to a purposive sample of 359 undergraduate students enrolled in an interior design program at a university in China. Exploratory and confirmatory factor analyses were also conducted to evaluate the underlying factor structure of the scale. The findings revealed a robust four-factor model encompassing critical thinking skills, problem-solving skills, teamwork skills, and practical innovation skills. The scale demonstrated high internal consistency (Cronbach's alpha = 0.948–0.966) and satisfactory convergent and discriminant validity. This scale provides a valuable instrument for assessing and cultivating HOTS among interior design major students in blended learning environments. Future research can utilize a scale to examine the factors influencing the development of these skills and inform instructional practices in the field.

Similar content being viewed by others

critical thinking and scientific method

A meta-analysis of the effects of design thinking on student learning

critical thinking and scientific method

Blended knowledge sharing model in design professional

critical thinking and scientific method

Using design thinking for interdisciplinary curriculum design and teaching: a case study in higher education

Introduction.

In the contemporary landscape of the twenty-first century, students face numerous challenges that necessitate the development of competitive skills, with a particular emphasis on the cultivation of HOTS 1 , 2 , 3 , this has become a crucial objective in educational reform. Notably, it is worth noting that the National Education Association (NEA, 2012) has clearly identified critical thinking and problem-solving, communication, collaboration, creativity, and innovation as key competencies that students must possess in the current era, which are considered important components of twenty-first century skills 4 , 5 , 6 , 7 . As learners in the fields of creativity and design, students in the interior design profession also need to possess HOTS to address complex design problems and the evolving demands of the industry 8 , 9 .

Currently, blended learning has become an important instructional model in interior design education 10 , 11 . It serves as a teaching approach that combines traditional face-to-face instruction with online learning, providing students with a more flexible and personalized learning experience 12 , 13 . Indeed, several scholars have recognized the benefits of blended learning in providing students with diverse learning resources, activities, and opportunities for interaction, thereby fostering HOTS 14 , 15 , 16 , 17 . For example, blended learning, as evidenced by studies conducted by Anthony et al. 10 and Castro 11 , has demonstrated its efficacy in enhancing students' HOTS. The integration of online resources, virtual practices, and online discussions in blended learning fosters active student engagement and improves critical thinking, problem solving, and creative thinking skills. Therefore, teachers need to determine appropriate assessment methods and construct corresponding assessment tasks to assess students' expected learning outcomes. This decision requires teachers to have a clear understanding of students' learning progress and the development of various skills, whereas students have knowledge of only their scores and lack awareness of their individual skill development 18 , 19 .

Nevertheless, the precise assessment of students' HOTS in the blended learning milieu poses a formidable challenge. The dearth of empirically validated assessment tools impedes researchers from effectively discerning students' levels of cognitive aptitude and developmental growth within the blended learning realm 20 , 21 , 22 . In addition, from the perspective of actual research topics, current studies on blended learning focus mainly on the "concept, characteristics, mechanisms, models, and supporting technologies of blended learning 23 . " Research on "measuring students' HOTS in blended learning" is relatively limited, with most of the focus being on elementary, middle, and high school students 24 , 25 . Few studies have specifically examined HOTS measurement in the context of university students 26 , 27 , particularly in practical disciplines such as interior design. For example, Bervell et al. 28 suggested that the lack of high-quality assessment scales inevitably impacts the quality of research. Additionally, Schmitt 29 proposed the “Three Cs” principle for measurement, which includes clarity, coherence, and consistency. He highlighted that high-quality assessment scales should possess clear and specific measurement objectives, logically coherent items, and consistent measurement results to ensure the reliability and validity of the data. This reflects the importance of ensuring the alignment of the measurement goals of assessment scales with the research questions and the content of the discipline in the design of assessments.

The development of an assessment scale within the blended learning environment is expected to address the existing gap in measuring and assessing HOTS scores in interior design education. This scale not only facilitates the assessment of students' HOTS but also serves as a guide for curriculum design, instructional interventions, and student support initiatives. Ultimately, the integration of this assessment scale within the blended learning environment has the potential to optimize the development of HOTS among interior design students, empowering them to become adept critical thinkers, creative problem solvers, and competent professionals in the field.

Therefore, this study follows a scientific scale development procedure to develop an assessment scale specifically designed to measure the HOTS of interior design students in blended learning environments. This endeavor aims to provide educators with a reliable instrument for assessing students' progress in cultivating and applying HOTS, thus enabling the implementation of more effective teaching strategies and enhancing the overall quality of interior design education. The research questions are as follows:

What key dimensions should be considered when developing a HOTS assessment scale to accurately capture students' HOTS in an interior design major blended learning environment?

How can an advanced thinking skills assessment scale for blended learning in interior design be developed?

How can the reliability and validity of the HOTS assessment scale be verified and ensured, and is it reliable and effective in the interior design of major blended learning environments?

Key dimensions of HOTS assessment scale in an interior design major blended learning environment

The research results indicate that in the blended learning environment of interior design, this study identified 16 initial codes representing key dimensions for enhancing students' HOTS. These codes were further categorized into 8 main categories and 4 overarching themes: critical thinking, problem-solving, teamwork skills and practical innovation skills. They provide valuable insights for data comprehension and analysis, serving as a comprehensive framework for the HOTS scale. Analyzing category frequency and assessing its significance and universality in a qualitative dataset hold significant analytical value 30 , 31 . High-frequency terms indicate the central position of specific categories in participants' narratives, texts, and other data forms 32 . Through interviews with interior design experts and teachers, all core categories were mentioned more than 20 times, providing compelling evidence of their universality and importance within the field of interior design's HOTS dimensions. As shown in Table 1 .

Themes 1: critical thinking skills

Critical thinking skills constitute a key core category in blended learning environments for interior design and are crucial for cultivating students' HOTS. This discovery emphasizes the importance of critical thinking in interior design learning. This mainly includes the categories of logical reasoning and judgment, doubt and reflection, with a frequency of more than 8, highlighting the importance of critical thinking skills. Therefore, a detailed discussion of each feature is warranted. As shown in Table 2 .

Category 1: logical reasoning and judgment

The research results indicate that in a blended learning environment for interior design, logical reasoning and judgment play a key role in cultivating critical thinking skills. Logical reasoning refers to inferring reasonable conclusions from information through analysis and evaluation 33 . Judgment is based on logic and evidence for decision-making and evaluation. The importance of these concepts lies in their impact on the development and enhancement of students' HOTS. According to the research results, interior design experts and teachers unanimously believe that logical reasoning and judgment are very important. For example, as noted by Interviewee 1, “For students, logical reasoning skills are still very important. Especially in indoor space planning, students use logical reasoning to determine whether the layout of different functional areas is reasonable”. Similarly, Interviewee 2 also stated that “logical reasoning can help students conduct rational analysis of various design element combinations during the conceptual design stage, such as color matching, material selection, and lighting application”.

As emphasized by interviewees 1 and 2, logical reasoning and judgment are among the core competencies of interior designers in practical applications. These abilities enable designers to analyze and evaluate design problems and derive reasonable solutions from them. In the interior design industry, being able to conduct accurate logical reasoning and judgment is one of the key factors for success. Therefore, through targeted training and practice, students can enhance their logical thinking and judgment, thereby better addressing design challenges and providing innovative solutions.

Category 2: skepticism and reflection

Skepticism and reflection play crucial roles in cultivating students' critical thinking skills in a blended learning environment for interior design. Doubt can prompt students to question and explore information and viewpoints, whereas reflection helps students think deeply and evaluate their own thinking process 34 . These abilities are crucial for cultivating students' higher-order thinking skills. According to the research findings, most interior design experts and teachers agree that skepticism and reflection are crucial. For example, as noted by interviewees 3, “Sometimes, when facing learning tasks, students will think about how to better meet the needs of users”. Meanwhile, Interviewee 4 also agreed with this viewpoint. As emphasized by interviewees 3 and 4, skepticism and reflection are among the core competencies of interior designers in practical applications. These abilities enable designers to question existing perspectives and practices and propose innovative design solutions through in-depth thinking and evaluation. Therefore, in the interior design industry, designers with the ability to doubt and reflect are better able to respond to complex design needs and provide clients with unique and valuable design solutions.

Themes 2: problem-solving skills

The research findings indicate that problem-solving skills constitute a key core category in blended learning environments for interior design and are crucial for cultivating students' HOTS. This discovery emphasizes the importance of problem-solving skills in interior design learning. Specifically, categories such as identifying and defining problems, as well as developing and implementing plans, have been studied more than 8 times, highlighting the importance of problem-solving skills. Therefore, it is necessary to discuss each function in detail to better understand and cultivate students' problem-solving skills. As shown in Table 3 .

Category 1: identifying and defining issues

The research findings indicate that in a blended learning environment for interior design, identifying and defining problems play a crucial role in fostering students' problem-solving skills. Identifying and defining problems require students to possess the ability to analyze and evaluate problems, enabling them to accurately determine the essence of the problems and develop effective strategies and approaches to solve them 35 . Interior design experts and teachers widely recognize the importance of identifying and defining problems as core competencies in interior design practice. For example, Interviewee 5 emphasized the importance of identifying and defining problems, stating, "In interior design, identifying and defining problems is the first step in addressing design challenges. Students need to be able to clearly identify the scope, constraints, and objectives of the problems to engage in targeted thinking and decision-making in the subsequent design process." Interviewee 6 also supported this viewpoint. As stressed by Interviewees 5 and 6, identifying and defining problems not only require students to possess critical thinking abilities but also necessitate broad professional knowledge and understanding. Students need to comprehend principles of interior design, spatial planning, human behavior, and other relevant aspects to accurately identify and define problems associated with design tasks.

Category 2: developing and implementing a plan

The research results indicate that in a blended learning environment for interior design, developing and implementing plans plays a crucial role in cultivating students' problem-solving abilities. The development and implementation of a plan refers to students identifying and defining problems, devising specific solutions, and translating them into concrete implementation plans. Specifically, after determining the design strategy, students refine it into specific implementation steps and timelines, including drawing design drawings, organizing PPT reports, and presenting design proposals. For example, Interviewee 6 noted, “Students usually break down design strategies into specific tasks and steps by refining them.” Other interviewees also unanimously support this viewpoint. As emphasized by respondent 6, developing and implementing plans can help students maintain organizational, systematic, and goal-oriented problem-solving skills, thereby enhancing their problem-solving skills.

Themes 3: teamwork skills

The research results indicate that teamwork skills constitute a key core category in blended learning environments for interior design and are crucial for cultivating students' HOTS. This discovery emphasizes the importance of teamwork skills in interior design learning. This mainly includes communication and coordination and division of labor and collaboration, which are mentioned frequently in the interview documents. Therefore, it is necessary to discuss each function in detail to better understand and cultivate students' teamwork skills. As shown in Table 4 .

Category 1: communication and coordination

The research results indicate that communication and collaboration play crucial roles in cultivating students' teamwork abilities in a blended learning environment for interior design. Communication and collaboration refer to the ability of students to effectively share information, understand each other's perspectives, and work together to solve problems 36 . Specifically, team members need to understand each other's resource advantages integrate and share these resources to improve work efficiency and project quality. For example, Interviewee 7 noted, “In interior design, one member may be skilled in spatial planning, while another member may be skilled in color matching. Through communication and collaboration, team members can collectively utilize this expertise to improve work efficiency and project quality.” Other interviewees also unanimously believe that this viewpoint can promote students' teamwork skills, thereby promoting the development of their HOTS. As emphasized by the viewpoints of these interviewees, communication and collaboration enable team members to collectively solve problems and overcome challenges. Through effective communication, team members can exchange opinions and suggestions with each other, provide different solutions, and make joint decisions. Collaboration and cooperation among team members contribute to brainstorming and finding the best solution.

Category 2: division of labor and collaboration

The research results indicate that in the blended learning environment of interior design, the division of labor and collaboration play crucial roles in cultivating students' teamwork ability. The division of labor and collaboration refer to the ability of team members to assign different tasks and roles in a project based on their respective expertise and responsibilities and work together to complete the project 37 . For example, Interviewee 8 noted, “In an internal design project, some students are responsible for space planning, some students are responsible for color matching, and some students are responsible for rendering production.” Other interviewees also support this viewpoint. As emphasized by interviewee 8, the division of labor and collaboration help team members fully utilize their respective expertise and abilities, promote resource integration and complementarity, cultivate a spirit of teamwork, and enable team members to collaborate, support, and trust each other to achieve project goals together.

Themes 4: practical innovation skills

The research results indicate that practical innovation skills constitute a key core category in blended learning environments for interior design and are crucial for cultivating students' HOTS. This discovery emphasizes the importance of practical innovation skills in interior design learning. This mainly includes creative conception and design expression, as well as innovative application of materials and technology, which are often mentioned in interview documents. Therefore, it is necessary to discuss each function in detail to better understand and cultivate students' practical innovation skills. As shown in Table 5 .

Category 1: creative conception and design expression

The research results indicate that in the blended learning environment of interior design, creative ideation and design expression play crucial roles in cultivating students' practical and innovative skills. Creative ideation and design expression refer to the ability of students to break free from traditional thinking frameworks and try different design ideas and methods through creative ideation, which helps stimulate their creativity and cultivate their ability to think independently and solve problems. For example, interviewee 10 noted that "blended learning environments combine online and offline teaching modes, allowing students to acquire knowledge and skills more flexibly. Through learning and practice, students can master various expression tools and techniques, such as hand-drawn sketches, computer-aided design software, model making, etc., thereby more accurately conveying their design concepts." Other interviewees also expressed the importance of this viewpoint, emphasizing the importance of creative ideas and design expression in blended learning environments that cannot be ignored. As emphasized by interviewee 10, creative ideation and design expression in the blended learning environment of interior design can not only enhance students' creative thinking skills and problem-solving abilities but also strengthen their application skills in practical projects through diverse expression tools and techniques. The cultivation of these skills is crucial for students' success in their future careers.

Category 2: innovative application of materials and technology

Research findings indicate that the innovative application of materials and technology plays a crucial role in developing students' practical and creative skills within a blended learning environment for interior design. The innovative application of materials and technology refers to students' exploration and utilization of new materials and advanced technologies, enabling them to overcome the limitations of traditional design thinking and experiments with diverse design methods and approaches. This process not only stimulates their creativity but also significantly enhances their problem-solving skills. Specifically, the innovative application of materials and technology involves students gaining a deep understanding of the properties of new materials and their application methods in design, as well as becoming proficient in various advanced technological tools and equipment, such as 3D printing, virtual reality (VR), and augmented reality (AR). These skills enable students to more accurately realize their design concepts and effectively apply them in real-world projects.

For example, Interviewee 1 stated, "The blended learning environment combines online and offline teaching modes, allowing students to flexibly acquire the latest knowledge on materials and technology and apply these innovations in real projects." Other interviewees also emphasized the importance of this view. Therefore, the importance of the innovative application of materials and technology in a blended learning environment cannot be underestimated. As emphasized by interviewee 1, the innovative application of materials and technologies is crucial in the blended learning environment of interior design. This process not only enables students to flexibly acquire the latest materials and technical knowledge but also enables them to apply these innovations to practice in practical projects, thereby improving their practical abilities and professional ethics.

In summary, through research question 1 research, the dimensions of the HOTS assessment scale in blended learning for interior design include four main aspects: critical thinking skills, problem-solving skills, teamwork skills, and practical innovation skills. Based on the assessment scales developed by previous scholars in various dimensions, the researcher developed a HOTS assessment scale suitable for blended learning environments in interior design and collected feedback from interior design experts through interviews.

Development of the HOTS assessment scale

The above research results indicate that the dimensions of the HOTS scale mainly include critical thinking, problem-solving, teamwork skills and practical innovation skills. The dimensions of a scale represent the abstract characteristics and structure of the concept being measured. Since these dimensions are often abstract and difficult to measure directly, they need to be converted into several concrete indicators that can be directly observed or self-reported 38 . These concrete indicators, known as dimension items, operationalize the abstract dimensions, allowing for the measurement and evaluation of various aspects of the concept. This process transforms the abstract dimensions into specific, measurable components. The following content is based on the results of research question 1 to develop an advanced thinking skills assessment scale for mixed learning in interior design.

Dimension 1: critical thinking skills

The research results indicate that critical thinking skills constitute a key core category in blended learning environments for interior design and are crucial for cultivating students' HOTS. Critical thinking skills refer to the ability to analyze information objectively and make a reasoned judgment 39 . Scholars tend to emphasize this concept as a method of general skepticism, rational thinking, and self-reflection 7 , 40 . For example, Goodsett 26 suggested that it should be based on rational skepticism and careful thought about external matters as well as open self-reflection about internal thoughts and actions. Moreover, the California Critical Thinking Disposition Inventory (CCTDI) is widely used to measure critical thinking skills, including dimensions such as seeking truth, confidence, questioning and courage to seek truth, curiosity and openness, as well as analytical and systematic methods 41 . In addition, maturity means continuous adjustment and improvement of a person's cognitive system and learning activities through continuous awareness, reflection, and self-awareness 42 . Moreover, Nguyen 43 confirmed that critical thinking and cognitive maturity can be achieved through these activities, emphasizing that critical thinking includes cognitive skills such as analysis, synthesis, and evaluation, as well as emotional tendencies such as curiosity and openness.

In addition, in a blended learning environment for interior design, critical thinking skills help students better understand, evaluate, and apply design knowledge and skills, cultivating independent thinking and innovation abilities 44 . If students lack these skills, they may accept superficial information and solutions without sufficient thinking and evaluation, resulting in the overlooking of important details or the selection of inappropriate solutions in the design process. Therefore, for the measurement of critical thinking skills, the focus should be on cognitive skills such as analysis, synthesis, and evaluation, as well as curiosity and open mindedness. The specific items for critical thinking skills are shown in Table 6 .

Dimension 2: problem-solving skills

Problem-solving skills constitute a key core category in blended learning environments for interior design and are crucial for cultivating students' HOTS. Problem-solving skills involve the ability to analyze and solve problems by understanding them, identifying their root causes, and developing appropriate solutions 45 . According to the 5E-based STEM education approach, problem-solving skills encompass the following abilities: problem identification and definition, formulation of problem-solving strategies, problem representation, resource allocation, and monitoring and evaluation of solution effectiveness 7 , 46 . Moreover, D'zurilla and Nezu 47 and Tan 48 indicated that attitudes, beliefs, and knowledge skills during problem solving, as well as the quality of proposed solutions and observable outcomes, are demonstrated. In addition, D'Zurilla and Nezu devised the Social Problem-Solving Inventory (SPSI), which comprises seven subscales: cognitive response, emotional response, behavioral response, problem identification, generation of alternative solutions, decision-making, and solution implementation. Based on these research results, the problem-solving skills dimension questions designed in this study are shown in Table 7 .

Dimension 3: teamwork skills

The research results indicate that teamwork skills constitute a key core category in blended learning environments for interior design and are crucial for cultivating students' HOTS. Teamwork skills refer to the ability to effectively collaborate, coordinate, and communicate with others in a team environment 49 . For example, the Teamwork Skills Assessment Tool (TWKSAT) developed by Stevens and Campion 50 identifies five core dimensions of teamwork: conflict management; collaborative problem-solving; communication; goal setting; performance management; decision-making; and task coordination. The design of this tool highlights the essential skills in teamwork and provides a structured approach for evaluating these skills. In addition, he indicated that successful teams need to have a range of skills for problem solving, including situational control, conflict management, decision-making and coordination, monitoring and feedback, and an open mindset. These skills help team members effectively address complex challenges and demonstrate the team’s collaboration and flexibility. Therefore, the assessment of learners' teamwork skills needs to cover the above aspects. As shown in Table 8 .

Dimension 4: practice innovative skills

The research results indicate that practical innovation skills constitute a key core category in blended learning environments for interior design, which is crucial for cultivating students' HOTS. The practice of innovative skills encompasses the utilization of creative cognitive processes and problem-solving strategies to facilitate the generation of original ideas, solutions, and approaches 51 . This practice places significant emphasis on two critical aspects: creative conception and design expression, as well as the innovative application of materials and technology. Tang et al. 52 indicated that creative conception and design expression involve the generation and articulation of imaginative and inventive ideas within a given context. With the introduction of concepts such as 21st-century learning skills, the "5C" competency framework, and core student competencies, blended learning has emerged as the goal and direction of educational reform. It aims to promote the development of students' HOTS, equipping them with the essential qualities and key abilities needed for lifelong development and societal advancement. Blended learning not only emphasizes the mastery of core learning content but also requires students to develop critical thinking, complex problem-solving, creative thinking, and practical innovation skills. To adapt to the changes and developments in the blended learning environment, this study designed 13 preliminary test items based on 21st-century learning skills, the "5C" competency framework, core student competencies, and the TTCT assessment scale developed by Torrance 53 . These items aim to assess students' practice of innovative skills within a blended learning environment, as shown in Table 9 .

The researchers' results indicate that the consensus among the interviewed expert participants is that the structural integrity of the scale is satisfactory and does not require modification. However, certain measurement items have been identified as problematic and require revision. The primary recommendations are as follows: Within the domain of problem-solving skills, the item "I usually conduct classroom and online learning with questions and clear goals" was deemed biased because of its emphasis on the "online" environment. Consequently, the evaluation panel advised splitting this item into two separate components: (1) "I am adept at frequently adjusting and reversing a negative team atmosphere" and (2) "I consistently engage in praising and encouraging others, fostering harmonious relationships. “The assessment process requires revisions and adjustments to specific projects, forming a pilot test scale consisting of 66 observable results from the original 65 items. In addition, there were other suggestions about linguistic formulation and phraseology, which are not expounded upon herein.

Verify the effectiveness of the HOTS assessment scale

The research results indicate that there are significant differences in the average scores of the four dimensions of the HOTS, including critical thinking skills (A1–A24 items), problem-solving skills (B1–B13 items), teamwork skills (C1–C16 items), and practical innovation skills (D1–D13 items). Moreover, this also suggests that each item has discriminative power. Specifically, this will be explained through the following aspects.

Project analysis based on the CR value

The critical ratio (CR) method, which uses the CR value (decision value) to remove measurement items with poor discrimination, is the most used method in project analysis. The specific process involves the use of the CR value (critical value) to identify and remove such items. First, the modified pilot test scale data are aggregated and sorted. Individuals representing the top and bottom 27% of the distribution were subsequently selected, constituting 66 respondents in each group. The high-score group comprises individuals with a total score of 127 or above (including 127), whereas the low-score group comprises individuals with a total score of 99 or below (including 99). Finally, an independent sample t test was conducted to determine the significant differences in the mean scores for each item between the high-score and low-score groups. The statistical results are presented in Table 10 .

The above table shows that independent sample t tests were conducted for all the items; their t values were greater than 3, and their p values were less than 0.001, indicating that the difference between the highest and lowest 27% of the samples was significant and that each item had discriminative power.

In summary, based on previous research and relevant theories, the HOTS scale for interior design was revised. This revision process involved interviews with interior design experts, teachers, and students, followed by item examination and homogeneity testing via the critical ratio (CR) method. The results revealed significant correlations ( p  < 0.01) between all the items and the total score, with correlation coefficients (R) above 0.4. Therefore, the scale exhibits good accuracy and internal consistency in capturing measured HOTS. These findings provide a reliable foundation for further research and practical applications.

Pilot study exploratory factor analysis

This study used SPSS (version 28) to conduct the KMO and Bartlett tests on the scale. The total HOTS test scale as well as the KMO and Bartlett sphericities were first calculated for the four subscales to ensure that the sample data were suitable for factor analysis 7 . The overall KMO value is 0.946, indicating that the data are highly suitable for factor analysis. Additionally, Bartlett's test of sphericity was significant, further supporting the appropriateness of conducting factor analysis ( p  < 0.05). All the values are above 0.7, indicating that the data for these subscales are also suitable for factor analysis. According to Javadi et al. 54 , these results suggest the presence of shared factors among the items within the subscales, as shown in Table 11 .

For each subscale, exploratory factor analysis was conducted to extract factors with eigenvalues greater than 1 while eliminating items with communalities less than 0.30, loadings less than 0.50, and items that cross multiple (more than one) common factors 55 , 56 . Additionally, items that were inconsistent with the assumed structure of the measure were identified and eliminated to ensure the best structural validity. These principles were applied to the factor analysis of each subscale, ensuring that the extracted factor structure and observed items are consistent with the hypothesized measurement structure and analysis results, as shown in the table 55 , 58 . In the exploratory factor analysis (EFA), the latent variables were effectively interpreted and demonstrated a significant response, with cumulative explained variances of the common factors exceeding 60%. This finding confirms the alignment between the scale structure, comprising the remaining items, and the initial theoretical framework proposed in this study. Additionally, the items were systematically reorganized to construct the final questionnaire. Consequently, items A1 to A24 were associated with the critical thinking skills dimension, items B25 to B37 were linked to problem-solving skills, items C38 to C53 were indicative of teamwork skills, and items D54 to D66 were reflective of practical innovation skills. As shown in Table 12 below.

In addition, the criterion for extracting principal components in factor analysis is typically based on eigenvalues, with values greater than 1 indicating greater explanatory power than individual variables. The variance contribution ratio reflects the proportion of variance explained by each principal component relative to the total variance and signifies the ability of the principal component to capture comprehensive information. The cumulative variance contribution ratio measures the accumulated proportion of variance explained by the selected principal components, aiding in determining the optimal number of components to retain while minimizing information loss. The above table shows that four principal components can be extracted from the data, and their cumulative variance contribution rate reaches 59.748%.

However, from the scree plot (as shown in Fig.  1 ), the slope flattens starting from the fifth factor, indicating that no distinct factors can be extracted beyond that point. Therefore, retaining four factors seems more appropriate. The factor loading matrix is the core of factor analysis, and the values in the matrix represent the factor loading of each item on the common factors. Larger values indicate a stronger correlation between the item variable and the common factor. For ease of analysis, this study used the maximum variance method to rotate the initial factor loading matrix, redistributing the relationships between the factors and original variables and making the correlation coefficients range from 0 to 1, which facilitates interpretation. In this study, factor loadings with absolute values less than 0.4 were filtered out. According to the analysis results, the items of the HOTS assessment scale can be divided into four dimensions, which is consistent with theoretical expectations.

figure 1

Gravel plot of factors.

Through the pretest of the scale and selection of measurement items, 66 measurement items were ultimately determined. On this basis, a formal scale for assessing HOTS in a blended learning environment was developed, and the reliability and validity of the scale were tested to ultimately confirm its usability.

Confirmatory factor analysis of final testing

Final test employed that AMOS (version 26.0), a confirmatory factor analysis (CFA) was conducted on the retested sample data to validate the stability of the HOTS structural model obtained through exploratory factor analysis. This analysis aimed to assess the fit between the measurement results and the actual data, confirming the robustness of the derived HOTS structure and its alignment with the empirical data. The relevant model was constructed based on the factor structure of each component obtained through EFA and the observed variables, as shown in the diagram. The model fit indices are presented in Fig.  2 (among them, A represents critical thinking skills, B represents problem-solving skills, C represents teamwork skills, and D represents practical innovation skills). The models strongly support the "4-dimensional" structure of the HOTS, which includes four first-order factors: critical thinking skills, problem-solving skills, teamwork skills, and practical innovation skills. Critical thinking skills play a pivotal role in the blended learning environment of interior design, connecting problem-solving skills, teamwork skills, and innovative practices. These four dimensions form the assessment structure of HOTS, with critical thinking skills serving as the core element, inspiring individuals to assess problems and propose innovative solutions. By providing appropriate learning resources, diverse learning activities, and learning tasks, as well as designing items for assessment scales, it is possible to delve into the measurement and development of HOTS in the field of interior design, providing guidance for educational and organizational practices. This comprehensive approach to learning and assessment helps cultivate students' HOTS and lays a solid foundation for their comprehensive abilities in the field of interior design. Thus, the CFA structural models provide strong support for the initial hypothesis of the proposed HOTS assessment structure in this study. As shown in Fig.  2 .

figure 2

Confirmatory factor analysis based on 4 dimensions. *A represents the dimension of critical thinking. B represents the dimension of problem-solving skills. C represents the dimension of teamwork skills. D represents the dimension of practical innovation skills.

Additionally, χ2. The fitting values of RMSEA and SRMR are both below the threshold, whereas the fitting values of the other indicators are all above the threshold, indicating that the model fits well. As shown in Table 13 .

Reliability and validity analysis

The reliability and validity of the scale need to be assessed after the model fit has been determined through validation factor analysis 57 . Based on the findings of Marsh et al. 57 , the following conclusions can be drawn. In terms of hierarchical and correlational model fit, the standardized factor loadings of each item range from 0.700 to 0.802, all of which are greater than or equal to 0.7. This indicates a strong correspondence between the observed items and each latent variable. Furthermore, the Cronbach's α coefficients, which are used to assess the internal consistency or reliability of the scale, ranged from 0.948 to 0.966 for each dimension, indicating a high level of data reliability and internal consistency. The composite reliabilities ranged from 0.948 to 0.967, exceeding the threshold of 0.6 and demonstrating a substantial level of consistency (as shown in Table 14 ).

Additionally, the diagonal bold font represents the square root of the AVE for each dimension. All the dimensions have average variance extracted (AVE) values ranging from 0.551 to 0.589, all of which are greater than 0.5, indicating that the latent variables have strong explanatory power for their corresponding items. These results suggest that the scale structure constructed in this study is reliable and effective. Furthermore, according to the results presented in Table 15 , the square roots of the AVE values for each dimension are greater than the absolute values of the correlations with other dimensions, indicating discriminant validity of the data. Therefore, these four subscales demonstrate good convergent and discriminant validity, indicating that they are both interrelated and independent. This implies that they can effectively capture the content required to complete the HOTS test scale.

Discussion and conclusion

The assessment scale for HOTS in interior design blended learning encompasses four dimensions: critical thinking skills, problem-solving skills, teamwork skills, and practical innovation skills. The selection of these dimensions is based on the characteristics and requirements of the interior design discipline, which aims to comprehensively evaluate students' HOTS demonstrated in blended learning environments to better cultivate their ability to successfully address complex design projects in practice. Notably, multiple studies have shown that HOTSs include critical thinking, problem-solving skills, creative thinking, and decision-making skills, which are considered crucial in various fields, such as education, business, and engineering 20 , 59 , 60 , 61 . Compared with prior studies, these dimensions largely mirror previous research outcomes, with notable distinctions in the emphasis on teamwork skills and practical innovation skills 62 , 63 . Teamwork skills underscore the critical importance of collaboration in contemporary design endeavors, particularly within the realm of interior design 64 . Effective communication and coordination among team members are imperative for achieving collective design objectives.

Moreover, practical innovation skills aim to increase students' capacity for creatively applying theoretical knowledge in practical design settings. Innovation serves as a key driver of advancement in interior design, necessitating students to possess innovative acumen and adaptability to evolving design trends for industry success. Evaluating practical innovation skills aims to motivate students toward innovative thinking, exploration of novel concepts, and development of unique design solutions, which is consistent with the dynamic and evolving nature of the interior design sector. Prior research suggests a close interplay between critical thinking, problem-solving abilities, teamwork competencies, and creative thinking, with teamwork skills acting as a regulatory factor for critical and creative thought processes 7 , 65 . This interconnected nature of HOTS provides theoretical support for the construction and validation of a holistic assessment framework for HOTS.

After the examination by interior design expert members, one item needed to be split into two items. The results of the CR (construct validity) analysis of the scale items indicate that independent sample t tests were subsequently conducted on all the items. The t values were greater than 3, with p values less than 0.001, indicating significant differences between the top and bottom 27% of the samples and demonstrating the discriminant validity of each item. This discovery highlights the diversity and effectiveness of the scale's internal items, revealing the discriminatory power of the scale in assessing the study subjects. The high t values and significant p values reflect the substantiality of the internal items in distinguishing between different sample groups, further confirming the efficacy of these items in evaluating the target characteristics. These results provide a robust basis for further refinement and optimization of the scale and offer guidance for future research, emphasizing the importance of scale design in research and providing strong support for data interpretation and analysis.

This process involves evaluating measurement scales through EFA, and it was found that the explanatory variance of each subscale reached 59.748%, and the CR, AVE, Cronbach's alpha, and Pearson correlation coefficient values of the total scale and subscales were in a better state, which strongly demonstrates the structure, discrimination, and convergence effectiveness of the scale 57 .

The scale structure and items of this study are reliable and effective, which means that students in the field of interior design can use them to test their HOTS level and assess their qualities and abilities. In addition, scholars can use this scale to explore the relationships between students' HOTS and external factors, personal personalities, etc., to determine different methods and strategies for developing and improving HOTS.

Limitations and future research

The developed mixed learning HOTS assessment scale for interior design also has certain limitations that need to be addressed in future research. The first issue is that, owing to the requirement of practical innovation skills, students need to have certain practical experience and innovative abilities. First-grade students usually have not yet had sufficient opportunities for learning and practical experience, so it may not be possible to evaluate their abilities effectively in this dimension. Therefore, when this scale is used for assessment, it is necessary to consider students' grade level and learning experience to ensure the applicability and accuracy of the assessment tool. For first-grade students, it may be necessary to use other assessment tools that are suitable for their developmental stage and learning experience to evaluate other aspects of their HOTS 7 . Future research should focus on expanding the scope of this dimension to ensure greater applicability.

The second issue is that the sample comes from ordinary private undergraduate universities in central China and does not come from national public universities or key universities. Therefore, there may be regional characteristics in the obtained data. These findings suggest that the improved model should be validated with a wider range of regional origins, a more comprehensive school hierarchy, and a larger sample size. The thirdly issue is the findings of this study are derived from self-reported data collected from participants through surveys. However, it is important to note that the literature suggests caution in heavily relying on such self-reported data, as perception does not always equate to actions 66 . In addition, future research can draw on this scale to evaluate the HOTS of interior design students, explore the factors that affect their development, determine their training and improvement paths, and cultivate skilled talent for the twenty-first century.

This study adopts a mixed method research approach, combining qualitative and quantitative methods to achieve a comprehensive understanding of the phenomenon 67 . By integrating qualitative and quantitative research methods, mixed methods research provides a comprehensive and detailed exploration of research questions, using multiple data sources and analytical methods to obtain accurate and meaningful answers 68 . To increase the quality of the research, the entire study followed the guidelines for scale development procedures outlined by Professor Li after the data were obtained. As shown in Fig.  3

figure 3

Scale development program.

Basis of theory

This study is guided by educational objectives such as 21st-century learning skills, the "5C" competency framework, and students' core abilities 4 . The construction process of the scale is based on theoretical foundations, including Bloom's taxonomy. Drawing from existing research, such as the CCTDI 41 , SPSI 69 , and TWKSAT scales, the dimensions and preliminary items of the scale were developed. Additionally, to enhance the validity and reliability of the scale, dimensions related to HOTS in interior design were obtained through semi-structured interviews, and the preliminary project adapted or directly cited existing research results. The preliminary items were primarily adapted or directly referenced from existing research findings. Based on existing research, such as the CCTDI, SPSI, TWKSAT, and twenty-first century skills frameworks, this study takes "critical thinking skills, problem-solving skills, teamwork skills, and practical innovative skills" as the four basic dimensions of the scale.

Participants and procedures

This study is based on previous research and develops a HOTS assessment scale to measure the thinking levels of interior design students in blended learning. By investigating the challenges and opportunities students encounter in blended learning environments and exploring the complexity and diversity of their HOTS, this study aims to obtain comprehensive insights. For research question 1, via the purposive sampling method, 10 interior design experts are selected to investigate the dimensions and evaluation indicators of HOTS in blended learning of interior design. The researcher employed a semi structured interview method, and a random sampling technique was used to select 10 senior experts and teachers in the field of interior design, holding the rank of associate professor or above. This included 5 males and 5 females. As shown in Table 16 .

For research question 2 and 3, the research was conducted at an undergraduate university in China, in the field of interior design and within a blended learning environment. In addition, a statement confirms that all experimental plans have been approved by the authorized committee of Zhengzhou University of Finance and Economics. In the process of practice, the methods used were all in accordance with relevant guidelines and regulations, and informed consent was obtained from all participants. The Interior Design Blended Learning HOTS assessment scale was developed based on sample data from 350 students who underwent one pre-test and retest. The participants in the study consisted of second-, third-, and fourth-grade students who had participated in at least one blended learning course. The sample sizes were 115, 118, and 117 for the respective grade levels, totaling 350 individuals. Among the participants, there were 218 male students and 132 female students, all of whom were within the age range of 19–22 years. Through purposeful sampling, this study ensured the involvement of relevant participants and focused on a specific university environment with diverse demographic characteristics and rich educational resources.

This approach enhances the reliability and generalizability of the research and contributes to a deeper understanding of the research question (as shown in Table 17 ).

Instruments

The tools used in this study include semi structured interview guidelines and the HOTS assessment scale developed by the researchers. For research question 1, the semi structured interview guidelines were reviewed by interior design experts to ensure the accuracy and appropriateness of their content and questions. In addition, for research question 2 and 3, the HOTS assessment scale developed by the researchers will be checked via the consistency ratio (CR) method to assess the consistency and reliability of the scale items and validate their effectiveness.

Data analysis

For research question 1, the researcher will utilize the NVivo version 14 software tool to conduct thematic analysis on the data obtained through semi structured interviews. Thematic analysis is a commonly used qualitative research method that aims to identify and categorize themes, concepts, and perspectives that emerge within a dataset 70 . By employing NVivo software, researchers can effectively organize and manage large amounts of textual data and extract themes and patterns from them.

For research question 2, the critical ratio (CR) method was employed to conduct item analysis and homogeneity testing on the items of the pilot test questionnaire. The CR method allows for the assessment of each item's contribution to the total score and the evaluation of the interrelationships among the items within the questionnaire. These analytical techniques served to facilitate the evaluation and validation of the scale's reliability and validity.

For research question 3, this study used SPSS (version 26), in which confirmatory factor analysis (CFA) was conducted on the confirmatory sample data via maximum likelihood estimation. The purpose of this analysis was to verify whether the hypothesized factor structure model of the questionnaire aligned with the actual survey data. Finally, several indices, including composite reliability (CR), average variance extracted (CR), average variance extracted (AVE), Cronbach's alpha coefficient, and the Pearson correlation coefficient, were computed to assess the reliability and validity of the developed scale and assess its reliability and validity.

In addition, exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) are commonly utilized techniques in questionnaire development and adaptation research 31 , 70 . The statistical software packages SPSS and AMOS are frequently employed for implementing these analytical techniques 71 , 72 , 73 . CFA is a data-driven approach to factor generation that does not require a predetermined number of factors or specific relationships with observed variables. Its focus lies in the numerical characteristics of the data. Therefore, prior to conducting CFA, survey questionnaires are typically constructed through EFA to reveal the underlying structure and relationships between observed variables and the latent structure.

In contrast, CFA tests the hypothesized model structure under specific theoretical assumptions or structural hypotheses, including the interrelationships among factors and the known number of factors. Its purpose is to validate the hypothesized model structure. Thus, the initial validity of the questionnaire structure, established through EFA, necessitates further confirmation through CFA 57 , 70 . Additionally, a sample size of at least 200 is recommended for conducting the validation factor analysis. In this study, confirmatory factor analysis was performed on a sample size of 317.

Data availability

All data generated or analyzed during this study are included in this published article. All the experimental protocols were approved by the Zhengzhou College of Finance and Economics licensing committee.

Hariadi, B. et al. Higher order thinking skills based learning outcomes improvement with blended web mobile learning Model. Int. J. Instr. 15 (2), 565–578 (2022).

Google Scholar  

Sagala, P. N. & Andriani, A. Development of higher-order thinking skills (HOTS) questions of probability theory subject based on bloom’s taxonomy. J. Phys. Conf. Ser. https://doi.org/10.1088/1742-6596/1188/1/012025 (2019).

Article   Google Scholar  

Yudha, R. P. Higher order thinking skills (HOTS) test instrument: Validity and reliability analysis with the rasch model. Eduma Math. Educ. Learn. Teach. https://doi.org/10.24235/eduma.v12i1.9468 (2023).

Leach, S. M., Immekus, J. C., French, B. F. & Hand, B. The factorial validity of the Cornell critical thinking tests: A multi-analytic approach. Think. Skills Creat. https://doi.org/10.1016/j.tsc.2020.100676 (2020).

Noroozi, O., Dehghanzadeh, H. & Talaee, E. A systematic review on the impacts of game-based learning on argumentation skills. Entertain. Comput. https://doi.org/10.1016/j.entcom.2020.100369 (2020).

Supena, I., Darmuki, A. & Hariyadi, A. The influence of 4C (constructive, critical, creativity, collaborative) learning model on students’ learning outcomes. Int. J. Instr. 14 (3), 873–892. https://doi.org/10.29333/iji.2021.14351a (2021).

Zhou, Y., Gan, L., Chen, J., Wijaya, T. T. & Li, Y. Development and validation of a higher-order thinking skills assessment scale for pre-service teachers. Think. Skills Creat. https://doi.org/10.1016/j.tsc.2023.101272 (2023).

Musfy, K., Sosa, M. & Ahmad, L. Interior design teaching methodology during the global COVID-19 pandemic. Interiority 3 (2), 163–184. https://doi.org/10.7454/in.v3i2.100 (2020).

Yong, S. D., Kusumarini, Y. & Tedjokoesoemo, P. E. D. Interior design students’ perception for AutoCAD SketchUp and Rhinoceros software usability. IOP Conf. Ser. Earth Environ. Sci. https://doi.org/10.1088/1755-1315/490/1/012015 (2020).

Anthony, B. et al. Blended learning adoption and implementation in higher education: A theoretical and systematic review. Technol. Knowl. Learn. 27 (2), 531–578. https://doi.org/10.1007/s10758-020-09477-z (2020).

Castro, R. Blended learning in higher education: Trends and capabilities. Edu. Inf. Technol. 24 (4), 2523–2546. https://doi.org/10.1007/s10639-019-09886-3 (2019).

Alismaiel, O. Develop a new model to measure the blended learning environments through students’ cognitive presence and critical thinking skills. Int. J. Emerg. Technol. Learn. 17 (12), 150–169. https://doi.org/10.3991/ijet.v17i12.30141 (2022).

Gao, Y. Blended teaching strategies for art design major courses in colleges. Int. J. Emerg. Technol. Learn. https://doi.org/10.3991/ijet.v15i24.19033 (2020).

Banihashem, S. K., Kerman, N. T., Noroozi, O., Moon, J. & Drachsler, H. Feedback sources in essay writing: peer-generated or AI-generated feedback?. Int. J. Edu. Technol. Higher Edu. 21 (1), 23 (2024).

Ji, J. A Design on Blended Learning to Improve College English Students’ Higher-Order Thinking Skills. https://doi.org/10.18282/l-e.v10i4.2553 (2021).

Noroozi, O. The role of students’ epistemic beliefs for their argumentation performance in higher education. Innov. Edu. Teach. Int. 60 (4), 501–512 (2023).

Valero Haro, A., Noroozi, O., Biemans, H. & Mulder, M. First- and second-order scaffolding of argumentation competence and domain-specific knowledge acquisition: A systematic review. Technol. Pedag. Edu. 28 (3), 329–345. https://doi.org/10.1080/1475939x.2019.1612772 (2019).

Narasuman, S. & Wilson, D. M. Investigating teachers’ implementation and strategies on higher order thinking skills in school based assessment instruments. Asian J. Univ. Edu. https://doi.org/10.24191/ajue.v16i1.8991 (2020).

Valero Haro, A., Noroozi, O., Biemans, H. & Mulder, M. Argumentation competence: Students’ argumentation knowledge, behavior and attitude and their relationships with domain-specific knowledge acquisition. J. Constr. Psychol. 35 (1), 123–145 (2022).

Johansson, E. The Assessment of Higher-order Thinking Skills in Online EFL Courses: A Quantitative Content Analysis (2020).

Noroozi, O., Kirschner, P. A., Biemans, H. J. A. & Mulder, M. Promoting argumentation competence: Extending from first- to second-order scaffolding through adaptive fading. Educ. Psychol. Rev. 30 (1), 153–176. https://doi.org/10.1007/s10648-017-9400-z (2017).

Noroozi, O., Weinberger, A., Biemans, H. J. A., Mulder, M. & Chizari, M. Facilitating argumentative knowledge construction through a transactive discussion script in CSCL. Comput. Educ. 61 , 59–76. https://doi.org/10.1016/j.compedu.2012.08.013 (2013).

Noroozi, O., Weinberger, A., Biemans, H. J. A., Mulder, M. & Chizari, M. Argumentation-based computer supported collaborative learning (ABCSCL): A synthesis of 15 years of research. Educ. Res. Rev. 7 (2), 79–106. https://doi.org/10.1016/j.edurev.2011.11.006 (2012).

Setiawan, Baiq Niswatul Khair, Ratnadi Ratnadi, Mansur Hakim, & Istiningsih, S. Developing HOTS-Based Assessment Instrument for Primary Schools (2019).

Suparman, S., Juandi, D., & Tamur, M. Does Problem-Based Learning Enhance Students’ Higher Order Thinking Skills in Mathematics Learning? A Systematic Review and Meta-Analysis 2021 4th International Conference on Big Data and Education (2021).

Goodsett, M. Best practices for teaching and assessing critical thinking in information literacy online learning objects. J. Acad. Lib. https://doi.org/10.1016/j.acalib.2020.102163 (2020).

Putra, I. N. A. J., Budiarta, L. G. R., & Adnyayanti, N. L. P. E. Developing Authentic Assessment Rubric Based on HOTS Learning Activities for EFL Teachers. In Proceedings of the 2nd International Conference on Languages and Arts across Cultures (ICLAAC 2022) (pp. 155–164). https://doi.org/10.2991/978-2-494069-29-9_17 .

Bervell, B., Umar, I. N., Kumar, J. A., Asante Somuah, B. & Arkorful, V. Blended learning acceptance scale (BLAS) in distance higher education: Toward an initial development and validation. SAGE Open https://doi.org/10.1177/21582440211040073 (2021).

Byrne, D. A worked example of Braun and Clarke’s approach to reflexive thematic analysis. Qual. Quant. 56 (3), 1391–1412 (2022).

Xu, W. & Zammit, K. Applying thematic analysis to education: A hybrid approach to interpreting data in practitioner research. Int. J. Qual. Methods 19 , 1609406920918810 (2020).

Braun, V. & Clarke, V. Conceptual and design thinking for thematic analysis. Qual. Psychol. 9 (1), 3 (2022).

Creswell, A., Shanahan, M., & Higgins, I. Selection-inference: Exploiting large language models for interpretable logical reasoning. arXiv:2205.09712 (2022).

Baron, J. Thinking and Deciding 155–156 (Cambridge University Press, 2023).

Book   Google Scholar  

Silver, N., Kaplan, M., LaVaque-Manty, D. & Meizlish, D. Using Reflection and Metacognition to Improve Student Learning: Across the Disciplines, Across the Academy (Taylor & Francis, 2023).

Oksuz, K., Cam, B. C., Kalkan, S. & Akbas, E. Imbalance problems in object detection: A review. IEEE Trans. Pattern Anal. Mach. Intell. 43 (10), 3388–3415 (2020).

Saputra, M. D., Joyoatmojo, S., Wardani, D. K. & Sangka, K. B. Developing critical-thinking skills through the collaboration of jigsaw model with problem-based learning model. Int. J. Instr. 12 (1), 1077–1094 (2019).

Imam, H. & Zaheer, M. K. Shared leadership and project success: The roles of knowledge sharing, cohesion and trust in the team. Int. J. Project Manag. 39 (5), 463–473 (2021).

DeCastellarnau, A. A classification of response scale characteristics that affect data quality: A literature review. Qual. Quant. 52 (4), 1523–1559 (2018).

Article   PubMed   Google Scholar  

Haber, J. Critical Thinking 145–146 (MIT Press, 2020).

Hanscomb, S. Critical Thinking: The Basics 180–181 (Routledge, 2023).

Sulaiman, W. S. W., Rahman, W. R. A. & Dzulkifli, M. A. Examining the construct validity of the adapted California critical thinking dispositions (CCTDI) among university students in Malaysia. Proc. Social Behav. Sci. 7 , 282–288 (2010).

Jaakkola, N. et al. Becoming self-aware—How do self-awareness and transformative learning fit in the sustainability competency discourse?. Front. Educ. https://doi.org/10.3389/feduc.2022.855583 (2022).

Nguyen, T. T. B. Critical thinking: What it means in a Vietnamese tertiary EFL context. English For. Language Int. J. 2 (3), 4–23 (2022).

Henriksen, D., Gretter, S. & Richardson, C. Design thinking and the practicing teacher: Addressing problems of practice in teacher education. Teach. Educ. 31 (2), 209–229 (2020).

Okes, D. Root cause analysis: The core of problem solving and corrective action 179–180 (Quality Press, 2019).

Eroğlu, S. & Bektaş, O. The effect of 5E-based STEM education on academic achievement, scientific creativity, and views on the nature of science. Learn. Individual Differ. 98 , 102181 (2022).

Dzurilla, T. J. & Nezu, A. M. Development and preliminary evaluation of the social problem-solving inventory. Psychol. Assess. J. Consult. Clin. Psychol. 2 (2), 156 (1990).

Tan, O.-S. Problem-based learning innovation: Using problems to power learning in the 21st century. Gale Cengage Learning (2021).

Driskell, J. E., Salas, E. & Driskell, T. Foundations of teamwork and collaboration. Am. Psychol. 73 (4), 334 (2018).

Lower, L. M., Newman, T. J. & Anderson-Butcher, D. Validity and reliability of the teamwork scale for youth. Res. Social Work Pract. 27 (6), 716–725 (2017).

Landa, R. Advertising by design: generating and designing creative ideas across media (Wiley, 2021).

Tang, T., Vezzani, V. & Eriksson, V. Developing critical thinking, collective creativity skills and problem solving through playful design jams. Think. Skills Creat. 37 , 100696 (2020).

Torrance, E. P. Torrance tests of creative thinking. Educational and psychological measurement (1966).

Javadi, M. H., Khoshnami, M. S., Noruzi, S. & Rahmani, R. Health anxiety and social health among health care workers and health volunteers exposed to coronavirus disease in Iran: A structural equation modeling. J. Affect. Disord. Rep. https://doi.org/10.1016/j.jadr.2022.100321 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Hu, L. & Bentler, P. M. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct. Equ. Model. Multidiscip. J. 6 (1), 1–55. https://doi.org/10.1080/10705519909540118 (1999).

Matsunaga, M. Item parceling in structural equation modeling: A primer. Commun. Methods Measures 2 (4), 260–293. https://doi.org/10.1080/19312450802458935 (2008).

Marsh, H. W., Morin, A. J., Parker, P. D. & Kaur, G. Exploratory structural equation modeling: An integration of the best features of exploratory and confirmatory factor analysis. Ann. Rev. Clin. Psychol. 10 (1), 85–110 (2014).

Song, Y., Lee, Y. & Lee, J. Mediating effects of self-directed learning on the relationship between critical thinking and problem-solving in student nurses attending online classes: A cross-sectional descriptive study. Nurse Educ. Today https://doi.org/10.1016/j.nedt.2021.105227 (2022).

Chu, S. K. W., Reynolds, R. B., Tavares, N. J., Notari, M., & Lee, C. W. Y. 21st century skills development through inquiry-based learning from theory to practice . Springer (2021).

Eliyasni, R., Kenedi, A. K. & Sayer, I. M. Blended learning and project based learning: the method to improve students’ higher order thinking skill (HOTS). Jurnal Iqra’: Kajian Ilmu Pendidikan 4 (2), 231–248 (2019).

Yusuf, P. & Istiyono,. Blended learning: Its effect towards higher order thinking skills (HOTS). J. Phys. Conf. Ser. https://doi.org/10.1088/1742-6596/1832/1/012039 (2021).

Byron, K., Keem, S., Darden, T., Shalley, C. E. & Zhou, J. Building blocks of idea generation and implementation in teams: A meta-analysis of team design and team creativity and innovation. Personn. Psychol. 76 (1), 249–278 (2023).

Walid, A., Sajidan, S., Ramli, M. & Kusumah, R. G. T. Construction of the assessment concept to measure students’ high order thinking skills. J. Edu. Gift. Young Sci. 7 (2), 237–251 (2019).

Alawad, A. Evaluating online learning practice in the interior design studio. Int. J. Art Des. Edu. 40 (3), 526–542. https://doi.org/10.1111/jade.12365 (2021).

Awuor, N. O., Weng, C. & Militar, R. Teamwork competency and satisfaction in online group project-based engineering course: The cross-level moderating effect of collective efficacy and flipped instruction. Comput. Educ. 176 , 104357 (2022).

Noroozi, O., Alqassab, M., Taghizadeh Kerman, N., Banihashem, S. K. & Panadero, E. Does perception mean learning? Insights from an online peer feedback setting. Assess. Eval. Higher Edu. https://doi.org/10.1080/02602938.2024.2345669 (2024).

Creswell, J. W. A concise introduction to mixed methods research. SAGE publications124–125 (2021) .

Tashakkori, A., Johnson, R. B., & Teddlie, C. Foundations of mixed methods research: Integrating quantitative and qualitative approaches in the social and behavioral sciences. Sage Publications 180–181(2020).

Jiang, X., Lyons, M. D. & Huebner, E. S. An examination of the reciprocal relations between life satisfaction and social problem solving in early adolescents. J. Adolescence 53 (1), 141–151. https://doi.org/10.1016/j.adolescence.2016.09.004 (2016).

Orcan, F. Exploratory and confirmatory factor analysis: Which one to use first. Egitimde ve Psikolojide Olçme ve Degerlendirme Dergisi https://doi.org/10.21031/epod.394323 (2018).

Asparouhov, T. & Muthén, B. Exploratory structural equation modeling. Struct. Eq. Model. Multidiscip. J. 16 (3), 397–438 (2009).

Article   MathSciNet   Google Scholar  

Finch, H., French, B. F., & Immekus, J. C. Applied psychometrics using spss and amos. IAP (2016).

Marsh, H. W., Guo, J., Dicke, T., Parker, P. D. & Craven, R. G. Confirmatory factor analysis (CFA), exploratory structural equation modeling (ESEM), and Set-ESEM: Optimal balance between goodness of fit and parsimony. Multivar. Behav. Res. 55 (1), 102–119. https://doi.org/10.1080/00273171.2019.1602503 (2020).

Download references

Acknowledgements

Thanks to the editorial team and reviewers of Scientific Reports for their valuable comments.

Author information

Authors and affiliations.

Faculty of Education, SEGI University, 47810 Petaling Jaya, Selangor, Malaysia

Department of Art and Design, Zhengzhou College of Finance and Economics, Zhengzhou, 450000, Henan, China

Xiaolei Fan

Faculty of Humanities and Arts, Macau University of Science and Technology, Avenida Wai Long, 999078, Taipa, Macao, Special Administrative Region of China

Lingchao Meng

You can also search for this author in PubMed   Google Scholar

Contributions

D.L. Conceptualized a text experiment, and wrote the main manuscript text. D.L. and X.F. conducted experiments, D.L., X.F. and L.M. analyzed the results. L.M. contributed to the conceptualization, methodology and editing, and critically reviewed the manuscript. All authors have reviewed the manuscript.

Corresponding author

Correspondence to Lingchao Meng .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Li, D., Fan, X. & Meng, L. Development and validation of a higher-order thinking skills (HOTS) scale for major students in the interior design discipline for blended learning. Sci Rep 14 , 20287 (2024). https://doi.org/10.1038/s41598-024-70908-3

Download citation

Received : 28 February 2024

Accepted : 22 August 2024

Published : 31 August 2024

DOI : https://doi.org/10.1038/s41598-024-70908-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Assessment scale
  • Higher-order thinking skills
  • Interior design
  • Blended learning

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.

critical thinking and scientific method

IMAGES

  1. What is the scientific method? Good scientists think it through. The

    critical thinking and scientific method

  2. The Need for Critical Thinking and the Scientific Method

    critical thinking and scientific method

  3. PPT

    critical thinking and scientific method

  4. 7 Steps of the Scientific Method, Mind Map Text Concept for

    critical thinking and scientific method

  5. PPT

    critical thinking and scientific method

  6. PPT

    critical thinking and scientific method

VIDEO

  1. SCT 3.3a

  2. SCT 3.1a

  3. Critical thinking: Scientific method

  4. 10 Fascinating Facts About Albert Einstein

  5. Explain critical thinking; A very important soft skill

  6. Immersive Critical Thinking Activities: Think Like A Scientist

COMMENTS

  1. The Relationship Between Scientific Method & Critical Thinking

    Learn how critical thinking, the ability to analyze claims about the world, is the intellectual basis of the scientific method, a structured mode of inquiry. See how critical thinking initiates the hypothesis, experimentation and conclusion in the scientific method.

  2. Understanding the Complex Relationship between Critical Thinking and

    Critical thinking and scientific reasoning are similar but different constructs that include various types of higher-order cognitive processes, metacognitive strategies, and dispositions involved in making meaning of information. ... 2004) and developing explicit examples of how critical thinking relates to the scientific method (Miri et al ...

  3. Critical Thinking

    Critical thinking is the process of using and assessing reasons to evaluate statements, assumptions, and arguments in ordinary situations. It is widely regarded as a species of informal logic, but it also makes use of some formal methods and has diverse goals and applications.

  4. A Guide to Using the Scientific Method in Everyday Life

    Learn how scientists use inductive reasoning to formulate hypotheses based on observations and empirical data, and how they test them using controlled experiments and falsification. The web page explains the history, the limitations, and the applications of the scientific method in everyday life.

  5. 1.5: The Scientific Method

    Critical Reasoning and Writing (Levin et al.) 1: Introduction to Critical Thinking, Reasoning, and Logic 1.5: The Scientific Method ... and social scientists (such as anthropologists). Scientific method is the most powerful and successful form of knowing that has been employed. Every advance in engineering, medicine, and technology has been ...

  6. Scientific Thinking and Critical Thinking in Science Education

    This article explores the differences and relationships between scientific thinking and critical thinking in the context of science education. It argues that both types of thinking are essential for citizenship and science literacy, but they have different purposes and skills.

  7. Science, method and critical thinking

    The method, based on critical thinking, is embedded in the scientific method, named here the Critical Generative Method. Before illustrating the key requirements for critical thinking, one point must be made clear from the outset: thinking involves using language, and the depth of thought is directly related to the 'active' vocabulary ...

  8. Critical Thinking

    Critical Thinking. Critical thinking is a widely accepted educational goal. Its definition is contested, but the competing definitions can be understood as differing conceptions of the same basic concept: careful thinking directed to a goal. Conceptions differ with respect to the scope of such thinking, the type of goal, the criteria and norms ...

  9. Science and the Spectrum of Critical Thinking

    This chapter explores the connections between scientific and other forms of rational thinking, based on logic, rules, and best practices. It traces the origins of the scientific method and critical thinking to ancient Greece, and examines the spectrum of critical thinking from logic to systemic thinking.

  10. PDF The Nature of Scientific Thinking

    Learn how scientists think and work in different ways, from historical and contemporary case studies. Explore the characteristics of 21st century scientific thinking, such as creative and critical thinking, synthesis, collaboration, and use of technology.

  11. Enhancing Scientific Thinking Through the Development of Critical

    In general, scientific thinking and critical thinking overlap considerably with the demand for evidence for knowledge claims and action. However, on the one hand, in the Finnish higher education context, the term scientific thinking is a more commonly used term than critical thinking and sometimes these terms are used interchangeably ...

  12. Understanding the Complex Relationship between Critical Thinking and

    Developing critical-thinking and scientific reasoning skills are core learning objectives of science education, but little empirical evidence exists regarding the interrelationships between these constructs. Writing effectively fosters students' development of these constructs, and it offers a unique window into studying how they relate. In this study of undergraduate thesis writing in ...

  13. Critical Thinking in Science: Fostering Scientific Reasoning Skills in

    Learn how to develop critical thinking skills in science students through project-based learning, self-reflection, addressing assumptions, and lab activities. Critical thinking is essential for evaluating evidence, identifying bias, and thinking like a scientist.

  14. Understanding the Complex Relationship between Critical Thinking and

    This study examines how scientific reasoning in writing relates to general and specific critical-thinking skills among biology students. It finds that inference is the most strongly related skill, while other aspects of science reasoning are not significantly related to critical thinking.

  15. PDF Research Methods Lecture 1: Scientific Method and Critical Thinking

    Scientific Method and Critical Thinking 1.1 Scientific Methods Most beginning science courses describe the scientific method. They mean something fairly specific, which is often outlined as Hypothesis . 1. State a hypothesis; that is, a falsifiable statement about the world. 2. Design an experimental procedure to test the hypothesis, and ...

  16. Use the Scientific Method

    The scientific method helps you avoid making mistakes when thinking of a situation. However, there are many ways to approach critical thinking; feel free to dig deeper into some of them: A philosophical-historical approach from Greek philosophers or the Enlightenment (university studies, philosophy, etc.). A popular approach (YouTube channels ...

  17. Science, method and critical thinking

    This article argues that science is based on a method of critical thinking that involves verifying statements, identifying postulates and using common sense. It also warns that the scientific vocabulary and the historical context are essential for critical thinking and that they are often neglected or manipulated.

  18. Scientific Method

    Science is an enormously successful human enterprise. The study of scientific method is the attempt to discern the activities by which that success is achieved. Among the activities often identified as characteristic of science are systematic observation and experimentation, inductive and deductive reasoning, and the formation and testing of ...

  19. Critical thinking

    Critical thinking is a mode of cognition using deliberative reasoning and impartial scrutiny of information to arrive at a possible solution to a problem. It is a core instructional goal of progressive education and a key outcome of U.S. education policy, but it is not the same as fancy or imagination.

  20. Thinking critically on critical thinking: why scientists' skills need

    How can we teach critical thinking skills to students in science and other subjects? Rachel Grieve argues that science-specific skills should be incorporated into the wider curriculum.

  21. Scientific method

    The scientific method is a technique used to construct and test hypotheses in science. It involves observing, asking questions, and seeking answers through experiments and tests. Learn more about the scientific method and its applications in different fields of science.

  22. Scientific method

    The scientific method is an empirical method for acquiring knowledge that has characterized the development of science since at least the 17th century. The web page does not mention any specific scientist as the founder of the scientific method, but it covers its history, elements, and modern use and critical thought.

  23. CRITICAL THINKING, THE SCIENTIFIC METHOD

    Because the scientific method is just a formalization of critical thinking, that means that the students become critical thinkers. And that is what I most want to teach. The basic idea: Explicitly discussing the logic and the thought processes that inform experimental methods works better than hoping students will "get it" if they hear enough ...

  24. Exploring the link of personality traits and tutors' instruction on

    Objectives This study aimed to investigate the associations between critical thinking (CT) disposition and personal characteristics and tutors' guidance among medical graduate students, which may provide a theoretical basis for cultivating CT. Design A cross-sectional study was conducted. Setting This study was conducted in Sichuan and Chongqing from November to December 2021. Participants A ...

  25. Development and validation of a higher-order thinking skills (HOTS

    The critical ratio (CR) method, which uses the CR value (decision value) to remove measurement items with poor discrimination, is the most used method in project analysis.