U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Case Study Observational Research: A Framework for Conducting Case Study Research Where Observation Data Are the Focus

Affiliation.

  • 1 1 University of Otago, Wellington, New Zealand.
  • PMID: 27217290
  • DOI: 10.1177/1049732316649160

Case study research is a comprehensive method that incorporates multiple sources of data to provide detailed accounts of complex research phenomena in real-life contexts. However, current models of case study research do not particularly distinguish the unique contribution observation data can make. Observation methods have the potential to reach beyond other methods that rely largely or solely on self-report. This article describes the distinctive characteristics of case study observational research, a modified form of Yin's 2014 model of case study research the authors used in a study exploring interprofessional collaboration in primary care. In this approach, observation data are positioned as the central component of the research design. Case study observational research offers a promising approach for researchers in a wide range of health care settings seeking more complete understandings of complex topics, where contextual influences are of primary concern. Future research is needed to refine and evaluate the approach.

Keywords: New Zealand; appreciative inquiry; case studies; case study observational research; health care; interprofessional collaboration; naturalistic inquiry; observation; primary health care; qualitative; research design.

PubMed Disclaimer

Similar articles

  • Observation of interprofessional collaborative practice in primary care teams: An integrative literature review. Morgan S, Pullon S, McKinlay E. Morgan S, et al. Int J Nurs Stud. 2015 Jul;52(7):1217-30. doi: 10.1016/j.ijnurstu.2015.03.008. Epub 2015 Mar 19. Int J Nurs Stud. 2015. PMID: 25862411 Review.
  • An exemplar of naturalistic inquiry in general practice research. McInnes S, Peters K, Bonney A, Halcomb E. McInnes S, et al. Nurse Res. 2017 Jan 23;24(3):36-41. doi: 10.7748/nr.2017.e1509. Nurse Res. 2017. PMID: 28102791
  • Standards and guidelines for observational studies: quality is in the eye of the beholder. Morton SC, Costlow MR, Graff JS, Dubois RW. Morton SC, et al. J Clin Epidemiol. 2016 Mar;71:3-10. doi: 10.1016/j.jclinepi.2015.10.014. Epub 2015 Nov 5. J Clin Epidemiol. 2016. PMID: 26548541
  • Using observation to collect data in emergency research. Fry M, Curtis K, Considine J, Shaban RZ. Fry M, et al. Australas Emerg Nurs J. 2017 Feb;20(1):25-30. doi: 10.1016/j.aenj.2017.01.001. Epub 2017 Feb 4. Australas Emerg Nurs J. 2017. PMID: 28169134
  • Real-world research and the role of observational data in the field of gynaecology - a practical review. Heikinheimo O, Bitzer J, García Rodríguez L. Heikinheimo O, et al. Eur J Contracept Reprod Health Care. 2017 Aug;22(4):250-259. doi: 10.1080/13625187.2017.1361528. Epub 2017 Aug 17. Eur J Contracept Reprod Health Care. 2017. PMID: 28817972 Review.
  • A realist evaluation protocol: assessing the effectiveness of a rapid response team model for mental state deterioration in acute hospitals. Dziruni TB, Hutchinson AM, Keppich-Arnold S, Bucknall T. Dziruni TB, et al. Front Health Serv. 2024 Jul 15;4:1400060. doi: 10.3389/frhs.2024.1400060. eCollection 2024. Front Health Serv. 2024. PMID: 39076771 Free PMC article.
  • Using Qualitative Methods to Understand the Interconnections Between Cities and Health: A Methodological Review. Silva JP, Ribeiro AI. Silva JP, et al. Public Health Rev. 2024 Apr 8;45:1606454. doi: 10.3389/phrs.2024.1606454. eCollection 2024. Public Health Rev. 2024. PMID: 38651134 Free PMC article. Review.
  • Preference-based patient participation in intermediate care: Translation, validation and piloting of the 4Ps in Norway. Kvæl LAH, Bergland A, Eldh AC. Kvæl LAH, et al. Health Expect. 2024 Feb;27(1):e13899. doi: 10.1111/hex.13899. Epub 2023 Nov 7. Health Expect. 2024. PMID: 37934200 Free PMC article.
  • Dilemmas and deliberations in managing the care trajectory of elderly patients with complex health needs: a single-case study. Kumlin M, Berg GV, Kvigne K, Hellesø R. Kumlin M, et al. BMC Health Serv Res. 2022 Aug 12;22(1):1030. doi: 10.1186/s12913-022-08422-3. BMC Health Serv Res. 2022. PMID: 35962337 Free PMC article.
  • Tailoring and Evaluating an Intervention to Support Self-management After Stroke: Protocol for a Multi-case, Mixed Methods Comparison Study. Elf M, Klockar E, Kylén M, von Koch L, Ytterberg C, Wallin L, Finch T, Gustavsson C, Jones F. Elf M, et al. JMIR Res Protoc. 2022 May 6;11(5):e37672. doi: 10.2196/37672. JMIR Res Protoc. 2022. PMID: 35522476 Free PMC article.
  • Search in MeSH

LinkOut - more resources

Full text sources.

  • Ovid Technologies, Inc.

Other Literature Sources

  • scite Smart Citations

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Logo for Kwantlen Polytechnic University

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Non-Experimental Research

32 Observational Research

Learning objectives.

  • List the various types of observational research methods and distinguish between each.
  • Describe the strengths and weakness of each observational research method. 

What Is Observational Research?

The term observational research is used to refer to several different types of non-experimental studies in which behavior is systematically observed and recorded. The goal of observational research is to describe a variable or set of variables. More generally, the goal is to obtain a snapshot of specific characteristics of an individual, group, or setting. As described previously, observational research is non-experimental because nothing is manipulated or controlled, and as such we cannot arrive at causal conclusions using this approach. The data that are collected in observational research studies are often qualitative in nature but they may also be quantitative or both (mixed-methods). There are several different types of observational methods that will be described below.

Naturalistic Observation

Naturalistic observation  is an observational method that involves observing people’s behavior in the environment in which it typically occurs. Thus naturalistic observation is a type of field research (as opposed to a type of laboratory research). Jane Goodall’s famous research on chimpanzees is a classic example of naturalistic observation. Dr.  Goodall spent three decades observing chimpanzees in their natural environment in East Africa. She examined such things as chimpanzee’s social structure, mating patterns, gender roles, family structure, and care of offspring by observing them in the wild. However, naturalistic observation  could more simply involve observing shoppers in a grocery store, children on a school playground, or psychiatric inpatients in their wards. Researchers engaged in naturalistic observation usually make their observations as unobtrusively as possible so that participants are not aware that they are being studied. Such an approach is called disguised naturalistic observation .  Ethically, this method is considered to be acceptable if the participants remain anonymous and the behavior occurs in a public setting where people would not normally have an expectation of privacy. Grocery shoppers putting items into their shopping carts, for example, are engaged in public behavior that is easily observable by store employees and other shoppers. For this reason, most researchers would consider it ethically acceptable to observe them for a study. On the other hand, one of the arguments against the ethicality of the naturalistic observation of “bathroom behavior” discussed earlier in the book is that people have a reasonable expectation of privacy even in a public restroom and that this expectation was violated. 

In cases where it is not ethical or practical to conduct disguised naturalistic observation, researchers can conduct  undisguised naturalistic observation where the participants are made aware of the researcher presence and monitoring of their behavior. However, one concern with undisguised naturalistic observation is  reactivity. Reactivity refers to when a measure changes participants’ behavior. In the case of undisguised naturalistic observation, the concern with reactivity is that when people know they are being observed and studied, they may act differently than they normally would. This type of reactivity is known as the Hawthorne effect . For instance, you may act much differently in a bar if you know that someone is observing you and recording your behaviors and this would invalidate the study. So disguised observation is less reactive and therefore can have higher validity because people are not aware that their behaviors are being observed and recorded. However, we now know that people often become used to being observed and with time they begin to behave naturally in the researcher’s presence. In other words, over time people habituate to being observed. Think about reality shows like Big Brother or Survivor where people are constantly being observed and recorded. While they may be on their best behavior at first, in a fairly short amount of time they are flirting, having sex, wearing next to nothing, screaming at each other, and occasionally behaving in ways that are embarrassing.

Participant Observation

Another approach to data collection in observational research is participant observation. In  participant observation , researchers become active participants in the group or situation they are studying. Participant observation is very similar to naturalistic observation in that it involves observing people’s behavior in the environment in which it typically occurs. As with naturalistic observation, the data that are collected can include interviews (usually unstructured), notes based on their observations and interactions, documents, photographs, and other artifacts. The only difference between naturalistic observation and participant observation is that researchers engaged in participant observation become active members of the group or situations they are studying. The basic rationale for participant observation is that there may be important information that is only accessible to, or can be interpreted only by, someone who is an active participant in the group or situation. Like naturalistic observation, participant observation can be either disguised or undisguised. In disguised participant observation , the researchers pretend to be members of the social group they are observing and conceal their true identity as researchers.

In a famous example of disguised participant observation, Leon Festinger and his colleagues infiltrated a doomsday cult known as the Seekers, whose members believed that the apocalypse would occur on December 21, 1954. Interested in studying how members of the group would cope psychologically when the prophecy inevitably failed, they carefully recorded the events and reactions of the cult members in the days before and after the supposed end of the world. Unsurprisingly, the cult members did not give up their belief but instead convinced themselves that it was their faith and efforts that saved the world from destruction. Festinger and his colleagues later published a book about this experience, which they used to illustrate the theory of cognitive dissonance (Festinger, Riecken, & Schachter, 1956) [1] .

In contrast with undisguised participant observation ,  the researchers become a part of the group they are studying and they disclose their true identity as researchers to the group under investigation. Once again there are important ethical issues to consider with disguised participant observation.  First no informed consent can be obtained and second deception is being used. The researcher is deceiving the participants by intentionally withholding information about their motivations for being a part of the social group they are studying. But sometimes disguised participation is the only way to access a protective group (like a cult). Further, disguised participant observation is less prone to reactivity than undisguised participant observation. 

Rosenhan’s study (1973) [2]   of the experience of people in a psychiatric ward would be considered disguised participant observation because Rosenhan and his pseudopatients were admitted into psychiatric hospitals on the pretense of being patients so that they could observe the way that psychiatric patients are treated by staff. The staff and other patients were unaware of their true identities as researchers.

Another example of participant observation comes from a study by sociologist Amy Wilkins on a university-based religious organization that emphasized how happy its members were (Wilkins, 2008) [3] . Wilkins spent 12 months attending and participating in the group’s meetings and social events, and she interviewed several group members. In her study, Wilkins identified several ways in which the group “enforced” happiness—for example, by continually talking about happiness, discouraging the expression of negative emotions, and using happiness as a way to distinguish themselves from other groups.

One of the primary benefits of participant observation is that the researchers are in a much better position to understand the viewpoint and experiences of the people they are studying when they are a part of the social group. The primary limitation with this approach is that the mere presence of the observer could affect the behavior of the people being observed. While this is also a concern with naturalistic observation, additional concerns arise when researchers become active members of the social group they are studying because that they may change the social dynamics and/or influence the behavior of the people they are studying. Similarly, if the researcher acts as a participant observer there can be concerns with biases resulting from developing relationships with the participants. Concretely, the researcher may become less objective resulting in more experimenter bias.

Structured Observation

Another observational method is structured observation . Here the investigator makes careful observations of one or more specific behaviors in a particular setting that is more structured than the settings used in naturalistic or participant observation. Often the setting in which the observations are made is not the natural setting. Instead, the researcher may observe people in the laboratory environment. Alternatively, the researcher may observe people in a natural setting (like a classroom setting) that they have structured some way, for instance by introducing some specific task participants are to engage in or by introducing a specific social situation or manipulation.

Structured observation is very similar to naturalistic observation and participant observation in that in all three cases researchers are observing naturally occurring behavior; however, the emphasis in structured observation is on gathering quantitative rather than qualitative data. Researchers using this approach are interested in a limited set of behaviors. This allows them to quantify the behaviors they are observing. In other words, structured observation is less global than naturalistic or participant observation because the researcher engaged in structured observations is interested in a small number of specific behaviors. Therefore, rather than recording everything that happens, the researcher only focuses on very specific behaviors of interest.

Researchers Robert Levine and Ara Norenzayan used structured observation to study differences in the “pace of life” across countries (Levine & Norenzayan, 1999) [4] . One of their measures involved observing pedestrians in a large city to see how long it took them to walk 60 feet. They found that people in some countries walked reliably faster than people in other countries. For example, people in Canada and Sweden covered 60 feet in just under 13 seconds on average, while people in Brazil and Romania took close to 17 seconds. When structured observation  takes place in the complex and even chaotic “real world,” the questions of when, where, and under what conditions the observations will be made, and who exactly will be observed are important to consider. Levine and Norenzayan described their sampling process as follows:

“Male and female walking speed over a distance of 60 feet was measured in at least two locations in main downtown areas in each city. Measurements were taken during main business hours on clear summer days. All locations were flat, unobstructed, had broad sidewalks, and were sufficiently uncrowded to allow pedestrians to move at potentially maximum speeds. To control for the effects of socializing, only pedestrians walking alone were used. Children, individuals with obvious physical handicaps, and window-shoppers were not timed. Thirty-five men and 35 women were timed in most cities.” (p. 186).

Precise specification of the sampling process in this way makes data collection manageable for the observers, and it also provides some control over important extraneous variables. For example, by making their observations on clear summer days in all countries, Levine and Norenzayan controlled for effects of the weather on people’s walking speeds.  In Levine and Norenzayan’s study, measurement was relatively straightforward. They simply measured out a 60-foot distance along a city sidewalk and then used a stopwatch to time participants as they walked over that distance.

As another example, researchers Robert Kraut and Robert Johnston wanted to study bowlers’ reactions to their shots, both when they were facing the pins and then when they turned toward their companions (Kraut & Johnston, 1979) [5] . But what “reactions” should they observe? Based on previous research and their own pilot testing, Kraut and Johnston created a list of reactions that included “closed smile,” “open smile,” “laugh,” “neutral face,” “look down,” “look away,” and “face cover” (covering one’s face with one’s hands). The observers committed this list to memory and then practiced by coding the reactions of bowlers who had been videotaped. During the actual study, the observers spoke into an audio recorder, describing the reactions they observed. Among the most interesting results of this study was that bowlers rarely smiled while they still faced the pins. They were much more likely to smile after they turned toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

In yet another example (this one in a laboratory environment), Dov Cohen and his colleagues had observers rate the emotional reactions of participants who had just been deliberately bumped and insulted by a confederate after they dropped off a completed questionnaire at the end of a hallway. The confederate was posing as someone who worked in the same building and who was frustrated by having to close a file drawer twice in order to permit the participants to walk past them (first to drop off the questionnaire at the end of the hallway and once again on their way back to the room where they believed the study they signed up for was taking place). The two observers were positioned at different ends of the hallway so that they could read the participants’ body language and hear anything they might say. Interestingly, the researchers hypothesized that participants from the southern United States, which is one of several places in the world that has a “culture of honor,” would react with more aggression than participants from the northern United States, a prediction that was in fact supported by the observational data (Cohen, Nisbett, Bowdle, & Schwarz, 1996) [6] .

When the observations require a judgment on the part of the observers—as in the studies by Kraut and Johnston and Cohen and his colleagues—a process referred to as   coding is typically required . Coding generally requires clearly defining a set of target behaviors. The observers then categorize participants individually in terms of which behavior they have engaged in and the number of times they engaged in each behavior. The observers might even record the duration of each behavior. The target behaviors must be defined in such a way that guides different observers to code them in the same way. This difficulty with coding illustrates the issue of interrater reliability, as mentioned in Chapter 4. Researchers are expected to demonstrate the interrater reliability of their coding procedure by having multiple raters code the same behaviors independently and then showing that the different observers are in close agreement. Kraut and Johnston, for example, video recorded a subset of their participants’ reactions and had two observers independently code them. The two observers showed that they agreed on the reactions that were exhibited 97% of the time, indicating good interrater reliability.

One of the primary benefits of structured observation is that it is far more efficient than naturalistic and participant observation. Since the researchers are focused on specific behaviors this reduces time and expense. Also, often times the environment is structured to encourage the behaviors of interest which again means that researchers do not have to invest as much time in waiting for the behaviors of interest to naturally occur. Finally, researchers using this approach can clearly exert greater control over the environment. However, when researchers exert more control over the environment it may make the environment less natural which decreases external validity. It is less clear for instance whether structured observations made in a laboratory environment will generalize to a real world environment. Furthermore, since researchers engaged in structured observation are often not disguised there may be more concerns with reactivity.

Case Studies

A  case study   is an in-depth examination of an individual. Sometimes case studies are also completed on social units (e.g., a cult) and events (e.g., a natural disaster). Most commonly in psychology, however, case studies provide a detailed description and analysis of an individual. Often the individual has a rare or unusual condition or disorder or has damage to a specific region of the brain.

Like many observational research methods, case studies tend to be more qualitative in nature. Case study methods involve an in-depth, and often a longitudinal examination of an individual. Depending on the focus of the case study, individuals may or may not be observed in their natural setting. If the natural setting is not what is of interest, then the individual may be brought into a therapist’s office or a researcher’s lab for study. Also, the bulk of the case study report will focus on in-depth descriptions of the person rather than on statistical analyses. With that said some quantitative data may also be included in the write-up of a case study. For instance, an individual’s depression score may be compared to normative scores or their score before and after treatment may be compared. As with other qualitative methods, a variety of different methods and tools can be used to collect information on the case. For instance, interviews, naturalistic observation, structured observation, psychological testing (e.g., IQ test), and/or physiological measurements (e.g., brain scans) may be used to collect information on the individual.

HM is one of the most notorious case studies in psychology. HM suffered from intractable and very severe epilepsy. A surgeon localized HM’s epilepsy to his medial temporal lobe and in 1953 he removed large sections of his hippocampus in an attempt to stop the seizures. The treatment was a success, in that it resolved his epilepsy and his IQ and personality were unaffected. However, the doctors soon realized that HM exhibited a strange form of amnesia, called anterograde amnesia. HM was able to carry out a conversation and he could remember short strings of letters, digits, and words. Basically, his short term memory was preserved. However, HM could not commit new events to memory. He lost the ability to transfer information from his short-term memory to his long term memory, something memory researchers call consolidation. So while he could carry on a conversation with someone, he would completely forget the conversation after it ended. This was an extremely important case study for memory researchers because it suggested that there’s a dissociation between short-term memory and long-term memory, it suggested that these were two different abilities sub-served by different areas of the brain. It also suggested that the temporal lobes are particularly important for consolidating new information (i.e., for transferring information from short-term memory to long-term memory).

QR code for Hippocampus & Memory video

The history of psychology is filled with influential cases studies, such as Sigmund Freud’s description of “Anna O.” (see Note 6.1 “The Case of “Anna O.””) and John Watson and Rosalie Rayner’s description of Little Albert (Watson & Rayner, 1920) [7] , who allegedly learned to fear a white rat—along with other furry objects—when the researchers repeatedly made a loud noise every time the rat approached him.

The Case of “Anna O.”

Sigmund Freud used the case of a young woman he called “Anna O.” to illustrate many principles of his theory of psychoanalysis (Freud, 1961) [8] . (Her real name was Bertha Pappenheim, and she was an early feminist who went on to make important contributions to the field of social work.) Anna had come to Freud’s colleague Josef Breuer around 1880 with a variety of odd physical and psychological symptoms. One of them was that for several weeks she was unable to drink any fluids. According to Freud,

She would take up the glass of water that she longed for, but as soon as it touched her lips she would push it away like someone suffering from hydrophobia.…She lived only on fruit, such as melons, etc., so as to lessen her tormenting thirst. (p. 9)

But according to Freud, a breakthrough came one day while Anna was under hypnosis.

[S]he grumbled about her English “lady-companion,” whom she did not care for, and went on to describe, with every sign of disgust, how she had once gone into this lady’s room and how her little dog—horrid creature!—had drunk out of a glass there. The patient had said nothing, as she had wanted to be polite. After giving further energetic expression to the anger she had held back, she asked for something to drink, drank a large quantity of water without any difficulty, and awoke from her hypnosis with the glass at her lips; and thereupon the disturbance vanished, never to return. (p.9)

Freud’s interpretation was that Anna had repressed the memory of this incident along with the emotion that it triggered and that this was what had caused her inability to drink. Furthermore, he believed that her recollection of the incident, along with her expression of the emotion she had repressed, caused the symptom to go away.

As an illustration of Freud’s theory, the case study of Anna O. is quite effective. As evidence for the theory, however, it is essentially worthless. The description provides no way of knowing whether Anna had really repressed the memory of the dog drinking from the glass, whether this repression had caused her inability to drink, or whether recalling this “trauma” relieved the symptom. It is also unclear from this case study how typical or atypical Anna’s experience was.

Figure 6.8 Anna O. “Anna O.” was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis. Source: http://en.wikipedia.org/wiki/File:Pappenheim_1882.jpg

Case studies are useful because they provide a level of detailed analysis not found in many other research methods and greater insights may be gained from this more detailed analysis. As a result of the case study, the researcher may gain a sharpened understanding of what might become important to look at more extensively in future more controlled research. Case studies are also often the only way to study rare conditions because it may be impossible to find a large enough sample of individuals with the condition to use quantitative methods. Although at first glance a case study of a rare individual might seem to tell us little about ourselves, they often do provide insights into normal behavior. The case of HM provided important insights into the role of the hippocampus in memory consolidation.

However, it is important to note that while case studies can provide insights into certain areas and variables to study, and can be useful in helping develop theories, they should never be used as evidence for theories. In other words, case studies can be used as inspiration to formulate theories and hypotheses, but those hypotheses and theories then need to be formally tested using more rigorous quantitative methods. The reason case studies shouldn’t be used to provide support for theories is that they suffer from problems with both internal and external validity. Case studies lack the proper controls that true experiments contain. As such, they suffer from problems with internal validity, so they cannot be used to determine causation. For instance, during HM’s surgery, the surgeon may have accidentally lesioned another area of HM’s brain (a possibility suggested by the dissection of HM’s brain following his death) and that lesion may have contributed to his inability to consolidate new information. The fact is, with case studies we cannot rule out these sorts of alternative explanations. So, as with all observational methods, case studies do not permit determination of causation. In addition, because case studies are often of a single individual, and typically an abnormal individual, researchers cannot generalize their conclusions to other individuals. Recall that with most research designs there is a trade-off between internal and external validity. With case studies, however, there are problems with both internal validity and external validity. So there are limits both to the ability to determine causation and to generalize the results. A final limitation of case studies is that ample opportunity exists for the theoretical biases of the researcher to color or bias the case description. Indeed, there have been accusations that the woman who studied HM destroyed a lot of her data that were not published and she has been called into question for destroying contradictory data that didn’t support her theory about how memories are consolidated. There is a fascinating New York Times article that describes some of the controversies that ensued after HM’s death and analysis of his brain that can be found at: https://www.nytimes.com/2016/08/07/magazine/the-brain-that-couldnt-remember.html?_r=0

Archival Research

Another approach that is often considered observational research involves analyzing archival data that have already been collected for some other purpose. An example is a study by Brett Pelham and his colleagues on “implicit egotism”—the tendency for people to prefer people, places, and things that are similar to themselves (Pelham, Carvallo, & Jones, 2005) [9] . In one study, they examined Social Security records to show that women with the names Virginia, Georgia, Louise, and Florence were especially likely to have moved to the states of Virginia, Georgia, Louisiana, and Florida, respectively.

As with naturalistic observation, measurement can be more or less straightforward when working with archival data. For example, counting the number of people named Virginia who live in various states based on Social Security records is relatively straightforward. But consider a study by Christopher Peterson and his colleagues on the relationship between optimism and health using data that had been collected many years before for a study on adult development (Peterson, Seligman, & Vaillant, 1988) [10] . In the 1940s, healthy male college students had completed an open-ended questionnaire about difficult wartime experiences. In the late 1980s, Peterson and his colleagues reviewed the men’s questionnaire responses to obtain a measure of explanatory style—their habitual ways of explaining bad events that happen to them. More pessimistic people tend to blame themselves and expect long-term negative consequences that affect many aspects of their lives, while more optimistic people tend to blame outside forces and expect limited negative consequences. To obtain a measure of explanatory style for each participant, the researchers used a procedure in which all negative events mentioned in the questionnaire responses, and any causal explanations for them were identified and written on index cards. These were given to a separate group of raters who rated each explanation in terms of three separate dimensions of optimism-pessimism. These ratings were then averaged to produce an explanatory style score for each participant. The researchers then assessed the statistical relationship between the men’s explanatory style as undergraduate students and archival measures of their health at approximately 60 years of age. The primary result was that the more optimistic the men were as undergraduate students, the healthier they were as older men. Pearson’s  r  was +.25.

This method is an example of  content analysis —a family of systematic approaches to measurement using complex archival data. Just as structured observation requires specifying the behaviors of interest and then noting them as they occur, content analysis requires specifying keywords, phrases, or ideas and then finding all occurrences of them in the data. These occurrences can then be counted, timed (e.g., the amount of time devoted to entertainment topics on the nightly news show), or analyzed in a variety of other ways.

Media Attributions

  • What happens when you remove the hippocampus? – Sam Kean by TED-Ed licensed under a standard YouTube License
  • Pappenheim 1882  by unknown is in the  Public Domain .
  • Festinger, L., Riecken, H., & Schachter, S. (1956). When prophecy fails: A social and psychological study of a modern group that predicted the destruction of the world. University of Minnesota Press. ↵
  • Rosenhan, D. L. (1973). On being sane in insane places. Science, 179 , 250–258. ↵
  • Wilkins, A. (2008). “Happier than Non-Christians”: Collective emotions and symbolic boundaries among evangelical Christians. Social Psychology Quarterly, 71 , 281–301. ↵
  • Levine, R. V., & Norenzayan, A. (1999). The pace of life in 31 countries. Journal of Cross-Cultural Psychology, 30 , 178–205. ↵
  • Kraut, R. E., & Johnston, R. E. (1979). Social and emotional messages of smiling: An ethological approach. Journal of Personality and Social Psychology, 37 , 1539–1553. ↵
  • Cohen, D., Nisbett, R. E., Bowdle, B. F., & Schwarz, N. (1996). Insult, aggression, and the southern culture of honor: An "experimental ethnography." Journal of Personality and Social Psychology, 70 (5), 945-960. ↵
  • Watson, J. B., & Rayner, R. (1920). Conditioned emotional reactions. Journal of Experimental Psychology, 3 , 1–14. ↵
  • Freud, S. (1961).  Five lectures on psycho-analysis . New York, NY: Norton. ↵
  • Pelham, B. W., Carvallo, M., & Jones, J. T. (2005). Implicit egotism. Current Directions in Psychological Science, 14 , 106–110. ↵
  • Peterson, C., Seligman, M. E. P., & Vaillant, G. E. (1988). Pessimistic explanatory style is a risk factor for physical illness: A thirty-five year longitudinal study. Journal of Personality and Social Psychology, 55 , 23–27. ↵

Research that is non-experimental because it focuses on recording systemic observations of behavior in a natural or laboratory setting without manipulating anything.

An observational method that involves observing people’s behavior in the environment in which it typically occurs.

When researchers engage in naturalistic observation by making their observations as unobtrusively as possible so that participants are not aware that they are being studied.

Where the participants are made aware of the researcher presence and monitoring of their behavior.

Refers to when a measure changes participants’ behavior.

In the case of undisguised naturalistic observation, it is a type of reactivity when people know they are being observed and studied, they may act differently than they normally would.

Researchers become active participants in the group or situation they are studying.

Researchers pretend to be members of the social group they are observing and conceal their true identity as researchers.

Researchers become a part of the group they are studying and they disclose their true identity as researchers to the group under investigation.

When a researcher makes careful observations of one or more specific behaviors in a particular setting that is more structured than the settings used in naturalistic or participant observation.

A part of structured observation whereby the observers use a clearly defined set of guidelines to "code" behaviors—assigning specific behaviors they are observing to a category—and count the number of times or the duration that the behavior occurs.

An in-depth examination of an individual.

A family of systematic approaches to measurement using qualitative methods to analyze complex archival data.

Research Methods in Psychology Copyright © 2019 by Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler, & Dana C. Leighton is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Observation Method in Psychology: Naturalistic, Participant and Controlled

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

The observation method in psychology involves directly and systematically witnessing and recording measurable behaviors, actions, and responses in natural or contrived settings without attempting to intervene or manipulate what is being observed.

Used to describe phenomena, generate hypotheses, or validate self-reports, psychological observation can be either controlled or naturalistic with varying degrees of structure imposed by the researcher.

There are different types of observational methods, and distinctions need to be made between:

1. Controlled Observations 2. Naturalistic Observations 3. Participant Observations

In addition to the above categories, observations can also be either overt/disclosed (the participants know they are being studied) or covert/undisclosed (the researcher keeps their real identity a secret from the research subjects, acting as a genuine member of the group).

In general, conducting observational research is relatively inexpensive, but it remains highly time-consuming and resource-intensive in data processing and analysis.

The considerable investments needed in terms of coder time commitments for training, maintaining reliability, preventing drift, and coding complex dynamic interactions place practical barriers on observers with limited resources.

Controlled Observation

Controlled observation is a research method for studying behavior in a carefully controlled and structured environment.

The researcher sets specific conditions, variables, and procedures to systematically observe and measure behavior, allowing for greater control and comparison of different conditions or groups.

The researcher decides where the observation will occur, at what time, with which participants, and in what circumstances, and uses a standardized procedure. Participants are randomly allocated to each independent variable group.

Rather than writing a detailed description of all behavior observed, it is often easier to code behavior according to a previously agreed scale using a behavior schedule (i.e., conducting a structured observation).

The researcher systematically classifies the behavior they observe into distinct categories. Coding might involve numbers or letters to describe a characteristic or the use of a scale to measure behavior intensity.

The categories on the schedule are coded so that the data collected can be easily counted and turned into statistics.

For example, Mary Ainsworth used a behavior schedule to study how infants responded to brief periods of separation from their mothers. During the Strange Situation procedure, the infant’s interaction behaviors directed toward the mother were measured, e.g.,

  • Proximity and contact-seeking
  • Contact maintaining
  • Avoidance of proximity and contact
  • Resistance to contact and comforting

The observer noted down the behavior displayed during 15-second intervals and scored the behavior for intensity on a scale of 1 to 7.

strange situation scoring

Sometimes participants’ behavior is observed through a two-way mirror, or they are secretly filmed. Albert Bandura used this method to study aggression in children (the Bobo doll studies ).

A lot of research has been carried out in sleep laboratories as well. Here, electrodes are attached to the scalp of participants. What is observed are the changes in electrical activity in the brain during sleep ( the machine is called an EEG ).

Controlled observations are usually overt as the researcher explains the research aim to the group so the participants know they are being observed.

Controlled observations are also usually non-participant as the researcher avoids direct contact with the group and keeps a distance (e.g., observing behind a two-way mirror).

  • Controlled observations can be easily replicated by other researchers by using the same observation schedule. This means it is easy to test for reliability .
  • The data obtained from structured observations is easier and quicker to analyze as it is quantitative (i.e., numerical) – making this a less time-consuming method compared to naturalistic observations.
  • Controlled observations are fairly quick to conduct which means that many observations can take place within a short amount of time. This means a large sample can be obtained, resulting in the findings being representative and having the ability to be generalized to a large population.

Limitations

  • Controlled observations can lack validity due to the Hawthorne effect /demand characteristics. When participants know they are being watched, they may act differently.

Naturalistic Observation

Naturalistic observation is a research method in which the researcher studies behavior in its natural setting without intervention or manipulation.

It involves observing and recording behavior as it naturally occurs, providing insights into real-life behaviors and interactions in their natural context.

Naturalistic observation is a research method commonly used by psychologists and other social scientists.

This technique involves observing and studying the spontaneous behavior of participants in natural surroundings. The researcher simply records what they see in whatever way they can.

In unstructured observations, the researcher records all relevant behavior with a coding system. There may be too much to record, and the behaviors recorded may not necessarily be the most important, so the approach is usually used as a pilot study to see what type of behaviors would be recorded.

Compared with controlled observations, it is like the difference between studying wild animals in a zoo and studying them in their natural habitat.

With regard to human subjects, Margaret Mead used this method to research the way of life of different tribes living on islands in the South Pacific. Kathy Sylva used it to study children at play by observing their behavior in a playgroup in Oxfordshire.

Collecting Naturalistic Behavioral Data

Technological advances are enabling new, unobtrusive ways of collecting naturalistic behavioral data.

The Electronically Activated Recorder (EAR) is a digital recording device participants can wear to periodically sample ambient sounds, allowing representative sampling of daily experiences (Mehl et al., 2012).

Studies program EARs to record 30-50 second sound snippets multiple times per hour. Although coding the recordings requires extensive resources, EARs can capture spontaneous behaviors like arguments or laughter.

EARs minimize participant reactivity since sampling occurs outside of awareness. This reduces the Hawthorne effect, where people change behavior when observed.

The SenseCam is another wearable device that passively captures images documenting daily activities. Though primarily used in memory research currently (Smith et al., 2014), systematic sampling of environments and behaviors via the SenseCam could enable innovative psychological studies in the future.

  • By being able to observe the flow of behavior in its own setting, studies have greater ecological validity.
  • Like case studies , naturalistic observation is often used to generate new ideas. Because it gives the researcher the opportunity to study the total situation, it often suggests avenues of inquiry not thought of before.
  • The ability to capture actual behaviors as they unfold in real-time, analyze sequential patterns of interactions, measure base rates of behaviors, and examine socially undesirable or complex behaviors that people may not self-report accurately.
  • These observations are often conducted on a micro (small) scale and may lack a representative sample (biased in relation to age, gender, social class, or ethnicity). This may result in the findings lacking the ability to generalize to wider society.
  • Natural observations are less reliable as other variables cannot be controlled. This makes it difficult for another researcher to repeat the study in exactly the same way.
  • Highly time-consuming and resource-intensive during the data coding phase (e.g., training coders, maintaining inter-rater reliability, preventing judgment drift).
  • With observations, we do not have manipulations of variables (or control over extraneous variables), meaning cause-and-effect relationships cannot be established.

Participant Observation

Participant observation is a variant of the above (natural observations) but here, the researcher joins in and becomes part of the group they are studying to get a deeper insight into their lives.

If it were research on animals , we would now not only be studying them in their natural habitat but be living alongside them as well!

Leon Festinger used this approach in a famous study into a religious cult that believed that the end of the world was about to occur. He joined the cult and studied how they reacted when the prophecy did not come true.

Participant observations can be either covert or overt. Covert is where the study is carried out “undercover.” The researcher’s real identity and purpose are kept concealed from the group being studied.

The researcher takes a false identity and role, usually posing as a genuine member of the group.

On the other hand, overt is where the researcher reveals his or her true identity and purpose to the group and asks permission to observe.

  • It can be difficult to get time/privacy for recording. For example, researchers can’t take notes openly with covert observations as this would blow their cover. This means they must wait until they are alone and rely on their memory. This is a problem as they may forget details and are unlikely to remember direct quotations.
  • If the researcher becomes too involved, they may lose objectivity and become biased. There is always the danger that we will “see” what we expect (or want) to see. This problem is because they could selectively report information instead of noting everything they observe. Thus reducing the validity of their data.

Recording of Data

With controlled/structured observation studies, an important decision the researcher has to make is how to classify and record the data. Usually, this will involve a method of sampling.

In most coding systems, codes or ratings are made either per behavioral event or per specified time interval (Bakeman & Quera, 2011).

The three main sampling methods are:

Event-based coding involves identifying and segmenting interactions into meaningful events rather than timed units.

For example, parent-child interactions may be segmented into control or teaching events to code. Interval recording involves dividing interactions into fixed time intervals (e.g., 6-15 seconds) and coding behaviors within each interval (Bakeman & Quera, 2011).

Event recording allows counting event frequency and sequencing while also potentially capturing event duration through timed-event recording. This provides information on time spent on behaviors.

  • Interval recording is common in microanalytic coding to sample discrete behaviors in brief time samples across an interaction. The time unit can range from seconds to minutes to whole interactions. Interval recording requires segmenting interactions based on timing rather than events (Bakeman & Quera, 2011).
  • Instantaneous sampling provides snapshot coding at certain moments rather than summarizing behavior within full intervals. This allows quicker coding but may miss behaviors in between target times.

Coding Systems

The coding system should focus on behaviors, patterns, individual characteristics, or relationship qualities that are relevant to the theory guiding the study (Wampler & Harper, 2014).

Codes vary in how much inference is required, from concrete observable behaviors like frequency of eye contact to more abstract concepts like degree of rapport between a therapist and client (Hill & Lambert, 2004). More inference may reduce reliability.

Coding schemes can vary in their level of detail or granularity. Micro-level schemes capture fine-grained behaviors, such as specific facial movements, while macro-level schemes might code broader behavioral states or interactions. The appropriate level of granularity depends on the research questions and the practical constraints of the study.

Another important consideration is the concreteness of the codes. Some schemes use physically based codes that are directly observable (e.g., “eyes closed”), while others use more socially based codes that require some level of inference (e.g., “showing empathy”). While physically based codes may be easier to apply consistently, socially based codes often capture more meaningful behavioral constructs.

Most coding schemes strive to create sets of codes that are mutually exclusive and exhaustive (ME&E). This means that for any given set of codes, only one code can apply at a time (mutual exclusivity), and there is always an applicable code (exhaustiveness). This property simplifies both the coding process and subsequent data analysis.

For example, a simple ME&E set for coding infant state might include: 1) Quiet alert, 2) Crying, 3) Fussy, 4) REM sleep, and 5) Deep sleep. At any given moment, an infant would be in one and only one of these states.

Macroanalytic coding systems

Macroanalytic coding systems involve rating or summarizing behaviors using larger coding units and broader categories that reflect patterns across longer periods of interaction rather than coding small or discrete behavioral acts. 

Macroanalytic coding systems focus on capturing overarching themes, global qualities, or general patterns of behavior rather than specific, discrete actions.

For example, a macroanalytic coding system may rate the overall degree of therapist warmth or level of client engagement globally for an entire therapy session, requiring the coders to summarize and infer these constructs across the interaction rather than coding smaller behavioral units.

These systems require observers to make more inferences (more time-consuming) but can better capture contextual factors, stability over time, and the interdependent nature of behaviors (Carlson & Grotevant, 1987).

Examples of Macroanalytic Coding Systems:

  • Emotional Availability Scales (EAS) : This system assesses the quality of emotional connection between caregivers and children across dimensions like sensitivity, structuring, non-intrusiveness, and non-hostility.
  • Classroom Assessment Scoring System (CLASS) : Evaluates the quality of teacher-student interactions in classrooms across domains like emotional support, classroom organization, and instructional support.

Microanalytic coding systems

Microanalytic coding systems involve rating behaviors using smaller, more discrete coding units and categories.

These systems focus on capturing specific, discrete behaviors or events as they occur moment-to-moment. Behaviors are often coded second-by-second or in very short time intervals.

For example, a microanalytic system may code each instance of eye contact or head nodding during a therapy session. These systems code specific, molecular behaviors as they occur moment-to-moment rather than summarizing actions over longer periods.

Microanalytic systems require less inference from coders and allow for analysis of behavioral contingencies and sequential interactions between therapist and client. However, they are more time-consuming and expensive to implement than macroanalytic approaches.

Examples of Microanalytic Coding Systems:

  • Facial Action Coding System (FACS) : Codes minute facial muscle movements to analyze emotional expressions.
  • Specific Affect Coding System (SPAFF) : Used in marital interaction research to code specific emotional behaviors.
  • Noldus Observer XT : A software system that allows for detailed coding of behaviors in real-time or from video recordings.

Mesoanalytic coding systems

Mesoanalytic coding systems attempt to balance macro- and micro-analytic approaches.

In contrast to macroanalytic systems that summarize behaviors in larger chunks, mesoanalytic systems use medium-sized coding units that target more specific behaviors or interaction sequences (Bakeman & Quera, 2017).

For example, a mesoanalytic system may code each instance of a particular type of therapist statement or client emotional expression. However, mesoanalytic systems still use larger units than microanalytic approaches coding every speech onset/offset.

The goal of balancing specificity and feasibility makes mesoanalytic systems well-suited for many research questions (Morris et al., 2014). Mesoanalytic codes can preserve some sequential information while remaining efficient enough for studies with adequate but limited resources.

For instance, a mesoanalytic couple interaction coding system could target key behavior patterns like validation sequences without coding turn-by-turn speech.

In this way, mesoanalytic coding allows reasonable reliability and specificity without requiring extensive training or observation. The mid-level focus offers a pragmatic compromise between depth and breadth in analyzing interactions.

Examples of Mesoanalytic Coding Systems:

  • Feeding Scale for Mother-Infant Interaction : Assesses feeding interactions in 5-minute episodes, coding specific behaviors and overall qualities.
  • Couples Interaction Rating System (CIRS): Codes specific behaviors and rates overall qualities in segments of couple interactions.
  • Teaching Styles Rating Scale : Combines frequency counts of specific teacher behaviors with global ratings of teaching style in classroom segments.

Preventing Coder Drift

Coder drift results in a measurement error caused by gradual shifts in how observations get rated according to operational definitions, especially when behavioral codes are not clearly specified.

This type of error creeps in when coders fail to regularly review what precise observations constitute or do not constitute the behaviors being measured.

Preventing drift refers to taking active steps to maintain consistency and minimize changes or deviations in how coders rate or evaluate behaviors over time. Specifically, some key ways to prevent coder drift include:
  • Operationalize codes : It is essential that code definitions unambiguously distinguish what interactions represent instances of each coded behavior. 
  • Ongoing training : Returning to those operational definitions through ongoing training serves to recalibrate coder interpretations and reinforce accurate recognition. Having regular “check-in” sessions where coders practice coding the same interactions allows monitoring that they continue applying codes reliably without gradual shifts in interpretation.
  • Using reference videos : Coders periodically coding the same “gold standard” reference videos anchors their judgments and calibrate against original training. Without periodic anchoring to original specifications, coder decisions tend to drift from initial measurement reliability.
  • Assessing inter-rater reliability : Statistical tracking that coders maintain high levels of agreement over the course of a study, not just at the start, flags any declines indicating drift. Sustaining inter-rater agreement requires mitigating this common tendency for observer judgment change during intensive, long-term coding tasks.
  • Recalibrating through discussion : Having meetings for coders to discuss disagreements openly explores reasons judgment shifts may be occurring over time. Consensus on the application of codes is restored.
  • Adjusting unclear codes : If reliability issues persist, revisiting and refining ambiguous code definitions or anchors can eliminate inconsistencies arising from coder confusion.

Essentially, the goal of preventing coder drift is maintaining standardization and minimizing unintentional biases that may slowly alter how observational data gets rated over periods of extensive coding.

Through the upkeep of skills, continuing calibration to benchmarks, and monitoring consistency, researchers can notice and correct for any creeping changes in coder decision-making over time.

Reducing Observer Bias

Observational research is prone to observer biases resulting from coders’ subjective perspectives shaping the interpretation of complex interactions (Burghardt et al., 2012). When coding, personal expectations may unconsciously influence judgments. However, rigorous methods exist to reduce such bias.

Coding Manual

A detailed coding manual minimizes subjectivity by clearly defining what behaviors and interaction dynamics observers should code (Bakeman & Quera, 2011).

High-quality manuals have strong theoretical and empirical grounding, laying out explicit coding procedures and providing rich behavioral examples to anchor code definitions (Lindahl, 2001).

Clear delineation of the frequency, intensity, duration, and type of behaviors constituting each code facilitates reliable judgments and reduces ambiguity for coders. Application risks inconsistency across raters without clarity on how codes translate to observable interaction.

Coder Training

Competent coders require both interpersonal perceptiveness and scientific rigor (Wampler & Harper, 2014). Training thoroughly reviews the theoretical basis for coded constructs and teaches the coding system itself.

Multiple “gold standard” criterion videos demonstrate code ranges that trainees independently apply. Coders then meet weekly to establish reliability of 80% or higher agreement both among themselves and with master criterion coding (Hill & Lambert, 2004).

Ongoing training manages coder drift over time. Revisions to unclear codes may also improve reliability. Both careful selection and investment in rigorous training increase quality control.

Blind Methods

To prevent bias, coders should remain unaware of specific study predictions or participant details (Burghardt et al., 2012). Separate data gathering versus coding teams helps maintain blinding.

Coders should be unaware of study details or participant identities that could bias coding (Burghardt et al., 2012).

Separate teams collecting data versus coding data can reduce bias.

In addition, scheduling procedures can prevent coders from rating data collected directly from participants with whom they have had personal contact. Maintaining coder independence and blinding enhances objectivity.

Data Analysis Approaches

Data analysis in behavioral observation aims to transform raw observational data into quantifiable measures that can be statistically analyzed.

It’s important to note that the choice of analysis approach is not arbitrary but should be guided by the research questions, study design, and nature of the data collected.

Interval data (where behavior is recorded at fixed time points), event data (where the occurrence of behaviors is noted as they happen), and timed-event data (where both the occurrence and duration of behaviors are recorded) may require different analytical approaches.

Similarly, the level of measurement (categorical, ordinal, or continuous) will influence the choice of statistical tests.

Researchers typically start with simple descriptive statistics to get a feel for their data before moving on to more complex analyses. This stepwise approach allows for a thorough understanding of the data and can often reveal unexpected patterns or relationships that merit further investigation.

simple descriptive statistics

Descriptive statistics give an overall picture of behavior patterns and are often the first step in analysis.
  • Frequency counts tell us how often a particular behavior occurs, while rates express this frequency in relation to time (e.g., occurrences per minute).
  • Duration measures how long behaviors last, offering insight into their persistence or intensity.
  • Probability calculations indicate the likelihood of a behavior occurring under certain conditions, and relative frequency or duration statistics show the proportional occurrence of different behaviors within a session or across the study.

These simple statistics form the foundation of behavioral analysis, providing researchers with a broad picture of behavioral patterns. 

They can reveal which behaviors are most common, how long they typically last, and how they might vary across different conditions or subjects.

For instance, in a study of classroom behavior, these statistics might show how often students raise their hands, how long they typically stay focused on a task, or what proportion of time is spent on different activities.

contingency analyses

Contingency analyses help identify if certain behaviors tend to occur together or in sequence.
  • Contingency tables , also known as cross-tabulations, display the co-occurrence of two or more behaviors, allowing researchers to see if certain behaviors tend to happen together.
  • Odds ratios provide a measure of the strength of association between behaviors, indicating how much more likely one behavior is to occur in the presence of another.
  • Adjusted residuals in these tables can reveal whether the observed co-occurrences are significantly different from what would be expected by chance.

For example, in a study of parent-child interactions, contingency analyses might reveal whether a parent’s praise is more likely to follow a child’s successful completion of a task, or whether a child’s tantrum is more likely to occur after a parent’s refusal of a request.

These analyses can uncover important patterns in social interactions, learning processes, or behavioral chains.

sequential analyses

Sequential analyses are crucial for understanding processes and temporal relationships between behaviors.
  • Lag sequential analysis looks at the likelihood of one behavior following another within a specified number of events or time units.
  • Time-window sequential analysis examines whether a target behavior occurs within a defined time frame after a given behavior.

These methods are particularly valuable for understanding processes that unfold over time, such as conversation patterns, problem-solving strategies, or the development of social skills.

observer agreement

Since human observers often code behaviors, it’s important to check reliability . This is typically done through measures of observer agreement.
  • Cohen’s kappa is commonly used for categorical data, providing a measure of agreement between observers that accounts for chance agreement.
  • Intraclass correlation coefficient (ICC) : Used for continuous data or ratings.

Good observer agreement is crucial for the validity of the study, as it demonstrates that the observed behaviors are consistently identified and coded across different observers or time points.

advanced statistical approaches

As researchers delve deeper into their data, they often employ more advanced statistical techniques.
  • For instance, an ANOVA might reveal differences in the frequency of aggressive behaviors between children from different socioeconomic backgrounds or in different school settings.
  • This approach allows researchers to account for dependencies in the data and to examine how behaviors might be influenced by factors at different levels (e.g., individual characteristics, group dynamics, and situational factors).
  • This method can reveal trends, cycles, or patterns in behavior over time, which might not be apparent from simpler analyses. For instance, in a study of animal behavior, time series analysis might uncover daily or seasonal patterns in feeding, mating, or territorial behaviors.

representation techniques

Representation techniques help organize and visualize data:
  • Many researchers use a code-unit grid, which represents the data as a matrix with behaviors as rows and time units as columns.
  • This format facilitates many types of analyses and allows for easy visualization of behavioral patterns.
  • Standardized formats like the Sequential Data Interchange Standard (SDIS) help ensure consistency in data representation across studies and facilitate the use of specialized analysis software.
  • Indeed, the complexity of behavioral observation data often necessitates the use of specialized software tools. Programs like GSEQ, Observer, and INTERACT are designed specifically for the analysis of observational data and can perform many of the analyses described above efficiently and accurately.

observation methods

Bakeman, R., & Quera, V. (2017). Sequential analysis and observational methods for the behavioral sciences. Cambridge University Press.

Burghardt, G. M., Bartmess-LeVasseur, J. N., Browning, S. A., Morrison, K. E., Stec, C. L., Zachau, C. E., & Freeberg, T. M. (2012). Minimizing observer bias in behavioral studies: A review and recommendations. Ethology, 118 (6), 511-517.

Hill, C. E., & Lambert, M. J. (2004). Methodological issues in studying psychotherapy processes and outcomes. In M. J. Lambert (Ed.), Bergin and Garfield’s handbook of psychotherapy and behavior change (5th ed., pp. 84–135). Wiley.

Lindahl, K. M. (2001). Methodological issues in family observational research. In P. K. Kerig & K. M. Lindahl (Eds.), Family observational coding systems: Resources for systemic research (pp. 23–32). Lawrence Erlbaum Associates.

Mehl, M. R., Robbins, M. L., & Deters, F. G. (2012). Naturalistic observation of health-relevant social processes: The electronically activated recorder methodology in psychosomatics. Psychosomatic Medicine, 74 (4), 410–417.

Morris, A. S., Robinson, L. R., & Eisenberg, N. (2014). Applying a multimethod perspective to the study of developmental psychology. In H. T. Reis & C. M. Judd (Eds.), Handbook of research methods in social and personality psychology (2nd ed., pp. 103–123). Cambridge University Press.

Smith, J. A., Maxwell, S. D., & Johnson, G. (2014). The microstructure of everyday life: Analyzing the complex choreography of daily routines through the automatic capture and processing of wearable sensor data. In B. K. Wiederhold & G. Riva (Eds.), Annual Review of Cybertherapy and Telemedicine 2014: Positive Change with Technology (Vol. 199, pp. 62-64). IOS Press.

Traniello, J. F., & Bakker, T. C. (2015). The integrative study of behavioral interactions across the sciences. In T. K. Shackelford & R. D. Hansen (Eds.), The evolution of sexuality (pp. 119-147). Springer.

Wampler, K. S., & Harper, A. (2014). Observational methods in couple and family assessment. In H. T. Reis & C. M. Judd (Eds.), Handbook of research methods in social and personality psychology (2nd ed., pp. 490–502). Cambridge University Press.

Print Friendly, PDF & Email

Wright State University - Research Logo

Case study: Methods and observations of overwintering Eptesicus fuscus with White-Nose Syndrome in Ohio, USA

Research output : Contribution to journal › Article › peer-review

Original languageEnglish
Pages (from-to)11-16
Number of pages6
Journal
Volume38
Issue number3
StatePublished - 2018
Externally publishedYes

ASJC Scopus Subject Areas

  • Animal Science and Zoology
  • General Veterinary
  • Eptesicus fuscus
  • Pseudogymnoascus destructans
  • White-Nose Syndrome
  • Wildlife disease
  • Wildlife rehabilitation

Other files and links

  • Link to publication in Scopus
  • Link to the citations in Scopus

T1 - Case study

T2 - Methods and observations of overwintering Eptesicus fuscus with White-Nose Syndrome in Ohio, USA

AU - Simonis, Molly C.

AU - Crow, Rebecca A.

AU - Rúa, Megan A.

N1 - Publisher Copyright: © 2018 International Wildlife Rehabilitation Council. All rights reserved.

KW - Eptesicus fuscus

KW - Pseudogymnoascus destructans

KW - White-Nose Syndrome

KW - Wildlife disease

KW - Wildlife rehabilitation

UR - http://www.scopus.com/inward/record.url?scp=85060529383&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060529383&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:85060529383

SN - 1071-2232

JO - Journal of Wildlife Rehabilitation

JF - Journal of Wildlife Rehabilitation

Comparative case study on NAMs: towards enhancing specific target organ toxicity analysis

  • Regulatory Toxicology
  • Open access
  • Published: 29 August 2024

Cite this article

You have full access to this open access article

observation case study methods

  • Kristina Jochum 1 ,
  • Andrea Miccoli 1 , 2 , 5 ,
  • Cornelia Sommersdorf 3 ,
  • Oliver Poetz 3 , 4 ,
  • Albert Braeuning 5 ,
  • Tewes Tralau 1 &
  • Philip Marx-Stoelting   ORCID: orcid.org/0000-0002-6487-2153 1  

Traditional risk assessment methodologies in toxicology have relied upon animal testing, despite concerns regarding interspecies consistency, reproducibility, costs, and ethics. New Approach Methodologies (NAMs), including cell culture and multi-level omics analyses, hold promise by providing mechanistic information rather than assessing organ pathology. However, NAMs face limitations, like lacking a whole organism and restricted toxicokinetic interactions. This is an inherent challenge when it comes to the use of omics data from in vitro studies for the prediction of organ toxicity in vivo. One solution in this context are comparative in vitro–in vivo studies as they allow for a more detailed assessment of the transferability of the respective NAM data. Hence, hepatotoxic and nephrotoxic pesticide active substances were tested in human cell lines and the results subsequently related to the biology underlying established effects in vivo. To this end, substances were tested in HepaRG and RPTEC/tERT1 cells at non-cytotoxic concentrations and analyzed for effects on the transcriptome and parts of the proteome using quantitative real-time PCR arrays and multiplexed microsphere-based sandwich immunoassays, respectively. Transcriptomics data were analyzed using three bioinformatics tools. Where possible, in vitro endpoints were connected to in vivo observations. Targeted protein analysis revealed various affected pathways, with generally fewer effects present in RPTEC/tERT1. The strongest transcriptional impact was observed for Chlorotoluron in HepaRG cells (increased CYP1A1 and CYP1A2 expression). A comprehensive comparison of early cellular responses with data from in vivo studies revealed that transcriptomics outperformed targeted protein analysis, correctly predicting up to 50% of in vivo effects.

Avoid common mistakes on your manuscript.

Introduction

Given the at times heated discussions about regulatory toxicology in the political and public domain, the quite remarkable track record of toxicological health protection sometimes tends to go unnoticed. Not only are chemical scares such as the chemically induced massive acute health impacts in the 1950ies, 60ies and 70ies a thing of the past (Herzler et al. 2021 ), but in many parts of the world, there are now regulatory frameworks in place which aim at the early identification of potential health risks from chemicals. Within Europe, the most notable in terms of impact are probably REACH (EC 2006 ) and the regulations on pesticides (EC 2009 ) both of which still overwhelmingly rely on animal data for their risk assessments. This has manifold reasons, one being the historical reliability of animal-based systems for the prediction of adversity in humans. However, there are a number of challenges to this traditional approach. These comprise capacity issues when it comes to the testing of thousands of new or hitherto untested substances, the testing of mixtures, the ever-daunting question of species specificity or the limitation of current in vivo studies regarding less accessible endpoints such as for example immunotoxicity or developmental neurotoxicity.

Over recent years, so-called New Approach Methodologies (NAMs) have thus attracted increased attention and importance for regulatory toxicology. The United States Environmental Protection Agency (US EPA 2018 ) defines NAM as ‘…a broadly descriptive reference to any technology, methodology, approach, or combination thereof that can be used to provide information on chemical hazard and risk assessment that avoids the use of intact animals… ’. One instance of an attempt to replace an animal test with an in vitro test system is the embryonic stem cell test in the area of developmental toxicology (Buesen et al. 2004 ; Seiler et al. 2006 ). This stand-alone test was first evaluated for assessing the embryotoxic potential of chemicals as early on as 2004 (Genschow et al. 2004 ). While its establishment as a regulatory prediction model took several more years, one major outcome was the realization that the use of NAMs in general is greatly improved when used as part of a biologically and toxicologically meaningful testing battery (Marx-Stoelting et al. 2009 ; Schenk et al. 2010 ). It should be noted that despite all the potential of such testing batteries a tentative one to one replacement of animal studies is neither practical nor straight forward. The reason is not only the complexity of the endpoints in question but also practical constraints. This was recently exemplified by Landsiedel et al. who pointed out that with the number of different organs and tissues tested during one sub-chronic rodent study, and assuming that 5 NAMs are needed to address the adverse outcomes in any of those organs, it would take decades just to replace this one study. Any regulatory use of NAMs should hence preferably rely on their direct use (Landsiedel et al. 2022 ).

An example from the field of hepatotoxicity testing is the in vitro toolbox for steatosis that was developed by Luckert et al. ( 2018 ) based on the adverse outcome pathway (AOP) concept by Vinken ( 2015 ). The authors employed five assays covering relevant key events from the AOP in HepaRG cells after incubation with the test substance Cyproconazole. Concomitantly, transcript and protein marker patterns for the identification of steatotic compounds were established in HepaRG cells (Lichtenstein et al. 2020 ). The findings were subsequently brought together in a proposed protocol for AOP-based analysis of liver steatosis in vitro (Karaca et al. 2023a ).

One promising use for such cell-based systems is their combination with multi-level omics. In conjunction with sufficient biological and mechanistic knowledge, the wealth of information provided by multi-omics data should potentially allow some prediction of substance-induced adversity. That said any such prediction can of course only be reliable within the established limits of such systems such as the lack of a whole organism and incomplete toxicokinetics and restrictions on adequately capturing the effects of long-term exposure (Schmeisser et al. 2023 ). Regulatory use and trust in cell-based systems will, therefore, strongly rely on how they compare to the outcome of studies based on systemic data (Schmeisser et al. 2023 ).

Pesticide active substances are a group of compounds with profound in vivo data. Some examples for active substances commonly used in PPPs are the fungicides Cyproconazole, Fluxapyroxad, Azoxystrobin and Thiabendazole, as well as the herbicide Chlorotoluron and the multi-purpose substance 2-Phenylphenol. For these compounds, several short- and long-term studies in rodents have been conducted and multiple adverse effects in target organs like liver or kidneys were observed (see Table  1 ). Liver steatosis, as one potential adverse health outcome, has been associated with triazole fungicides, such as Cyproconazole, but other active substances such as Azoxystrobin are suspected to interfere with the lipid metabolism as well (Gao et al. 2014 ; Luckert et al. 2018 ). Potential modes of action for adverse effects include the activation of nuclear receptors, such as the constitutive androstane receptor (CAR), which has been shown for Cyproconazole and Fluxapyroxad (Marx-Stoelting et al. 2017 ; Tamura et al. 2013 ; Zahn et al. 2018 ). Notably, even when an active substance is considered to be of low acute toxicity, e.g. Chlorotoluron, Thiabendazole and 2-Phenylphenol (EC 2015 ; US EPA 2002 ; WHO 1996 ), they might still exhibit adverse chronic effects (Mizutani et al. 1990 ; WHO 1996 ). This is the reason why pesticide active substances and plant protection products (PPP) are assessed extensively before their placing on the market (EC 2009 ).

The target organs most frequently affected by pesticide active ingredients are the liver and kidneys (Nielsen et al. 2012 ). Hence, an in vitro test system aimed at the prediction of pesticide organ toxicity should be able to model effects on these two target organs. One of the best options currently available for hepatotoxicity studies in vitro is the cell line HepaRG (Ashraf et al. 2018 ). Before their use in toxicological assays, the cells undergo a differentiation process resulting in CYP-dependent activities close to the levels in primary human hepatocytes (Andersson et al. 2012 ; Hart et al. 2010 ). They also feature the capability to induce or inhibit a variety of CYP enzymes (Antherieu et al. 2010 ; Hartman et al. 2020 ) and the expression of phase II enzymes, membrane transporters and transcription factors (Aninat et al. 2006 ). Antherieu et al. ( 2012 ) demonstrated that HepaRG cells can sustain various types of chemically induced hepatotoxicity following acute and repeated exposure. Hence, HepaRG cells have the potential to replace the use of primary human hepatocytes in the study of acute and chronic effects of xenobiotics in the liver. In 2012, the European Commission Joint Research Centre’s European Union Reference Laboratory for Alternatives to Animal Testing (EURL ECVAM) coordinated a validation study finding differentiated HepaRG cells as a reliable and relevant tool for CYP enzyme activity studies (EURL ECVAM 2012 ). This led to the proposal of a respective draft test guideline by the OECD in 2019 (OECD 2019 ). Additionally, as part of the US EPA Tox21 project, HepaRG cells were used for an assay assessing toxicogenomics (Franzosa et al. 2021 ).

A promising test system for investigations of nephrotoxicity is the tERT1 immortalized renal proximal tubular epithelial cell line RPTEC/tERT1 (further referred to as RPTEC). These non-cancerous cells have been found to closely resemble primary counterparts showing typical morphology and functionality (Shah et al. 2017 ; Wieser et al. 2008 ). Aschauer et al. ( 2015 ) demonstrated the applicability of RPTEC for investigation of repeated-dose nephrotoxicity using a transcriptomic-based approach. Simon et al. ( 2014 ) showed similar toxicological responses of RPTEC and the target tissue to exposure to benzo[ a ]pyrene and cadmium. Conclusively, RPTEC can be a useful tool for toxicological studies.

In the present study, six pesticide active substances were analyzed in two cell lines, namely the liver cell line HepaRG and the kidney cell line RPTEC. Assays were performed following exposure to the highest non-cytotoxic concentration and comprised targeted protein and transcriptomics analysis. Triggered pathways were identified and compared with established results from in vivo experiments.

Materials and methods

All test substances were purchased in analytical grade (purity ≥ 98.0%) from Sigma-Aldrich, Pestanal® (Taufkirchen, Germany): Cyproconazole, CAS no. 94361–06-5, catalog no. 46068, batch no. BCCD4066; Fluxapyroxad, CAS no. 907204–31-3, catalog no. 37047, batch no. BCCF6749; Azoxystrobin, CAS no. 131860–33-8, catalog no. 31697, batch no. BCCF6593; Chlorotoluron, CAS no. 15545–48-9, catalog no. 45400, batch no. BCBW1414; Thiabendazole, CAS no. 148–79-8, catalog no. 45684, batch no. BCBV5436; 2-Phenylphenol, CAS no. 90–43-7, catalog no. 45529, batch no. BCCF1784. William’s E medium, fetal calf serum (FCS) good forte (catalog no. P40-47500, batch no. P131102), recombinant human insulin and l -glutamine were acquired from PAN-Biotech GmbH (Aidenbach, Germany), FCS superior (catalog no. S0615, batch no. 0001659021) from Bio&Sell (Feucht bei Nürnberg, Germany). Dimethyl sulfoxide (DMSO, purity ≥ 99.8%), hydrocortisone-hemisuccinate (HC/HS), hydrocortisone, epidermal growth factor (EGF) and neutral red (NR) were purchased from Sigma-Aldrich (Taufkirchen, Germany). Dulbecco’s modified eagle medium (DMEM) and Ham’s F Nutrition mix were obtained from Gibco® Life Technologies (Karlsruhe, Germany), trypsin–EDTA, Penicillin–Streptomycin and insulin-transferrin-selenium from Capricorn Scientific GmbH (Ebsdorfergrund, Germany).

Cell culture

HepaRG cells were obtained from Biopredic International (Sant Grégoire, France) and kept in 75 cm 2 flasks under humid conditions at 37 °C and 5% CO 2 . Cells were grown in proliferation medium consisting of William’s E medium with 2 mM l -glutamine, supplemented with 10% FCS good forte, 100 U mL −1 penicillin, 100 µg mL −1 streptomycin, 0.05% human insulin and 50 µM HC/HS for 2 weeks. Then, HepaRG cells were passaged using trypsin–EDTA solution and seeded in 75 cm 2 flasks, 6-well, 12-well and 96-well plates at a density of 20 000 cells per cm 2 . Cells in cell culture dishes were maintained in proliferation medium for another 2 weeks before the medium was changed to differentiation medium (i.e., proliferation medium supplemented by 1.7% DMSO) and cells were cultured for another 2 weeks. Thereafter, cells were used in experiments within 4 weeks, while media was changed to treatment media (i.e., proliferation media supplemented by 0.5% DMSO and 2% FCS) 2 days prior to the experiments.

The RPTEC cell line was obtained from Evercyte GmbH (Vienna, Austria) and cultivated as previously described (Aschauer et al. 2013 ; Wieser et al. 2008 ). Cells were grown in a 1:1 mixture of DMEM and Ham’s F-12 Nutrient Mix, supplemented with 2.5% FCS superior, 100 U mL −1 penicillin, 100 µg mL −1 streptomycin, 2 mM l -glutamine, 36 ng mL −1 hydrocortisone, 10 ng mL −1 EGF, 5 µg mL −1 insulin, 5 µg mL −1 transferrin and 5 ng mL −1 selenium. RPTEC were cultivated in 75 cm 2 flasks until they reached near confluence. Then, cells were passaged using trypsin–EDTA and seeded at 30% density in 75 cm 2 flasks for further sub-cultivation and 6-well, 12-well and 96-well plates for experiments. To obtain complete differentiation, cells in cell culture dishes were maintained for 14 days before they were used in experiments.

Test concentrations

All substances were dissolved in DMSO and diluted in the respective medium to a final DMSO concentration of 0.5% before incubation. HepaRG treatment medium and 0.5% DMSO in RPTEC medium served as solvent controls for HepaRG cells and RPTEC, respectively. At least 3 biological replicates, i.e., independent experiments, were performed for each assay.

Cell viability

Cell viability was investigated with the WST-1 assay (Immunservice, Hamburg, Germany), according to the manufacturer’s protocol and subsequent NR uptake assay according to Repetto et al. ( 2008 ). HepaRG cells and RPTEC were seeded in 96-well plates and incubated with the test substances for 72 h. Triton X-100 (0.01%, Thermo Fisher Scientific, Darmstadt, Germany) was used as positive control for reduced cell viability. At the end of the incubation period, 10 µL WST-1 solution was added to each well and incubated for 30 min at 37 °C. The tetrazolium salt WST-1 is metabolized by cellular mitochondrial dehydrogenases of living cells to a formazan derivative, the absorbance of which was measured at 450 nm with an Infinite M200 PRO plate reader (Tecan, Maennedorf, Switzerland). The reading of each well was related to the absorbance value at the reference wavelength of 620 nm, and blank values were subtracted before the relation to the solvent control.

Afterwards the NR uptake assay was performed, where incorporation of NR into lysosomes of viable cells is measured. One day prior to the assay, NR medium was prepared by diluting a 4 mg mL −1 NR stock solution in PBS 1:100 with the respective cell culture medium for HepaRG cells and RPTEC, and incubated at 37 °C over night. After the WST-1 measurement, the incubation medium was removed and cells were washed twice with PBS. Subsequently, 100 µL NR medium, previously centrifuged for 10 min at 600 ×  g , was added and incubated for 2 h. Afterwards, cells were washed twice with PBS, and 100 µL destaining solution (49.5:49.5:1 ethanol absolute, distilled water, glacial acetic acid) per well was added. Plates were shaken at 500 rotations min −1 for 10 min and fluorescence of NR was measured with an Infinite M200 PRO plate reader (Tecan, Maennedorf, Switzerland) at 530 nm excitation and 645 nm emission. Each reading was subtracted by the blank value and normalized to the solvent control.

Multiplexed microsphere-based sandwich immunoassays

Marker proteins and protein modifications were analyzed by Signatope GmbH (Tübingen, Germany) with a multiplexed microsphere-based sandwich immunoassay. Cells were seeded in 6-well plates and incubated with the test substances for 36 and 72 h. Protein extraction was performed by adding 250 µL pre-cooled extraction buffer, supplied by the company, to the cells in each well and subsequent incubation for 30 min at 4 °C. Cell lysates were transferred to 1.5 mL reaction tubes and centrifuged for 30 min at 4 °C and 15 000 ×  g . The supernatant was aliquoted in 60 µL batches and stored at -80 °C until shipment. After thawing, aliquots were directly used and not frozen again. Samples were analyzed for 8 proteins and protein modifications, each representing a marker for a certain form of toxicity (Table  2 ).

Quantitative real-time PCR and PCR profiler arrays

RT-qPCR was conducted to ensure well performing RNA for subsequent PCR profiler arrays. Cells were seeded in 12-well plates and incubated with the test substances for 36 h. RNA extraction was performed with the RNA easy Mini Kit (Qiagen, Venlo, Netherlands) according to the manufacturer’s manual. Yield RNA concentration and purity were analyzed with a Nanodrop spectrometer (NanoDrop 2000, Thermo Fischer Scientific, Darmstadt, Germany) and RNA samples were stored at -80 °C until further use. Reverse transcription to cDNA was conducted using the High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Waltham, MA, USA) according to the manufacturer’s protocol with a GeneAmp ® PCR System 9700 (Applied Biosystems, Darmstadt, Germany) and cDNA samples were stored at – 20 °C. RT-qPCR was performed with Maxima SYBR Green/ROX Master Mix (Thermo Fisher Scientific, Darmstadt, Germany) according to manufacturer’s protocol. In brief, 9 µL master mix, consisting of 5 µL Maxima SYBR Green/ROX qPCR Master Mix, 0.6 µL each of forward and reverse primers (2.5 µM) and 2.8 µL nuclease-free water, was added to each well of a 384-well plate. Primer sequences are shown in Online Resource 1. Subsequently, 20 ng cDNA was added to each well to a final volume of 10 µL and RT-qPCR was performed with an ABI 7900HT Fast Real-Time PCR system instrument (Applied Biosystems, Darmstadt, Germany). In brief, activation took place at 95 °C for 15 min, followed by 40 cycles of 15 s at 95 °C and 60 s at 60 °C, followed by 15 min at 60 °C and default melting curve analysis. Data were processed using 7900 software v241 and Microsoft Excel 2021. Threshold cycle (C T ) was set to 0.5, melting curve was checked and manual baseline correction was performed for each gene individually. Yield C T -values were extracted to Microsoft Excel 2021 and relative gene expression was obtained with the 2 −ΔΔCt method according to Livak and Schmittgen ( 2001 ). GUSB and HPRT1 served as endogenous control genes for HepaRG cells, GUSB and GAPDH were used for RPTEC. Primer efficiency was tested beforehand according to Schmittgen and Livak ( 2008 ). Only RNA samples showing amplification in RT-qPCR were used for further analysis with PCR profiler arrays. For quality control purposes, yield 2 −ΔΔCt values from RT-qPCR and PCR profiler arrays were compared and had to be within the same range (Online Resource 1).

For performing the PCR profiler array, cDNA was synthesized from 1 µg RNA using the RT 2 First Strand Kit (Qiagen, Venlo, Netherlands) according to the manufacturer’s protocol with a GeneAmp® PCR System 9700 (Applied Biosystems, Darmstadt, Germany). Subsequently, the RT 2 Profiler™ PCR Array Human Molecular Toxicology Pathway Finder or Nephrotoxicity (Qiagen, Venlo, Netherlands) was conducted with RT 2 SYBR ® Green ROX qPCR Mastermix (Qiagen, Venlo, Netherlands) according to the manufacturer’s protocol. RT-qPCR was performed with an ABI 7900HT Fast Real-Time PCR system instrument (Applied Biosystems, Darmstadt, Germany), where activation of polymerase took place for 10 min at 95 °C, followed by 40 cycles of 15 s at 95 °C and 60 s at 60 °C and default melting curve analysis. Data were analyzed using 7900 software v241 and Excel 2021. C T was set to 0.2, melting curve was checked and manual baseline correction was performed. Yield C T -values were extracted and further analyzed.

  • Pathway analysis

Further evaluation of PCR array data was performed with functional class scoring methods such as Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG), as well as with the bioinformatics analysis and search tool Ingenuity Pathway Analysis Software (IPA). Following the manufacturer’s instructions, yield C T -values were uploaded to the Qiagen Gene Globe Webportal Footnote 1 and analyzed using the standard ΔΔC T method referring to an untreated control. A cut-off C T was set to 35, all 5 built-in housekeeping genes were manually selected as reference genes and their arithmetic mean used for normalization. Means of fold regulation and p-values were calculated and further evaluated with the bioinformatics tools following the protocol provided in Online Resource 2. The processed results from HepaRG cells and RPTEC were used as input data individually, as well as combined. For the combined analysis, duplicate genes that were present on both arrays were removed.

To generate a first overview, the percentage of differentially expressed genes (DEG) per pathway was determined as previously published (Heise et al. 2018 ). Genes were assorted to pathways as suggested on the manufacturer’s web page. Footnote 2 The percentage of DEG was calculated as number of genes whose expression significantly differed by a fold change of 2, as determined by Student’s t- test (p < 0.05), related to the total number of genes in the pathway.

GO enrichment and KEGG analysis

The freely available web tools GOrilla Footnote 3 and ShinyGO 0.80 Footnote 4 were used for GO enrichment and KEGG analysis, respectively (Eden et al. 2007 , 2009 ; Ge et al. 2020 ). Detailed protocols are provided in Online Resource 2 together with the R code for determining DEG and background genes (see Data availability), which was adapted from Feiertag et al. ( 2023 ).

Ingenuity pathway analysis

In addition to GO enrichment and KEGG analysis, further evaluation of PCR array data was performed with the bioinformatics analysis and search tool IPA (Qiagen, Hilden, Germany, analysis date: Nov. 2023) as previously published (Karaca et al. 2023b ). IPA is a commercial bioinformatics tool for analyzing RNA data, predicting pathway activation and functional interrelations using a curated pathway database. Using Fisher’s exact test, IPA identifies overrepresented pathways by measuring significant overlaps between user-provided gene lists and predefined gene sets. Means of fold regulation and p -values were uploaded to IPA following the protocol provided in Online Resource 2. Cut-off was set to – 1.5 and + 1.5 for fold regulation and 0.05 for the p -value. Fold regulation represents fold change results in a biologically meaningful way. In case the fold change is greater than 1, the fold regulation is equal to the fold change. For fold change values less than 1, the fold regulation is the negative inverse of the fold change. No further filtering was applied and an IPA core analysis was run. One Excel spread sheet per substance was obtained including all predicted diseases or functions annotations, the associated categories, the p-value of overlap as well as the number and names of the DEG found in the respective annotation (Online Resource 3). Predicted effects on other organs than the liver or the kidneys, such as heart or lungs, were discarded. For further comparison with in vivo data only the categories were used, combined with the p-value of the annotation, which was the highest.

Comparison with animal studies

The data obtained from targeted protein and transcriptomics analyses were compared with known in vivo observations from Draft Assessment Reports (DARs) of the pesticide active substances required for pesticide legislation. To facilitate the comparison of the data, the in vitro data was transformed into a more comprehensible form by applying evaluation matrices as shown in Table  3 .

The in vivo effects attributed to the pesticide active substances were taken from the publication by Nielsen et al. ( 2012 ). Additionally, the DARs of the two substances not reported in Nielsen et al . were analyzed and assigned accordingly. All in vivo effects identified by the authors for liver and kidneys can be found in Online Resource 1. Based on expert knowledge, descriptions of in vitro outcomes were combined with in vivo observations (see Tables  4 and 5 ).

Based on the combination of the in vitro and the in vivo data, it was possible to draw conclusions on the concordance of the predictions. In order to establish optimized thresholds for regarding an effect as in vitro positive, the analyses were performed by considering at least medium effects, strong and very strong effects, or very strong effects only (see Table  3 ) and comparing these to the corresponding in vivo effect. In case multiple in vitro predictors were connected to the same in vivo observation, a positive prediction from one was sufficient to be considered in vitro positive. For protein analyses, the comparison was performed for the data from HepaRG cells and RPTEC individually, as well as combined, where a positive prediction from one of the cell lines was considered sufficient and compared to hepatotoxic and nephrotoxic in vivo effects. For the gene transcription analysis, the categories obtained by IPA were compared to in vivo observations from DARs. A further evaluation integrating protein and transcriptional data was conducted, wherein a positive result from either data type was sufficient to classify a sample as in vitro positive. Online Resource 1 shows the combination of the results in detail. The percentage of concordance between in vitro prediction and in vivo observation was calculated. Indicative concordance was defined as percentage of in vivo positive observations that were predicted to be positive by the in vitro test system.

Statistical analysis

Statistical analysis was performed using R 4.2.1 and RStudio 2023.09.1 + 494. Data evaluation was done with Microsoft Excel 2021.

All experiments were performed in at least three independent biological replicates. Technical replicates, when applicable, were averaged and subsequently mean and standard deviation values were calculated from biological replicates. For targeted protein analysis, statistical significance was calculated with bootstrap technique using R package boot (Canty and Ripley 2016 ; Davidson and Hinkley 1997 ) to account for the high variability that results when the protein expression is affected. Data visualization was done using ggplot2 package (Wickham 2016 ). Calculation of statistical significance of altered gene transcription was performed using Student’s t -test, and R package ComplexHeatmaps was used for data visualization (Gu 2022 ). All R scripts can be found using the link provided in the Data availability section.

Impairment of cell viability

Each substance was tested for its effect on the viability of HepaRG cells and RPTEC. Based on these results, the highest non-cytotoxic concentration was determined and employed in further experiments together with a second concentration (i.e., 0.33 × highest non-cytotoxic concentration). For HepaRG cells, published data were used as a starting point for cytotoxicity testing and confirmed with WST-1 and NR uptake assays. The highest non-cytotoxic concentration, defined as the concentration determining a cell viability greater than 80%, is shown in Table  6 .

For RPTEC, a relatively new cell line, little data was available. At least 3 biological replicates were performed in technical triplicates to determine the highest non-cytotoxic concentrations (Table  6 ). The bar graphs in Online Resource 4 depict the concentration-dependent course of all tested concentrations per substance limited by solubility. Online Resource 1 provides a table with calculated approximations of substance concentrations in the target organ at LOAEL or NOEAL level based on in vivo toxicokinetic results from DARs. These approximations can be compared with the selected in vitro concentrations based on cytotoxicity experiments.

Effects on marker proteins

The result from multiplex microsphere-based sandwich immunoassays of treated HepaRG cells and RPTEC are shown in Figs.  1 and 2 , respectively. In HepaRG cells, incubation with the highest non-cytotoxic concentrations of Azoxystrobin, Chlorotoluron and Thiabendazole increased the expression of total LC3B, an indicator of autophagy, after 36 h (all three compounds) and 72 h (Chlorotoluron and Thiabendazole). Strong effects were observed on cleaved PARP, an indicator of apoptosis, after 36 h of incubation with 120 µM Cyproconazole (247 ± 147%) and 300 µM Thiabendazole (359 ± 204%). However, after 72 h incubation with 120 µM Cyproconazole, the level of cleaved PARP was strongly reduced. Expression of HIF 1-alpha, an indicator of hypoxia, was significantly increased after 36 h incubation with 45 µM Azoxystrobin (214 ± 24%). Fluxapyroxad and 2-Phenylphenol did not significantly increase the expression of any of the protein analytes.

figure 1

Effects on protein abundance and protein modification of key proteins observed in HepaRG cells after 36 and 72 h of incubation with the test substances using a multiplexed microsphere-based sandwich immunoassay panel. Results are shown as means of 3 independent experiments, normalized to solvent controls. Statistical differences to the solvent control were calculated with bootstrapping (* p  < 0.05)

figure 2

Effects on protein abundance and protein modification of key proteins in RPTEC after 36 and 72 h of incubation with the test substances using a multiplexed microsphere-based sandwich immunoassay panel. Results are shown as means of 3 independent experiments, normalized to solvent controls. Statistical differences to the solvent control were calculated with bootstrapping (* p  < 0.05)

In RPTEC, the abundance of p-elF4B, involved in eukaryotic translation initiation, was increased after 36 and 72 h incubation with 300 µM Cyproconazole (165 ± 45% and 201 ± 51%, respectively), all conditions of Fluxapyroxad, incubation with 3 µM Azoxystrobin for 36 h (166 ± 56%) and incubation with 900 µM Chlorotoluron for 36 and 72 h (238 ± 59% and 170 ± 44%, respectively). Thiabendazole exposure for 36 h resulted in an increase of cleaved PARP at both tested concentrations. Due to the high standard deviation, these results were not statistically significant.

Comparing the results from HepaRG cells and RPTEC, fewer effects were observed in RPTEC than in HepaRG cells. Effects of Azoxystrobin and Chlorotoluron on p-elF4B were observed in both cell lines, as well as increased levels of cleaved PARP after Thiabendazole exposure; yet these results were only significant in HepaRG cells. 2-Phenylphenol did not increase the expression of any of the tested proteins in either cell line, while Fluxapyroxad only affected p-elF4B in RPTEC.

A graphical representation of all data points from HepaRG and RPTEC including means and standard deviations can be found in Online Resource 4.

Changes at the gene transcription level

Changes at the protein level are often preceded by changes at the gene expression level. These were analyzed by RT 2 Profiler™ PCR arrays. Figures  3 and 4 show the results from HepaRG cells and RPTEC, respectively. The genes included in the array were assigned to certain pathways according to the information provided on the manufacturer’s web page. For data interpretation, the percentage of DEG was calculated. In HepaRG cells, most DEG were observed following the exposure to Chlorotoluron. Overall, genes categorized as CYPs and phase I were predominantly affected. Cyproconazole and Chlorotoluron exerted effects on genes associated with fatty acid metabolism (10 and 55%, respectively). Of all steatosis-associated genes, 47% were altered by Chlorotoluron. With regards to individual genes, the strongest increase was observed for CYP1A1 and CYP1A2 , both in the group of CYPs and phase I, after exposure to Chlorotoluron (479-fold and 57-fold, respectively) and Thiabendazole (330-fold and 215-fold, respectively).

figure 3

Relative quantities of mRNA transcript levels observed after 36 h exposure of HepaRG cells to non-cytotoxic concentrations of the test substances using the Human Molecular Toxicology Pathway Finder RT 2 Profiler™ PCR Array. Data evaluation was performed using the 2 −∆∆ Ct method, according to Livak and Schmittgen ( 2001 ). All target genes were normalized to 5 housekeeping genes. Results are shown as mean of 3 biological replicates and statistical analysis was performed by one sample Student’s t -test (* p  < 0.05)

figure 4

Relative quantities of mRNA transcript levels observed after 36 h exposure of RPTEC to non-cytotoxic concentrations of the test substances using the Human Nephrotoxicity RT 2 Profiler™ PCR Array. Data evaluation was performed using the 2 −∆∆ Ct method, according to Livak and Schmittgen ( 2001 ). All target genes were normalized to 5 housekeeping genes. Results are shown as mean of 3 biological replicates and statistical analysis was performed by one sample Student’s t -test (* p  < 0.05)

In RPTEC, the cluster encompassing most of the DEG was that associated with regulation of the cell cycle. Here, Cyproconazole, Fluxapyroxad, Azoxystrobin, and Chlorotoluron affected the expression of over 40% of the associated genes. Genes associated with apoptosis were altered following the exposure to all substances, particularly Cyproconazole and Chlorotoluron (47 and 37%, respectively). Cyproconazole additionally showed pronounced effects on genes encoding for extracellular matrix and tissue remodeling molecules (27 and 40%, respectively). All substances affected about 20% of all genes contained in the group of genes related to cell proliferation. Cyproconazole, Chlorotoluron and 2-Phenylphenol affected 25% of all oxidative stress-associated genes. In comparison to HepaRG cells, where CYPs and phase I was the most impacted group, in RPTEC only one of the DEG established for any of the substances belonged to the group of xenobiotic metabolism. At the level of individual genes, HMOX1, a nephrotoxicity marker, was induced over twofold after incubation with all substances, but highest for Cyproconazole (eightfold). Of all genes, the strongest induction was observed for IGFBP1 , a member of the insulin-like growth factor-binding protein family, which was increased 53-fold by incubation with Cyproconazole and over 52-fold after incubation with Chlorotoluron.

A graphical representation of all data points including means and standard deviations can be found in Online Resource 4 for HepaRG and RPTEC results.

Data analysis with GO enrichment and KEGG analysis

Gene expression results were analyzed with GO enrichment and KEGG analysis. All effects obtained in the analyses can be found in Online Resource 3.

The GO enrichment analysis of HepaRG DEG from the incubation with Cyproconazole pointed at changes in secondary and xenobiotic metabolic processes , and the combined analysis additionally resulted in significant enrichment of response to estrogen . DEG modulated by the exposure to Chlorotoluron were involved in 16 ontologies including metabolic, biosynthetic, and catabolic processes , with lipid metabolic process and organic hydroxyl compound metabolic process being the most statistically supported (i.e., p-value: 9.2 × 10 –8 and 7.7 × 10 –7 , respectively). In RPTEC, nucleic acid metabolic process was the only significantly enriched GO term for Chlorotoluron, while the combined analysis revealed a total of 23. Analysis of DEG from incubation with Thiabendazole resulted, among others, in hits for xenobiotic, terpenoid, and isoprenoid metabolic process in HepaRG and combined results. Although analysis of DEG from incubation with 2-Phenylphenol did not result in significantly enriched GO terms from the HepaRG or the RPTEC data; the combined data set showed 5 enriched terms with NADP metabolic process and myeloid leukocyte migration having the lowest p-values (6.9 × 10 –4 , both).

For KEGG analysis, the HepaRG data set for Fluxapyroxad and Chlorotoluron showed enrichment of drug metabolism-cytochrome P450 , as well as taurine and hypotaurine metabolism (Fluxapyroxad) and metabolic pathways (Chlorotoluron). Thiabendazole data revealed enrichment of steroid hormone biosynthesis , metabolism of xenobiotics by cytochrome P450 and chemical carcinogenesis-DNA adducts . RPTEC data set for Azoxystrobin and Chlorotoluron showed multiple cancer-related pathways. The combined data set only resulted in few pathways: hepatocellular carcinoma for Azoxystrobin, metabolic pathways for Chlorotoluron and mineral absorption for 2-Phenylphenol. All other analyses did not result in any significant enrichment.

Data analysis with ingenuity pathway analysis software

Gene expression data were further analyzed with the IPA software. In total 32 different categories of diseases or functions were predicted. Figure  5 shows the ten most frequently resulting categories. Liver Hyperplasia/Hyperproliferation is the only common category across all cell lines and substances. The statistical confidence of the pathway analysis was strongest for Chlorotoluron, which also induced most DEG. Comparing the three methodologies of input data, lower p-values were observed for HepaRG and combined analysis and most categories of diseases or functions were predicted by the combined analysis. Evidently, effects on the kidney were predicted from the input data from liver cells and vice versa.

figure 5

Results obtained by analysis of transcriptomics data with Qiagen Ingenuity Pathway Analysis. The 10 categories most affected are represented. The x-axis shows the -log 10 value of the p-value obtained for the respective effect

In a final step, the data acquired from targeted protein and transcriptomics analyses were compared with known in vivo observations. Given that the comparison focused on aligning the responses from human cell lines with whole animal data, the analysis focused on the extent to which the omics-responses were indicative of the respective biological response in vivo (indicative concordance). To establish an optimized threshold for the evaluation of in vitro predictions, the in vitro data were transformed by applying evaluation matrices as shown in Table  3 . Based on that, activated key proteins and thus cellular functions were identified for each substance from targeted protein analyses. For the evaluation of gene transcripts, the p-values for the categories obtained by IPA were considered. Indicative concordance with known in vivo results is shown in Table  7 .

For the protein analysis, the indicative concordance ranged from 18 to 47% for the single cell lines and their combination, respectively. In contrast to the results from targeted protein analyses, the indicative concordance for the transcriptomic response was much stronger with greatest values of 55, 63 and 76% for the single cell lines and their combination, respectively. Likewise, for those cases where no effect was seen in vivo, no adverse indications were seen in vitro in 80, 91 and 78% of cases, respectively. For protein analysis, this value ranged from 78 to 86% and was 50% for the combined analysis of protein and transcriptional data. It should be noted, however, that these values decreased when the evaluation criteria were less strict (medium or strong instead of very strong).

In the present study, the pathways triggered by non-cytotoxic concentrations of six pesticide active substances were examined, employing targeted protein and transcriptomics analyses in the liver cell line HepaRG and the kidney cell line RPTEC. Utilizing evaluation matrices and prediction software tools, the observed cellular responses were interpreted and compared with outcomes from established in vivo experiments, in order to assess the relevance of our in vitro model systems in predicting the impact of pesticide exposure on human hepatic and renal cellular function. The primary emphasis of this investigation did not lie in delineating discrete effects attributable to individual substances; rather, it centered on discerning the predictive capacity of the system and serving as a case study to highlight the current challenges in the regulatory adoption of NAMs.

When targeted protein data were used to predict in vivo impacts in rodents, the best result was achieved by the combined analysis and setting the evaluation criteria to medium effects (47%). Regarding the indicative concordance based on transcriptional data, medium effects in HepaRG cells seemed the most promising resulting in a 55% match. This is notable given the systemic as well as species differences between the corresponding test systems. It also highlights that the “gold standard”, i.e., the reference standard used for comparison, is in fact not necessarily indisputable (Trevethan 2017 ). Various studies pointed at the shortcomings of traditional animal studies, such as interspecies concordance, poor reproducibility and unsatisfactory extrapolation to humans (Goodman 2018 ; Karmaus et al. 2022 ; Luijten et al. 2020 ; Ly Pham et al. 2020; Smirnova et al. 2018 ; Wang and Gray 2015 ). One example illustrating the difficulties in extrapolating data from rodents to humans is the question whether Cyproconazole causes neoplasms in the liver. Here, animal studies with CD-1 mice showed statistically significant positive trends for hepatocellular adenomas and combined tumors in male mice (EFSA 2010 ; Hester et al. 2012 ). Ensuing studies identified CAR activation by Cyproconazole as the underlying Mode of Action (MoA) (Peffer et al. 2007 ). Marx-Stoelting et al. ( 2017 ) investigated effects of Cyproconazole in mice with humanized CAR and PXR and demonstrated increased sensitivity of rodents to CAR agonist-induced effects, compared to humanized mice. In line with these observations the Joint FAO/WHO Meeting on Pesticide Residues (JMPR) concluded that Cyproconazole is unlikely to pose a carcinogenic risk to humans (JMPR 2010 ). Likewise, Cyproconazole was not considered to cause neoplasms in the liver when analyzed for this study. However, such detailed analysis of a substance’s MoA is scarce.

Another important factor impeding the comparison of in vitro and in vivo data are the different ontologies. The need for harmonized ontologies and reporting formats of in vivo data has been expressed by many researchers in the field of in silico toxicology and has been addressed in multiple projects (Hardy et al. 2012 ; Sanz et al. 2017 ). For example, uncertainty arises as to the reason if and why an effect for a particular organ is possibly not reported. Depending on the case and study in question, this might be because absent effects were simply not explicitly reported as negative, or because other organ toxicities occurred at lower doses and hence data for the remaining organs were omitted or not assessed, or because the focus of the study was another organ (Smirnova et al. 2018 ). While this does not pose a problem for when such studies are used for risk assessment, it does affect the comparison with in vitro results. Another major obstacle is the retrospective conclusive combination of large and comprehensive sets of mechanistic data in vitro with systemic and histopathological observations in vivo. This issue has recently been picked up by on-going European ONTOX project Footnote 5 (“ontology-driven and artificial intelligence-based repeated dose toxicity testing of chemicals for next generation risk assessment “) and has led the consortium to reverse the strategy and build NAMs to predict systemic repeated dose toxicity effects to enable human risk assessment when combined with exposure assessment (Vinken et al. 2021 ). A recent publication by Jiang et al. ( 2023 ) as part of the ONTOX project identified transcriptomic signatures of drug-induced intrahepatic cholestasis with potential future use as prediction model. However, not all pathologies have been analyzed so far, and those that have were often only studied for a limited number of chemicals, limiting their transferability. Hence, this study relied on the use of computational tools such as IPA, GO enrichment and KEGG analysis, to draw functional conclusions from transcriptomics data. While IPA results in categorized diseases or functions annotations, KEGG and GO analyses display enriched ontologies. Therefore, while KEGG and GO results were too ambiguous to be related to distinct in vivo observations, it was feasible to combine IPA results with in vivo observations. It is noteworthy that even though GO enrichment and KEGG analysis seem fairly similar, the results varied widely between the predictions from the various software tools. Soh et al. ( 2010 ) analyzed consistency, comprehensiveness, and compatibility of pathway databases and made several crucial findings such as the inconsistency of associated genes across different databases pertaining to the same biological pathway. Furthermore, common biological pathways shared across different databases were frequently labeled with names that provided limited indication of their interrelationships. Chen et al. ( 2023 ) demonstrated that using the same gene list with different analysis methods may result in non-concordant overrepresented, enriched or perturbed pathways. Taken together, these considerations may explain the divergent findings from the different transcriptomics analyses in the present study. Additionally, these findings underscore the challenges associated with integrating pathway data from diverse sources and emphasize the need for standardized and cohesive representation of biological pathways in databases.

Compared to the transcriptomic data, protein analyses from HepaRG cells and RPTEC cells resulted in a comparatively low indicative concordance. This challenges the notion that protein analysis may be superior in prediction (Wu et al. 2023 ). One likely explanation is that proteins often reflect molecular functions and adverse effects more accurately, and diseases frequently involve dysregulated post-translational modifications, which are challenging to detect and may be poorly correlated with mRNA levels (Kannaiyan and Mahadevan 2018 ; Kelly et al. 2010 ; Zhao et al. 2020 ). However, due to the relatively low number of protein markers as compared to the number of mRNA markers, the targeted transcriptomics analysis is associated with a higher likelihood of finding a match. In the gene transcription analysis with ensuing IPA evaluation, 370 genes were analysed for HepaRG. In contrast, the protein analysis conducted in this study focussed on 8 proteins or modifications, each indicative of a particular cellular function, that were analysed at two time points after incubation of cells with two concentrations of the test substances. Consequently, a cellular response to a stressor over time can be observed, such as the different levels of cleaved PARP after 36 h and 72 h of incubation with Cyproconazole in HepaRG cells. While elevated levels of this apoptosis indicator were noted after 36 h, reduced levels were observed after 72 h. Possible explanations for this include a cellular feedback mechanism or an advanced stage of apoptosis.

Another central observation is that combination of cell lines and methods significantly increases indicative concordance (up to 88%). In the case of targeted protein analysis, combination of results led to an overall value of 47%, compared to approximately 20% for each cell line. Similar trends were observed for transcriptomic data with 76% indicative concordance for combined results, albeit decreasing the cases where an in vivo negative effect corresponded to no adverse indication seen in vitro , as the total number of positive in vitro effects was increased. Nonetheless, the idea that including omics data in regulatory process will unreasonably increase positive findings and lead to overprotectiveness can be challenged as strengthening the evaluation criteria lead to a reversion of this trend. The shortcomings of stand-alone in vitro tests to replace animal experiments have long been known. For example, single tests do not cover all possible outcomes of interest or all modes of action possibly causing a toxicological effect (Hartung et al. 2013 ; Rovida et al. 2015 ). In the present study, reported in vivo effects such as lesions of biliary epithelium or inflammation of the liver may not be fully represented by a single hepatic cell line. Hence, regulatory toxicologists strive to implement so-called integrated testing strategies (ITS) (Caloni et al. 2022 ). Results from projects in the fields of embryonic, developmental and reproductive, or acute oral toxicity have shown that test batteries increase the predictive value over individual assays (Piersma et al. 2013 ; Prieto et al. 2013 ; Sogorb et al. 2014 ). To share these novel methodologies in ITS for safety evaluations in the regulatory context, the OECD Integrated Approaches for Testing and Assessment (IATA) Case Studies Project offers a platform where comprehensive information on case studies, such as consideration documents capturing learnings and lessons from the review experience, can be found. Footnote 6

While this publication’s scope did not extend to establishing a conclusive ITS for liver and kidney toxicity, it serves as a valuable starting point for future analyses in this direction and offers ongoing assistance and insights. Moving forward, it could prove beneficial when exploring testing protocols that integrate protein and transcriptomics analyses, enhancing the comprehensiveness of safety evaluations in this domain.

Data availability

The data sets generated during the current study are available in the Jochum-et-al-2024 GitHub repository, https://github.com/KristinaJochum/Jochum-et-al-2024 .

https://geneglobe.qiagen.com/de/analyze , accessed last: 23.02.2024.

https://geneglobe.qiagen.com/us/product-groups/rt2-profiler-pcr-arrays , accessed last: 23.02.2024.

https://cbl-gorilla.cs.technion.ac.il/ , accessed last: 16.01.2024.

http://bioinformatics.sdstate.edu/go/ , accessed last: 16.01.2024.

https://ontox-project.eu/ , accessed last 26.02.2024.

https://www.oecd.org/chemicalsafety/risk-assessment/iata/ , accessed last: 26.04.2024.

Andersson TB, Kanebratt KP, Kenna JG (2012) The HepaRG cell line: a unique in vitro tool for understanding drug metabolism and toxicology in human. Expert Opin Drug Metab Toxicol 8(7):909–920. https://doi.org/10.1517/17425255.2012.685159

Article   CAS   PubMed   Google Scholar  

Andonegui-Elguera MA, Caceres-Gutierrez RE, Lopez-Saavedra A et al (2022) The Roles of Histone Post-Translational Modifications in the Formation and Function of a Mitotic Chromosome. Int J Mol Sci 23(15):8704. https://doi.org/10.3390/ijms23158704

Article   CAS   PubMed   PubMed Central   Google Scholar  

Aninat C, Piton A, Glaise D et al (2006) Expression of cytochromes P450, conjugating enzymes and nuclear receptors in human hepatoma HepaRG cells. Drug Metab Dispos 34(1):75–83. https://doi.org/10.1124/dmd.105.006759

Antherieu S, Chesne C, Li R et al (2010) Stable expression, activity, and inducibility of cytochromes P450 in differentiated HepaRG cells. Drug Metab Dispos 38(3):516–525. https://doi.org/10.1124/dmd.109.030197

Antherieu S, Chesne C, Li R, Guguen-Guillouzo C, Guillouzo A (2012) Optimization of the HepaRG cell model for drug metabolism and toxicity studies. Toxicol in Vitro 26(8):1278–1285. https://doi.org/10.1016/j.tiv.2012.05.008

Aschauer L, Gruber LN, Pfaller W et al (2013) Delineation of the key aspects in the regulation of epithelial monolayer formation. Mol Cell Biol 33(13):2535–2550. https://doi.org/10.1128/MCB.01435-12

Aschauer L, Limonciel A, Wilmes A, et al. (2015) Application of RPTEC/TERT1 cells for investigation of repeat dose nephrotoxicity: A transcriptomic study. Toxicol In Vitro 30(1 Pt A):106–16 https://doi.org/10.1016/j.tiv.2014.10.005

Ashraf M, Asghar M, Rong Y, Doschak M, Kiang T (2018) Advanced In Vitro HepaRG Culture Systems for Xenobiotic Metabolism and Toxicity Characterization. Eur J Drug Metab Pharmacokinet 44:437–458. https://doi.org/10.1007/s13318-018-0533-3

Article   CAS   Google Scholar  

Buesen R, Visan A, Genschow E, Slawik B, Spielmann H, Seiler A (2004) Trends in improving the embryonic stem cell test (EST): an overview. Altex 21(1):15–22

PubMed   Google Scholar  

Caloni F, De Angelis I, Hartung T (2022) Replacement of animal testing by integrated approaches to testing and assessment (IATA): a call for in vivitrosi. Arch Toxicol 96(7):1935–1950. https://doi.org/10.1007/s00204-022-03299-x

Canty A, Ripley B (2016) boot: Bootstrap R (S-Plus) Functions. R Package Version 1:3–18

Google Scholar  

Chen JW, Shrestha L, Green G, Leier A, Marquez-Lago TT (2023) The hitchhikers' guide to RNA sequencing and functional analysis. Brief Bioinform 24(1):bbac529 https://doi.org/10.1093/bib/bbac529

Davidson AC, Hinkley DV (1997) Bootstrap Methods and Their Applications. Cambridge University Press, Cambridge

Book   Google Scholar  

Duncan RF, Hershey JW (1989) Protein synthesis and protein phosphorylation during heat stress, recovery, and adaptation. J Cell Biol 109(4 Pt 1):1467–1481. https://doi.org/10.1083/jcb.109.4.1467

EC (2006) Regulation (EC) No 1907/2006 concerning the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH), establishing a European Chemicals Agency. In: Parliament E (ed) 1907/2006. Official Journal of the European Union, p 396/1

EC (2009) Regulation (EC) No 1107/2009 of the European Parliament and of the council of 21 October 2009 concerning the placing of plant protection products on the market and repealing Council Directives 79/117/EEC and 91/414/EEC. In: EC (ed) 1107/2009. Official Journal of the European Union

EC (2015) Opinion on o-Phenylphenol, Sodium o-phenylphenate and Potassium o-phenylphenate. European Commission, Directorate-General for Health, Food Safety

EURL ECVAM (2012) Multi-study validation trial for cytochrome P450 induction providing a reliable human metabolically competent standard model or method using the human cryopreserved primary hepatocytes and the human cryopreserved HepaRG cell line. European commission joint research center, p 164

Eden E, Lipson D, Yogev S, Yakhini Z (2007) Discovering motifs in ranked lists of DNA sequences. PLoS Comput Biol 3(3):e39. https://doi.org/10.1371/journal.pcbi.0030039

Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z (2009) GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics 10:48. https://doi.org/10.1186/1471-2105-10-48

Article   PubMed   PubMed Central   Google Scholar  

EFSA (2010) Conclusion on the peer review of the pesticide risk assessment of the active substance cyproconazole. EFSA J 8(11):1897. https://doi.org/10.2903/j.efsa.2010.1897

US EPA (2002) Reregistration Eligibility Decision Thiabendazole. In: US EPA OP, Pesticides and Toxic Substances (ed) Prevention, Pesticides and Toxic Substances. Washington D.C.

US EPA (2018) Strategic Plan to Promote the Development and Implementation of Alternative Test Methods Within the TSCA Program. In: U.S. Environmental Protection Agency OoCSaP, Prevention (eds). Washington, DC

Feiertag K, Karaca M, Fischer B, et al. (2023) Mixture effects of co-formulants and two plant protection products in a liver cell line. EXCLI J 22:221–236 https://doi.org/10.17179/excli2022-5648

Franzosa JA, Bonzo JA, Jack J et al (2021) High-throughput toxicogenomic screening of chemicals in the environment using metabolically competent hepatic cell cultures. NPJ Syst Biol Appl 7(1):7. https://doi.org/10.1038/s41540-020-00166-2

French ME, Koehler CF, Hunter T (2021) Emerging functions of branched ubiquitin chains. Cell Discov 7(1):6. https://doi.org/10.1038/s41421-020-00237-y

Gao A-H, Fu Y-Y, Zhang K-Z et al (1840) (2014) Azoxystrobin, a mitochondrial complex III Qo site inhibitor, exerts beneficial metabolic effects in vivo and in vitro. Biochim Biophys Acta 7:2212–2221. https://doi.org/10.1016/j.bbagen.2014.04.002

Ge SX, Jung D, Yao R (2020) ShinyGO: a graphical gene-set enrichment tool for animals and plants. Bioinformatics 36(8):2628–2629. https://doi.org/10.1093/bioinformatics/btz931

Genschow E, Spielmann H, Scholz G et al (2004) Validation of the embryonic stem cell test in the international ECVAM validation study on three in vitro embryotoxicity tests. Altern Lab Anim 32(3):209–244. https://doi.org/10.1177/026119290403200305

Goodman JI (2018) Goodbye to the Bioassay Toxicol Res (camb) 7(4):558–564. https://doi.org/10.1039/c8tx00004b

Gu Z (2022) Complex heatmap visualization. iMeta 1(3) https://doi.org/10.1002/imt2.43

Hardy B, Apic G, Carthew P, et al. (2012) Toxicology Ontology Perspectives. ALTEX https://doi.org/10.14573/altex.2012.2.139

Hart SN, Li Y, Nakamoto K, Subileau EA, Steen D, Zhong XB (2010) A comparison of whole genome gene expression profiles of HepaRG cells and HepG2 cells to primary human hepatocytes and human liver tissues. Drug Metab Dispos 38(6):988–994. https://doi.org/10.1124/dmd.109.031831

Hartman GD, Kuduk SD, Espiritu C, Lam AM (2020) P450s under Restriction (PURE) Screen Using HepaRG and Primary Human Hepatocytes for Discovery of Novel HBV Antivirals. ACS Med Chem Lett 11(10):1919–1927. https://doi.org/10.1021/acsmedchemlett.9b00630

Hartung T, Luechtefeld T, Maertens A, Kleensang A (2013) Integrated testing strategies for safety assessments. ALTEX 30(1):3–18 https://doi.org/10.14573/altex.2013.1.003

Heise T, Schmidt F, Knebel C et al (2018) Hepatotoxic combination effects of three azole fungicides in a broad dose range. Arch Toxicol 92(2):859–872. https://doi.org/10.1007/s00204-017-2087-6

Herzler M, Marx-Stoelting P, Pirow R et al (2021) The “EU chemicals strategy for sustainability” questions regulatory toxicology as we know it: is it all rooted in sound scientific evidence? Arch Toxicol 95(7):2589–2601. https://doi.org/10.1007/s00204-021-03091-3

Hester S, Moore T, Padgett WT, Murphy L, Wood CE, Nesnow S (2012) The hepatocarcinogenic conazoles: cyproconazole, epoxiconazole, and propiconazole induce a common set of toxicological and transcriptional responses. Toxicol Sci 127(1):54–65. https://doi.org/10.1093/toxsci/kfs086

Jackson RJ, Hellen CU, Pestova TV (2010) The mechanism of eukaryotic translation initiation and principles of its regulation. Nat Rev Mol Cell Biol 11(2):113–127. https://doi.org/10.1038/nrm2838

Jiang J, van Ertvelde J, Ertaylan G et al (2023) Unraveling the mechanisms underlying drug-induced cholestatic liver injury: identifying key genes using machine learning techniques on human in vitro data sets. Arch Toxicol 97(11):2969–2981. https://doi.org/10.1007/s00204-023-03583-4

JMPR (2010) Pesticide residues in food 2010. Report of the Joint Meeting of the FAO Panel of Experts on Pesticide Residues in Food and the Environment and the WHO Core Assessment Group on Pesticide Residues FAO Plant Production and Protection Paper. vol 200, Rome

Kannaiyan R, Mahadevan D (2018) A comprehensive review of protein kinase inhibitors for cancer therapy. Expert Rev Anticancer Ther 18(12):1249–1270. https://doi.org/10.1080/14737140.2018.1527688

Karaca M, Fritsche K, Lichtenstein D et al (2023a) Adverse outcome pathway-based analysis of liver steatosis in vitro using human liver cell lines. STAR Protoc 4(3):102500. https://doi.org/10.1016/j.xpro.2023.102500

Karaca M, Willenbockel CT, Tralau T, Bloch D, Marx-Stoelting P (2023b) Toxicokinetic and toxicodynamic mixture effects of plant protection products: A case study. Regul Toxicol Pharmacol 141:105400. https://doi.org/10.1016/j.yrtph.2023.105400

Karmaus AL, Mansouri K, To KT et al (2022) Evaluation of Variability Across Rat Acute Oral Systemic Toxicity Studies. Toxicol Sci 188(1):34–47. https://doi.org/10.1093/toxsci/kfac042

Kelly TK, De Carvalho DD, Jones PA (2010) Epigenetic modifications as therapeutic targets. Nat Biotechnol 28(10):1069–1078. https://doi.org/10.1038/nbt.1678

Kiang JG, Tsokos GC (1998) Heat shock protein 70 kDa: molecular biology, biochemistry, and physiology. Pharmacol Ther 80(2):183–201. https://doi.org/10.1016/s0163-7258(98)00028-x

Landsiedel R, Birk B, Funk-Weyer D (2022) The Evolution of Regulatory Toxicology: Where is the Gardener? Altern Lab Anim 50(4):255–262. https://doi.org/10.1177/02611929221107617

Article   PubMed   Google Scholar  

Lee KA, Roth RA, LaPres JJ (2007) Hypoxia, drug therapy and toxicity. Pharmacol Ther 113(2):229–246. https://doi.org/10.1016/j.pharmthera.2006.08.001

Lichtenstein D, Mentz A, Schmidt FF et al (2020) Transcript and protein marker patterns for the identification of steatotic compounds in human HepaRG cells. Food Chem Toxicol 145:111690. https://doi.org/10.1016/j.fct.2020.111690

Lichtenstein D, Mentz A, Sprenger H et al (2021) A targeted transcriptomics approach for the determination of mixture effects of pesticides. Toxicology 460:152892. https://doi.org/10.1016/j.tox.2021.152892

Livak KJ, Schmittgen TD (2001) Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 25(4):402–408. https://doi.org/10.1006/meth.2001.1262

Luckert C, Braeuning A, de Sousa G et al (2018) Adverse Outcome Pathway-Driven Analysis of Liver Steatosis in Vitro: A Case Study with Cyproconazole. Chem Res Toxicol 31(8):784–798. https://doi.org/10.1021/acs.chemrestox.8b00112

Luijten M, Corvi R, Mehta J et al (2020) A comprehensive view on mechanistic approaches for cancer risk assessment of non-genotoxic agrochemicals. Regul Toxicol Pharmacol 118:104789. https://doi.org/10.1016/j.yrtph.2020.104789

Ly Pham L, Watford S, Pradeep P et al (2020) Variability in in vivo studies: defining the upper limit of performance for predictions of systemic effect levels. Comput Toxicol 15:1–100126. https://doi.org/10.1016/j.comtox.2020.100126

Marx-Stoelting P, Ganzenberg K, Knebel C et al (2017) Hepatotoxic effects of cyproconazole and prochloraz in wild-type and hCAR/hPXR mice. Arch Toxicol 91(8):2895–2907. https://doi.org/10.1007/s00204-016-1925-2

Marx-Stoelting P, Adriaens E, Ahr HJ, et al. (2009) A review of the implementation of the embryonic stem cell test (EST). The report and recommendations of an ECVAM/ReProTect Workshop. Altern Lab Anim 37(3):313–328 https://doi.org/10.1177/026119290903700314

Mennecozzi M, Landesmann B, Harris GA, Liska R, Whelan M (2012) Hepatotoxicity screening taking a mode-of-action approach using HepaRG cells and HCA. Altex Proc 1(12):193–204 https://doi.org/10.58847/ap.1201

Mizutani T, Ito K, Nomura H, Nakanishi K (1990) Nephrotoxicity of Thiabendazole in mice depleted of glutathione by treatment with DL-buthionine sulphoximine. Food Chem Toxicol 28:169–177. https://doi.org/10.1016/0278-6915(90)90005-8

Muniz L, Nicolas E, Trouche D (2021) RNA polymerase II speed: a key player in controlling and adapting transcriptome composition. EMBO J 40(15):e105740 https://doi.org/10.15252/embj.2020105740

Nelson DM, Ye X, Hall C et al (2002) Coupling of DNA synthesis and histone synthesis in S phase independent of cyclin/cdk2 activity. Mol Cell Biol 22(21):7459–7472. https://doi.org/10.1128/MCB.22.21.7459-7472.2002

Nielsen E, Nørhede P, Boberg J, et al. (2012) Identification of Cumulative Assessment Groups of Pesticides. EFSA Support Publ 9(4) https://doi.org/10.2903/sp.efsa.2012.EN-269

OECD (2019) Determination of Cytochrome P450 (CYP) enzyme activity induction using differentiated human hepatic cells.

Ohtake F (2022) Branched ubiquitin code: from basic biology to targeted protein degradation. J Biochem 171(4):361–366. https://doi.org/10.1093/jb/mvac002

Oliver FJ, de la Rubia G, Rolli V, Ruiz-Ruiz MC, de Murcia G, Murcia JM (1998) Importance of poly(ADP-ribose) polymerase and its cleavage in apoptosis. Lesson from an uncleavable mutant. J Biol Chem 273(50):33533–9 https://doi.org/10.1074/jbc.273.50.33533

Ozawa S, Ohta K, Miyajima A et al (2000) Metabolic activation of o-phenylphenol to a major cytotoxic metabolite, phenylhydroquinone: role of human CYP1A2 and rat CYP2C11/CYP2E1. Xenobiotica 30(10):1005–1017. https://doi.org/10.1080/00498250050200159

Peffer RC, Moggs JG, Pastoor T et al (2007) Mouse liver effects of cyproconazole, a triazole fungicide: role of the constitutive androstane receptor. Toxicol Sci 99(1):315–325. https://doi.org/10.1093/toxsci/kfm154

Piersma AH, Bosgra S, van Duursen MB et al (2013) Evaluation of an alternative in vitro test battery for detecting reproductive toxicants. Reprod Toxicol 38:53–64. https://doi.org/10.1016/j.reprotox.2013.03.002

Prieto P, Kinsner-Ovaskainen A, Stanzel S et al (2013) The value of selected in vitro and in silico methods to predict acute oral toxicity in a regulatory context: results from the European Project ACuteTox. Toxicol in Vitro 27(4):1357–1376. https://doi.org/10.1016/j.tiv.2012.07.013

Reggiori F, Klionsky DJ (2002) Autophagy in the eukaryotic cell. Eukaryot Cell 1(1):11–21. https://doi.org/10.1128/EC.01.1.11-21.2002

Repetto G, del Peso A, Zurita JL (2008) Neutral red uptake assay for the estimation of cell viability/cytotoxicity. Nat Protoc 3(7):1125–1131. https://doi.org/10.1038/nprot.2008.75

Rovida C, Alepee N, Api AM, et al. (2015) Integrated Testing Strategies (ITS) for safety assessment. ALTEX 32(1):25–40 https://doi.org/10.14573/altex.1411011

Sanz F, Pognan F, Steger-Hartmann T et al (2017) Legacy data sharing to improve drug safety assessment: the eTOX project. Nat Rev Drug Discov 16(12):811–812. https://doi.org/10.1038/nrd.2017.177

Schenk B, Weimer M, Bremer S et al (2010) The ReProTect Feasibility Study, a novel comprehensive in vitro approach to detect reproductive toxicants. Reprod Toxicol 30(1):200–218. https://doi.org/10.1016/j.reprotox.2010.05.012

Schmeisser S, Miccoli A, von Bergen M et al (2023) New approach methodologies in human regulatory toxicology - Not if, but how and when! Environ Int 178:108082. https://doi.org/10.1016/j.envint.2023.108082

Schmittgen TD, Livak KJ (2008) Analyzing real-time PCR data by the comparative C(T) method. Nat Protoc 3(6):1101–1108. https://doi.org/10.1038/nprot.2008.73

Seiler AE, Buesen R, Visan A, Spielmann H (2006) Use of murine embryonic stem cells in embryotoxicity assays: the embryonic stem cell test. Methods Mol Biol 329:371–395. https://doi.org/10.1385/1-59745-037-5:371

Shah H, Patel M, Shrivastava N (2017) Gene expression study of phase I and II metabolizing enzymes in RPTEC/TERT1 cell line: application in in vitro nephrotoxicity prediction. Xenobiotica 47(10):837–843. https://doi.org/10.1080/00498254.2016.1236299

Simon BR, Wilson MJ, Wickliffe JK (2014) The RPTEC/TERT1 cell line models key renal cell responses to the environmental toxicants, benzo[a]pyrene and cadmium. Toxicol Rep 1:231–242. https://doi.org/10.1016/j.toxrep.2014.05.010

Smirnova L, Kleinstreuer N, Corvi R, Levchenko A, Fitzpatrick SC, Hartung T (2018) 3S - Systematic, systemic, and systems biology and toxicology. ALTEX 35(2):139–162 https://doi.org/10.14573/altex.1804051

Sogorb MA, Pamies D, de Lapuente J, Estevan C, Estevez J, Vilanova E (2014) An integrated approach for detecting embryotoxicity and developmental toxicity of environmental contaminants using in vitro alternative methods. Toxicol Lett 230(2):356–367. https://doi.org/10.1016/j.toxlet.2014.01.037

Soh D, Dong D, Guo Y, Wong L (2010) Consistency, comprehensiveness, and compatibility of pathway databases. BMC Bioinformatics 11:449. https://doi.org/10.1186/1471-2105-11-449

Tamura K, Inoue K, Takahashi M et al (2013) Dose-response involvement of constitutive androstane receptor in mouse liver hypertrophy induced by triazole fungicides. Toxicol Lett 221(1):47–56. https://doi.org/10.1016/j.toxlet.2013.05.011

Trevethan R (2017) Sensitivity, Specificity, and Predictive Values: Foundations, Pliabilities, and Pitfalls in Research and Practice. Front Public Health 5:307. https://doi.org/10.3389/fpubh.2017.00307

Vinken M (2015) Adverse Outcome Pathways and Drug-Induced Liver Injury Testing. Chem Res Toxicol 28(7):1391–1397. https://doi.org/10.1021/acs.chemrestox.5b00208

Vinken M, Benfenati E, Busquet F et al (2021) Safer chemicals using less animals: kick-off of the European ONTOX project. Toxicology 458:152846. https://doi.org/10.1016/j.tox.2021.152846

Wang B, Gray G (2015) Concordance of Noncarcinogenic Endpoints in Rodent Chemical Bioassays. Risk Anal 35(6):1154–1166. https://doi.org/10.1111/risa.12314

WHO (1996) Guidlines for drinking-water quality. In: Water S, Hygiene and Health (ed) Health criteria and other supporting information. vol 2, 2 edn, Geneva

Wickham H (2016) ggplot2: Elegant Graphics for Data Analysis, 2nd edn. Springer Cham, New York

Wieser M, Stadler G, Jennings P et al (2008) hTERT alone immortalizes epithelial cells of renal proximal tubules without changing their functional characteristics. Am J Physiol Renal Physiol 295(5):F1365–F1375. https://doi.org/10.1152/ajprenal.90405.2008

Wu Y, Liu Q, Xie L (2023) Hierarchical multi-omics data integration and modeling predict cell-specific chemical proteomics and drug responses. Cell Rep Methods 3(4):100452. https://doi.org/10.1016/j.crmeth.2023.100452

Zahn E, Wolfrum J, Knebel C et al (2018) Mixture effects of two plant protection products in liver cell lines. Food Chem Toxicol 112:299–309. https://doi.org/10.1016/j.fct.2017.12.067

Zhao W, Li J, Chen MM, et al. (2020) Large-Scale Characterization of Drug Responses of Clinically Relevant Proteins in Cancer Cell Lines. Cancer Cell 38(6):829–843 e4 https://doi.org/10.1016/j.ccell.2020.10.008

Download references

Open Access funding enabled and organized by Projekt DEAL. This project was supported by BfR grant no. 1322–794.

Author information

Authors and affiliations.

Department of Pesticides Safety, German Federal Institute for Risk Assessment, Berlin, Germany

Kristina Jochum, Andrea Miccoli, Tewes Tralau & Philip Marx-Stoelting

Institute for Marine Biological Resources and Biotechnology (IRBIM), National Research Council, Ancona, Italy

Andrea Miccoli

Signatope GmbH, Tübingen, Germany

Cornelia Sommersdorf & Oliver Poetz

NMI Natural and Medical Sciences Institute at the University of Tübingen, Reutlingen, Germany

Oliver Poetz

Department of Food Safety, German Federal Institute for Risk Assessment, Berlin, Germany

Andrea Miccoli & Albert Braeuning

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization: Oliver Poetz, Albert Braeuning, Philip Marx-Stoelting, Tewes Tralau; methodology: Kristina Jochum, Philip Marx-Stoelting, Oliver Poetz; formal analysis and investigation: Kristina Jochum, Andrea Miccoli, Cornelia Sommersdorf; writing—original draft preparation: Kristina Jochum, Philip Marx-Stoelting; writing—review and editing: Andrea Miccoli, Cornelia Sommersdorf, Oliver Poetz, Albert Braeuning, Tewes Tralau, Philip Marx-Stoelting; funding acquisition: Tewes Tralau, Philip Marx-Stoelting; resources: Tewes Tralau, Philip Marx-Stoelting, Oliver Poetz.

Corresponding author

Correspondence to Philip Marx-Stoelting .

Ethics declarations

Conflict of interest.

Oliver Poetz is a shareholder of SIGNATOPE GmbH. Cornelia Sommersdorf is an employee at SIGNATOPE GmbH. SIGNATOPE offers assay development and service using immunoassay technology.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 828 KB)

Supplementary file2 (pdf 250 kb), supplementary file3 (xlsx 295 kb), supplementary file4 (pdf 4573 kb), rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Jochum, K., Miccoli, A., Sommersdorf, C. et al. Comparative case study on NAMs: towards enhancing specific target organ toxicity analysis. Arch Toxicol (2024). https://doi.org/10.1007/s00204-024-03839-7

Download citation

Received : 03 July 2024

Accepted : 08 August 2024

Published : 29 August 2024

DOI : https://doi.org/10.1007/s00204-024-03839-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Gene enrichment
  • Find a journal
  • Publish with us
  • Track your research

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What Is Qualitative Observation? | Definition & Examples

What Is Qualitative Observation? | Definition & Examples

Published on March 18, 2023 by Tegan George . Revised on June 22, 2023.

Qualitative observation is a research method where the characteristics or qualities of a phenomenon are described without using any quantitative measurements or data. Rather, the observation is based on the observer’s subjective interpretation of what they see, hear, smell, taste, or feel.

Qualitative observations can be done using various methods, including direct observation, interviews , focus groups , or case studies . They can provide rich and detailed information about the behavior, attitudes, perceptions, and experiences of individuals or groups.

Table of contents

When to use qualitative observation, examples of qualitative observation, types of qualitative observations, advantages and disadvantages of qualitative observations, other interesting articles, frequently asked questions.

Qualitative observation is a type of observational study , often used in conjunction with other types of research through triangulation . It is often used in fields like social sciences, education, healthcare, marketing, and design. This type of study is especially well suited for gaining rich and detailed insights into complex and/or subjective phenomena.

A qualitative observation could be a good fit for your research if:

  • You are conducting exploratory research . If the goal of your research is to gain a better understanding of a phenomenon, object, or situation, qualitative observation is a good place to start.
  • When your research topic is complex, subjective, or cannot be examined numerically. Qualitative observation is often able to capture the complexity and subjectivity of human behavior, particularly for topics like emotions, attitudes, perceptions, or cultural practices. These may not be quantifiable or measurable through other methods.
  • You are relying on triangulation within your research approach. Qualitative observation is a solid addition to triangulation approaches, where multiple sources of data are used to validate and verify research findings.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

observation case study methods

Qualitative observation is commonly used in marketing to study consumer behavior, preferences, and attitudes towards products or services.

During the focus group, you focus particularly on qualitative observations, taking note of the participants’ facial expressions, body language, word choice, and tone of voice.

Qualitative observation is often also used in design fields, to better understand user needs, preferences, and behaviors. This can aid in the development of products and services that better meet user needs.

You are particularly focused on any usability issues that could impact customer satisfaction. You run a series of testing sessions, focusing on reactions like facial expressions, body language, and verbal feedback.

There are several types of qualitative observation. Here are some of the most common types to help you choose the best one for your work.

Type Definition Example
The researcher observes how the participants respond to their environment in “real-life” settings but does not influence their behavior in any way Observing monkeys in a zoo enclosure
Also occurs in “real-life” settings. Here, the researcher immerses themself in the participant group over a period of time Spending a few months in a hospital with patients suffering from a particular illness
Covert observation Hinges on the fact that the participants do not know they are being observed Observing interactions in public spaces, like bus rides or parks
Investigates a person or group of people over time, with the idea that close investigation can later be to other people or groups Observing a child or group of children over the course of their time in elementary school

Qualitative observations are a great choice of research method for some projects, but they definitely have their share of disadvantages to consider.

Advantages of qualitative observations

  • Qualitative observations allow you to generate rich and nuanced qualitative data —aiding you in understanding a phenomenon or object and providing insights into the more complex and subjective aspects of human experience.
  • Qualitative observation is a flexible research method that can be adjusted based on research goals and timeline. It also has the potential to be quite non-intrusive, allowing observation of participants in their natural settings without disrupting or influencing their behavior.
  • Qualitative observation is often used in combination with other research methods, such as interviews or surveys , to provide a more complete picture of the phenomenon being studied. This triangulation can help improve the reliability and validity of the research findings.

Disadvantages of qualitative observations

  • Like many observational studies, qualitative observations are at high risk for many research biases , particularly on the side of the researcher in the case of observer bias . These biases can also bleed over to the participant size, in the case of the Hawthorne effect or social desirability bias .
  • Qualitative observations are typically based on a small sample size , which makes them very unlikely to be representative of the larger population. This greatly limits the generalizability of the findings if used as a standalone method, and the data collection process can be long and onerous.
  • Like other human subject research, qualitative observation has its share of ethical considerations to keep in mind and protect, particularly informed consent, privacy, and confidentiality.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

Data analysis in qualitative observation often involves searching for any recurring patterns, themes, and categories in your data. This process may involve coding the data, developing conceptual frameworks or models, and conducting thematic analysis . This can help you generate strong hypotheses or theories based on your data.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

George, T. (2023, June 22). What Is Qualitative Observation? | Definition & Examples. Scribbr. Retrieved August 29, 2024, from https://www.scribbr.com/methodology/qualitative-observation/

Is this article helpful?

Tegan George

Tegan George

Other students also liked, what is participant observation | definition & examples, naturalistic observation | definition, guide & examples, what is a cohort study | definition & examples, what is your plagiarism score.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Observational Studies: Cohort and Case-Control Studies

Jae w. song.

1 Research Fellow, Section of Plastic Surgery, Department of Surgery The University of Michigan Health System; Ann Arbor, MI

Kevin C. Chung

2 Professor of Surgery, Section of Plastic Surgery, Department of Surgery The University of Michigan Health System; Ann Arbor, MI

Observational studies are an important category of study designs. To address some investigative questions in plastic surgery, randomized controlled trials are not always indicated or ethical to conduct. Instead, observational studies may be the next best method to address these types of questions. Well-designed observational studies have been shown to provide results similar to randomized controlled trials, challenging the belief that observational studies are second-rate. Cohort studies and case-control studies are two primary types of observational studies that aid in evaluating associations between diseases and exposures. In this review article, we describe these study designs, methodological issues, and provide examples from the plastic surgery literature.

Because of the innovative nature of the specialty, plastic surgeons are frequently confronted with a spectrum of clinical questions by patients who inquire about “best practices.” It is thus essential that plastic surgeons know how to critically appraise the literature to understand and practice evidence-based medicine (EBM) and also contribute to the effort by carrying out high-quality investigations. 1 Well-designed randomized controlled trials (RCTs) have held the pre-eminent position in the hierarchy of EBM as level I evidence ( Table 1 ). However, RCT methodology, which was first developed for drug trials, can be difficult to conduct for surgical investigations. 3 Instead, well-designed observational studies, recognized as level II or III evidence, can play an important role in deriving evidence for plastic surgery. Results from observational studies are often criticized for being vulnerable to influences by unpredictable confounding factors. However, recent work has challenged this notion, showing comparable results between observational studies and RCTs. 4 , 5 Observational studies can also complement RCTs in hypothesis generation, establishing questions for future RCTs, and defining clinical conditions.

Levels of Evidence Based Medicine

Level of
Evidence
Qualifying Studies
IHigh-quality, multicenter or single-center, randomized controlled trial with adequate power; or systematic review of these studies
IILesser quality, randomized controlled trial; prospective cohort study; or systematic review of these studies
IIIRetrospective comparative study; case-control study; or systematic review of these studies
IVCase-series
VExpert opinion; case report or clinical example; or evidence based on physiology, bench research, or “first principles”

From REF 1 .

Observational studies fall under the category of analytic study designs and are further sub-classified as observational or experimental study designs ( Figure 1 ). The goal of analytic studies is to identify and evaluate causes or risk factors of diseases or health-related events. The differentiating characteristic between observational and experimental study designs is that in the latter, the presence or absence of undergoing an intervention defines the groups. By contrast, in an observational study, the investigator does not intervene and rather simply “observes” and assesses the strength of the relationship between an exposure and disease variable. 6 Three types of observational studies include cohort studies, case-control studies, and cross-sectional studies ( Figure 1 ). Case-control and cohort studies offer specific advantages by measuring disease occurrence and its association with an exposure by offering a temporal dimension (i.e. prospective or retrospective study design). Cross-sectional studies, also known as prevalence studies, examine the data on disease and exposure at one particular time point ( Figure 2 ). 6 Because the temporal relationship between disease occurrence and exposure cannot be established, cross-sectional studies cannot assess the cause and effect relationship. In this review, we will primarily discuss cohort and case-control study designs and related methodologic issues.

An external file that holds a picture, illustration, etc.
Object name is nihms-237355-f0001.jpg

Analytic Study Designs. Adapted with permission from Joseph Eisenberg, Ph.D.

An external file that holds a picture, illustration, etc.
Object name is nihms-237355-f0002.jpg

Temporal Design of Observational Studies: Cross-sectional studies are known as prevalence studies and do not have an inherent temporal dimension. These studies evaluate subjects at one point in time, the present time. By contrast, cohort studies can be either retrospective (latin derived prefix, “retro” meaning “back, behind”) or prospective (greek derived prefix, “pro” meaning “before, in front of”). Retrospective studies “look back” in time contrasting with prospective studies, which “look ahead” to examine causal associations. Case-control study designs are also retrospective and assess the history of the subject for the presence or absence of an exposure.

COHORT STUDY

The term “cohort” is derived from the Latin word cohors . Roman legions were composed of ten cohorts. During battle each cohort, or military unit, consisting of a specific number of warriors and commanding centurions, were traceable. The word “cohort” has been adopted into epidemiology to define a set of people followed over a period of time. W.H. Frost, an epidemiologist from the early 1900s, was the first to use the word “cohort” in his 1935 publication assessing age-specific mortality rates and tuberculosis. 7 The modern epidemiological definition of the word now means a “group of people with defined characteristics who are followed up to determine incidence of, or mortality from, some specific disease, all causes of death, or some other outcome.” 7

Study Design

A well-designed cohort study can provide powerful results. In a cohort study, an outcome or disease-free study population is first identified by the exposure or event of interest and followed in time until the disease or outcome of interest occurs ( Figure 3A ). Because exposure is identified before the outcome, cohort studies have a temporal framework to assess causality and thus have the potential to provide the strongest scientific evidence. 8 Advantages and disadvantages of a cohort study are listed in Table 2 . 2 , 9 Cohort studies are particularly advantageous for examining rare exposures because subjects are selected by their exposure status. Additionally, the investigator can examine multiple outcomes simultaneously. Disadvantages include the need for a large sample size and the potentially long follow-up duration of the study design resulting in a costly endeavor.

An external file that holds a picture, illustration, etc.
Object name is nihms-237355-f0003.jpg

Cohort and Case-Control Study Designs

Advantages and Disadvantages of the Cohort Study

  Gather data regarding sequence of events; can assess causality
  Examine multiple outcomes for a given exposure
  Good for investigating rare exposures
  Can calculate rates of disease in exposed and unexposed individuals over time (e.g. incidence, relative risk)
  Large numbers of subjects are required to study rare exposures
  Susceptible to selection bias
  May be expensive to conduct
  May require long durations for follow-up
  Maintaining follow-up may be difficult
  Susceptible to loss to follow-up or withdrawals
  Susceptible to recall bias or information bias
  Less control over variables

Cohort studies can be prospective or retrospective ( Figure 2 ). Prospective studies are carried out from the present time into the future. Because prospective studies are designed with specific data collection methods, it has the advantage of being tailored to collect specific exposure data and may be more complete. The disadvantage of a prospective cohort study may be the long follow-up period while waiting for events or diseases to occur. Thus, this study design is inefficient for investigating diseases with long latency periods and is vulnerable to a high loss to follow-up rate. Although prospective cohort studies are invaluable as exemplified by the landmark Framingham Heart Study, started in 1948 and still ongoing, 10 in the plastic surgery literature this study design is generally seen to be inefficient and impractical. Instead, retrospective cohort studies are better indicated given the timeliness and inexpensive nature of the study design.

Retrospective cohort studies, also known as historical cohort studies, are carried out at the present time and look to the past to examine medical events or outcomes. In other words, a cohort of subjects selected based on exposure status is chosen at the present time, and outcome data (i.e. disease status, event status), which was measured in the past, are reconstructed for analysis. The primary disadvantage of this study design is the limited control the investigator has over data collection. The existing data may be incomplete, inaccurate, or inconsistently measured between subjects. 2 However, because of the immediate availability of the data, this study design is comparatively less costly and shorter than prospective cohort studies. For example, Spear and colleagues examined the effect of obesity and complication rates after undergoing the pedicled TRAM flap reconstruction by retrospectively reviewing 224 pedicled TRAM flaps in 200 patients over a 10-year period. 11 In this example, subjects who underwent the pedicled TRAM flap reconstruction were selected and categorized into cohorts by their exposure status: normal/underweight, overweight, or obese. The outcomes of interest were various flap and donor site complications. The findings revealed that obese patients had a significantly higher incidence of donor site complications, multiple flap complications, and partial flap necrosis than normal or overweight patients. An advantage of the retrospective study design analysis is the immediate access to the data. A disadvantage is the limited control over the data collection because data was gathered retrospectively over 10-years; for example, a limitation reported by the authors is that mastectomy flap necrosis was not uniformly recorded for all subjects. 11

An important distinction lies between cohort studies and case-series. The distinguishing feature between these two types of studies is the presence of a control, or unexposed, group. Contrasting with epidemiological cohort studies, case-series are descriptive studies following one small group of subjects. In essence, they are extensions of case reports. Usually the cases are obtained from the authors' experiences, generally involve a small number of patients, and more importantly, lack a control group. 12 There is often confusion in designating studies as “cohort studies” when only one group of subjects is examined. Yet, unless a second comparative group serving as a control is present, these studies are defined as case-series. The next step in strengthening an observation from a case-series is selecting appropriate control groups to conduct a cohort or case-control study, the latter which is discussed in the following section about case-control studies. 9

Methodological Issues

Selection of subjects in cohort studies.

The hallmark of a cohort study is defining the selected group of subjects by exposure status at the start of the investigation. A critical characteristic of subject selection is to have both the exposed and unexposed groups be selected from the same source population ( Figure 4 ). 9 Subjects who are not at risk for developing the outcome should be excluded from the study. The source population is determined by practical considerations, such as sampling. Subjects may be effectively sampled from the hospital, be members of a community, or from a doctor's individual practice. A subset of these subjects will be eligible for the study.

An external file that holds a picture, illustration, etc.
Object name is nihms-237355-f0005.jpg

Levels of Subject Selection. Adapted from Ref 9 .

Attrition Bias (Loss to follow-up)

Because prospective cohort studies may require long follow-up periods, it is important to minimize loss to follow-up. Loss to follow-up is a situation in which the investigator loses contact with the subject, resulting in missing data. If too many subjects are loss to follow-up, the internal validity of the study is reduced. A general rule of thumb requires that the loss to follow-up rate not exceed 20% of the sample. 6 Any systematic differences related to the outcome or exposure of risk factors between those who drop out and those who stay in the study must be examined, if possible, by comparing individuals who remain in the study and those who were loss to follow-up or dropped out. It is therefore important to select subjects who can be followed for the entire duration of the cohort study. Methods to minimize loss to follow-up are listed in Table 3 .

Methods to Minimize Loss to Follow-Up

 Exclude subjects likely to be lost
  Planning to move
  Non-committal
 Obtain information to allow future tracking
  Collect subject's contact information (e.g. mailing addresses, telephone numbers, and email addresses)
  Collect social security and/or Medicare numbers
 Maintain periodic contact
  By telephone: may require calls during the weekends and/or evenings
  By mail: repeated mailings by e-mail or with stamped, self-addressed return envelopes
  Other: newsletters or token gifts with study logo

Adapted from REF 2 .

CASE-CONTROL STUDIES

Case-control studies were historically borne out of interest in disease etiology. The conceptual basis of the case-control study is similar to taking a history and physical; the diseased patient is questioned and examined, and elements from this history taking are knitted together to reveal characteristics or factors that predisposed the patient to the disease. In fact, the practice of interviewing patients about behaviors and conditions preceding illness dates back to the Hippocratic writings of the 4 th century B.C. 7

Reasons of practicality and feasibility inherent in the study design typically dictate whether a cohort study or case-control study is appropriate. This study design was first recognized in Janet Lane-Claypon's study of breast cancer in 1926, revealing the finding that low fertility rate raises the risk of breast cancer. 13 , 14 In the ensuing decades, case-control study methodology crystallized with the landmark publication linking smoking and lung cancer in the 1950s. 15 Since that time, retrospective case-control studies have become more prominent in the biomedical literature with more rigorous methodological advances in design, execution, and analysis.

Case-control studies identify subjects by outcome status at the outset of the investigation. Outcomes of interest may be whether the subject has undergone a specific type of surgery, experienced a complication, or is diagnosed with a disease ( Figure 3B ). Once outcome status is identified and subjects are categorized as cases, controls (subjects without the outcome but from the same source population) are selected. Data about exposure to a risk factor or several risk factors are then collected retrospectively, typically by interview, abstraction from records, or survey. Case-control studies are well suited to investigate rare outcomes or outcomes with a long latency period because subjects are selected from the outset by their outcome status. Thus in comparison to cohort studies, case-control studies are quick, relatively inexpensive to implement, require comparatively fewer subjects, and allow for multiple exposures or risk factors to be assessed for one outcome ( Table 4 ). 2 , 9

Advantages and Disadvantages of the Case-Control Study

 Good for examining rare outcomes or outcomes with long latency
 Relatively quick to conduct
 Relatively inexpensive
 Requires comparatively few subjects
 Existing records can be used
 Multiple exposures or risk factors can be examined
 Susceptible to recall bias or information bias
 Difficult to validate information
 Control of extraneous variables may be incomplete
 Selection of an appropriate comparison group may be difficult
 Rates of disease in exposed and unexposed individuals cannot be determined

An example of a case-control investigation is by Zhang and colleagues who examined the association of environmental and genetic factors associated with rare congenital microtia, 16 which has an estimated prevalence of 0.83 to 17.4 in 10,000. 17 They selected 121 congenital microtia cases based on clinical phenotype, and 152 unaffected controls, matched by age and sex in the same hospital and same period. Controls were of Hans Chinese origin from Jiangsu, China, the same area from where the cases were selected. This allowed both the controls and cases to have the same genetic background, important to note given the investigated association between genetic factors and congenital microtia. To examine environmental factors, a questionnaire was administered to the mothers of both cases and controls. The authors concluded that adverse maternal health was among the main risk factors for congenital microtia, specifically maternal disease during pregnancy (OR 5.89, 95% CI 2.36-14.72), maternal toxicity exposure during pregnancy (OR 4.76, 95% CI 1.66-13.68), and resident area, such as living near industries associated with air pollution (OR 7.00, 95% CI 2.09-23.47). 16 A case-control study design is most efficient for this investigation, given the rarity of the disease outcome. Because congenital microtia is thought to have multifactorial causes, an additional advantage of the case-control study design in this example is the ability to examine multiple exposures and risk factors.

Selection of Cases

Sampling in a case-control study design begins with selecting the cases. In a case-control study, it is imperative that the investigator has explicitly defined inclusion and exclusion criteria prior to the selection of cases. For example, if the outcome is having a disease, specific diagnostic criteria, disease subtype, stage of disease, or degree of severity should be defined. Such criteria ensure that all the cases are homogenous. Second, cases may be selected from a variety of sources, including hospital patients, clinic patients, or community subjects. Many communities maintain registries of patients with certain diseases and can serve as a valuable source of cases. However, despite the methodologic convenience of this method, validity issues may arise. For example, if cases are selected from one hospital, identified risk factors may be unique to that single hospital. This methodological choice may weaken the generalizability of the study findings. Another example is choosing cases from the hospital versus the community; most likely cases from the hospital sample will represent a more severe form of the disease than those in the community. 2 Finally, it is also important to select cases that are representative of cases in the target population to strengthen the study's external validity ( Figure 4 ). Potential reasons why cases from the original target population eventually filter through and are available as cases (study participants) for a case-control study are illustrated in Figure 5 .

An external file that holds a picture, illustration, etc.
Object name is nihms-237355-f0006.jpg

Levels of Case Selection. Adapted from Ref 2 .

Selection of Controls

Selecting the appropriate group of controls can be one of the most demanding aspects of a case-control study. An important principle is that the distribution of exposure should be the same among cases and controls; in other words, both cases and controls should stem from the same source population. The investigator may also consider the control group to be an at-risk population, with the potential to develop the outcome. Because the validity of the study depends upon the comparability of these two groups, cases and controls should otherwise meet the same inclusion criteria in the study.

A case-control study design that exemplifies this methodological feature is by Chung and colleagues, who examined maternal cigarette smoking during pregnancy and the risk of newborns developing cleft lip/palate. 18 A salient feature of this study is the use of the 1996 U.S. Natality database, a population database, from which both cases and controls were selected. This database provides a large sample size to assess newborn development of cleft lip/palate (outcome), which has a reported incidence of 1 in 1000 live births, 19 and also enabled the investigators to choose controls (i.e., healthy newborns) that were generalizable to the general population to strengthen the study's external validity. A significant relationship with maternal cigarette smoking and cleft lip/palate in the newborn was reported in this study (adjusted OR 1.34, 95% CI 1.36-1.76). 18

Matching is a method used in an attempt to ensure comparability between cases and controls and reduces variability and systematic differences due to background variables that are not of interest to the investigator. 8 Each case is typically individually paired with a control subject with respect to the background variables. The exposure to the risk factor of interest is then compared between the cases and the controls. This matching strategy is called individual matching. Age, sex, and race are often used to match cases and controls because they are typically strong confounders of disease. 20 Confounders are variables associated with the risk factor and may potentially be a cause of the outcome. 8 Table 5 lists several advantages and disadvantages with a matching design.

Advantages and Disadvantages for Using a Matching Strategy

AdvantagesDisadvantages
Eliminate influence of measurable confounders (e.g. age, sex)May be time-consuming and expensive
Eliminate influence of confounders that are difficult to measureDecision to match and confounding variables to match upon are decided at the outset of the study
May be a sampling convenience, making it easier to select the controls in a case-control studyMatched variables cannot be examined in the study
May improve study efficiency (i.e. smaller sample size)Requires a matched analysis
Vulnerable to overmatching: when matching variable has some relationship with the outcome

Multiple Controls

Investigations examining rare outcomes may have a limited number of cases to select from, whereas the source population from which controls can be selected is much larger. In such scenarios, the study may be able to provide more information if multiple controls per case are selected. This method increases the “statistical power” of the investigation by increasing the sample size. The precision of the findings may improve by having up to about three or four controls per case. 21 - 23

Bias in Case-Control Studies

Evaluating exposure status can be the Achilles heel of case-control studies. Because information about exposure is typically collected by self-report, interview, or from recorded information, it is susceptible to recall bias, interviewer bias, or will rely on the completeness or accuracy of recorded information, respectively. These biases decrease the internal validity of the investigation and should be carefully addressed and reduced in the study design. Recall bias occurs when a differential response between cases and controls occurs. The common scenario is when a subject with disease (case) will unconsciously recall and report an exposure with better clarity due to the disease experience. Interviewer bias occurs when the interviewer asks leading questions or has an inconsistent interview approach between cases and controls. A good study design will implement a standardized interview in a non-judgemental atmosphere with well-trained interviewers to reduce interviewer bias. 9

The STROBE Statement: The Strengthening the Reporting of Observational Studies in Epidemiology Statement

In 2004, the first meeting of the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) group took place in Bristol, UK. 24 The aim of the group was to establish guidelines on reporting observational research to improve the transparency of the methods, thereby facilitating the critical appraisal of a study's findings. A well-designed but poorly reported study is disadvantaged in contributing to the literature because the results and generalizability of the findings may be difficult to assess. Thus a 22-item checklist was generated to enhance the reporting of observational studies across disciplines. 25 , 26 This checklist is also located at the following website: www.strobe-statement.org . This statement is applicable to cohort studies, case-control studies, and cross-sectional studies. In fact, 18 of the checklist items are common to all three types of observational studies, and 4 items are specific to each of the 3 specific study designs. In an effort to provide specific guidance to go along with this checklist, an “explanation and elaboration” article was published for users to better appreciate each item on the checklist. 27 Plastic surgery investigators should peruse this checklist prior to designing their study and when they are writing up the report for publication. In fact, some journals now require authors to follow the STROBE Statement. A list of participating journals can be found on this website: http://www.strobe-statement.org./index.php?id=strobe-endorsement .

Due to the limitations in carrying out RCTs in surgical investigations, observational studies are becoming more popular to investigate the relationship between exposures, such as risk factors or surgical interventions, and outcomes, such as disease states or complications. Recognizing that well-designed observational studies can provide valid results is important among the plastic surgery community, so that investigators can both critically appraise and appropriately design observational studies to address important clinical research questions. The investigator planning an observational study can certainly use the STROBE statement as a tool to outline key features of a study as well as coming back to it again at the end to enhance transparency in methodology reporting.

Acknowledgments

Supported in part by a Midcareer Investigator Award in Patient-Oriented Research (K24 AR053120) from the National Institute of Arthritis and Musculoskeletal and Skin Diseases (to Dr. Kevin C. Chung).

None of the authors has a financial interest in any of the products, devices, or drugs mentioned in this manuscript.

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Page Not Found

Our servers are having a bit of trouble.

Our engineers are on the case and will have things back to normal shortly.

IMAGES

  1. Doing Your Child Observation Case Study: A Step-by-Step Guide

    observation case study methods

  2. Observational Research

    observation case study methods

  3. PPT

    observation case study methods

  4. observation , case study and interview method : research methodology(Nta UGC net sociology)

    observation case study methods

  5. types of observation case study

    observation case study methods

  6. types of observation case study

    observation case study methods

VIDEO

  1. Unit 2 Part 1 Research Methods

  2. 1 4 Observational studies and sampling strategies

  3. 10 Fascinating Facts About Albert Einstein

  4. Chapter 2

  5. Methods of Enquiry in Psychology

  6. Observational Studies vs. Randomized Experiments?

COMMENTS

  1. What Is an Observational Study?

    The case study group has a particular attribute while the control group does not. The two groups are then compared, to see if the case group exhibits a particular characteristic more than the control group. ... Determine your note-taking method: Observational studies often rely on note-taking because other methods, like video or audio recording ...

  2. PDF Case Study Observational Research: A Framework for Conducting Case

    CSR and observation methods. Second, we describe the informing philosophical approach and the research set-ting in which CSOR was developed and finally define the three distinctive features of the approach. Overview: Case Study Research and Observation Method CSR is a comprehensive method increasingly applied in

  3. Observational Study Designs: Synopsis for Selecting an Appropriate

    Case-control study. A case-control study is an observational analytic retrospective study design [].It starts with the outcome of interest (referred to as cases) and looks back in time for exposures that likely caused the outcome of interest [13, 20].This design compares two groups of participants - those with the outcome of interest and the matched control [].

  4. Case Study Observational Research: A Framework for Conducting Case

    Observation methods have the potential to reach beyond other methods that rely largely or solely on self-report. This article describes the distinctive characteristics of case study observational research, a modified form of Yin's 2014 model of case study research the authors used in a study exploring interprofessional collaboration in primary ...

  5. Case Study Observational Research: A Framework for Conducting Case

    Case study research is a comprehensive method that incorporates multiple sources of data to provide detailed accounts of complex research phenomena in real-life contexts. However, current models of case study research do not particularly distinguish the unique contribution observation data can make.

  6. Case study observational research: A framework for conducting case

    Case study research is a comprehensive method that incorporates multiple sources of data to provide detailed accounts of complex research phenomena in real-life contexts. However, current models of case study research do not particularly distinguish the unique contribution observation data can make. Observation methods have the potential to reach beyond other methods that rely largely or ...

  7. Case Study

    Defnition: A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation. It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied.

  8. Direct observation methods: A practical guide for health researchers

    Health research study designs benefit from observations of behaviors and contexts. •. Direct observation methods have a long history in the social sciences. •. Social science approaches should be adapted for health researchers' unique needs. •. Health research observations should be feasible, well-defined and piloted.

  9. Toward Developing a Framework for Conducting Case Study Research

    Gummesson (1988) argues that an important advantage of case study research is the opportunity for a holistic view of the process: "The detailed observations entailed in the case study method enable us to study many different aspects, examine them in relation to each other, view the process within its total environment and also use the ...

  10. Case Study Observational Research: A Framework for Conducting Case

    Request PDF | Case Study Observational Research: A Framework for Conducting Case Study Research Where Observation Data Are the Focus | Case study research is a comprehensive method that ...

  11. Observational studies: a review of study designs, challenges and

    Observational studies are useful methods for studying various problems, ... Within-subject methods (case-only designs): The self-controlled case-series method assesses the association between a transient exposure and an outcome by estimating the relative incidence of specified events in a defined time period after the exposure.

  12. What Is a Case Study?

    A case study is a detailed study of a specific subject, such as a person, group, place, event, organization, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research. A case study research design usually involves qualitative methods, but quantitative methods are sometimes also used.

  13. Case Study Method: A Step-by-Step Guide for Business Researchers

    The case study method involves a range of empirical material collection tools in order to answer the research questions with maximum breadth. Semistructured interviews can be conducted along with meeting observations and documents collection. Collecting empirical material from multiple sources allows triangulation .

  14. Observational Research

    Like many observational research methods, case studies tend to be more qualitative in nature. Case study methods involve an in-depth, and often a longitudinal examination of an individual. Depending on the focus of the case study, individuals may or may not be observed in their natural setting. If the natural setting is not what is of interest ...

  15. Case Study Research Method in Psychology

    Case studies are in-depth investigations of a person, group, event, or community. Typically, data is gathered from various sources using several methods (e.g., observations & interviews). The case study research method originated in clinical medicine (the case history, i.e., the patient's personal history). In psychology, case studies are ...

  16. Naturalistic Observation

    Naturalistic observation is one of the research methods that can be used for an observational study design. Another common type of observation is the controlled observation . In this case, the researcher observes the participant in a controlled environment (e.g., a lab).

  17. Observational studies and their utility for practice

    Introduction. Observational studies involve the study of participants without any forced change to their circumstances, that is, without any intervention.1 Although the participants' behaviour may change under observation, the intent of observational studies is to investigate the 'natural' state of risk factors, diseases or outcomes. For drug therapy, a group of people taking the drug ...

  18. Observation Methods: Naturalistic, Participant and Controlled

    The observation method in psychology involves directly and systematically witnessing and recording measurable behaviors, actions, and responses in natural or contrived settings without attempting to intervene or manipulate what is being observed. Used to describe phenomena, generate hypotheses, or validate self-reports, psychological observation can be either controlled or naturalistic with ...

  19. Methodologic and Data-Analysis Triangulation in Case Studies: A Scoping

    Although much has been published on case studies, there is little consensus on the quality of the various data sources, the most appropriate methods, or the procedures for conducting methodologic and data-analysis triangulation. 5 According to the EQUATOR (Enhancing the QUAlity and Transparency Of health Research) clearinghouse for reporting ...

  20. 7 Types of Observational Studies (With Examples)

    Researchers might choose to use one type of observational study or combine any of these multiple observational study approaches: 1. Cross-sectional studies. Cross-sectional studies happen when researchers observe their chosen subject at one particular point in time. This method is one of the easier types of observational studies, as it does not ...

  21. Case study: Methods and observations of overwintering Eptesicus fuscus

    Case study: Methods and observations of overwintering Eptesicus fuscus with White-Nose Syndrome in Ohio, USA Molly C. Simonis, Rebecca A. Crow, Megan A. Rúa Research output : Contribution to journal › Article › peer-review

  22. Observational Research

    Definition: Observation is the process of collecting and recording data by observing and noting events, behaviors, or phenomena in a systematic and objective manner. It is a fundamental method used in research, scientific inquiry, and everyday life to gain an understanding of the world around us.

  23. Comparative case study on NAMs: towards enhancing specific ...

    Another central observation is that combination of cell lines and methods significantly increases indicative concordance (up to 88%). In the case of targeted protein analysis, combination of results led to an overall value of 47%, compared to approximately 20% for each cell line.

  24. Case Study Methodology of Qualitative Research: Key Attributes and

    A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the debate ...

  25. Three‐year outcomes of contemporary ...

    This study investigated 3-year patency and clinical outcomes in patients with EVT-treated femoropopliteal lesions >25 cm. Methods This retrospective multicenter registry analyzed patients who presented with lower extremity artery disease having femoropopliteal lesions >25 cm who underwent EVT between 2017 and 2021.

  26. What Is Qualitative Observation?

    Qualitative observation is a research method where the characteristics or qualities of a phenomenon are described without using any quantitative measurements or data. Rather, the observation is based on the observer's subjective interpretation of what they see, hear, smell, taste, or feel. ... Case study: Investigates a person or group of ...

  27. Full article: Tracking Real-World Physical Activity in Chronic

    Methods . In this monocentric, prospective, observational feasibility study, it was our primary objective to investigate the association between PA and the evolution of the COPD assessment test (CAT) and the occurrence of acute exacerbations of COPD (AECOPD), respectively. ... First, case numbers were limited, and the study was neither ...

  28. Observational Studies: Cohort and Case-Control Studies

    Cohort studies and case-control studies are two primary types of observational studies that aid in evaluating associations between diseases and exposures. In this review article, we describe these study designs, methodological issues, and provide examples from the plastic surgery literature. Keywords: observational studies, case-control study ...

  29. Rapid differentiation of Staphylococcus aureus in blood cultures using

    This study introduces the "STAPH score," a novel semi-quantitative scoring system that combines Gram stain morphology and time to positivity. By providing a reliable and efficient method to differentiate SA from coagulase-negative staphylococci, the STAPH score enhances diagnostic accuracy and reduces subjectivity in microscopic examinations.

  30. [Pdf] English Songs for An Indonesian Toddler' Second Language

    Language acquisition for children begins when they produce their own words. Children's language usually develops in line with their age. Usually, they can produce sentences and speak their mother tongue fluently by the age of three. At the same time, children can acquire another language, which is called second language acquisition (SLA). In Indonesia, English has become a second language ...