tools4dev Practical tools for international development

research report on monitoring and measuring techniques

How to Write a Monitoring and Evaluation Report

Good impact stories require excellent reporting. Having the right mix of quantitative evidence, well visualized and applied; positioning your case-studies, and providing the relevant theoretical background are all part of telling a good impact story.

Start your report when you are designing the M&E framework. As implementation progresses, core things about the way teams understand their work changes too, and capturing these changes is key, particularly in contexts of complex or developmental evaluation. Being clear about precisely how the problem statement was defined, and having a clear and finalized logical framework for how the work initially set about to address this problem as close as possible to the start of the implementation is key for good M&E.

There are a few important aspects to think about as you compile your report:

Present the M&E system clearly at the start of the report. Include a diagram of your theoretical framework, as well as your more specific logical framework. The first should be stated more in the language and results and change, and should include concepts which are linked to and justified by empirical studies which form the basis of your programme rationale. You may draw from international studies, or your own past evaluations, but providing this clear background, not just for the change you hope to see, but why the programme was designed in this way is key for framing the project intention. The second; the Logical Framework should be depicted in the language of indicators, and the direction and quantity of the change you are hoping to measure.

Be sure to outline and describe the context: the beneficiary groups, and the circumstances of their lives. This is a section of the report where the problem statement and the programme rationale align, and it is also the point where you define the very heart of the change, whether you’re providing skills, information to change to mindsets, or whether you’re distributing resources. This is where you define any ongoing measures such as assessments and how this is relevant from the perspective of the beneficiaries, stated in the language of change.

Include a very clear description of the activities of the project. Although there is not space in an impact report to explore the full workings of a project, stating programme activities as the catalyst for change is extremely important. What activities are being undertaken, and why these are expected to create change is key.

Once you have clearly stated the intention of the project, as well as the intention and design of the system for measuring the project’s work, then it is a good idea to clearly state and describe the methods used in the M&E function, and to justify the chosen method. If you are taking a rigorous quantitative approach, for example in public health projects where change can be clearly measured; or if you are working in a complex social space, and have chosen to use a developmental or other realistic evaluation approach; the rest of the report will depend on the method you have chosen. Some methods, such as Social Return on Investment impact analyses include clear guides on how to report; while more qualitative, research oriented studies might be far less quantitatively analytical, but explanatory.

Whichever course your particular work might take, honesty and clarity are key. It is far more useful to reveal a missed milestone, or a goal not achieved (and why) than not to write about it. With the scope and extent of the problems which the communities of the world aim to remedy, there are only more lessons to learn. Including a section on ‘what didn’t work’ shows not only that the M&E has been effective, but also indicates that the organization is reflective and critical. For any project aiming to make serious change, this is the first step to ongoing improvement.

The bulk of the report should focus on the presentation of findings: the achievement of outputs and outcomes, and the project impact. The form this takes will depend largely on the type of work you are doing, and the adopted M&E methodology.

In summary, your report may take the following format:

Introduction

A strong introduction is key. Be sure to state the overall length of time of the project, and the reach, or total beneficiary information.

Background and Context

Include a problem statement as well as the theoretical background, or literature review.

Programme Overview

In this section, describe the programme – the overall vision (or results), the beneficiaries and the programme activities designed to create the change

Programme M&E

In this section, you will bring together the contextual, and the programme design into a visual representation of the overall theoretical framework. Then, include the log frame, and a section on methodology.

Analysis of Outcomes & Impacts

In this section, you may explore the outputs and outcomes against the log frame as you have presented it. This may include graphical visualization of quantitative indicators, or case studies for more observation based, or qualitative change. Include a section on the overall impact, returning to and referencing the programme information outlined in the report, but more importantly defining convergence, challenges and lessons learnt.

Finally, once you have completed your report, take some time on an executive summary. Not only will having this ensure that high-level stakeholders can get the salient information in a short amount of time, but this will also provide the space to be more exploratory in the main body of reporting. If you can summarize your outcomes and impact findings into an infographic, this can also be useful for your communications and fundraising strategies. Be prepared that a good impact report can take months, and to truly capture the complexity of change, M&E teams should take note throughout programme implementation. Unlike an independent, external evaluation, and Impact Report can really be an opportunity to explore and explain the full length and breadth of change as you conduct your work. Whether entirely quantitative, and working within a relatively closed system, or as a record of the iterative growth and learning of your organization as it strives to have an impact, an Impact Report is your biography, your testament.

About Angela Biden

' src=

Related Articles

research report on monitoring and measuring techniques

Apply now: Master of Science in Engineering, Sustainability and Health

15 May 2024

research report on monitoring and measuring techniques

Top 10 Websites to find Monitoring and Evaluation Jobs

12 August 2023

research report on monitoring and measuring techniques

Monitoring and Evaluation Tools for NGOs

6 August 2023

SYSTEMATIC REVIEW article

A survey of data quality measurement and monitoring tools.

\nLisa Ehrlinger,

  • 1 Institute for Application-Oriented Knowledge Processing (FAW), Johannes Kepler University, Linz, Austria
  • 2 Software Competence Center Hagenberg GmbH, Hagenberg, Austria

High-quality data is key to interpretable and trustworthy data analytics and the basis for meaningful data-driven decisions. In practical scenarios, data quality is typically associated with data preprocessing, profiling, and cleansing for subsequent tasks like data integration or data analytics. However, from a scientific perspective, a lot of research has been published about the measurement (i.e., the detection) of data quality issues and different generally applicable data quality dimensions and metrics have been discussed. In this work, we close the gap between data quality research and practical implementations with a detailed investigation on how data quality measurement and monitoring concepts are implemented in state-of-the-art tools . For the first time and in contrast to all existing data quality tool surveys, we conducted a systematic search, in which we identified 667 software tools dedicated to “data quality.” To evaluate the tools, we compiled a requirements catalog with three functionality areas: (1) data profiling, (2) data quality measurement in terms of metrics, and (3) automated data quality monitoring. Using a set of predefined exclusion criteria, we selected 13 tools (8 commercial and 5 open-source tools) that provide the investigated features and are not limited to a specific domain for detailed investigation. On the one hand, this survey allows a critical discussion of concepts that are widely accepted in research, but hardly implemented in any tool observed, for example, generally applicable data quality metrics. On the other hand, it reveals potential for functional enhancement of data quality tools and supports practitioners in the selection of appropriate tools for a given use case.

1. Introduction

Data quality (DQ) measurement is a fundamental building block for estimating the relevance of data-driven decisions. Such decisions accompany our everyday life, for instance, machine-based decisions in ranking algorithms, industrial robots, and self-driving cars in the emerging field of artificial intelligence. The negative impact of poor data on the error rate of machine learning (ML) models has been shown by Sessions and Valtorta (2006) and Ehrlinger et al. (2019) . Also human-based decisions rely on high-quality data, for example, the decision whether to promote or to suspend the production of a specific product is usually based on sales data. Despite the clear correlation between data and decision quality, 84 % of the CEOs in the US are concerned about their DQ ( KPMG International, 2016 ) and “organizations believe poor data quality to be responsible for an average of $15 million per year in losses” ( Moore, 2018 ). Thus, DQ is “no longer a question of ‘hygiene' [...], but rather has become critical for operational excellence” and is perceived as the greatest challenge in corporate data management ( Otto and Österle, 2016 ).

To increase the trust in data-driven decisions, it is necessary to measure, know, and improve the quality of the employed data with appropriate tools ( Ehrlinger et al., 2018 ; Heinrich et al., 2018 ). DQ improvement (i.e., data cleansing), which is based on DQ measurement, are both part of comprehensive DQ management. Most existing methodologies describe DQ management as cyclic process, which is carried out continuously (cf. Redman, 1997 ; Wang, 1998 ; English, 1999 ; Lee et al., 2009 ; Sebastian-Coleman, 2013 ). Yet, according to a German survey, 66 % of companies use Excel or Access solutions to validate their DQ and 63 % of the companies determine their DQ manually and ad hoc without any long-term DQ management strategy ( Schäffer and Beckmann, 2014 ). Considering such studies and the increasing amount of data to be processed, there is a clear need for intensive research to automate DQ management tasks. Sebastian-Coleman (2013) also states that “without automation , the speed and volume of data will quickly overwhelm even the most dedicated efforts to measure.”

Research about data quality has been conducted since the 1980s and since then, DQ is most often associated with the “fitness for use” principle ( Chrisman, 1983 ; Wang and Strong, 1996 ), which refers to the subjectivity and context-dependency of this topic. Data quality is typically referred to as a multi-dimensional concept, where single aspects are described by DQ dimensions (e.g., accuracy, completeness, timeliness). The fulfillment of a DQ dimension can be quantified using one or several DQ metrics ( Ehrlinger et al., 2018 ). According to the IEEE standard ( IEEE, 1998 ), a metric is a formula that yields a numerical value. In parallel to the scientific background, a wide variety of commercial, open-source, and academic DQ tools with different foci have been developed since then, in order to support the automation of DQ management. The range of functions offered by those tools varies widely, because the term “data quality” is context-dependent and not always used consistently. Despite the large number of publications, tools, and concepts into data quality, it is not always clear how to map the concepts from the theory (i.e., dimensions and metrics) to a practical implementation (i.e., tools). Therefore, the question of how to measure and monitor DQ automatically is still not sufficiently answered ( Sebastian-Coleman, 2013 ). In this survey, we contribute to this research question by providing a detailed investigation of DQ measurement and monitoring functionalities in state-of-the-art DQ tools.

Specifically, we conducted a systematic search, where we identified 667 software tools dedicated to “data quality.” According to predefined exclusion criteria, we selected 13 DQ tools (8 commercial and 5 open-source tools) for deeper investigation. To systematically evaluate the functional scope of the tools, we introduce a requirements catalog comprising three categories: (1) data profiling, (2) DQ measurement in terms of dimensions and metrics, and (3) continuous DQ monitoring. Since the focus of this survey is on the automation of DQ tasks, we specifically observe the measurement capabilities (i.e., how to detect and report DQ issues) and to which extent the tools support automated DQ monitoring, required to ensure high-quality data over time ( Ehrlinger and Wöß, 2017 ). We deliberately exclude tools that solely offer data cleansing and improvement functions, because an automated modification of the data (i.e., data cleansing) is usually not possible in productive information systems with critical content. Consequently, our main contributions can be summarizes as follows:

• To the best of our knowledge, we conducted the first systematic search to identify DQ tools and thus, give a comprehensive overview on the market.

• We compiled a requirements catalog to investigate data profiling, DQ measurement, and automated DQ monitoring functionalities of DQ tools. This catalog summarizes and classifies tasks that are required for automated and continuous DQ measurement in a new way and supports follow-up studies, e.g., on domain-specific DQ tools.

• Based on the detailed investigation of 13 DQ tools, we propose a new research direction for DQ measurement and highlight potential for enhancement in the DQ tools.

The results of this survey are not only relevant for DQ professionals to select the most appropriate tool for a given use case, but also highlight the current capabilities of state-of-the-art DQ tools. Especially since such a wide variety of DQ tools exist, it is often not clear which functional scope can be expected. The main findings of this article can be summarized as follows:

• Despite the presumption that the emerging market of DQ tools is still under development (cf. Selvage et al., 2017 ), we found a vast number (667) of DQ tools through our systematic search, where most of them have never been included in one of the existing surveys.

• Approximately half (50.82 %) of the DQ tools were domain specific, which means they were either dedicated to specific types of data or built to measure the DQ of a proprietary tool.

• 16.67 % of the DQ tools focused on data cleansing without a proper DQ measurement strategy (i.e., measurements are used to modify the data, but no comprehensive reports are provided).

• Most surveyed tools supported data profiling to some extent, but considering the research state, there is potential for functional enhancement in data profiling, especially with respect to multi-column profiling and dependency discovery.

• We did not find a tool that implements a wider range of DQ metrics for the most important DQ dimensions as proposed in research papers (cf. Piro, 2014 ; Batini and Scannapieco, 2016 ; Heinrich et al., 2018 ). Identified metric implementations have several drawbacks: some are only applicable on attribute-level (e.g., no aggregation), some require a gold standard that might not exist, and some have implementation errors.

• In general-purpose DQ tools, DQ monitoring is considered a premium feature, which is liable to costs and only provided in professional versions. Exceptions are dedicated open-source DQ monitoring tools, like Apache Griffin or MobyDQ, which support the automation of rules, but lack predefined functions and data profiling capabilities.

This article is structured as follows: Section 2 summarizes related research concerning DQ management, measurement, and monitoring. Section 3 covers the applied methodology to conduct this research, including related surveys, our research questions, and the tool selection strategy. Based on the existing research from Section 2, we introduce a new requirements catalog to evaluate DQ tools and the accompanying evaluation strategy in Section 4. In Section 5, we describe the tools, which have been selected for investigation, and discuss the evaluation. The results and lessons learned are summarized in Section 6. We conclude in Section 7 with an outlook on future work.

2. Theoretical Background on Data Quality

Despite different existing interpretations, the term “data quality” is most frequently described as “fitness for use” ( Chrisman, 1983 ; Wang and Strong, 1996 ), referring to the high subjectivity and context-dependency of this topic. Information quality is often used as synonym for data quality and even though both terms can be clearly distinguished, because “data” refers to plain facts and “information” describes the extension of those facts with context and semantics, they are often used interchangeably in the DQ literature ( Wang, 1998 ; Zhu et al., 2014 ). We use the term data quality because our focus is on processing objectively, automatically retrievable facts (i.e., intrinsic data characteristics). The term information serves as synonym for data in the systematic search to achieve higher coverage.

2.1. Data Quality Management

The Data Management Association (DAMA) defines “data quality management” as the analysis, improvement and assurance of data quality ( Otto and Österle, 2016 ). Over the years, a number of different DQ methodologies (also declared as “frameworks,” “programs,” or “methods”) have been proposed, for example, TDQM (Total Data Quality Management) by Wang (1998) , AIMQ (A Methodology for Information Quality Assessment) by Lee et al. (2002) , and the DQ assessment methods by Pipino et al. (2002) and Maydanchik (2007) . Batini et al. (2009) conducted a comprehensive comparison of DQ methodologies in 2009, and Cichy and Rass (2019) provide a recent overview on generally applicable DQ methodologies in 2019. Although these methodologies have different characteristics and emphases, it is possible to extract four core activities (cf. English, 1999 ; Maydanchik, 2007 ; Batini et al., 2009 ; Cichy and Rass, 2019 ): (1) state reconstruction, (2) DQ measurement or assessment, (3) data cleansing or improvement, and (4) the establishment of continuous DQ monitoring. Not all methodologies include all of these steps, for example, step (1) is omitted by Maydanchik (2007) and step (4) is omitted in the DQ methodology survey by Batini et al. (2009) . Further, some methodologies include additional activities like monitoring of data integration interfaces (cf. Maydanchik, 2007 ), which we do not consider because of their specialization. In the following paragraphs, we describe the four core steps of a DQ methodology in detail to clarify the difference between DQ measurement, DQ monitoring, and data cleansing activities, where the latter ones are not included in the survey. Step (1), the state reconstruction, describes the collection of contextual information on the observed data, as well as on the organization where a DQ project is carried out ( Batini et al., 2009 ). Since the focus of this article is on DQ tool functionalities, we restrict step (1) in the following to the data part (i.e., data profiling) and do not describe gathering of contextual information on the organization in detail.

2.1.1. Data Profiling

Data profiling is described as the process of analyzing a dataset to collect data about data (i.e., metadata) using a broad range of techniques ( Naumann, 2014 ; Abedjan et al., 2015 , 2019 ). Thus, it is an essential task prior to any DQ measurement or monitoring activity to get insight into a given dataset. Exemplary information that is gathered during data profiling are the number of distinct or missing (i.e., null) values in a column, data types of attributes, or occurring patterns and their frequency (e.g., formatting of telephone numbers) ( Abedjan et al., 2015 ). We refer to Abedjan et al. (2015 , 2019) for a detailed discussion on data profiling techniques and tasks. According to Selvage et al. (2017) and the findings of our survey, most general-purpose DQ tools offer data profiling capabilities to some extent.

2.1.2. Data Quality Measurement

According to Sebastian-Coleman (2013) , one of the biggest challenges for DQ practitioners is to answer the question on how data quality should be actually measured. Ge and Helfert (2007) indicate that this is also true for the synonymously used term assessment by stating that one of the major questions in DQ research is “How to assess data quality?.” The term measure describes “to ascertain the size, amount, or degree of something by using an instrument or device marked in standard units or by comparing it with an object of known size” ( McKean, 2005 ). Although the term assessment is often used as synonym for measurement, especially in DQ literature there is a clear distinction between both terms. Assessment is the “evaluation or estimation of the nature, ability, or quality of something” and extends the concept of measurement by evaluating the measurement results and drawing a conclusion about the object of assessment ( McKean, 2005 ; Sebastian-Coleman, 2013 ). DQ assessment is also described as the detection and initial estimation of data quality as well as the impact analysis of occurring DQ problems ( English, 1999 ; Apel et al., 2015 ). In this survey, we use the term measurement since the focus is on measurement capabilities of DQ tools, independently of the interpretation of the results by a user.

In addition to scientific publications, standards should represent the consensus of practitioners and researchers likewise. In terms of data quality, there has been considerable work done by the ISO/IEC JTC 1 (“Information technology,”) subcommittee 7 on “software and systems and engineering.” SC 7's working group 06 published ( ISO/IEC 25012:2008, 2008 ; ISO/IEC 25040:2011, 2011 ; ISO/IEC 25024:2015(E), 2015 ). In parallel, subcommittee SC 4 “Industrial data” of the technical committee ISO/TC 184 (“Industrial automation systems and integration”) published ( ISO 8000-8:2015(E), 2015 ). While ( ISO 8000-8:2015(E), 2015 ) defines prerequisites for the measurement and reporting of information and data quality on a very general level, ( ISO/IEC 25012:2008, 2008 ) provides more concrete DQ measures as well as an explanation of how to apply them. According to ISO 8000-8:2015(E) (2015 ), data can be measured on a very general level according to (1) syntactic quality that describes the degree to which data conforms to a specified syntax, (2) semantic quality , ie., the degree to which data corresponds to its real representation, or (3) pragmatic quality , i.e., the degree to which data is suitable for a specific purpose. ISO/IEC 25012:2008 (2008 ) defines the “measurement” (of data quality) as “set of operations having the object of determining a value of a measure” and define a set of normalized quality measures (between 0 and 1).

The partition of data quality into a set of DQ dimensions, which can be measured with metrics, is widely accepted in DQ research (cf. Wang and Strong, 1996 ; Lee et al., 2009 ; Batini and Scannapieco, 2016 ). For example, Lee et al. (2009) state that “DQ assessment requires assessments along a number of dimensions.” The quality measures provided by ISO/IEC 25012:2008 (2008 ) correspond to the most popular metrics in literature (e.g., accuracy, completeness, consistency). Despite the wide agreement on DQ dimensions and metrics (i.e., measures) in general and a lot of research over the last decades, there is still no consensus on a standardized list of dimensions and metrics for DQ measurement ( Sebastian-Coleman, 2013 ; Myers, 2017 ). Thus, we observe existing DQ dimensions and metrics and justify their inclusion in our requirements catalog in Section 2.2.

2.1.3. Data Cleansing

Data cleansing describes process of correcting erroneous data or data glitches ( Dasu and Johnson, 2003 ). In practice, automatable cleansing tasks include customer data standardization, de-duplication, and matching. Other efforts to improve DQ are usually performed manually. While automated data cleansing methods are very valuable for large amounts of data, they pose risks to insert new errors that are rarely well understood ( Maydanchik, 2007 ). We intentionally did not observe data cleansing functionalities in this survey, since the focus is on the detection of DQ problems. However, data cleansing algorithms are usually based on DQ measurement, since it is initially necessary to detect DQ problems to increase the quality of a given dataset.

2.1.4. Data Quality Monitoring

The term “DQ monitoring” is mainly used implicitly in literature without an established definition and common understanding. This leads to different interpretations when the term is mentioned in scientific publications or by companies promoting and describing their DQ tool. There is a difference between “data monitoring,” which describes continuous checking of rules, and “DQ monitoring,” which is ongoing measurement of DQ ( Ehrlinger and Wöß, 2017 ). The aim of this survey is to observe not only the functionalities of current DQ tools in terms of data profiling and measurement, but also in terms of true DQ monitoring. Pushkarev et al. (2010) and a follow-up study ( Pulla et al., 2016 ) point out that none of the tools observed had any monitoring functionality. We however want to include this criterion in our requirements catalog since there is evidence on several DQ tool websites that they do offer monitoring functionalities, but have not been observed by Pushkarev et al. (2010) and Pulla et al. (2016) . According to the ISO standard 8,000 ( ISO 8000-8:2015(E), 2015 ), pragmatic data quality measurement requires interaction with the respective users who validate the data. Consequently, fully automated DQ monitoring is restricted to syntactic and semantic DQ aspects.

2.2. Data Quality Dimensions and Metrics

Data quality is often described as concept with multiple dimensions, so that every DQ dimension refers to a specific aspect of the quality of data ( Ehrlinger and Wöß, 2019 ). Over the years, a wide variety of dimensions and dimension classifications have been proposed ( Ballou and Pazer, 1985 ; Wand and Wang, 1996 ; Wang and Strong, 1996 ; Pipino et al., 2002 ; Ge and Helfert, 2007 ; Batini and Scannapieco, 2016 ). An overview of possible dimensions and classifications is provided by Laranjeiro et al. (2015) ; Scannapieco and Catarci (2002) . Despite intensive research and an ongoing discussion on DQ dimensions, there is still no consensus on which dimensions are the essence for DQ measurement ( Sebastian-Coleman, 2013 ). Our evaluation framework covers the four most frequently used dimensions accuracy, completeness, consistency, and timeliness ( Wand and Wang, 1996 ; Scannapieco and Catarci, 2002 ; Hildebrand et al., 2015 ).

Piro (2014) distinguishes between “hard dimensions” (including accuracy, completeness, and timeliness, amongst others), which can be measured objectively using check routines, and “soft dimensions,” which can only be assessed using subjective evaluation. However, also objective check routines require a preceding subjective and domain-specific definition of the data objects to be measured, in order to consequently follow the “fitness for use” approach ( Piro, 2014 ).

In conjunction with the discussion of DQ dimensions, it is often mentioned that the definition of specific DQ metrics is required to apply those dimensions in practice. A metric is a function that maps a quality dimension to a numerical value, which allows an interpretation of a dimension's fulfillment ( IEEE, 1998 ). Such a DQ metric can be measured on different aggregation levels: on value-level, column or attribute-level, tuple or record-level, table or relation-level, as well as database (DB)-level ( Hildebrand et al., 2015 ). The aggregation could, for example, be performed with the weighted arithmetic mean of the calculated metric results from the previous level (e.g., results of the record-level to calculate the table-level metric) ( Hinrichs, 2002 ). Heinrich et al. (2018) proposed five requirements for DQ metrics to ensure reliable decision-making: “the existence of minimum and maximum metric values (R1), the interval scaling of the metric values (R2), the quality of the configuration parameters and the determination of the metric values (R3), the sound aggregation of the metric values (R4), and the economic efficiency of the metric (R5).” However, other researchers claim “that a more general approach is required” ( Bronselaer et al., 2018 ) to assess the usefulness and validity of a DQ metric. In the following, we describe four prominent DQ dimensions along with common metrics for their calculation. The list of metrics is not exhaustive, but should give an impression about the research conducted in this area since we observe the existence of such or similar metrics in our DQ tool evaluation.

2.2.1. Accuracy

Although accuracy is sometimes described as the most important data quality dimension, a number of different definitions exist ( Wand and Wang, 1996 ; Haegemans et al., 2016 ). In DQ literature, accuracy can be described as the closeness between an information system and the part of the real-world it is supposed to model ( Batini and Scannapieco, 2016 ). From the natural sciences perspective, accuracy is usually defined as the “magnitude of an error” ( Haegemans et al., 2016 ). We refer to Haegemans et al. (2016) for a detailed discussion on the definitions of accuracy and a comprehensive list of metrics related to accuracy. Here, we provide a few exemplary metrics. Redman (2005) defines field- and record-level accuracy as follows:

( Redman, 2005 ),

This metric is also reused by the DAMA UK ( Askham et al., 2013 ) by generalizing “fields” and “records” to “objects.” Lee et al. (2009) use the inverse metric ( 1 - Number of data units in error Total number of data units ) and Fisher et al. (2009) additionally take into account the randomness of the occurrence of an error ROE and the probability distribution of the occurrence of an error PDOE :

( Fisher et al., 2009 ).

Hinrichs (2002) proposed the accuracy metric in Equation (4), which can be aggregated on different levels. On attribute-value-level, the metric Q Gen for accuracy ( Gen is “Genauigkeit” in German, which means “accuracy” in English) is defined by the ratio between a value's arity and its optimal arity for numeric values. For a numeric attribute A , s opt ( A ) is the optimal number of digits and decimals for A , w is a value of A and s ( w ) is the actual number of digits and decimals for w in attribute A . Since s opt ( A ) is not necessarily maximal, the metric needs to be normalized by [0, 1] ( Hinrichs, 2002 ).

Hinrichs, (2002) .

For non-numeric attributes, Hinrichs (2002) suggests to assign w to plane i within a classification K with n planes ( K 1 , ..., K n ) and to replace s ( w ) with i and to select s opt ( A ) from K with s opt ( A ) ≤ n . For a tuple t , accuracy Q Gen is measured according to:

Hinrichs, (2002) ,

where t . A 1 , ..., t . A n are the attribute values for attributes A 1 , ..., A n that specify the observed tuple t . Factor g j is the relative importance of A j with respect to the total tuple and is an expert-defined weight ( Hinrichs, 2002 ). The accuracy on table-level is then calculated as the arithmetic mean of the tuple accuracy measurements, and the accuracy on DB-level is the arithmetic mean of the table-level accuracy measurements. For a more detailed discussion on the metric, we refer to Hinrichs (2002) .

2.2.2. Completeness

Completeness is very generally described as the “breadth, depth, and scope of information contained in the data” ( Wang and Strong, 1996 ; Batini and Scannapieco, 2016 ) and covers the condition for data to exist. Considering related work (cf. Redman, 1997 ; Hinrichs, 2002 ; Lee et al., 2009 ; Ehrlinger et al., 2018 ), the most generic metric for completeness can be defined as:

where | e c | is the number of complete elements and | e | is the total number of elements. Here, the generic term “element” can refer to any data unit, e.g., an attribute, a record, or a table. Lee et al. (2009) use the inverse metric ( 1 - Number of incomplete elements Total number of elements Lee et al., 2009 ) and Batini and Scannapieco (2016) suggest comparing the number of complete elements to the (total) number of elements in a perfect reference dataset. A more detailed specification on how to calculate completeness is provided by Hinrichs (2002) , who assigns 0.0 to a field value that is null or equivalent and 1.0 else. Based on this assumption, completeness can be calculated analogously to the accuracy metric on different aggregation levels with the weighted arithmetic mean. For example, the completeness Q Voll ( Voll is “Vollständigkeit” in German, which means “completeness” in English) on table-level is defined as:

where | T | is the number of records in table T and Q Voll ( t i ) is the completeness of record t i . We want to point out that in addition to the assumption by Hinrichs, who counts true missing values (i.e., null ), it is also possible to approach completeness in a more rigorous way by considering default values or textual entries stating “NaN” (i.e., not a number) as incomplete values.

Although Hinrichs does not propose a completeness metric per attribute (i.e., column) and other related work like ( Askham et al., 2013 ) describe attribute-level completeness only textually, such a metric can be derived from the description and Equation (6) as follows:

where | v | is the total number of values within a column and | v c | is the number of complete values that are not null .

2.2.3. Consistency

There are also different definitions for the consistency dimension. According to Batini and Scannapieco (2016) , “consistency captures the violation of semantic rules defined over data items, where items can be tuples of relational tables or records in a file.” An example for such rules are integrity constraints from the relational theory. Hinrichs (2002) assumes for his proposed consistency metric that domain knowledge is encoded into rules and excludes contradictions within the rules and fuzzy or probabilistic assumptions. Consequently the consistency Q Kon ( Kon is “Konsistenz” in German, which means “consistency” in English) of an attribute value w is defined as

where g j is the degree of severity of r j ( w ), and r j ( w ) is the violation of consistency rule r j (within a set of n consistency rules), applied to the attribute value w , and defined as

Hinrichs, (2002)

Consistency rules cannot only be defined on attribute-value-level, but also on tuple-level. The calculation of the consistency on table- or database-level is in alignment to the accuracy and completeness metric calculated as the arithmetic mean of the tuple-level consistency ( Hinrichs, 2002 ).

Sebastian-Coleman (2013) suggests measuring consistency over time by comparing the “record count distribution of values (column profile) to past instances of data populating the same field.”

2.2.4. Timeliness

Timeliness describes “how current the data are for the task at hand” ( Batini and Scannapieco, 2016 ) and is closely connected to the notions of currency (update frequency of data) and volatility (how fast data becomes irrelevant). A different definition states that “timeliness can be interpreted as the probability that an attribute value is still up-to-date” ( Heinrich et al., 2007 ). A list of different metrics to calculate timeliness is provided by Heinrich and Klier (2009) , where the authors suggest calculating timeliness based on the definition by Heinrich et al. (2007) according to:

( Heinrich et al., 2007 ),

where ω is the considered attribute value and decline ( A ) is the decline rate, which specifies the average number of attributes that become outdated within the time period t ( Heinrich et al., 2007 ).

This list of metrics for the DQ dimensions accuracy, completeness, timeliness, and consistency, is by no means exhaustive, but a comprehensive discussion would be out of scope for this article. We conclude that literature offers a number of specifically formulated metrics to measure DQ dimensions and this survey observes their implementation in state-of-the-art DQ tools.

2.3. Requirements for Data Quality Tools

In general, there are very few scientific papers that study the functional scope of DQ tools and even less papers that propose a dedicated requirements catalog for their evaluation. The differentiation of our DQ tool survey to existing ones (and consequently their requirements) is explained in detail in Section 3.1. In summary, the proposed requirements were of too less detail or with a different functional focus.

In addition to existing surveys, Goasdoué et al. (2007) explicitly proposed an evaluation framework for DQ tools without publishing the results of their evaluation. The proposed requirements were adapted to the context of the company, they performed the DQ tool evaluation for: Électricité de France (EDF), a French electric utility company, and more precisely to their CRM (customer relationship management) environments. Thus, the main differences to our requirements catalog are a more detailed evaluation of address normalization, duplicate detection, and reporting capabilities, but less details in data profiling and no coverage of DQ monitoring functionality.

In addition to requirements defined by researchers, there are several practitioner- and vendor-focused surveys by Gartner Inc. (cf. Judah et al., 2016 ; Selvage et al., 2017 ; Chien and Jain, 2019 ), which observe DQ tools by means of the following DQ capabilities: connectivity, data profiling, measurement and visualization, monitoring, parsing, standardization and cleaning, matching, linking and merging, multi-domain support, address validation/geocoding, data curation and enrichment, issue resolution and workflow, metadata management, DevOps environment, deployment environment, architecture and integration, and usability. Similarly, Loshin (2010) defines the following eight requirements a DQ tool must offer: “data profiling, parsing, standardization, identity resolution, record linkage and merging, data cleansing, data enhancement, and data inspection and monitoring.” Such lists of requirements were too coarse grained for our aim to specifically observe data profiling functionality, DQ measurement, and DQ monitoring functionality. While general features like connectivity and usability of the tools are not necessary to answer our research questions, we added a short textual description to each tool we observed.

3. Survey Methodology

A systematic survey is usually started by defining a “protocol that specifies the research questions being addressed and the methods that will be used” ( Kitchenham, 2004 ). This section describes the protocol we developed to systematically conduct our survey. The structure of the protocol has been derived from the methodology for systematic reviews in computer science by Kitchenham (2004) . Since the focus in Kitchenham (2004) is on the evaluation of primary research papers and not on specific implementations, we omit steps 5, 6, and 7 of the suggested planning information, including quality assessment, a data extraction strategy, and the synthesis of the extracted data from the original research papers.

3.1. Related Surveys

Although a lot of DQ methods and tools have been published, there are few scientific studies about the functional scope of DQ tools. Gartner Inc. (cf. Judah et al., 2016 ; Selvage et al., 2017 ; Chien and Jain, 2019 ) lists the strengths and cautions of vendors of commercial DQ tools in their “Magic Quadrant for Data Quality Tools” 2016 (17 vendors), 2017 (16 vendors), and 2019 (15 vendors). They include vendors that offer software tools or cloud-based services, which deliver general-purpose DQ functionalities, including at least profiling, parsing, standardization/cleansing, matching, and monitoring ( Selvage et al., 2017 ). The study is vendor-focused and does not provide a detailed comparison of the respective data quality tools in terms of functionality (e.g., measurement and monitoring capabilities). However, the “Magic Quadrant for Data Quality Tools” contains a representative selection of commercial DQ tools, which is a valuable complement to our survey. The closest survey to our work in terms of tool comparison structure has been published by Fraunhofer IAO in German language ( Kokemüller and Haupt, 2012 ). While ( Kokemüller and Haupt, 2012 ) focus on tools popular in German, we at a scientific approach to observe the availability of DQ tools from a general perspective by also justifying the tool selection.

Woodall et al. (2014) categorize different methods to assess and improve DQ. They understand DQ methods as automatically executable algorithms to detect or correct a DQ problem, e.g., column analysis, data verification, or data standardization. As basis for their classification, they reviewed the list of DQ tools included in the “Magic Quadrant for Data Quality Tools 2012” by Gartner and extracted a list of DQ methods that tackle specific DQ problems. Woodall et al. (2014) do not provide an in-depth comparison of which method is contained in which tool since their focus is on the method classification.

Barateiro and Galhardas (2005) compared 9 academic and 28 commercial DQ tools in a scientific survey. This article does not cover state-of-the-art tools and the survey was not conducted in a systematic way, which means, it is unclear how the list of DQ tools has been selected. In addition, the authors state that DQ tools aim at detecting and correcting data problems, which is why they observe functionalities for both, DQ measurement as well as data cleansing, with an emphasis on the second aspect. In contrast, we focus on the measurement of data quality issues only, with special consideration of long-term monitoring functionality.

Pushkarev et al. (2010) proposed an overview of 7 open-source or freely available DQ tools. They described each tool briefly and compared the functionalities of the tools by means of performance criteria (including 6 usability features like data source connectivity, report creation, or the graphical user interface—GUI) and core functionality . The core functionality consists of 4 groups, which are further subdivided into specific features that are observed: data profiling (e.g., data pattern discovery), data integration (e.g., ETL), data cleansing (e.g., parsing and standardization), and data monitoring. Due to the limited number of pages, Pushkarev et al. do not provide detailed insights in the implementation of specific criteria, and mainly distinguish between the availability of a feature (Y) or its absence (N). For example, the authors list 9 usability criteria for the GUI, but in the evaluation they only distinguish between (g) representing “not user friendly GUI” and a (G) for “user-friendly GUI” with drag and drop functionality. Pulla et al. (2016) published a revised version of the tool overview, which is very similar to the original work in terms of structure and methodology. They used the same criteria structure as Pushkarev et al. (2010) , but omitted the data monitoring group [since it is not provided by any of the tools according to Pulla et al. (2016) ] and 4 other sub-features without further justification. The list of investigated DQ tools was extended from 7 to 10. Our survey differs notably from these two papers since, we conducted a systematic search to select DQ tools and also investigated commercial tools, while Pushkarev et al., 2010 and ( Pulla et al., 2016 ) presented a predefined selection of free or open-source tools without publishing their selection strategy. Moreover, we focus on data profiling, DQ measurement, and DQ monitoring and evaluate these feature groups with a more detailed and comprehensive criteria catalog as provided by other published surveys mentioned above.

Another study by Gao et al. (2016) focuses on big data quality assurance. However, the authors did not clarify the methodology, that is, the selection of the investigated tools and evaluation criteria. In contrast to our survey, were the focus is on the actual DQ measurement functionalities, the comparison in Gao et al. (2016) includes mainly technical features like the supported operating system and data sources, as well as a limited list of 4 basic data validation functions.

Table 1 provides an overview on related DQ tool surveys and compares them to our work. It can be seen that there exists no other survey, which (1) conducted a systematic search to select the DQ tools for investigation, (2) addresses both practitioners and researchers, and (3) investigates data profiling, DQ measurement, DQ monitoring, as well as the vendors in terms of customer support. In contrast to other surveys that focus mainly on commercial or open-source tools, we provide a good digest of the market by investigating a total number of 13 DQ tools, from which five are open-source and eight commercial.

www.frontiersin.org

Table 1 . Comparison of related data quality tool surveys.

3.2. Research Questions

The aim of this survey is to evaluate and compare existing DQ tools with respect to their DQ measurement and monitoring functionalities in order to answer the research question how DQ measurement and monitoring concepts are implemented in state-of-the-art DQ tools . This research question can be refined with three sub-questions, where the theoretical background is discussed in Section 2. In Section 4.1, we present our requirements catalog, in which each sub-question is assigned to specific technical requirements.

1. Which data profiling capabilities are supported by current DQ tools?

2. Which data quality dimensions and metrics can be measured with current DQ tools?

3. Do DQ tools allow automated data quality monitoring over time?

3.3. DQ Tool Search Strategy

To establish a comprehensive list of existing DQ tools, we developed a three-fold strategy. First, we included all observed tools from previous surveys by Barateiro and Galhardas (2005) , Kokemüller and Haupt (2012 ), Gao et al. (2016) , Selvage et al. (2017) , Pulla et al. (2016) , and Pushkarev et al. (2010) as candidate tools. Second, we conducted a systematic search to find research papers that introduce, investigate, or mention DQ tools. The third part of our search strategy consists of a random Google search by using the same search term combinations as for the systematic search. In contrast to the systematic search, we do not aim at a comprehensive observation of all search results, which is unfeasible for Google search results. However, to also identify non-research tools that have not been described in scientific papers, we consider this random search as enrichment to guarantee a best possible coverage of candidate tools. The remainder of this section is dedicated to the systematic search.

We identified the following search terms to conduct the systematic search: data quality, information quality , and tool . Since “information quality” is considered a synonym to “data quality” ( Zhu et al., 2014 ), we applied both search terms to achieve higher coverage. We decided not to add the terms “assessment” and “monitoring” to the search, as it would automatically exclude tools that do not specifically use these keywords. Consequently, the following search expression has been applied:

(“ data quality ” ∨ “ information quality ”) ∧ tool

The search expression has then been applied to the list of digital libraries that is provided in Table 2 . We also included the software development platform GitHub, because the purpose of this search is to select concrete tools. The original aim was to search all titles and abstracts from the computer science domain. However, since each digital library offers different search functionalities, we selected the closest search-engine-specific settings to reflect our original search aim. Table 2 documents the deviations for each conducted search along with the ultimately utilized search expression, which is already formatted according to the guidelines of the respective search engine. For the GitHub search, we additionally omitted the search term tool , because most GitHub results are obviously tools (except for empty repositories, code samples, or documentations).

www.frontiersin.org

Table 2 . Systematic search.

For each search result, we assessed the title and abstract to determine whether a paper actually promotes a candidate DQ tool or not. In cases where title and abstract were not explicit enough, or they indicated the presentation of a tool (and therefore this article could not be directly classified as not relevant), the content of this article was investigated in more detail to record name and purpose of the tool in a first step. In the GitHub search, we excluded all tools that did not offer any kind of description immediately and used the others as candidates. Figure 1 illustrates the number of investigated research papers and the resulting tools. The next section describes the subsequent investigation of all candidate tools according to defined exclusion criteria (EC).

www.frontiersin.org

Figure 1 . Systematic search.

3.4. DQ Tool Selection

In accordance with our general search strategy, we defined three inclusion criteria. Each tool that was selected as candidate tool had to satisfy at least one of the following three criteria.

1. The tool was included in one of the previous surveys (cf. Barateiro and Galhardas, 2005 ; Pushkarev et al., 2010 ; Kokemüller and Haupt, 2012 ; Gao et al., 2016 ; Pulla et al., 2016 ; Selvage et al., 2017 ).

2. The tool was identified in our systematic search.

3. The tool was identified in our random search.

Figure 1 shows the number of scientific papers (#Papers), which we found in the systematic search per source, as well as the number of tools (#Tools) that were mentioned in these papers. It can be seen that some papers mention several DQ tools (e.g., other DQ tool surveys), while some use the term in their title or abstract, but do not refer to a concrete tool directly. In total, 1,298 papers have been discovered through the systematic search, which refer to 567 DQ tools (this number includes duplicates). In the related surveys we located 110 tools (including duplicates) and added 43 additional tools from the random Google search. In the next step, all 720 tools were merged into one file to remove duplicates. This resulted in a total of 667 identified distinct DQ tools. After establishing the list of candidate tools, we conducted a review to exclude all tools from the survey that met at least one of the following exclusion criteria.

(EC1) The tool is domain-specific (e.g., for web data or a specific implementations only).

(EC2) The tool is dedicated to specific data management tasks without explicitly offering DQ measurement.

(a) The tool is dedicated to data cleansing.

(b) The tool is dedicated to data integration (including on-the-fly DQ checks).

(c) The tool is dedicated to other data management tasks (e.g., data visualization).

(EC3) The tool is not publicly available (e.g., the tool is only described in a research paper).

(EC4) The tool is considered deprecated (i.e., the vendor does not exist any more or the tool was found on GitHub and the last commit was before January 1 st , 2016).

(EC5) The tool was found on GitHub without any further information available.

(EC6) The tool requires a fee and no free trial is offered upon request.

The table in Figure 1 shows how many tools were excluded per criterion (multiple selection was possible). Most of the tools were excluded because they are domain specific (EC1) and/or focus on specific data management tasks (EC2). The 267 tools excluded due to EC2 are divided between the three subcriteria as follows: 111 tools were excluded by EC2(a), 46 tools were excluded by EC2(b) and 110 tools by EC2(c).

For the search process and selection, we used Microsoft Excel to collect the identified scientific papers from the search engine results, to assemble a uniform list of identified DQ tools, and to remove duplicate tools. We tracked the exclusion of the tools according to our six criteria in a separate Excel file. 17 DQ tools have been selected for deeper investigation, from which 13 could be evaluated since three were based on SAP, where no installation was available, and one (IBM InfoSphere Information Server) could not be installed successfully during the time of the project, despite great effort but with little support from IBM.

3.5. Limitations of This Study

As pointed out by Pateli and Giaglis (2004) , “the selection phase is critical, since decisions made at this stage undoubtedly have a considerable impact on the validity of the literature review results.” For our survey, we consider the conduction of the selection process and consequently its inherent limitations (cf. Kitchenham et al., 2009 ) as main threats to its validity. In this section, we specifically discuss the comprehension of our tool search strategy and the stringency of the exclusion criteria.

The following measures mitigated the risk of missing an important research paper and subsequently a DQ tool: (1) we used the online search engine Google Scholar in addition to the main publisher websites, (2) we specifically observed references from existing DQ tool surveys, and (3) we included a manual Google search in parallel to the systematic search.

Considering the ratio between the number of DQ tools selected for deeper investigation and the total number of identified DQ tools (17/667), the exclusion criteria might seem very stringent. We argue that they have been selected adequately for this survey due to the following reasons. First, we want to point out that there is a huge number of DQ tools (especially a subset of the 296 found on GitHub), which are only simple scripts to clean specific data sets. Two examples are SQL-Utils 1 , which consists of five SQL scripts for cleaning data and performing simple DQ checks, and DescribeCol 2 , which consists of one Python function that implements DQ tests by describing and visualizing a Pandas DataFrame. Although the dedicated investigation of domain specific DQ tools (EC1) is interesting future work, a further restriction of these tools would be required to compile meaningful results for such a study. Second, we deliberately excluded DQ tools that are restricted to specific data management tasks (e.g., data cleansing), because they do not support the answer of our general research question how DQ measurement and monitoring concepts are implemented in state-of-the-art DQ tools . Third, the time invested for each tool was about one person per month per tool. This was on the one hand due to the detailed requirements catalog (cf. Table 3 ), and on the other hand, for some tools already the installation or the negotiation with the customer support (e.g., to receive a full functional trial license) was very time-consuming. Considering this time effort, the investigation of all 667 DQ tools, or even only the 339 domain-specific tools, would be out of scope to answer our research question. Fourth, the number of selected DQ tools seems reasonable compared to related surveys. Investigating a considerable larger number of DQ tools would require the refinement of the entire evaluation process.

www.frontiersin.org

Table 3 . DQ tool requirements catalog.

4. Design of the Evaluation Process

As outlined in Section 2.3, existing requirement frameworks for DQ tools did not adequately answer our research question. Thus, we developed a new catalog of requirements for the evaluation of DQ measurement and monitoring tools, which is discussed in the following subsection. The aim is to rate the fulfillment of each requirement with three categories: (✓) for fulfilled, (−) for not fulfilled, and ( p ) for partially fulfilled. In Section 4.2, we discuss the database used for the evaluation of the requirements and in Section 4.3 we list the predefined test cases to compare specific results between the investigated DQ tools.

4.1. Evaluation Requirements Catalog

Our requirements catalog in Table 3 consists of three main categories: data profiling (DP), data quality measurement (DQM), and continuous data quality monitoring (CDQM). The requirements for data profiling are based on the classification of DP tasks by Abedjan et al., which has been originally published by Abedjan et al. (2015) , and updated by Abedjan et al. (2019) . Since we started our survey prior to the classification update, our requirements catalog constitutes a tradeoff between the two versions. Since both versions contain the two sub-categories “single columns (SC) profiling” and “dependency detection,” we adhere here to the newer version by Abedjan et al. (2019) . In the SC sub-category, we split the null values task (i.e., number or percentage of null values) in two different requirements: (DP-2) number of null values and (DP-3) percentage of null values, to separate the results. The newer version ( Abedjan et al., 2019 ) contains an additional sub-category “metadata for non-relational data,” which is not included in our survey, because the evaluation for some tools with a fixed-period trial version was already completed at the time of the update. However, the original version ( Abedjan et al., 2015 ) included a category “multi-column (MC) profiling,” which has been removed by Abedjan et al. (2019) . We renamed this category to “advanced MC profiling” and added it along with two additional requirements (exact and relaxed duplicate tuple detection) to the end of the DP category. One reason for the exclusion of the MC sub-category from the data profiling task taxonomy by Abedjan et al. (2019) might be the strong overlap of these tasks with the field of data mining. Abedjan et al. (2019) point out that there exists no clearly defined and widely accepted distinction between the two research fields. Thus, although a separate category for those requirements could be argued, we decided to include it in the data profiling category, because data mining is not in the focus of our survey.

The category for DQ measurement contains requirements to provide metrics for specific DQ dimensions and business rule management capabilities. While we listed metrics for the DQ dimensions accuracy, completeness, consistency, and timeliness as described in Section 2.2 explicitly, we investigate the existence of additional metrics during our evaluation by means of (DQM-34). Since DQ dimensions such as consistency are often measured with a set of rules (cf. Section 2.2.3) and the development of business rules is generally regarded as the basis for DQ measurement in some methodologies (cf. Sebastian-Coleman, 2013 ), we have expanded our catalog to include (DQM 35-37). It is distinguished between the (DQM-35) creation of domain-specific business rules and the (DQM-36) availability of general integrity rules, for example a birth date cannot be in the future or a temperature value can never reach -270 °C. It should also be possible to verify those rules (DQM-37).

The requirements for CDQM are based on the findings from our previous research published by Ehrlinger and Wöß (2017) and summarize key tasks to ensure automated DQ monitoring over time. The continuous measurement, storage, and usage of the collected metadata should be possible for both data profiling results and DQ measurements.

4.2. Evaluation Database

For the evaluation of the requirements from Table 3 , we used a modernized version of the well-known Northwind DB published by dofactory 3 . Figure 2 illustrates the schema of the database with five tables as UML (unified modeling language) class diagram. Foreign key relationships and their cardinalities are represented in UML notation.

www.frontiersin.org

Figure 2 . Schema of the northwind evaluation DB.

4.3. Data Profiling Test Cases

To compare the results of the requirements between the DQ tools, we defined a test case for each requirement from the data profiling category. We did not define such fine-grained test cases for the DQ measurement category since the DQ metric implementations were too diverse to compare their results directly. Also, the requirements of the DQ monitoring category do not yield a comparable result (e.g., in form of numbers), and hence there are no test cases. The following list comprises all test cases we performed for the DP category, whereby the enumeration can be linked to the DP requirements from Table 3 :

1. Number of rows in table Product .

2. Number of null values in column Supplier.Fax .

3. Percentage of null values in column Supplier.Fax .

4. Number of distinct values in column Customer.Country .

5. Number of distinct values divided through number of rows for Customer.Country .

6. Frequency histograms for Customer.Country .

7. Minimum and maximum values in OrderItem.UnitPrice .

8. Constancy for column Customer.Country .

9. Quartiles in column OrderItem.UnitPrice .

10. Distribution if first digit=1 in column UnitPrice , table OrderItem .

11. Basic types for ProductName , UnitPrice , and isDiscontinued in table Product .

12. DBMS-specific data types for ProductName , UnitPrice , and isDiscontinued in table Product .

13. Minimum, maximum, average, and median value length of column Product.ProductName .

14. Maximum number of digits in column Product.UnitPrice .

15. Maximum number of decimals in column Product.UnitPrice .

16. Count of pattern “AA” in Customer.Country , derived from histogram

17. Semantic data types for ProductName , UnitPrice , and isDiscontinued in table Product .

18. Semantic domains for ProductName , UnitPrice , and isDiscontinued in table Product .

19. All 100 % conforming UCCs in Order .

20. All 98 % conforming UCCs in Order .

21. All 100 % conforming INDs between Order.CustomerId and Customer.Id .

22. All 93 % conforming INDs between Order.CustomerId and Customer.Id .

23. All 100 % conforming FDs in Order .

24. All 93 % conforming FDs in Order .

25. Correlation between OrderItem.UnitPrice and OrderItem.Quantity .

26. All possible association rules within Product .

27. Clustering the values in Product.UnitPrices .

28. All “very high values” in Order.TotalAmount .

29. All exact duplicates in Customer , considering FirstName and LastName only.

30. All relaxed duplicates in Customer , considering FirstName and LastName only.

All test cases were conducted by two researchers (one of whom is the lead author of this article), who verified each other's results.

5. Data Quality Tool Evaluation

In this section, we first describe the DQ tools, which we selected for the evaluation, and second, we investigate the selected tools with respect to our evaluation framework and discuss the requirements.

5.1. Selected Data Quality Tools

In total, we selected 17 DQ tools for detailed evaluation. Three of them were based on SAP (SAP Information Steward, DQ solution by ISO Professional Services, and dspCompose by BackOffice Associates GmbH) and since we had no access to a SAP installation, we did not include these tools in our survey, but described them textually. To achieve a comparable overview on the investigated DQ tools, we formulated the following seven questions.

• Which exact version did we evaluate? (DQ tool name and version).

• Who is the vendor or creator of the tool?

• Is the tool open-source?

• How did we perceive the user interface? (1–5 rating, 5 is best).

• How did we perceive customer support? (1–5 rating, 5 is best).

• How was the investigated DQ tool provided? (e.g., freely available on GitHub/SourceForge or trial license).

• In which scientific paper or on which online platform was the DQ tool found?

An overview on the answers to the questions is given in Table 4 and a detailed discussion is provided in the following subsections (DQ tools listed in alphabetical order). Since the focus of this survey is on the measurement functionality of DQ tools, technical details like the adoption (i.e., on-premise vs. SaaS) was not relevant for answering our research questions. We refer to related surveys for more technical details, especially the Gartner Magic Quadrant (cf. Chien and Jain, 2019 ) and Fraunhofer IAO (cf. Kokemüller and Haupt, 2012 ).

www.frontiersin.org

Table 4 . Summary of investigated DQ tools.

5.1.1. Aggregate Profiler

Aggregate Profiler (AP) is a freely available DQ tool, which is dedicated to data profiling. The tool was discovered twice in our systematic search: once because it was mentioned by Dai et al. (2016) in the Springer search results, and once in the Google search results, since it is also published on Sourceforge as “Open Source Data Quality and Profiling,” 4 developed by arrah and arunwizz . In addition to its data profiling capabilities, like statistical analysis and pattern matching, Aggregate Profiler can also be used for data preparation and cleansing activities, like address correction or duplicate removal. Moreover, business rules can be defined and scheduled in user-defined periods. We perceived the user interface (UI) as inferior compared to other tools, since the navigation and application of DP functions was not intuitive.

5.1.2. Apache Griffin

Apache Griffin 5 (AG) differs significantly from the other tools in this survey, because it does not offer any data profiling functionality and is not a comprehensive DQ solution. However, since part of the evaluation is to observe the extent to which current tools support CDQM, we included Apache Griffin, since it is dedicated to continuously measure the quality of Big Data, both batch-based and streaming data. We installed Apache Griffin 0.2.0, which is still in the incubator status of Apache, on Ubuntu 18.04. The tool requires the following dependencies, from which some are (at the time of the installation) still in incubating status as well: JDK (1.8+), MySQL DB, npm, Hadoop (2.6.0+), Spark (2.2.1+), Hive (2.2.0), Livy, and ElasticSearch. Due to these dependencies, the installation was very cumbersome in contrast to other tools. In our case, two experienced computer scientists needed over a week to complete the full installation. Once installed, the UI is intuitive and supports the domain-specific definition of accuracy metrics as well as the scheduling and monitoring of those metrics. Other DQ metrics, like completeness, are planned to be integrated in future versions.

5.1.3. Ataccama ONE

The company Ataccama with its headquarters in Canada offers several DQ products, which we found through different sources in our search: Data Quality Center and Master Data Center have been previously investigated by Kokemüller and Haupt (2012) ; DQ Analyzer has been included in Pushkarev et al. (2010) and Pulla et al. (2016) and in Abedjan et al. (2015) . Gartner additionally mentioned the DQ Issue Tracker and the DQ Dashboard in 2016 ( Judah et al., 2016 ). However, since 2017, Ataccama consolidated their separate DQ solutions into “Ataccama ONE” (A-ONE). While the license of the full DQ solutions is subject to costs, the data profiling module of Ataccama ONE can be accessed freely. Unfortunately, Ataccama customer support did not provide us with a trial license of the complete ONE solution. Thus, we were only able to investigate the free “Ataccama ONE profiler,” 6 where the focus is on data profiling and which does not provide monitoring functionality. We performed the evaluation of the online-available tool during October 2018. According to Gartner (cf. Selvage et al., 2017 ) and Ataccama customer support, the full solution would provide a much richer scope of functions, including DQ monitoring, but we were not able to investigate it. The data profiling module was very intuitive and easy to use, also for business users. In terms of customer support (from Prague), we experienced very long response times on our contact attempts for a license request. Additionally, we were promised to receive a training as prerequisite to test the full Ataccama ONE solution, which was never redeemed due to the workload on the side of Ataccama.

5.1.4. DataCleaner by Human Inference

The DQ products “DataCleaner” (DC) and “DataHub” were originally developed by Human Inference, which was incorporated into Neopost in 2012, later into Quadient, and since 2019 into the EDM Media Group, where it is again promoted with its original name “Human Inference.” DataCleaner offers dedicated and independent DQ measurement functionality, although pure data cleansing functions might be expected due to its name. Our customer contact declared that the professional version of DataCleaner (in contrast to the community edition that is freely available on GitHub) offers the same DQ measurement functionalities as DataHub, but differs only with respect to the convenient usage, the UI, and the data integration features. Thus, we evaluated a full trial of DataCleaner Enterprise Edition, which aims at people with technical background. In addition, we were able to observe the functionalities of DataHub in an interactive web session. Human Inference places emphasis on customer data, which is reflected in special algorithms for duplicate detection, address matching, and data cleansing. Under the vendor Quadient, DataCleaner was mentioned by in Selvage et al. (2017) (Gartner Inc.), but excluded from the follow-up survey by Chien and Jain (2019) due to strategic changes. DataCleaner was previously observed by Pushkarev et al. (2010) ; Gao et al. (2016) ; Pulla et al. (2016) , but under different vendors. Although DataCleaner is built for technical users, we perceived the UI as very intuitive. DataHub (with its vision of a single customer view) offers in addition to the administrator's view a data steward view, which is specifically dedicated to business users, for example, to resolve ambiguous duplicates. We also want to highlight [in conformance with Selvage et al. (2017) ] the very helpful and friendly customer support that provided us with the trial license and more insight in DataHub.

5.1.5. Datamartist by nModal Solutions Inc.

The commercial tool Datamartist 7 (DM) by nModal Solutions Inc. requires the operating system Microsoft Windows and the .NET framework 2.0 to be installed. Datamartist is dedicated to data profiling and data transformation. The investigated 30-days trial offers all Pro edition features. Since the trial could be downloaded from the website directly, we did not consult any customer support. We perceived the UI of Datamartist as slightly inferior compared to other commercial tools since for some tasks (e.g., exporting data profiling results) the command line was required.

5.1.6. Experian Pandora

The company Experian with its headquarters in Ireland offers two commercial DQ solutions: Cleanse and Pandora (EP). During the conduct of our survey, they introduced the new product Aperture Data Studio, which is going to replace Pandora in the future. While Cleanse is dedicated to one-time-data-cleansing, we investigated the more comprehensive tool Pandora. In accordance with the findings by Selvage et al. (2017) (Gartner Inc.), we perceived the tool as easy to install and use and want to highlight the comprehensive data profiling capabilities in general, and the cross-table profiling capabilities in particular. In addition, Pandora provides a rich ability to extend the existing feature palette with customized functions. In summary, Pandora achieved one of the best overall assessments in our survey. We perceived the UI as good, though more dedicated to technical users, and had very good experience with the technical customer support who supported us in a timely and target-oriented fashion.

5.1.7. Informatica Data Quality

Informatica Data Quality (IDQ) is one module of the commercial data management solution by Informatica, which is according to Gartner (cf. Judah et al., 2016 ; Selvage et al., 2017 ; Chien and Jain, 2019 ), leader in the Magic Quadrant of Data Quality Tools for several years. We were provided with two 30-days trial licenses. The trial included the Informatica Developer (the desktop installation for developers), Informatica Analyst (the web-based platform for business users), and Informatica Administrator (for task scheduling), where all three user interfaces access the same server-side backend of Informatica DQ version 10.2.0. In our systematic search, we found five different tools offered by the company Informatica, from which four had been excluded from the evaluation. For example, the “Master Data Management” solution was excluded due to the focus on master data management. Informatica Data Quality was found through the Springer Link search (cf. Abedjan et al., 2015 ), and because it was previously investigated by Judah et al. (2016) ; Selvage et al. (2017) , and Gao et al. (2016) . Informatica has its origin in the field of data integration and in addition to the features we evaluated, they offer data cleansing and matching functionalities. In terms of DQ measurement, they offer most probably the closest implementation to the DQ dimension and metric view promoted in the research community. We perceived the UI of Informatica Analyst as easy to use, also for business users, but with less comprehensive functionality than the Informatica Developer, which is more powerful and dedicated to trained and technical users. In accordance to the findings by Gartner customers (cf. Selvage et al., 2017 ), we can confirm the very helpful sales support, which was one of the best we experienced. During the evaluation, we had regular web conferences to ask questions and review the results, and short intermediate requests were answered timely.

5.1.8. IBM InfoSphere Information Server for Data Quality

The product “Infosphere Information Server for Data Quality” (IBM ISDQ) by IBM was found through the studies by Gartner (cf. Judah et al., 2016 ; Selvage et al., 2017 ) and Fraunhofer IAO (cf. Kokemüller and Haupt, 2012 ). Other product (or product components) from IBM have also been previously mentioned in the following research papers: IBM Informix (previously called “DataBlade”) by Barateiro and Galhardas (2005) , IBM InfoSphere Information Analyzer by Abedjan et al. (2015) , IBM QuerySurge by Gao et al. (2016) , IBM Data Integrator by Chen et al. (2012) , IBM InfoSphere MDM Server by Pawluk (2010) , and IBM Quality Stage by Prasad et al. (2011) . For our survey, the IBM partner solvistas GmbH, located in Austria, provided us with the installation files of IBM InfoSphere Information Server for Data Quality version 11.7 for a three-month trial. Unfortunately, we were not able to evaluate the tool due to an early error in the installation process stating that a required file was not found. Despite intensive research of the documentation 8 , it was not possible to resolve the issue within the timeframe of the project, since no support by IBM nor any specific installation instruction for the received files was provided. We also contacted Fraunhofer IAO, who included IBM ISDQ in their survey ( Kokemüller and Haupt, 2012 ). However, they did not install the tool, but based their statements on contact with the IBM support and the documentation. Also solvistas GmbH claimed that, so far, they never installed the IBM DQ product line. This experience aligns with the statement by Gartner that reference customer rate the technical support and documentation of IBM below the average ( Chien and Jain, 2019 ).

5.1.9. InfoZoom by humanIT Software GmbH

InfoZoom is a commercial DQ tool by the German vendor humanIT Software GmbH 9 and is dedicated to data profiling using in-memory analytics. It was previously surveyed by Kokemüller and Haupt (2012) . We investigated InfoZoom Desktop Professional with the IZDQ (InfoZoom Data Quality) extension in a 6-month license granted to us from the customer support. While InfoZoom Desktop is dedicated to data profiling and data investigation, the IZDQ extension allows a user to define rules and jobs for comprehensive DQ management. Generally, InfoZoom aims at observing and understanding the data but does not support any cleansing activities, which aligns well with the observations performed in this survey. We perceived the UI of InfoZoom Desktop as easy to use, also for business users, whereas the IZDQ extension requires technical knowledge like the ability to write SQL statements, or at least, intensive training to be used by non-technical users. The customer support was very friendly and helpful and provided us in a timely manner with a relatively long trial licenses in comparison to other commercial DQ tools.

5.1.10. MobyDQ

MobyDQ 10 , which was previously termed “Data Quality Framework,” by Alexis Rolland is a free and open-source DQ solution that aims to automate DQ checks during data processing, storing DQ measurements and metric results, and triggering alerts in case of anomaly. The tool was inspired by an internal DQ project at Ubisoft Entertainment, which differs to the open-source version with respect to software dependency and mature but context-dependent configuration. We found MobyDQ through our GitHub search and evaluated the version downloaded on May 21nd, 2019. Similar to the commercial tools we observed, the framework can be used to access different data sources. In contrast to Apache Griffin, MobyDQ could be installed quickly and straightforward, based on the detailed documentation provided on GitHub. MobyDQ does not provide any DP functionality, because its focus is on the creation, application, and automation of DQ checks. The creator Alexis Rolland was very helpful in demonstrating the productive installation at Ubisoft Entertainment to us, which clearly demonstrates the potential of the tool when applied in practice.

5.1.11. OpenRefine and MetricDoc

OpenRefine 11 (formerly Google Refine, abbrev. OR) is a free and open-source DQ tool dedicated to data cleansing and data transformation and was discovered through ( Kusumasari et al., 2016 ) in the IEEE search results, and ( Tsiflidou and Manouselis, 2013 ) in the Springer Link search results as well as on GitHub 12 . While the original functionality of the tools does not primarily align with the focus of our survey, its extension MetricDoc specifically aims at assessing DQ with “customizable, reusable quality metrics in combination with immediate visual feedback” ( Bors et al., 2018 ). Apart from the mention by Tsiflidou and Manouselis (2013) and Kusumasari et al. (2016) , OpenRefine was not evaluated in one of the previous DQ tool surveys, although it is open source. We installed the tool from GitHub and evaluated OpenRefine version 3.0 with the MetricDoc extension (where no version was provided), downloaded on February 14th, 2019. We perceived the usability of OpenRefine as average and especially in the MetricDoc extension, the usability of several functions reflected its state as very current research project.

5.1.12. Oracle Enterprise Data Quality

The commercial tool Oracle Enterprise Data Quality (EDQ) was previously mentioned by Gartner (cf. Judah et al., 2016 ; Selvage et al., 2017 ) and also found in the Springer Link search results ( Abedjan et al., 2015 ). We investigated the freely available pre-built Virtual Machine available at the Oracle website 13 . In addition to classical data profiling capabilities, EDQ offers data cleansing (parsing, standardization, match and merge, address verification), as well as DQ monitoring to some extent. The GUI was perceived as average with the major drawback being the inflexible data source connection to DBs and files. In comparison to other DQ tools, where a connection can be directly accessed and reused, Oracle EDQ requires a “snapshot” of the actual data connection to be created prior to any profiling or DQ measurement task. This approach prevents an automatic update of the data source. We did not require contact to the customer support and the install documentation and user guide was up-to-date and very intuitive to use.

5.1.13. Talend Open Studio for Data Quality

The company Talend offers two DQ products: Talend Open Studio (TOS) for Data Quality (a free version) and Talend Data Management Platform (requires subscription). Gartner upgraded Talend in their Magic Quadrant of Data Quality Tools from being “visionary” in 2016 to “leader” in 2017 (cf. Judah et al., 2016 ; Selvage et al., 2017 ). Talend Open Studio for Data Quality is one of the most frequently cited DQ tools that we discovered in our systematic search: it was found through Springer Link and GitHub 14 and was already previously investigated by Pushkarev et al. (2010) ; Gao et al. (2016) ; Pulla et al. (2016) . Both products (Open Studio and Enterprise) offer good support for Big Data analysis like Spark or Hadoop and a variety of data profiling and cleansing functionalities. We evaluated version 6.5.1 of TOS for Data Quality, which can definitely keep up with several commercial DQ tools (which require a fee) in terms of data profiling capabilities, business rule management, and UI experience. However, the free version does not support DQ monitoring capabilities, which is an exclusive feature of the Enterprise edition. It was not possible to receive a free trial of the Talend Data Management Platform, because according to our customer contact, it is unlikely that someone would purchase the Enterprise edition because of this feature.

5.1.14. SAS Data Quality

The US company SAS 15 (Statistical Analysis System) offers three commercial DQ products: SAS Data Management, SAS Data Quality, and SAS Data Quality Desktop ( Chien and Jain, 2019 ). Since, the traditional focus of SAS is on data analysis, their DQ product is based on the acquired company DataFlux. The product “dfPower” by DataFlux has previously been surveyed by Barateiro and Galhardas (2005) and is mentioned by Maletic and Marcus (2009) , which was discovered through our systematic search. In our evaluation, we did not find powerful machine learning (ML) capabilities (as core strength of SAS) in DQ measurement, which was also mentioned by Selvage et al. (2017) . According to our customer contact and also mentioned by Chien and Jain (2019) , SAS' overall strategic focus is on migrating all product lines into the cloud-based SAS Viya platform to increase the usability and to better integrate ML and DQ. In the evaluated tool SAS Data Quality Desktop 2.7, we found that the overall usability was below the average when compared to other DQ tools. The customer support was friendly, but hardly any question could be answered directly.

5.1.15. Data Quality Solutions Dedicated to SAP

SAP (German abbreviation for “Systeme, Anwendungen und Produkte in der Datenverarbeitung,” i.e., “Systems, Applications and Products in Data Processing,”) is a worldwide operating company for enterprise application software with headquarters in Germany. Since SAP is market leader in the data processing domain, there are several DQ tools that are specifically built to operate on top of an existing SAP installation. During this survey, we had no access to such an installation, and thus, were not able to include those tools in our evaluation. However, due to the practical relevance of DQ measurement in SAP, we describe the most relevant tools dedicated to SAP, which we found through our systematic search.

5.1.15.1. SAP Information Steward

SAP Information Steward was found through our systematic search and previously mentioned by Chien and Jain (2019) ; Abedjan et al. (2015 , 2019) . According to the documentation, the tool offers different data profiling functionalities (like simple statistics, histograms, data types, and dependencies), allows to define and execute business rules, as well as to monitor DQ with scorecards. Its strength are the wide range of out-of-the-box functions for specific domains like customers, supply chains, and products, however, customers often state that the costs for the product are too high and the interface needs some modernization for business users ( Chien and Jain, 2019 ).

5.1.15.2. Data Quality Solution by ISO Professional Services

The German company ISO Professional Services offers a data governance solution, which is implemented directly in SAP and reuses user-defined business rules from the SAP environment. A few years ago, ISO acquired the company Scarus Software GmbH with the DQ tool DataGovernanceSuite, which was discovered through our search and was previously evaluated by Kokemüller and Haupt (2012) . The Scarus Data Quality (SDQ) Server constitutes the core DQ component by ISO, which has a separate memory but no DB. SDQ interoperates with SAP transparently, by offering functions like data profiling, duplicate detection, and address validation, which are directly executed within SAP. In contrast to its competing product SAP Information Steward, which aims at large enterprises, the tool by ISO is optimized for small to medium-sized companies. Reference customers of this size preferred the tool by ISO Professional Services due to its adjusted functional scope and cheaper pricing.

5.1.15.3. dspCompose by BackOffice Associates GmbH

The German company BackOffice Associates GmbH offers a DQ suite prefixed with “dsp” (data stewardship platform), which is dedicated to master data management. Their primary DQ products are dspMonitor (for data profiling, monitoring, and DQ checks), which is a competing product to SAP Information Steward, and dspCompose (for data cleansing and DQ workflow management), which acts as add-on for dspMonitor or SAP Information Steward. Further DQ related products are dspMigrate, an end-to-end data migration tool, dspConduct, a SAP MDE tool, and dspArchive for data achiving in SAP environments. Although BackOffice Associates offer their DQ products to customers without SAP, they developed a strong SAP focus in recent years. According to our customer contact, they leverage the greatest potential in offering dspCompose in combination with SAP.

5.2. Comparison of Data Profiling, DQ Measurement, and Monitoring Capabilities

In this Section, we investigate the DQ tools with regard to our catalog of requirements from Table 3 . For each requirement, three ratings are possible: (✓) the requirement is fulfilled, (−) the requirement is not fulfilled, or ( p ) the requirement is partially fulfilled. The coverage of each requirement is described in textual form with a focus on the justification of partial fulfillments.

5.2.1. Data Profiling Capabilities

Table 5 shows the fulfillment of data profiling capabilities for each tool. We excluded Apache Griffin and MobyDQ from this table, because both tools do not offer any data profiling functionality. It can be summarized that basic single-column data profiling like cardinalities (DP 1–5) are covered by most tools, but more sophisticated functionalities, like dependency discovery and multi-column profiling, are offered only in single cases.

www.frontiersin.org

Table 5 . Data profiling capabilities.

5.2.1.1. Single Column—Cardinalities

While simple counts of values (i.e., cardinalities), like the number of rows, null values, or distinct values are covered by all DQ tools that support data profiling in general, the major distinction is an out-of-the-box availability of percentage values. The percentage of null values (DP-3) or distinct values (DP-5) is not supported by all investigated tools. The test case results also reveal different precision for the calculation of the percentages. For example, the percentage of null values in column Supplier . Fax was 55 % with Datamartist, 55.2 % with Oracle EDQ and SAS DataFlux, and 55.17 % in all other tools. The test case for DP-5 yielded 23 % with Datamartist, 23.07 % with Informatica, and 23.08 % with the other tools.

5.2.1.2. Single Column—Value Distributions

Value distributions can be described as cardinalities of value groups ( Abedjan et al., 2015 ). While histograms to visualize value distributions are available in most tools in the form of equi-width histograms (which “span value ranges of same length” Abedjan et al., 2015 ), we did not find any tool that supports equi-depth or equi-height histograms (where each bucket represents “the same number of value occurrences.” Thus, we rated all tools that support histograms with “partially” for DP-6. Ataccama allows frequency analysis but no visualization with histograms, and Aggregate Profiler visualizes the distributions only in form of a pie chart. The majority of tools also support minimum and maximum values (DP-7), as well as constancy (DP-8), which is defined as “the ratio of the frequency of the most frequent value (possibly a predefined default value) and the overall number of values” ( Abedjan et al., 2015 ). “Benford's law” (DP-10), which is particularly interesting in the area of fraud detection, was only available in Talend OS.

Quantiles are a statistical measure to divide a value distribution into equidistant percentage points ( Sheskin, 2003 ). The most common type of quantiles, which we observed in our study, are “quartiles” (DP-9), where the value distribution is divided by three points into four blocks. The division points are a multiple of 25 %, denoted as lower quartile or Q1 (25 %), median or Q2 (50 %), and upper quartile or Q3 (75 %), respectively. Other examples for quantiles are “percentiles,” which divide the distribution into 100 blocks (i.e., each block comprises a proportion of 1 %), or “deciles,” which divide the distribution into 10 blocks of each 10 % value distribution ( Sheskin, 2003 ). While only three tools explicitly support quartiles, we discovered the availability of other types of quantiles too (in our survey rated as p ). Table 6 shows the results for the DP-9 test case where quartiles or other types of quantiles are calculated for the column OrderItem.UnitPrice .

www.frontiersin.org

Table 6 . Data profiling—test case quartiles.

Ataccama ONE supports deciles, which are displayed extra in the last row since they cannot be directly mapped to quartiles. Although SAS Data Quality provides 20 blocks, that is, demi-deciles, the functionality is described in the SAS UI as “percentiles,” which would refer to the 100-partitions quantiles. In Table 6 , we picked only the values for the quartile blocks out of the 20 blocks in total. In InfoZoom, the inverse function to quantiles is chosen: instead of merging values into blocks, the percentage value of the distribution is display for each value [denoted as “Cumulative Distribution Function” ( Dasu and Johnson, 2003 )], leading to a total of 2.155 blocks, where each block contains exactly one value. Table 6 displays the percentage value of the distribution that refers to Q1, Q2, and Q3, respectively. It can be summarized that the determination of quantiles is interpreted differently in the single DQ tools with respect to the notation (“Q1” vs. “lower quartile” vs. 25 %) as well as the type of quantile.

5.2.1.3. Single Column—Patterns, Data Types, and Domains

In this category, the support of the different requirements varies widely and there is definitive potential for improvement with respect to out-of-the-box pattern and domain discovery. Even the discovery of basic types (DP-11) is not always supported. For example, DataCleaner recognizes the difference between string, boolean, and number and uses this information for further internal processing, but does not explicitly display it per attribute. While the test cases for the DBMS-specific data types (DP-12) yielded uniform results (“varchar” for ProductName , “decimal” for UnitPrice , and “bit” for isDiscontinued ), the variety in terminology and classification for the basic types is outlined in Table 7 . In SAS, we had problems to access a table containing the “decimal” data type and thus, converted Product.UnitPrice to “long.” “Alphanumeric” in Experian Pandora is abbreviated with “Alphanum.”

www.frontiersin.org

Table 7 . Data profiling—test cases basic types.

For the measurement of the value length (DP-13), the minimum (min.) and maximum (max.) values are usually provided, but not always an average (avg.) value length. The median (med.) value length is only provided by Ataccama ONE. We rated this requirement as fulfilled if at least the minimum, maximum, and average value length were provided, considering the median as optional. Table 8 shows the exact results delivered by the single tools, which justifies the fulfillment ratings and indicates differences in the accuracy of the average values. InfoZoom provides only the maximum value length, while SAS and Talend OS restrict this feature to string values.

www.frontiersin.org

Table 8 . Data profiling—test case value length.

For the number of digits and decimals, the DQ tools usually use the values documented by the DBMS, e.g., 12 digits and 2 decimals for attribute UnitPrice in table Product , compared to maximum 5 digits and 2 decimals in the real data. Value patterns and their visualization as a histogram (DP-16) is supported by most DQ tools. SAS supports pie charts only.

Generic semantic data types (DP-17), such as code, indicators, date/time, quantity, or identifier are also denoted as “data class” ( Abedjan et al., 2015 ) and are defined by generic patterns. A semantic domain (DP-18), “such as a credit card, first name, city, [or] phenotype” ( Abedjan et al., 2015 ), is more concrete than a generic semantic data type and usually associated with a specific application context. The DQ tools that fulfill these requirements offer a number of patterns, which are associated with the respective generic data type or semantic domain. By applying these patterns to the data values, it could be verified to which extent an attribute contains values that are of a specific type. Thus, the two requirements DP-17 and DP-18 are usually not distinguished within the DQ tools we evaluated. The number of available patterns varies between approximately 10–50 patterns (Pandora, DataCleaner, SAS), 50–100 patterns (Talend), and 100–300 (Informatica, Oracle). While most tools display the matching patterns per attribute (e.g., Product.UnitPrice conforms to 98.72 % to the domain “Geocode_Longitude” using Informatica DQ), SAS displays the matching attribute per pattern (e.g., “Country” matches to 100 % the attribute Customer.Country ). Talend OS is the only tool that displays the matching rows instead of the percentage of matching rows per attribute. For Ataccama ONE, we rated DP-17 and DP-18 as partially fulfilled, since specific attributes are classified (e.g., Customer.FirstName as “first name”), but those terms are part of the Ataccama business glossary, which we were unable to access during our evaluation and, therefore, had no further information about its origin.

5.2.1.4. Dependencies

The dependency section has the lowest coverage of the data profiling category and is best supported by Experian Pandora and Informatica DQ (in Developer edition only). Although we introduce each concept briefly, we refer to Abedjan et al. (2019) for details about dependency discovery and their implementation. In the following, R denotes a relational schema (defining a set of attributes) with r being an instance of R (defining a set of records). Sets of attributes are denoted by α and β.

A unique column combination (UCC) is an attribute set α ⊆ R whose projection contains no duplicate entries in r ( Abedjan et al., 2019 ). In other words, a UCC is a (possibly composite) candidate key that functionally determines R . While Experian Pandora allows the detection of single column keys (thus p ), Informatica DQ offers full UCC detection. Both tools allow the user to set a threshold for relaxed UCC detection (DP-20) and to identify violating records via drill-down. With Informatica DQ, we discovered five UCCs in table Order of our test DB, using a threshold of 98 %: Id (100 %), OrderNumber (100 %), OrderDate + TotalAmount (100 %), CustomerId + TotalAmount (99.88 %), and CustomerId + OrderDate (99.16 %). With Experian Pandora, only the two single column keys Id (100 %) and OrderNumber (100 %) were detected. SAS Data Quality indicates 100 % unique attributes as primary key candidates.

An inclusion dependency (IND) over the relational schemata R i and R j states that all values in attribute set α also occur in β, that is R i [α]⊆ R i [β] ( Abedjan et al., 2019 ). The detection of INDs (DP-21 and DP-22), also referred to as foreign key discovery, is not widely supported. The best automation for this requirement delivers Experian Pandora, where initially the primary keys (UCCs) and foreign key relations are inferred, and based on this information, INDs are displayed graphically as a Venn diagram. In addition, it is possible to drill down to records that violate those INDs in a spread sheet. Informatica DQ and SAS Data Quality support IND discovery only partially, since it is required that the user selects the respective primary key (UCCs) and assigns it to possible foreign key candidates that are then tested for compliance. DataCleaner can only be used to check if two tables can possibly be joined, without information on the respective columns or the join quality (i.e., violating rows).

A functional dependency (FD) α → β asserts that all pairs of records with the same attribute values in α must also have the same attribute values in β. Thus, the α-values functionally determine the β-values ( Codd, 1970 ). Again, we used table Order to verify exact (DP-23) and relaxed (DP-24) FD detection. With Experian Pandora and Informatica DQ we found in total eight exact FDs: { Id } → { OrderDate , OrderNumber , CustomerId , TotalAmount } and { OrderNumber } → { Id , OrderDate , CustomerId , TotalAmount } and two more FDs when relaxing the threshold to 93 %: { TotalAmount } → { CustomerId , OrderDate }. Talend OS fulfills FD discovery only partially, because it requires user interaction to specify the attribute sets α and β, given that the number and type of columns are equal. Although specific FDs can be tested with this functionality (e.g., to which extent TotalAmount determines CustomerId ), we do not perceive it as true automated FD discovery, e.g., when performing the test case and specifying all attributes of Order as α and β respectively, the result are five FDs, where each attribute is discovered to functionally determine itself. This case should be ideally excluded during the detection. All three tools printed the identified FDs in table-format, one row for each attribute pair along with the match percentage, but with slightly differing terminology. Thus α (the left side) is denoted as “A column set,” “identity column” or “determinant column,” and β (the right side) is denoted as “B column set,” “identified columns” or “dependent column.” For relaxed FDs, Experian displayed the violating rows with a count (50 in this case), Informatica listed the respective rows, and Talend did not provide violating rows at all.

5.2.1.5. Advanced Multi-Column Profiling

Apart from duplicate detection, which is a widely supported feature, advanced multi-column features are rarely supported satisfactorily. No single tool offers association rule mining (DP-26) as mentioned by Abedjan et al. (2015) . Note that we specifically tested the DQ tools described in Section 5.1, and did not consider related tools that are often installed together. For example, SAS Enterprise Guide, which was shipped with our DQ installation, is dedicated to data analysis and therefore provides a rich function palette that overlaps with the multi-column profiling section, e.g., a selection of correlation coefficients, hierarchical and k-means clustering. Since the aim of this survey is to investigate DQ tools, we did not consider such related tools.

Correlations (DP-25) are a statistical measure between 1.0 and -1.0 to indicate the relationship between two numerical attributes ( Sheskin, 2003 ; Abedjan et al., 2015 ). The most commonly used coefficients are Pearson correlation coefficient, or the rank-based Spearman's or Kendall's tau correlation coefficients ( Sheskin, 2003 ). In our survey, only Aggregate Profiler is able to compute Pearson correlations. However, our test case for DP-25 (Pearson correlation between OrderItem.UnitPrice and OrderItem.Quantity ) yielded -0.045350608 with Aggregate Profiler, which did not conform to our cross-check using SAS Enterprise Guide (0.00737) and the Python package numpy (0.00736647). Talend distinguishes between “numerical,” “time,” and “nominal” correlation analysis and displays the respective correlations in bubble charts. We rated this as partial fulfillment p , since no correlation coefficient is calculated and the calculation is restricted to single columns with specific data types, thus, it is not possible to calculate the correlation between two interval data types.

During our investigation, we found that the concepts of clustering (DP-27), outlier detection (DP-28), and duplicate detection (DP-29 and DP-30) are not always clearly distinguishable in practice. Also Abedjan et al. (2015) state that clustering can be either used to detect outliers in a single column, or to detect similar or duplicate records within a table. Thus, we describe the three concepts briefly along with the condition we applied to verify (partial) fulfillment of the respective requirement.

Clustering (DP-27) is a type of unsupervised machine learning, where data items (e.g., records or attribute values) are classified into groups (clusters) ( Jain et al., 2000 ). A comprehensive overview on existing clustering algorithms is provided by Jain et al. (2000) . In some DQ tools, clustering is only available in the frame of duplicate detection. For example, in OpenRefine clustering is used to detect duplicate string values (cf. Stephens, 2018 ), in Informatica DQ the grouped duplicates are referred to as “clusters,” SAS Data Quality requires a “clustering” component to group records based on their match codes ( SAS, 2019 ); and Oracle EDQ uses clustering as preprocessing step of the matching component to increase runtime efficiency by preventing unnecessary comparisons between records ( Oracle, 2018 ). To completely fulfill requirement DP-27, we presumed the availability of one of the common clustering algorithms (like k -means or hierarchical clustering) as an independent function. Datamartist supports k -means clustering and allows to select the number of clusters k from 5 predefined values (5, 10, 25, 50, 100) and to restrict the observed value range. Aggregate Profiler supports k -means clustering without any modification possibility (e.g., choose k ), as well as a second type of clustering for numeric values, where the number of clusters can be defined. No further information about this clustering algorithm is provided. No tool offers hierarchical clustering or other partitional clustering algorithms except k -means, for example, graph theoretic approaches or expectation maximization ( Jain et al., 2000 ).

Outlier detection deals with data points that are considered abnormalities, deviants, or discordant when compared to the remaining data ( Aggarwal, 2017 ). A comprehensive overview on different algorithms to detect outliers is provided by Aggarwal (2017) . Our investigation showed that outlier detection is implemented in the tools very differently, and compared to the current state of research, only simple methods are used. We have not found a tool that supports multivariate outlier detection or one of the more sophisticated approaches like z-score, linear regression models, or probabilistic models as mentioned by Aggarwal (2017) . In the following, we describe the implementation of outlier detection and the result that our test case yielded to “find very high values” in Order.TotalAmount :

• Aggregate Profiler, Ataccama ONE, Datamartist, and InfoZoom provide outlier detection for numerical values only visually, either in a quantile plot (Ataccama), in a bar chart (Datamartist) or in form of a box plot. Aggregate Profiler and Ataccama ONE do not allow drill-down to the actual outlying values and in InfoZoom the visualization of the single values in the plot are not readable. In all three tools, it is not possible to modify the plot settings or to get details about the used settings. The bar chart in Datamartist is based on k-means clustering with the same modification options as described in the previous paragraph. When using the standard settings (100 bars), one outlier is detected for our test case of finding “very high values” in column Product.UnitPrices : 17250.0. This extreme value is detected by all tools correctly, although other methods yield more outlying values.

• Experian Pandora offers a number of different types of outlier checks, where some require one of the two parameters that can be specified by a user: “Rarity Threshold” (default: 1000) and “Standard Deviation Tolerance” (default: 3.3) ( Experian, 2018 ). The rarity threshold is used to detect rare values, which occur less frequently than one time in < threshold> is used for the checks “rare values,” “is a key,” and “unusually missing values” ( Experian, 2018 ). The standard deviation tolerance specifies the number of standard deviations that is tolerated for a value to be apart from the norm. It is used for low/high amounts, short/long values, rare/frequent values, and rare formats ( Experian, 2018 ). By using the standard settings, we found 18 outlying values for our test case (17250.0, 16321.9, 15810.0, 12281.2, 11493.2, 11490.7, 11380.0, 11283.2, 10835.24, 10741.6, 10588.5, 10495.6, 10191.7, 10164.8, 8902.5, 8891.0, 8623.45, 8267.4).

• Informatica DQ distinguishes between “pattern outliers,” which refer to unusual patterns in the data, and “value frequency outliers” ( Informatica, 2018 ), where values with unusual occurring frequency are displayed. With this functionality, it was not possible to perform our test case, because the characteristic of being an outlier depends on the frequency instead of the actual value.

• SAS provides an “outliers” tab for columns of different data types, where a fixed number of five minimum and maximum values are outlined without any modification possibility. For our test case, the following maximum values have been detected: 17250.0, 16321.0, 15810.0, 12281.2, and 11493.2, which correspond to the five highest results detected with Pandora.

Duplicate detection “aims to identify records [...] that refer to the same real-world entity” ( Elmagarmid et al., 2006 ). It is a widely researched field, which is also referred to as record matching, record linkage, data merging, or redundancy detection ( Elmagarmid et al., 2006 ). In contrast to clustering and outlier detection, the understanding and implementation of duplicate detection is very similar across all tools we investigated. In principle, the user (1) selects the columns that should be considered for comparison, (2) optionally applies a transformation to those columns (e.g., pruning a string to the first three characters), and finally (3) selects an appropriate distance function and algorithm. The major difference in the implementations is the selection of distance functions for the attribute values. The following distances are supported:

• Aggregate Profiler: exact match, similar-any word (if any word is similar for this column), similar-all words (if all words are similar for this column), begin char match, and end char match ( Arrah Technology, 2019 ). No information about the used similarity function was provided.

• DataCleaner: n-grams, first 5, last 5, sorted acronym, Metaphone, common integer, Fingerprint, near integer (for pre-selection phase); exact, is empty, normalized affine gap, and cosine similarity (for scoring phase). The two phases are explained in the following paragraph.

• Experian Pandora: Edit distance, exact, exact (ignore cases), Jaro distance, Jaro-Winkler distance, regular expression, and Soundex.

• Informatica DQ: Bigram, Edit, Hamming, reverse Hamming, and Jaro distance.

• InfoZoom: Soundex and Cologne phonetics.

• OpenRefine: Fingerprint, n-gram Fingerprint, Metaphone3, or Cologne phonetics (with key collision method); Levenshtein or PPM (with nearest neighbor method) cf. ( Stephens, 2018 ) for details.

• Oracle EDQ: (transformations) absolute value, first/last n characters/words, lower case, Metaphone, normalize whitespace, round, Soundex.

• Talend OS: exact, exact (ignore case), Soundex, Soundex FR, Levenshtein, Metaphone, Double Metaphone, Jaro, Fingerprint key, Jaro-Winkler, q-grams, Hamming, and custom.

SAS Data Quality does not offer string distances, but matches based on match codes ( SAS, 2019 ), which are generated based on an input variable, a “definition” (type of transformation for the input variable) and a “sensitivy” (threshold), where records with the same match codes are then grouped together into the same cluster. The list of match definitions depends on the used Quality Knowledge Base (QBK). Talend OS offers two different algorithms to define the record merge strategy: simple VSR Matcher (default) or T-Swoosh. We refer to the documentation ( Talend, 2017 ) for details.

DataCleaner implements a ML-based approach that distinguishes between two modes: untrained detection (considered experimental) and a training mode plus duplicate detection using the trained ML model ( Quadient, 2008 ). The training mode is divided into three phases: (1) pre-selection, (2) scoring using a random forest classifier and the distance functions mentioned above, and (3) the outcome, which highlights duplicate pairs with a probability between 0 and 1.

Despite the fact that duplicate detection is typically attributed toward data cleansing (data integration or data matching) and not considered to be part of data profiling in the implementations, most DQ tools allow this functionality to be used also for detection purposes. We rated Datamartist and Aggregate Profiler as supporting this requirement partially since the function is dedicated to direct cleansing (deletion or replacement of records) and because of the very limited configuration options compared to all other tools. Datamartist does not support DP-30 at all.

5.2.2. Data Quality Measurement Capabilities

Table 9 summarizes the fulfillment of the DQM category, where the first part is dedicated to DQ dimensions, and the second one to business rules.

www.frontiersin.org

Table 9 . Data quality measurement capabilities.

5.2.2.1. Accuracy

An accuracy metric (on table-level) is only provided by Apache Griffin, where the user needs to select a source and a target table and accuracy is calculated according to A t a b = | r a | | r | * 100 % , where | r | is the total number of records in the source table, and | r a | the number of (accurate) records in the target table that can be directly matched to a record in the source table ( Apache Foundation, 2019 ). This metric corresponds to the accuracy metric proposed by Redman (2005) , which is outlined in Equation (2).

5.2.2.2. Completeness

The metric for the completeness on attribute-level ( C a t t = | v c | | v | ) introduced in Equation (8) is closely related to DP-3, the percentage of null values, which yields the missingness M a t t = | v n | | v | = 1 - C a t t , where | v n | is the number of null values in one column. Ataccama, DataCleaner, Datamartist, Experian, and InfoZoom provide the completeness calculation on attribute-level according to Equation (8), without the possibility for aggregation on higher levels. Informatica DQ allows an aggregation on table-level (as the arithmetic mean of all attribute-level completeness values), but not higher. Note that this aggregated metric differs from the table-level completeness proposed by ( Hinrichs, 2002 ) and described in Equation (7), which calculates the mean of all completeness values at the record-level. Despite the fact that MetricDoc offers a metric that is denoted “completeness” in the GUI and is also described in their scientific documentation ( Bors et al., 2018 ), they calculate the missingness M att on attribute-level and thus did not fulfill requirement DQM-32. MobyDQ computes the completeness between two data sources (source and target) according to C t a b = C t - C s C s , where C t is the completeness measure of the target source and C s the measure from the source. If C s is considered to be the reference dataset, this metric corresponds to the completeness calculation proposed by Batini and Scannapieco (2016) , which is discussed in Section 2.2.2.

5.2.2.3. Consistency

The consistency dimension is mentioned in the Informatica DQ methodology ( Informatica, 2010 ), and SAS implements no single metric, but a set of rules that are grouped to this dimension, e.g., checks if an attribute contains numbers, non-numbers, is alphabetic, or is all lower case. We did not rate this as fulfilled because no aggregate metric was provided to calculate “consistency” and these rules are supplied by most DQ tools with generally applicable business rules (DQM-37). However, we would like to point out that the understanding of Informatica and SAS corresponds to the consistency metrics proposed in research (cf. Section 2.2.3), since both approaches are rule-based. In contrast to a predefined metric, Informatica and SAS assume that the user creates metric manually.

5.2.2.4. Timeliness

We did not find an implementation for the timeliness dimension as discussed in Section 2.2.4. However, with respect to other time-related dimensions, MetricDoc offers two different time interval metrics , where one checks if the interval between two timestamps “is smaller than, larger than, or equal to a given duration value” ( Bors et al., 2018 ), and the second one performs outlier detection on interval length. MobyDQ offers metrics for freshness and latency , but refers to those dimensions as DQ “indicators” ( Rolland, 2019 ). Freshness is implemented as ts cur − ts t , where ts cur is the current timestamp and ts t the last updated timestamp from the target request, and latency as ts s − ts t , where ts s is the last updated timestamp from the source request ( Rolland, 2019 ). These indicators are not specifically dedicated to DQ dimensions and do not fulfill the requirement for DQ metrics to be normalized between [0,1] by Heinrich et al. (2018) .

5.2.2.5. Other DQ Metrics

With respect to other non-time-related DQ metrics (DQM-35), the uniqueness dimension is most often implemented according to U a t t = | v u | | v | , where | v u | refers to the number of unique values within a column. DataCleaner, Datamartist, Experian, SAS Data Quality, and Talend OS implement uniqueness on attribute-level only, which corresponds to the requirement DP-5. Informatica DQ allows an aggregation on table-level but not higher. MetricDoc implements a dimension referred to as “uniqueness,” but actually calculate the redundancy R t a b = | r b l a c k | | r | on table-level, where | r black | is the number of records with at least one duplicate entry in the table. The user needs to select more than one attribute within a table in order to calculate the metric and it cannot be aggregated on higher levels. In addition, MetricDoc offers metrics for the DQ dimensions validity and plausibility . Validity is calculated on attribute-level as V a t t = | v i | | v | , where | v i | is the number of attribute values that do not comply to the column data type. Plausibility is also calculated on attribute-level as P a t t = | v j | | v | , where | v j | is the number of attribute values that are outliers according to a nonrubost or robust statistical measure (mean with standard deviation, or median with interquartile range estimator, respectively) ( Bors et al., 2018 ). MobyDQ also provides a validity indicator, which connects to one single target data source and compares the values with defined thresholds ( Rolland, 2019 ). SAS does not provide predefined metrics, but uses DQ dimensions as abstraction layer to group business rules.

A number of additional DQ dimensions are mentioned in documentations or on websites of DQ tool vendors without being implemented as metric. In order to provide a structured overview, these additionally mentioned DQ dimensions are summarized together with all “other” DQ dimensions and their implementations (explained in this paragraph) in the following list:

• Conformance : mentioned by Informatica Loshin (2006) .

• Conformity : mentioned by Informatica (2010) .

• Correctness : mentioned in DataCleaner documentation ( Quadient, 2008 ).

• Currency : mentioned by Loshin (2006) .

• Duplicates : mentioned by Informatica ( Informatica, 2010 ).

• Duplication : mentioned in DataCleaner documentation ( Quadient, 2008 ).

• Freshness : implemented by MobyDQ as ts cur − ts t ( Rolland, 2019 ).

• Integrity : mentioned in Informatica (2010) , Talend (2017) , and SAS (2019) .

• Latency : implemented by MobyDQ as ts s − ts t ( Rolland, 2019 ).

• Plausibility : implemented by MetricDoc for OpenRefine as P a t t = | v j | | v | ( Bors et al., 2018 ).

• Referential Integrity : mentioned by Informatica ( Loshin, 2006 ).

• Structure : mentioned by SAS (2019) .

• Uniformedness : mentioned in DataCleaner documentation ( Quadient, 2008 ).

• Uniqueness : implemented by DataCleaner, Datamartist, Experian, SAS Data Quality, and Talend OS on attribute-level as U a t t = | v u | | v | and by Informatica also on table-level. Implemented by MetricDoc for OpenRefine as R t a b = | r b l a c k | | r | . Also mentioned in Loshin (2006) ; Datamartist (2017) ; SAS (2019) ; Experian (2020) .

• Validity : implemented by MetricDoc for OpenRefine as V a t t = | v i | | v | ( Bors et al., 2018 ). Also mentioned in SAS (2019) and Experian (2020) .

5.2.2.6. Business Rules

While the creation and application of business rules (DQM-36 and DQM-38) is supported by most DQ tools, few tools also offer predefined generally applicable business rules. A widely supported example are rules for address validation (e.g., zip codes, cities, states) that tackle the prevalent problem of failed mail deliveries due to incorrect addresses, also described by Apel et al. (2015) . Despite the good performance of DataCleaner in terms of data profiling and CDQM, it does not support business rules at all. We rated DP-37 (the availability of generally applicable rules) as only partly fulfilled for InfoZoom, because the provided rules have been created for a given demo DB schema and would need to be modified to apply to other schemas (e.g., with other column names).

5.2.3. Data Quality Monitoring Capabilities

The results of the CDQM evaluation are shown in Table 10 . We want to point out that for two DQ tools (Ataccama ONE and Talend OS), more advanced versions are available that support CDQM according to their vendors, but we did not investigate it.

www.frontiersin.org

Table 10 . Data quality monitoring capabilities.

The storage (CDQM-40) of DP or DQM results is possible in all tools. The majority of DQ tools (except MobyDQ and OpenRefine) also support data export via a GUI. Datamartist allows to export only very basic data profiles. The most comprehensive enterprise solutions for CDQM-40 and 41 is provided by Informatica DQ and SAS Data Quality, which enable the export of full DP procedures. During import, all settings and required data sources are reloaded from the time of the analysis.

Task scheduling (CDQM-39) is also widely supported. Aggregate Profiler fulfills this requirement only partially, since it is only possible to schedule business rules, but no other form of tasks, e.g., data profiling tasks. With Datamartist, InfoZoom, and SAS Data Quality, task scheduling is cumbersome for business users, since the command line is required to write batch files. With Datamartist, the tool needs to be closed to execute the batch file.

To visualize the continuously performed DQ checks (be it DP tasks, user-defined rules, or DQ metrics), Informatica DQ relies on so called “scorecards,” which can be customized to display the respective information. Apache Griffin, Experian Pandora, and SAS Data Quality also allow alerts to be defined, when specific errors occur or when a defined rule is violated. MobyDQ does not offer any visualization (which is considered future work), but relies on external libraries in its implementation at Ubisoft. SAS fulfills both requirements CDQM-42 and 43 only partially, since its “dashboards” contain solely the number or percentage of triggers per date, source or user, but no specific values (e.g., 80 % completeness) could be plotted. The most comprehensive solution for CDQM in general-purpose DQ tools provide Informatica DQ and DataCleaner by Human Inference. With respect to the open-source tools, only Apache Griffin provides comprehensive CDQM support, and the commercial version of MobyDQ, which is deployed at Ubisoft.

6. Survey Discussion and Lessons Learned

The results of our survey on DQ measurement and monitoring tools revealed interesting characteristics of DQ tools and allow to draw conclusions about the future direction of automated and continuous data quality measurement. While the following paragraphs provide a general overview on the marked of DQ tools, which was a side-result of this survey, each sub-question and the overall research question of this survey are discussed separately per subsection.

One of the greatest challenges we faced during the conduct of this survey was the constant change and development of the DQ tools, especially the open-source tools. Nevertheless, it is of great value to reflect on the current state of the market for two reasons: (1) to create a uniform vision for the future of DQ research, and (2) to identify the potential for functional enhancement across the tools.

The fact that we found 667 tools attributed to “data quality” in our systematic search indicates the growing awareness of the topic. However, approximately half (50.82 %) of the DQ tools that we found were domain specific, which means they were either dedicated to specific types of data or built to measure the DQ of a proprietary tool. This amount underlines the “fitness for use” principle of DQ, which states that the quality of data is dictated by the user and type of usage. 40 % of the DQ tools were dedicated to a specific data management task, for example, data cleansing, data integration, or data visualization, which reflects the complexity of the topic “data quality.” Although those tasks are often not clearly distinguished in practice, we required explicit DQ measurement, that is, making statements about the DQ without modifying the observed data.

Our selection of DQ tools provides a good digest of the market, since we included eight commercial and closed-source tools as well as five free and open-source tools, from which four (except Talend) are not mentioned by Gartner (cf. Chien and Jain, 2019 ). The vendors of four tools have been named “leader” in the Magic Quadrant of Data Quality Tools 2019 (Informatica, SAS, Talend, Oracle) and two of them are among the four vendors currently controlling the market [which are SAP, Informatica, Experian, and Syncsort ( Chien and Jain, 2019 ); however, no trial for SAP nor Syncsort was granted].

Overall and according to our requirements, we experienced Informatica DQ as the most mature DQ tool. The best support for data profiling is provided by Experian Pandora, which allows to profile across an entire DB and even across multiple connected data sources. All other tools allow data profiling only for selected columns or within specific tables. Despite being classified as leader by Gartner, we perceived Oracle EDQ, Talend OS, and SAS Data Quality as having less support for data profiling and/or DQ monitoring. Although Quadient (with DataCleaner) was removed from the Gartner study in 2019 due to their focus on customer data, our evaluation yielded a good support in data profiling and a strong support in DQ monitoring. However, when comparing the two general-purpose and freely available DQ tools Talend OS and Aggregate Profiler, the former one convinced in terms of intuitive user interface and a good overall performance. Aggregate Profiler on the other hand, has a richer support for advanced multi-column profiling and data cleansing, but it is not always clear which algorithms are used to perform data modifications and the documentation is not up-to-date.

Three open-source tools (Apache Griffin, MobyDQ, and OpenRefine) were installed from GitHub and thus required technical knowledge for the setup. While OpenRefine can not keep up with comparable tools like Talend OS or Aggregate Profiler in terms of data profiling, MobyDQ and Apache Griffin have clearly a different focus on CDQM. IBM ISDQ demonstrated, that also commercial tools can be very arduous and time intensive to install due to the increasing complexity of the single modules and dependencies between them.

6.1. Data Profiling Capabilities in Current DQ Tools

In order to answer the first sub-research-question “ Which data profiling capabilities are supported by current DQ tools? ” we compiled 30 requirements that are mainly based on the classification on data profiling by Abedjan et al. (2019) .

In summary, 11 (all except Apache Griffin and MobyDQ) of the 13 tools examined supported data profiling at least partially. The details on the data profiling capabilities per DQ tools are discussed in Section 5.2.1. Our evaluation revealed that especially single-column data profiling like cardinalities (DP 1—5) were supported by all 11 tools. However, considering the state-of-the-art in research, there is potential for functional enhancement with respect to multi-column profiling (DP 25–30) and dependency discovery (DP 19–24). For example, dependency discovery is only supported by 2 tools in a comprehensive way. While in the group of multi-column profiling, exact and approximate duplicate detection is a very common feature (supported at least partially by 10 tools in total), correlation analysis is only supported by one tool (Aggregation Profiler) completely, and a second tool (Talend Open Studio) partially. Association-rule mining is not supported by any tool at all and there is also no full support for clustering by any tool observed. According to our customer contacts and reference customers, those functionalities are not considered to be part of data profiling and are usually implemented in analytics tools (e.g., SAS Enterprise Guide). This might be a reason why most DQ tools in our evaluation lack a wide range of features in this category: customers and vendors simply do not consider it as part of data profiling and data quality. This observation can be explained by the unclear distinction between the terms “data profiling” and “data mining.” Abedjan et al. (2019) distinguish the two topics by the object of analysis (focus on columns in data profiling vs. rows in data mining) and by the goal of the task (gathering technical metadata by data profiling vs. gathering new insights by data mining) ( Abedjan et al., 2019 ). While this distinction is still fuzzy, we go one step further and claim that there is also no clear distinction between data mining and data analytics with respect to the used techniques [e.g., regression analysis is discussed in both topics ( Dasu and Johnson, 2003 )].

In recent years, numerous research initiatives concerning data profiling have been carried out that also use ML-based methods. Current general-purpose DQ tools do not take full advantage of these features. Although, several vendors claim to implement ML-based methods, we found no or only limited documentation of concrete algorithms (cf. Quadient, 2008 ). Note that in the case of DataCleaner for duplicate detection, we received more detailed documentation upon request. We think that especially concerning the hype for artificial intelligence and the enhancement of detecting DQ errors with ML methods, it is necessary to focus on the desirable core characteristics for DQ and data mining ( Dasu and Johnson, 2003 ): the methods should be widely applicable, easy to use, interpret, store and deploy, and should have short response times. A counterexample are neural networks, which are increasingly applied in recent research initiatives, but need to be handled with care for DQ measurement, because they are black-box and hard to interpret. For measuring the quality of data (to ensure reliable and trustworthy data analysis), easy and clearly interpretable statistics and algorithms are required to prevent a user from deriving wrong conclusions from the results.

Apart from functional enhancements, we want to point out the desire for more automation and in data profiling. Current DQ tools allow users to select data profiling features or to define rules, which are then applied to single attributes or tables. This does not meet today's requirements to master big data problems, where typically, multiple information systems needs to be monitored at the same time ( Stonebraker and Ilyas, 2018 ). To ease the high initialization effort for large information system infrastructures, more automated, initial and still meaningful out-of-the-box profiling would be required.

6.2. Data Quality Measurement Capabilities

The first sub-research-question “ Which data quality dimensions and metrics can be measured with current DQ tools? ” was inspired by the number of DQ dimensions and metrics proposed by researchers (cf. Piro, 2014 ; Batini and Scannapieco, 2016 ; Heinrich et al., 2018 and the detailed outline in Section 2.2). In our survey, did not find a tool that implements a wider range of DQ metrics for the most important DQ dimensions as proposed in research papers and we also did not find another survey that investigates the existence of DQ metrics in tools. Identified DQ metric implementations have several drawbacks: some are only applicable on attribute-level (e.g., no aggregation possibility), some require a gold standard that might not exist, and some have implementation errors.

6.2.1. DQ Metrics for Accuracy and the Problem of Gold Standards

The two open-source tools that implement metrics for the DQ dimensions accuracy (Apache Griffin) and completeness between two tables (MobyDQ) relied on a reference data set (i.e., gold standard) provided by the user. Apache Griffin based their metric on the definition by DAMA UK, who state that accuracy is “the degree to which data correctly describes the ‘real world' object or event being described” ( Askham et al., 2013 ), which needs to be selected for the calculation. MobyDQ specifically aims at automating DQ checks in data pipelines, that is, computing the difference between a source and a target data source, where the gold standard is clearly defined. However, in scenarios where the quality of a single data source should be assessed, such metrics are not suitable since a reference or gold standard is often not available ( Ehrlinger et al., 2018 ). This fact is also reflected by the restricted prevalence of such gold-standard-depending DQ metrics in commercial and general-purpose DQ tools.

6.2.2. DQ Metrics for Completeness and Uniqueness

The other investigated tools mainly implement two very basic metrics: completeness (indicating the missing data problem) and uniqueness (indicating duplicate data values or records). It is noteworthy that while completeness is one of the most-widely used DQ dimensions (cf. Batini and Scannapieco, 2016 ; Myers, 2017 ; Heinrich et al., 2018 ), the aspect of uniqueness is often neglected in DQ research ( Ehrlinger and Wöß, 2019 ). For example, ( Piro, 2014 ) perceives duplicate detection as a symptom of data quality, but not as DQ dimension. Neither ( Myers, 2017 ) in his “List of Conformed Dimensions of Data Quality,” nor the ISO/IEC 25024:2015 standard on DQ ( ISO/IEC 25024:2015(E), 2015 ) refer to a DQ dimension that describes the aspect of uniqueness or non-redundancy ( Ehrlinger and Wöß, 2019 ). Despite this difference, both DQ dimensions have a common characteristic: they can be calculated without necessarily requiring a gold standard. Nevertheless, these implementations lack two aspects: (1) the aggregation of DQ dimensions and (2) schema-level DQ dimensions that are clearly part of the DQ topic ( Batini and Scannapieco, 2016 ). The aggregation of DQ dimensions from value-level to attribute-, record-, table-, DB- or cross-data-source-level as presented by Hinrichs (2002) ; Piro (2014) was not provided by any tool prefabricated. Informatica DQ is the only tool that allows to aggregate column-level metrics on table-level, but not higher. We did not declare a manual implementation in tools with strong rule support as availability of such aggregation functions.

6.2.3. DQ Measurement Methodologies

Despite the lack of prefabricated DQ metrics, most tools refer to a set of DQ dimensions in their user guide or defined methodology, for example, Informatica and SAS rely on whitepapers influenced by David Loshin (cf. Informatica, 2010 ; SAS, 2019 ), or Talend promotes the existence of such metrics on their website 16 . In Section 5.2.2, we showed that the list of referenced DQ dimensions and metrics by the DQ vendors is very non-uniformly. Further inquiry on the metrics yielded two different responses by our customer contacts: while some explicitly stated that they do not offer generally applicable DQ metrics, others could not answer the question of how specific metrics are implemented.

In the case of Talend, we asked our customer contact and the Talend Community 17 , where the metrics promoted on the website can be found. Unfortunately, we got no satisfying answer, only references to the data profiling perspective in TOS and its documentation. This experience underlines the statement by Sebastian-Coleman (2013) that “people can often not say how to measure completeness or accuracy,” which also leads to different interpretations and implementations.

Other vendors justified the absence of generally applicable DQ metrics with two reasons: because such metrics are not feasible in practice, and because customers do not request it. Several DQ strategies also indicate the fact that DQ metrics should be created by the user and adjusted to the data (cf. Informatica, 2010 ; Apache Foundation, 2019 ; SAS, 2019 ). This understanding follows the “fitness for use” principle, which highlights the subjectivity of DQ. Also Piro (2014) states that objectively measurable DQ dimensions previously require a manual configuration by a user. An example for this is Apache Griffin, who state that “Data scientists/analyst define their DQ requirements such as accuracy, completeness, timeliness, and profiling” ( Apache Foundation, 2019 ). Sebastian-Coleman (2013) points out that it is important to understand the DQ dimensions, but these do not immediately lend themselves to enabling specific measurements. The main foundation into DQ measurement, including the set of DQ dimensions and metrics have been originally proposed in the course of the Total Data Quality Management (TDQM) program of MIT 18 in the 1980s. Dasu and Johnson (2003) state that DQ dimensions, as originally proposed by the TDQM, are not practically implementable and it is often not clear what they mean. The results of our survey underlines this statement with a scientific foundation, because each DQ tool implements the dimensions differently, and partially far away from the complex metrics proposed in research (e.g., no aggregation, often no gold standard). Apart from completeness and uniqueness on attribute-level, no DQ dimension finds wide-spread agreement in the implementation and definition in practice. This is especially noteworthy for the frequently mentioned accuracy dimension, which however, requires a reference data set that is often not available in practice.

6.2.4. The Meaning of DQ Dimensions and Metrics for DQ Measurement

We conclude that there is a strong need to question the current use of DQ dimensions and metrics. Research efforts to measure DQ dimensions directly with a single, generally-applicable DQ metric have little practical relevance and can hardly be found in DQ tools. In practice, DQ dimensions are used to group domain-specific DQ rules (sometimes referred to as metrics) on a higher level. Since research and practitioners failed to create a common understanding of DQ dimensions and their measurement for decades, a complementary and more practice-oriented approach should be developed. Several DQ tools show that DQ measurement is possible without referring to the dimensions at all. Since our focus is the automation of DQ measurement, a practical approach would be required without the need for DQ dimensions, but a focus on the core aspects (like missing data and duplicate detection), which can actually be measured automatically.

6.3. Data Quality Monitoring

The third sub-research question addresses “ whether DQ tools enable automated monitoring of data quality over time .” In contrast to Pushkarev et al. (2010) , who did not find any tool that supports DQ monitoring, we identified the existence of this feature, as shown in Table 10 .

In general-purpose DQ tools (e.g., DataCleaner, Informatica EDQ, InfoZoom & IZDQ), DQ monitoring is considered a premium feature, which is liable to costs and only provided in professional versions. This is also the reason, why DQ monitoring has not been studied so far in related work that focused on open-source DQ tools (cf. Pushkarev et al., 2010 ). An exceptions to this observation is the dedicated open-source DQ monitoring tool Apache Griffin, which supports the automation of DQ metrics, but lacks predefined functions and data profiling capabilities. The remaining open question with respect to DQ monitoring is which aspects of the data should actually be measured (discussed in Section 6.2).

7. Conclusion and Outlook

In this survey, we conducted a systematic search in which we identified 667 software tools dedicated to the topic “data quality.” With six predefined exclusion criteria, we extracted 17 tools for deeper investigation. We evaluated 13 of the 17 tools with regard to our catalog of 43 requirements divided into the three categories (1) data profiling, (2) DQ measurement, and (3) continuous DQ monitoring. Although the market of DQ tools is continuously changing, this survey gives a comprehensive overview on state-of-the-art of DQ tools and how DQ measurement is currently perceived in practice by companies in contrast to DQ research.

So far, there are only a few surveys on DQ tools in general, and in particular no survey that investigated the existence of generic DQ metrics. There is also no survey that identified the existence of DQ monitoring capabilities in tools. We attempted to close this gap with our survey and provide the results regarding the available DQ metrics and DQ monitoring capabilities for the tools analyzed.

While we identified the need for more automation in data profiling and DQ measurement (with respect to initialization as well as continuous DQ monitoring), at the same time, a clear declaration and explanation of the performed calculations and algorithms is essential. In several tools (e.g., AggregateProfiler, InfoZoom), plots were generated or outliers were detected without a clear declaration of the used threshold or distance function. In alignment with the requirement for interpretability of data profiling results, we highlight the need for clear declaration of the parameters used.

In our ongoing and future work, we will introduce a practical DQ methodology that regards at directly measurable aspects of DQ in contrast to abstract dimensions with no common understanding. We also think that it is worth investigating the potential for automated out-of-the-box data profiling along with a clear declaration of the used parameters, which might be modified after the initial run. Part of our ongoing research is to exploit time-series analytics for further investigation of DQ monitoring results in order to predict trends and sudden changes in the DQ (as suggested in Ehrlinger and Wöß, 2017 ). Since a deep investigation of single DP requirements was out of scope for this survey, it would also be worth to further investigate specific implementations and their proper functionality, for example, which aspects yield floating point differences. Last but not least, further investigation of the 339 excluded domain-specific DQ tools with regard to their domains and their scope would be interesting.

The top vendors of DQ tools worldwide have between 7,200 (Experian), 5,000 (Informatica) and 2,700 (SAS) customers for their DQ product line ( Chien and Jain, 2019 ). Compared to the hype for AI and ML, these low numbers show high catch-up demand for DQ tool applications in general.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author Contributions

LE designed and conducted the survey. LE and WW wrote this article. Both authors contributed to the article and approved the submitted version.

The research reported in this paper has been supported by the Austrian Ministry for Transport, Innovation and Technology, the Federal Ministry for Digital and Economic Affairs, and the State of Upper Austria in the frame of the COMET center SCCH.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

The authors would like to thank all contact persons who provided us with trial licenses and support, in particular, Thomas Bodenmüller-Dodek and Dagmar Hillmeister-Müller from Informatica, David Zydron from Experian, Alexis Rolland from Ubisoft, Marc Kliffen from Human Inference, Ingo Lenzen from InfoZoom, Loredana Locci from SAS, and Rudolf Plank from solvistas GmbH. We specifically thank Elisa Rusz for her support with the systematic search and the DQ tool evaluation, as well as Alexander Gindlhumer for his support with Apache Griffin.

1. ^ https://github.com/CanburakTumer/SQL-Utils (January, 2022).

2. ^ https://github.com/rogelj/DescribeCol (January, 2022).

3. ^ http://www.dofactory.com/sql/sample-database (January, 2022).

4. ^ https://sourceforge.net/projects/dataquality (January, 2022).

5. ^ https://griffin.apache.org (January, 2022).

6. ^ https://one.ataccama.com (January, 2022).

7. ^ http://www.datamartist.com (January, 2022).

8. ^ https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.productization.iisinfsv.install.doc/topics/cont_iisinfsrv_install.html (January, 2022).

9. ^ http://www.humanit.de (January, 2022).

10. ^ https://github.com/mobydq/mobydq (January, 2022).

11. ^ http://openrefine.org (January, 2022).

12. ^ https://github.com/OpenRefine/OpenRefine (January, 2022).

13. ^ http://www.oracle.com/technetwork/middleware/oedq/downloads/edq-vm-download-2424092.html (January, 2022).

14. ^ https://github.com/Talend/tdq-studio-se (January, 2022).

15. ^ http://www.sas.com (January, 2022).

16. ^ https://info.talend.com/vrstosdq_150602.html (January, 2022).

17. ^ https://community.talend.com (January, 2022).

18. ^ http://web.mit.edu/tdqm (January, 2022).

Abedjan, Z., Golab, L., and Naumann, F. (2015). Profiling relational data: a survey. VLDB J. 24, 557–581. doi: 10.1007/s00778-015-0389-y

CrossRef Full Text | Google Scholar

Abedjan, Z., Golab, L., Naumann, F., and Papenbrock, T. (2019). “Data profiling,” in Synthesis Lectures on Data Management , vol. 10 (San Rafael, CA: Morgan & Claypool Publishers), 1–154.

Aggarwal, C. C. (2017). Outlier Analysis . 2nd Edn. New York, NY: Springer International Publishing.

Apache Foundation (2019). Apache Griffin User Guide . Technical report, Apache Foundation. Available online at: https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md (January 2022).

Apel, D., Behme, W., Eberlein, R., and Merighi, C. (2015). Datenqualität erfolgreich steuern: Praxislösungen für Business-Intelligence-Projekte [Successfully Governing Data Quality: Practical Solutions for Business-Intelligence Projects] . Edition TDWI. Heidelberg: dpunkt.verlag GmbH.

Arrah Technology (2019). Aggregate Profile User Guide Version 6.1.8 . Technical Report.

Askham, N., Cook, D., Doyle, M., Fereday, H., Gibson, M., Landbeck, U., et al. (2013). The six primary dimensions for data quality assessment . Technical Report, DAMA United Kingdom.

Ballou, D. P., and Pazer, H. L. (1985). Modeling data and process quality in multi-input, multi-output information systems. Manag. Sci. 31, 150–162. doi: 10.1287/mnsc.31.2.150

Barateiro, J., and Galhardas, H. (2005). A survey of data quality tools. Datenbank-Spektrum 14, 15–21. doi: 10.1145/3190578

Batini, C., Cappiello, C., Francalanci, C., and Maurino, A. (2009). Methodologies for data quality assessment and improvement. ACM Comput. Surveys (CSUR) 41, 16:1–16:52. doi: 10.1145/1541880.1541883

Batini, C., and Scannapieco, M. (2016). Data and Information Quality: Concepts, Methodologies and Techniques . Cham: Springer International Publishing.

Bors, C., Gschwandtner, T., Kriglstein, S., Miksch, S., and Pohl, M. (2018). Visual interactive creation, customization, and analysis of data quality metrics. J. Data Inf. Qual. 10, 3:1–3:26. doi: 10.145/3190578

Bronselaer, A., De Mol, R., and De Tré, G. (2018). A measure-theoretic foundation for data quality. IEEE Trans. Fuzzy Syst. 26, 627–639. doi: 10.1109/TFUZZ.2017.2686807

Chen, M., Song, M., Han, J., and Haihong, E. (2012). “Survey on data quality,” in 2012 World Congress on Information and Communication Technologies (WICT) (Trivandrum: IEEE), 1009–1013.

Chien, M., and Jain, A. (2019). Magic Quadrant for Data Quality Tools . Technical Report, Gartner, Inc.

Chrisman, N. R. (1983). The role of quality information in the long-term functioning of a geographic information system. Cartographica Int. J. Geograph. Inf. Geovisual. 21, 79–88.

Cichy, C., and Rass, S. (2019). An overview of data quality frameworks. IEEE Access , 7:24634–24648. doi: 10.1109/ACCESS.2019.2899751

Codd, E. F. (1970). A relational model of data for large shared data banks. Commun. ACM 13, 377–387.

PubMed Abstract | Google Scholar

Dai, W., Wardlaw, I., Cui, Y., Mehdi, K., Li, Y., and Long, J. (2016). “Data profiling technology of data governance regarding big data: review and rethinking,” in Information Technology: New Generations (Las Vegas, NV: Springer), 439–450.

Dasu, T., and Johnson, T. (2003). Exploratory Data Mining and Data Cleaning , Vol. 479. Hoboken, NJ: John Wiley & Sons.

Datamartist (2017). Automating Data Profiling (Pro Only) . Technical Report, Datamartist.

Ehrlinger, L., Haunschmid, V., Palazzini, D., and Lettner, C. (2019). “A DaQL to monitor the quality of machine data,” in Proceedings of the International Conference on Database and Expert Systems Applications (DEXA), volume 11706 of Lecture Notes in Computer Science . (Cham: Springer), 227–237.

Ehrlinger, L., and Wöß, W. (2017). “Automated data quality monitoring,” in Proceedings of the 22nd MIT International Conference on Information Quality (ICIQ 2017) , ed J. R. Talburt (Little Rock, AR), 15.1–15.9.

Ehrlinger, L., and Wöß, W. (2019). A Novel Data Quality Metric for Minimality. In Hacid, H., Sheng, Q. Z., Yoshida, T., Sarkheyli, A., and Zhou, R., editors, Data Quality and Trust in Big Data , pages 1–15, Cham. Springer International Publishing.

Ehrlinger, L., Werth, B., and Wöß, W. (2018). Automated continuous data quality measurement with quaIIe. Int. J. Adv. Softw. 11, 400–417.

Elmagarmid, A. K., Ipeirotis, P. G., and Verykios, V. S. (2006). Duplicate record detection: a survey. IEEE Trans. Knowl. Data Eng. 19, 1–16. doi: 10.1109/TKDE.2007.250581

English, L. P. (1999). Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits . New York, NY: John Wiley & Sons, Inc.

Experian (2018). User Manual Version 5.9. Technical Report, Experian . Available online at: https://www.edq.com/globalassets/documentation/pandora/pandora_manual_590.pdf (January 2022).

Experian (2020). What is a Data Quality Dimension? Available online at: https://www.experian.co.uk/business/glossary/data-quality-dimensions (January 2022).

Fisher, C. W., Lauria, E. J. M., and Matheus, C. C. (2009). An accuracy metric: percentages, randomness, and probabilities. J. Data Inf. Qual. 1, 16:1–16:21. doi: 10.1145/1659225.1659229

Friedman, T. (2012). Magic Quadrant for Data Quality Tools . Technical Report, Gartner, Inc.

Gao, J., Xie, C., and Tao, C. (2016). “Big data validation and quality assurance – issuses, challenges, and needs,” in Proceedings of the 2016 IEEE Symposium on Service-Oriented System Engineering (SOSE) (Oxford: IEEE), 433–441.

Ge, M., and Helfert, M. (2007). “A review of information quality research,” in Proceedings of the 12th International Conference on Information Quality (ICIQ) (Cambridge, MA: MIT), 76–91.

Goasdoué, V., Nugier, S., Duquennoy, D., and Laboisse, B. (2007). “An evaluation framework for data quality tools,” in Proceedings of the 12th International Conference on Information Quality (ICIQ) (Cambridge, MA: MIT), 280–294.

Haegemans, T., Snoeck, M., and Lemahieu, W. (2016). “Towards a precise definition of data accuracy and a justification for its measure,” in Proceedings of the International Conference on Information Quality (ICIQ 2016) (Ciudad Real: Alarcos Research Group (UCLM)), 16.1–16.13.

Heinrich, B., Kaiser, M., and Klier, M. (2007). “How to measure data quality? a metric-based approach,” in Proceedings of the 28th International Conference on Information Systems (ICIS) , eds S. Rivard, and J. Webstere (Montreal, QC: Association for Information Systems 2007), 1–15. doi: 10.1145/3148238

Heinrich, B., and Klier, M. (2009). “A novel data quality metric for timeliness considering supplemental data,” in Proceedings of the 17th European Conference on Information Systems (Verona: Università di Verona, Facoltà di Economia, Departimento de Economia Aziendale), 2701–2713.

Heinrich, B., Hristova, D., Klier, M., Schiller, A., and Szubartowicz, M. (2018). Requirements for data quality metrics. J. Data Inf. Qual. 9, 12:1–12:32.

Hildebrand, K., Gebauer, M., Hinrichs, H., and Mielke, M. (2015). Daten- und Informationsqualität [Data and Information Quality] , vol. 3. Wiesbaden: Springer Vieweg.

Hinrichs, H. (2002). Datenqualitätsmanagement in Data Warehouse-Systemen [Data Quality Management in Data Warehouse Systems] . Ph.D. thesis, Universität Oldenburg.

IEEE (1998). Standard for a Software Quality Metrics Methodology . Technical Report 1061-1998, Institute of Electrical and Electronics Engineers.

Informatica (2010). The Informatica Data Quality Methodology . Technical Report, Informatica.

Informatica (2018). Profile Guide – 10.2 HotFix 1 . Technical Report, Informatica. Available online at: https://docs.informatica.com/content/dam/source/GUID-2/GUID-2257EE21-27D6-4053-B1DE-E656DA0A15C8/11/en/IN_101_ProfileGuide_en.pdf (January 2022).

ISO 8000-8:2015(E) (2015). Data Quality – Part 8: Information and Data Quality Concepts and Measuring . Standard, International Organization for Standardization, Geneva.

ISO/IEC 25012:2008 (2008). Systems and Software Engineering ? Systems and Software Quality Requirements and Evaluation (SQuaRE) ? Measurement of Data Quality . Standard, International Organization for Standardization, Geneva.

ISO/IEC 25024:2015(E) (2015). Systems and Software Engineering ? Systems and Software Quality Requirements and Evaluation (SQuaRE) ? Measurement of Data Quality . Standard, International Organization for Standardization, Geneva, Switzerland.

ISO/IEC 25040:2011 (2011). Systems and Software Engineering ? Systems and Software Quality Requirements and Evaluation (SQuaRE) ? Measurement of Data Quality . Standard, International Organization for Standardization, Geneva.

Jain, A. K., Murty, M. N., and Flynn, P. J. (2000). Data Clustering: a review. ACM Comput. Surveys (CSUR) 31, 264–323. doi: 10.1145/331499.331504

Judah, S., Selvage, M. Y., and Jain, A. (2016). Magic Quadrant for Data Quality Tools . Technical Report, Gartner, Inc.

Kitchenham, B. (2004). Procedures for Performing Systematic Reviews . Technical Report, Keele University TR/SE-0401 and NICTA 0400011T.1.

Kitchenham, B., Brereton, O. P., Budgen, D., Turner, M., Bailey, J., and Linkman, S. (2009). Systematic literature reviews in software engineering – a systematic literature review. Inf. Softw. Technol. 51, 7–15. doi: 10.1016/j.infsof.2008.09.009

Kokemüller, J., and Haupt, F. (2012). Datenqualitätswerkzeuge 2012 – Werkzeuge zur Bewertung und Erhöhung von Datenqualität [Data Quality Tools 2012 - Tools for the Assessment and Improvement of Data Quality] . Technical Report, Fraunhofer IAO.

KPMG International (2016). Now or Never: 2016 Global CEO Outlook . Available online at: https://home.kpmg/content/dam/kpmg/pdf/2016/06/2016-global-ceo-outlook.pdf (January 2022).

Kusumasari, T. F. (2016). “Data profiling for data quality improvement with OpenRefine,” in 2016 International Conference on Information Technology Systems and Innovation (ICITSI) (Bal: IEEE), 1–6.

Laranjeiro, N., Soydemir, S. N., and Bernardino, J. (2015). “A survey on data quality: classifying poor data,” in Proceedings of the 21st Pacific Rim International Symposium on Dependable Computing (PRDC) (Zhangjiajie: IEEE), 179–188.

Lee, Y. W., Pipino, L. L., Funk, J. D., and Wang, R. Y. (2009). Journey to Data Quality . Cambridge, MA: The MIT Press.

Lee, Y. W., Strong, D. M., Kahn, B. K., and Wang, R. Y. (2002). AIMQ: a methodology for information quality assessment. Inf. Manag. 40, 133–146. doi: 10.1016/S0378-7206(02)00043-5

Loshin, D. (2006). Monitoring Data Quality Performance Using Data Quality Metrics . Technical Report, Informatica.

Loshin, D. (2010). The Practitioner's Guide to Data Quality Improvement . 1st Edn. San Francisco, CA: Morgan Kaufmann Publishers Inc.

Maletic, J. I., and Marcus, A. (2009). “Data cleansing: a prelude to knowledge discovery,” in Data Mining and Knowledge Discovery Handbook , ed O. Maimon (New York, NY: Springer), 19–32.

Maydanchik, A. (2007). Data Quality Assessment . Bradley Beach, NJ: Technics Publications, LLC.

McKean, E. (2005). The New Oxford American Dictionary , Vol. 2. Oxford: Oxford University Press New York.

Moore, S. (2018). How to Create a Business Case for Data Quality Improvement . Available Online at: https://www.gartner.com/smarterwithgartner/how-to-create-a-business-case-for-data-quality-improvement (January 2022).

Myers, D. (2017). About the Dimensions of Data Quality . Available Online at: http://dimensionsofdataquality.com/about_dims (January 2022).

Naumann, F. (2014). Data profiling revisited. ACM SIGMOD Rec. 42, 40–49. doi: 10.1145/2590989.2590995

Oracle (2018). em Enterprise Data Quality Help version 9.0. Technical Report, Oracle . Available online at: https://www.oracle.com/webfolder/technetwork/data-quality/edqhelp/index.htm (January 2022).

Otto, B., and Österle, H. (2016). Corporate Data Quality: Prerequisite for Successful Business Models . Berlin: Gabler.

Pateli, A. G., and Giaglis, G. M. (2004). A research framework for analysing ebusiness models. Eur. J. Inf. Syst. 13, 302–314. doi: 10.1057/palgrave.ejis.3000513

Pawluk, P. (2010). “Trusted data in IBM's MDM: accuracy dimension,” in Proceedings of the 2010 International Multiconference on Computer Science and Information Technology (IMCSIT) (Wisla: IEEE), 577–584.

Pipino, L. L., Lee, Y. W., and Wang, R. Y. (2002). Data quality assessment. Commun. ACM 45, 211–218. doi: 10.1145/505248.506010

Piro, A., editor (2014). Informationsqualität bewerten – Grundlagen, Methoden, Praxisbeispiele [Assessing Information Quality – Foundations, Methods, and Practical Examples] . 1st Edn. Düsseldorf: Symposion Publishing GmbH.

Prasad, K. H., Faruquie, T. A., Joshi, S., Chaturvedi, S., Subramaniam, L. V., and Mohania, M. (2011). “Data cleansing techniques for large enterprise datasets,” in 2011 Annual SRII Global Conference (San Jose, CA: IEEE), 135–144.

Pulla, V. S. V., Varol, C., and Al, M. (2016). Open Source Data Quality Tools: Revisited. In Latifi, S., editor, Information Technology: New Generations: 13th International Conference on Information Technology , pages 893–902, Cham, Switzerland. Springer International Publishing.

Pushkarev, V., Neumann, H., Varol, C., and Talburt, J. R. (2010). “An overview of open source data quality tools,” in Proceedings of the 2010 International Conference on Information & Knowledge Engineering, IKE 2010, July 12-15, 2010 (Las Vegas, NV: CSREA Press), 370–376.

Quadient (2008). DataCleaner Reference Documentation 5.2. Technical Report . Available online at: https://datacleaner.org/docs/5.2/html (January 2022).

Redman, T. C. (1997). Data Quality for the Information Age . 1st Edn. Norwood, MA: Artech House, Inc.

Redman, T. C. (2005). “Measuring data accuracy: a framework and review,” in Information Quality , Ch. 2 (Armonk, NY: M.E. Sharpe), 21–36.

Rolland, A. (2019). mobyDQ. Technical Report, The Data Tourists . Available online at: https://ubisoft.github.io/mobydq (January 2022).

SAS (2019). DataFlux Data Management Studio 2.7: User Guide . Technical Report, SAS. Available online at: http://support.sas.com/documentation/onlinedoc/dfdmstudio/2.7/dmpdmsug/dfUnity.html (January 2022).

Scannapieco, M., and Catarci, T. (2002). Data quality under the computer science perspective. Arch. Comput. 2, 1–15.

Schäffer, T., and Beckmann, H. (2014). Trendstudie Stammdatenqualität 2013: Erhebung der aktuellen Situation zur Stammdatenqualität in Unternehmen und daraus abgeleitete Trends [Trend Study Master Data Quality 2013: Inquiry of the Current Situation of Master Data Quality in Companies and Derived Trends] . Technical Report, Hochschule Heilbronn.

Sebastian-Coleman, L. (2013). Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework . Waltham, MA: Elsevier.

Selvage, M. Y., Judah, S., and Jain, A. (2017). Magic Quadrant for Data Quality Tools . Technical Report, Gartner, Inc.

Sessions, V., and Valtorta, M. (2006). “The effects of data quality on machine learning algorithms,” in Proceedings of the 11th International Conference on Information Quality (ICIQ 2006) , Vol. 6 (Cambridge, MA: MIT), 485–498.

Sheskin, D. J. (2003). Handbook of Parametric and Nonparametric Statistical Procedures . 3rd Edn. Boca Raton, FL: CRC Press.

Stephens, O. (2018). Methods and Theory behind the Clustering Functionality in OpenRefine . Available online at: https://github.com/OpenRefine/OpenRefine/wiki/Clustering-In-Depth (January 2022).

Stonebraker, M., and Ilyas, I. F. (2018). Data integration: the current status and the way forward. Bull. IEEE Comput. Soc. Techn. Committee Data Eng. 41, 3–9.

Talend (2017). Talend Open Studio for Data Quality – User Guide 7.0.1M2 . Technical Report, Talend. Available online at: http://download-mirror1.talend.com/top/user-guide-download/V552/TalendOpenStudio_DQ_UG_5.5.2_EN.pdf (January 2022).

Tsiflidou, E., and Manouselis, N. (2013). “Tools and techniques for assessing metadata quality,” in Research Conference on Metadata and Semantic Research (Cham: Springer), 99–110.

Wand, Y., and Wang, R. Y. (1996). Anchoring data quality dimensions in ontological foundations. Commun. ACM 39, 86–95.

Wang, R. Y. (1998). A product perspective on total data quality management. Commun. ACM 41, 58–65.

Wang, R. Y., and Strong, D. M. (1996). Beyond accuracy: what data quality means to data consumers. J. Manag. Inf. Syst. 12, 5–33.

Woodall, P., Oberhofer, M., and Borek, A. (2014). A classification of data quality assessment and improvement methods. Int. J. Inf. Qual. 3, 298–321. doi: 10.1504/IJIQ.2014.068656

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhu, H., Madnick, S., Lee, Y., and Wang, R. Y. (2014). “Data and information quality research: its evolution and future,” in Computing Handbook: Information Systems and Information Technology (London: Chapman and Hall/CRC), 16.1–16.20.

Keywords: data quality, data quality tools, data quality measurement, data quality monitoring, data profiling, information quality

Citation: Ehrlinger L and Wöß W (2022) A Survey of Data Quality Measurement and Monitoring Tools. Front. Big Data 5:850611. doi: 10.3389/fdata.2022.850611

Received: 07 January 2022; Accepted: 08 March 2022; Published: 31 March 2022.

Reviewed by:

Copyright © 2022 Ehrlinger and Wöß. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Lisa Ehrlinger, lisa.ehrlinger@scch.at

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

  • Featured Themes
  • > Climate Change & Conflict
  • > Youth, Peace & Security
  • > Tech for Good
  • > Asia Religious & Ethnic Freedom
  • All Regions
  • Resource Library
  • Peace Impact Framework
  • Grounded Accountability Model
  • Add a Resource
  • Organizations
  • Collaboration Map
  • Discussions
  • Add an Organization

research report on monitoring and measuring techniques

Monitoring and Evaluation: Tools, Methods and Approaches

The purpose of this M&E Overview is to strengthen awareness and interest in M&E, and to clarify what it entails. You will find an overview of a sample of M&E tools, methods, and approaches outlined here, including their purpose and use; advantages and disadvantages; costs, skills, and time required; and key references. Those illustrated here include several data collection methods, analytical frameworks, and types of evaluation and review. The M&E Overview discusses:

  • Performance indicators
  • The logical framework approach
  • Theory-based evaluation
  • Formal surveys
  • Rapid appraisal methods
  • Participatory methods
  • Public expenditure tracking surveys
  • Cost-benefit and cost-effectiveness analysis
  • Impact evaluation

This list is not comprehensive, nor is it intended to be. Some of these tools and approaches are complementary; some are substitutes. Some have broad applicability, while others are quite narrow in their uses. The choice of which is appropriate for any given context will depend on a range of considerations. These include the uses for which M&E is intended, the main stakeholders who have an interest in the M&E findings, the speed with which the information is needed, and the cost.

You must be logged in in order to leave a comment

Related Resources

“to protect her honour”: child marriage in emergencies – the fatal confusion between protecting girls and sexual violence.

Theme: Conflict Sensitivity & Integration

“Investing in Listening”: International Organization for Migration’s Experience with Humanitarian Feedback Mechanisms in Sindh Province, Pakistan

Theme: Democracy & Governance , General

Region: Europe , Oceania

Share on Mastodon

We value your privacy

We and our partners are using technologies like Cookies or Targeting and process personal data like IP-address or browser information in order to personalize the contents you see. We also use it in order to measure results or align our website content. Because we value your privacy, we are herewith asking your permission to use the following technologies.

The Importance of Monitoring and Evaluation for Decision-Making

  • First Online: 08 June 2021

Cite this chapter

research report on monitoring and measuring techniques

  • Nadini Persaud   ORCID: orcid.org/0000-0003-1827-2867 3 &
  • Ruby Dagher   ORCID: orcid.org/0000-0001-5211-2125 4  

590 Accesses

With a rich discussion of the role of monitoring and evaluation (M&E), the differences between monitoring and evaluation, and the various types of evaluations that can be undertaken, this chapter provides a rich assessment of the tools that evaluators can use to assess advancements in the SDGs, lessons learned, accountability, and the power of the interconnected nature of the SDGs. It also provides a critical assessment of the challenges that the evaluation domain faces.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Agrawal, R., & Rao, B. L. N. (2016). Evaluations as catalysts in bridging development inequalities. In R. C. Rist, F. P. Martin &. A. M. Fernandex (Eds.), Poverty, inequality, and evaluation: Changing perspectives (pp. 25–38). Washington, DC: The World Bank.

Google Scholar  

Baptiste, L., Lese, V., Gordon, V., Bailey, A., Persaud, N., Nicholson, C., et al. (2019). The transformative agenda for evaluation in small island developing states: The Caribbean and the Pacific. In R. D. van den Berg, C. Magro, & S. S. Mulder (Eds.). Evaluation for transformational change: Opportunities and challenges for the sustainable development goals (pp. 71–87). Exeter, UK: IDEAS.

Carlsson, J., Koehlin, G., & Ekbom, A. (1994). The political economy of evaluation: International aid agencies and the effectiveness of aid . London: Palgrave Macmillan UK.

Book   Google Scholar  

Cracknell, B. E. (2000). Evaluating development aid: Issues, problems and solutions . New Delhi: Sage.

Crawford, P., & Bryce, P. (2003). Project monitoring and evaluation: A method for enhancing the efficiency and effectiveness of aid project implementation. International Journal of Project Management, 21, 363–373. https://doi.org/10.1016/S0263-7863(02)00060-1 .

Article   Google Scholar  

Cunill-Grau, N., & Ospina, S. M. (2012). Performance measurement and evaluation systems: Institutionalizing accountability for Government results in Latin America. New Directions for Evaluation, 134 , 77–91.

Estrella, M. (2000). Learning from change. In M. Estrella (Ed.), Learning from change: Issues and experiences in participatory monitoring and evaluation (pp. 1–15). London: Intermediate Technology Publications Ltd.

Chapter   Google Scholar  

European Union. (2015). Evaluation matters: The evaluation policy for European union development co-operation . https://ec.europa.eu/international-partnerships/system/files/evaluation-matters_en.pdf .

Feeny, S. (2020). Transitioning from the MDGs to the SDGs: Lessons learnt? In S. Awaworyi (Ed.), Moving from the millennium to the sustainable development goals (pp. 343–351). Singapore: Palgrave.

Feinstein, O. N. (2012). Evaluation as a learning tool. New Directions for Evaluation, 134, 103–112. https://doi.org/10.1002/ev.104 .

Fitz-Gibbon, C. T., & Morris, L. L. (1987). How to design a program evaluation . Newbury Park: Sage.

Gabay, C. (2015). Special forum on the millennium development goals: Introduction. Globalizations, 12 (4), 576–580. https://doi.org/10.1080/14747731.2015.1033173 .

Ibrahim, S. (2013). Linking evaluation work in Arab countries to the crises in the “3F’s”—Finances, food, and fuel. In R. C. Rist, M. H. Boily & F. R. Martin (Eds.), Development evaluation in times of turbulence: Dealing with crises that endanger our future (pp. 1–4). Washington, DC. The World Bank.

Kusek, J. S., & Rist, R. C. (2004). Ten steps to a results-based monitoring and evaluation system: A handbook for development practitioners. Washington, DC. https://www.oecd.org/dac/peer-reviews/ World Bank 2004 10_Steps_to_a_Results_Based_ME_System.pdf.

Love, A. (1991). Internal evaluation: Building organizations from within . Newbury Park: Sage.

Marinic, S. (2012). Emergent evaluation and educational reforms in Latin America. New Directions for Evaluation, 134 , 17–27.

Menon, S. (2013). Evaluation and turbulence: Reflections. In R. C. Rist, M. H. Boily & F. R. Martin (Eds.), Development evaluation in times of turbulence: Dealing with crises that endanger our future (pp. 25–30). Washington, DC. The World Bank.

Morra-Imas, L. G., Morra, L. G., and Rist, R. C. (2009). The road to results: Designing and conducting effective development evaluations . Washington, DC: The World Bank. https://issuu.com/world.bank.publications/docs/9780821378915?layout=http%253A%252F%252Fskin.issuu.com%252Fv%252Flight%252Flayout.xml&showFlipBtn=true.

Morra-Imas, L. G., & Rist, R. C. (2009). The road to results: Designing and conducting effective development evaluations . Washington, DC: The World Bank. https://openknowledge.worldbank.org/bitstream/handle/10986/2699/52678.pdf?sequence=1&isAllowed=y

Neirotti, N. (2012). Evaluation in Latin America: Paradigms and practices. New Directions for Evaluation, 134, 7–16. https://doi.org/10.1002/ev .

Organization for Economic Cooperation and Development. (2020). OECD better criteria for better evaluation: Revised and updated evaluation criteria . https://www.oecd.org/dac/evaluation/evaluation-criteria-flyer-2020.pdf .

Parsons, D. (2017). Demystifying evaluation: Practical approaches for researchers and users . Bristol: Policy Press.

Patton, M. Q. (2016). A transcultural global systems perspective: In search of blue marble evaluators. Canadian Journal of Program Evaluation , Special Issue, 374–390.

Patton, M. Q. (2020). Blue marble evaluation: Premises and principles . New York, NY: The Guildford Press.

Persaud, N. (2019). An exploratory study on public sector program evaluation practices and culture in Barbados, Belize, Guyana, and Saint Vincent and the Grenadines: Where are we? Where do we need to go? Journal of MultiDisciplinary Evaluation, 15 (32), 17–27.

Persaud, N. (in press). Strengthening evaluation culture in the English Speaking Commonwealth Caribbean: A guide for evaluation practitioners and decision-makers in the public, private, and NGO sectors . Kingston, Jamaica: Arawak Publications.

Persaud, N., & Dagher, R. (2020). Evaluations in the English-speaking Commonwealth Caribbean region: Lessons from the field. American Journal of Evaluation, 41 (2), 255–276. https://doi.org/10.1177/1098214019866260 .

Reichert, J., & Gatens, A. (2019). Demystifying program evaluation in criminal justice: A guide for practitioners. IL. https://icjia.illinois.gov/researchhub/files/Demystifying_Evaluation191011T20092818.pdf .

Rossi, P. H., Lipsey, M. W., & Henry, G. T. (2018). Evaluation: A systematic approach (8th ed.). Thousand Oaks: Sage.

Rotondo, E. (2012). Lesson learned from evaluation capacity building. New Directions for Evaluation, 134 , 93–101.

Scriven, M. (1991). Prose and cons about goal-free evaluation. Evaluation Practice, 12 (1), 55–62.

Scriven, M. (2016). Roads to recognition and revolution. American Journal of Evaluation, 37 (1), 27–44.

Shepherd, R. (2016). Deliverology and innovation in evaluation: Canada as a case study . Paper presented at collaborative conference between The University of the West Indies, The Caribbean Development Bank, and Carleton University, on Strengthening the role of evaluation in the Caribbean: Lessons from the field, Bridgetown, Barbados.

United Nations Development Programme [UNDP]. (2016). From the MDGs to sustainable development for all: Lessons from 15   years of practice . New York. https://www.undp.org/content/undp/en/home/librarypage/sustainable-development-goals/from-mdgs-to-sustainable-developmentforall.html .

Weber, H. (2015). Reproducing inequalities through development: The MDGs and the politics of method. Globalizations, 12 (4), 660–676. https://doi.org/10.1080/14747731.2015.1039250 .

Download references

Author information

Authors and affiliations.

University of the West Indies, St. Michael, Barbados

Nadini Persaud

University of Ottawa, Ottawa, ON, Canada

Ruby Dagher

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Nadini Persaud .

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Persaud, N., Dagher, R. (2021). The Importance of Monitoring and Evaluation for Decision-Making. In: The Role of Monitoring and Evaluation in the UN 2030 SDGs Agenda. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-70213-7_4

Download citation

DOI : https://doi.org/10.1007/978-3-030-70213-7_4

Published : 08 June 2021

Publisher Name : Palgrave Macmillan, Cham

Print ISBN : 978-3-030-70212-0

Online ISBN : 978-3-030-70213-7

eBook Packages : Political Science and International Studies Political Science and International Studies (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of hhspa

Measuring best practices for workplace safety, health and wellbeing: The Workplace Integrated Safety and Health Assessment

Glorian sorensen.

1 Dana-Farber Cancer Institute, Boston, MA

2 Harvard T.H. Chan School of Public Health, Boston, MA

Emily Sparer

Jessica a.r. williams.

3 University of Kansas School of Medicine, Kansas City, KS

Daniel Gundersen

Leslie i. boden.

4 Boston University School of Public Health, Boston, MA

Jack T. Dennerlein

5 Northeastern University, Boston, MA

Dean Hashimoto

6 Partners HealthCare, Inc., Boston, MA

7 Boston College Law School, Newton Centre, MA

Jeffrey N. Katz

8 Brigham and Women’s Hospital, Boston, MA

Deborah L. McLellan

Cassandra a. okechukwu, nicolaas p. pronk.

9 HealthPartners Institute, Minneapolis, MN

Anna Revette

Gregory r. wagner, associated data.

To present a measure of effective workplace organizational policies, programs and practices that focuses on working conditions and organizational facilitators of worker safety, health and wellbeing: the Workplace Integrated Safety and Health (WISH) Assessment.

Development of this assessment used an iterative process involving a modified Delphi method, extensive literature reviews, and systematic cognitive testing.

The assessment measures six core constructs identified as central to best practices for protecting and promoting worker safety, health and wellbeing: leadership commitment; participation; policies, programs and practices that foster supportive working conditions; comprehensive and collaborative strategies; adherence to federal and state regulations and ethical norms; and data-driven change.

Conclusions

The WISH Assessment holds promise as a tool that may inform organizational priority setting and guide research around causal pathways influencing implementation and outcomes related to these approaches.

INTRODUCTION

Efforts to protect and promote the safety, health, and wellbeing of workers have increasingly focused on integrating the complex and dynamic systems of the work organization and work environment. 1 – 3 The National Institute for Occupational Safety and Health (NIOSH) applies this integrated approach in the Total Worker Health ® (TWH) initiative by attending to “policies, programs, and practices that integrate protection from work-related safety and health hazards with promotion of injury and illness prevention efforts to advance worker well-being.” 4 NIOSH has defined best practices as a set of essential elements for TWH that prioritizes a hazard-free work environment and recognizes the significant role of job-related factors in workers’ health and wellbeing. 5 , 6

Increasingly, others have also emphasized the importance of improvements in working conditions as central to best practice recommendations. 7 – 15 For example, the Robert Wood Johnson Foundation’s efforts have focused on building a culture of health in the workplace. 16 Continuous improvement systems have relied on employee participation as a means of shaping positive working conditions. 17 – 19 In Great Britain, continuous improvement processes are employed through a set of “management standards” that assess and address stressors in the workplace, including demands, control, support, relationships, role, and organizational change. 20 Researchers have reported benefits to this integrated systems approach, including reductions in pain and occupational injury and disability rates; 21 – 26 strengthened health and safety programs; 27 , 28 improvements in health behaviors; 29 – 38 enhanced rates of employee participation in programs; 39 and reduced costs. 28

Assessment of the extent to which a workplace adheres to best practice recommendations related to an integrated systems approach is important for several reasons. Understanding relationships between working conditions and worker safety and health outcomes can inform priority-setting and decision making for researchers, policy-makers, and employers alike, and may motivate employer actions to improve workplace conditions. 40 In turn, identifying the impact of worker health and safety on business-related outcomes, such as worker performance, productivity, and turnover, may help to demonstrate the importance of protecting and promoting worker health for the bottom line. 41 Baseline data with follow up assessments can also provide a means for tracking improvements in working conditions and related health outcomes over time. An important part of the process for evaluating workplace adherence to recommendations and understanding its relationship to worker and business outcomes is the creation of assessment tools that effectively capture implementation of best practice recommendations.

In 2013, the Harvard T.H. Chan School of Public Health Center for Work, Health and Wellbeing published a set of “Indicators of Integration” that were designed to assess the extent to which an organization has implemented an approach integrating occupational safety and health with worksite health promotion. 1 This instrument assessed four domains: organizational leadership and commitment; collaboration between health protection and worksite health promotion; supportive organizational policies and practices (including accountability and training, management and employee engagement, benefits and incentives to support workplace health promotion and protection, integrated evaluation and surveillance); and comprehensive program content. The instrument was validated in two samples, 42 , 43 and has played as useful role in the evolving dialogue around integrated approaches to worker safety, health and wellbeing. 44 , 45 There is a need, however, for building on this work toward assessment of more conceptually grounded and practical constructs that measure the implementation of systems approaches focusing on improving working conditions as a means of protecting and promoting worker safety, health and wellbeing.

This manuscript describes an improved measure that reflects the Center’s conceptual model, which articulates the central role of working conditions in shaping health and safety outcomes as well as enterprise outcomes such as absenteeism and turnover ( Figure 1 ). 2 Working conditions, placed centrally in the model as core determinants of worker health and safety, encompass the physical environment and the organization of work (i.e., psycho-social factors, job design and demands, health and safety climate). The model highlights the potential interactions across systems, guiding exploration of the shared effects of the physical environment and the organization of work. Working conditions serve as a pathway from enterprise and workforce characteristics and effective policies, programs, and practices, to worker safety and health outcomes, as well as to more proximal outcomes such as health-related behaviors. Effective policies, programs, and practices may also contribute to improvements in enterprise outcomes such as turnover and health care costs. Feedback loops underscore the complexity of the system of inter-relationships across multiple dimensions, and highlight the potential synergy of intervention effects.

An external file that holds a picture, illustration, etc.
Object name is nihms934884f1.jpg

Systems level conceptual model centered around working conditions.

The purpose of this paper is to present a new measure of effective workplace organizational policies, programs and practices that focuses on working conditions as well as organizational facilitators of worker safety, health and wellbeing: the Workplace Integrated Safety and Health (WISH) Assessment. The objective of the WISH Assessment is to evaluate the extent to which workplaces implement effective comprehensive approaches to protect and promote worker health, safety and wellbeing. The policies, programs and practices encompassed by these approaches include both those designed to prevent work-related injuries and illnesses, and those designed to enhance overall workforce health and wellbeing.

Like the Indicators of Integration, this assessment is designed to be completed at the organizational level by employer representatives, such as directors of human resources, occupational safety, or employee health. These representatives are likely to be knowledgeable of organizational priorities, as well as policies, programs and practices related to workplace safety and health. Moreover, these representatives are in the position to influence the cultural and structural re-alignment necessary for integrated approaches. These organizational assessments are increasingly being used by policymakers, organizational leadership, and health and safety committees to guide goal setting and decision making. Organizational assessments such as this one may additionally complement worker surveys that can effectively capture employee practices as well as their perceptions of working conditions.

The WISH Assessment differs from the Indicators of Integration in two important ways: it embraces an increased focus on the central role of working conditions (as illustrated in Figure 1 ), and it expands assessment of best practice systems approaches to include a broader definition of protecting and promoting worker safety, health and wellbeing. This paper also describes the methods used to develop this instrument.

Investigators from the Harvard Chan School Center for Work, Health and Wellbeing, a TWH Center for Excellence representing multiple institutions in the Boston area, developed the WISH Assessment. 46 This instrument measures workplace-level implementation of policies, programs and practices that protect and promote worker safety, health and wellbeing. Accordingly, the intended respondents are organizational representatives who are knowledgeable about existing policies, programs and practices at workplaces, such as executives at small businesses, or directors of human resources or safety departments. Development of the WISH Assessment relied on an iterative process involving a modified Delphi method, extensive literature reviews, and systematic cognitive testing.

Delphi method and literature review

The Center’s conceptual model 2 and related literature provided a guiding framework for the development of the WISH Assessment. As a starting point, we used constructs and related measures included in the Center’s published 1 and validated 42 , 43 Indicators of Integration tool. Based on an extensive review of literature, we expanded and revised the constructs captured by the instrument from four to seven. Next, we reviewed these constructs and their definitions using an iterative modified Delphi process 47 with an expert review panel, including Center investigators. We reviewed the literature to identify extant items for these and similar constructs, placing a priority on the inclusion of validated measures where feasible. To ensure content validity and adequate coverage across attributes for each construct, we reviewed working drafts of the instrument with the Center’s External Advisory Board, as well as members of other TWH Centers of Excellence. 48 Through repeated discussions, iterative review and revisions in six meetings over ten months, the Center members reached consensus related to a set of core domains and related measures. The working draft of the instrument was further reviewed by a survey methodologist, and prepared for cognitive testing. Following three rounds of cognitive testing and revision, as described below, Center investigators again reviewed and finalized this working version of the WISH Assessment.

Cognitive testing methods

The purpose of cognitive testing is to ensure that the items included on a survey effectively measure the intended constructs and are uniformly understood by potential respondents. The process focuses on the performance of each candidate item when used with members of the intended respondent group, and specifically assesses comprehension, information retrieval, judgment/estimation, and selection of response category. 49 , 50 In the development of the WISH Assessment, we tested the instrument through three rounds of interviews.

Data Collection

Participants were asked to fill out the self-administered paper (N=15) or web (N=4) survey and participate in a telephone-administered qualitative interview with a trained interviewer. Participants received the survey 48 hours prior to the interview, and were encouraged to complete the survey as close to the actual interview as possible. They were not explicitly encouraged to consult with others, nor were they told not to.

The semi-structured interviews were conducted using a structured interview guide and retrospective probing techniques. The first round focused on comprehension of key attributes for each item (e.g., concepts of integration and collaborative environments) and of key words or phrases in the context of the question and instrument. Items revised following round 1 were again tested in a second round using the same approach. The focus of round 2 was to assess the adequacy of revisions in addressing the problems identified in round 1. A third round of interviews was conducted to confirm there were no additional problems due to revisions based on round 2. Across all rounds, the interviews concluded with a series of questions about the participant’s overall experience in completing the survey. For example, participants could provide general feedback on the survey content and the time required to complete the survey, and could comment on questions that didn’t fit within the survey or were duplicative. Participants were compensated with a $20 Amazon gift card for their time.

Participants were selected using purposive sampling to ensure diversity across industries and organizational roles, and included directors of human resources, occupational health, safety, and similar positions from hospitals (n=9), risk management (n=2), technology (n=2), transportation (n=2), community health center (n=1), manufacturing (n=1), laboratory research and development (n=1), and emergency response (n=1). Participants were identified through the Center for Work, Health and Wellbeing, and included attendees at a continuing education course and former collaborators on other projects. New participants were included in subsequent rounds of testing in order to ensure that revisions to items from each round adequately addressed the limitations, and that items were appropriately and uniformly interpreted among respondents.

Following each round, feedback was summarized by item. Participant feedback was used to determine if the question wording needed modification. Each item was first considered on its own, and then assessed in terms of its fit in measuring its construct. A survey methodologist made suggested revisions which were then reviewed and discussed by the author team and interviewers to ensure that they retained substantive focus on the construct. We revised items when participants found terminology unclear or when participants’ answers did not map onto the intended construct. An item was removed if it was found to be redundant or did not adequately map onto the intended construct, and its deletion did not compromise content validity. We analyzed data across industry, and also assessed responses specific to the hospital industry, which included the most respondents.

Description of the constructs and measures

We identified six core constructs as central to best practices for protecting and promoting worker safety, health and wellbeing through the Delphi process and literature review. The items included in each construct with their response categories are included in Table 1 . Following the Indicators of Integration, this assessment was designed to be answered by one or more persons within an organization who are likely to be familiar with the organization’s policies, programs and practices related to worker safety, health and wellbeing. Below, we present each construct, including its definition, rationale for inclusion, and sources of the items included.

Workplace Integrated Safety and Health (WISH) Assessment

This brief survey is designed to assess the extent to which organizations effectively implement integrated approaches to worker safety, health and wellbeing. The term “integrated approaches” refers to policies, programs, and practices that aim to prevent work-related injuries and illnesses and enhance overall workforce health and wellbeing.

This survey is meant to be completed by health and safety representatives, either in human resources or in safety, at the middle management level. There are no right or wrong answers–Your responses are meant to reflect your understanding of policies, practices and programs currently implemented within your organization.
1. The following questions refer to leadership commitment. We define the term “leadership commitment” to mean the following: An organization’s leadership makes worker safety, health, and well-being a clear priority for the entire organization. They drive accountability and provide the necessary resources and environment to create positive working conditions.

Response Categories: Please indicate how often you feel your organization or its leaders do each of the following: Not all the time, some of the time, most of the time, all of the time.
a. The company’s leadership, such as senior leaders and middle managers, communicate their commitment to a work environment that supports employee safety, health, and wellbeing.
b. The organization allocates enough resources such as enough workers and money to implement policies or programs to protect and promote worker safety and health.
c. Our company’s leadership, such as senior leaders and managers, take responsibility for ensuring a safe and healthy work environment.
d. Worker health and safety are part of the organization’s mission, vision or business objectives.
e. The importance of health and safety is communicated across all levels of the organization, both formally and informally.
f. The importance of health and safety is consistently reflected in actions across all levels of the organization, both formally and informally.
2. The following questions refer to participation. We define the term “participation” to mean the following: Stakeholders at every level of an organization, including organized labor or other worker organizations if present, help plan and carry out efforts to protect and promote worker safety and health.

Response categories: For these collaborative activities or programs, please indicate how often you believe your organization implements each: not at all, some of the time, most of the time, all of the time.
a. Managers and employees work together in planning, implementing, and evaluating comprehensive safety and health programs, policies, and practices for employees.
b. This company has a joint worker-management committee that addresses efforts to protect and promote worker safety and health.
c. In this organization, managers across all levels consistently seek employee involvement and feedback in decision making.
d. Employees are encouraged to voice concerns about working conditions without fear of retaliation.
e. Leadership, such as supervisors and managers, initiate discussions with employees to identify hazards or other concerns in the work environment.
3. The following questions refer to policies, programs, and practices focused on positive working conditions. We define this term to mean the following: The organization enhances worker safety, health, and well-being with policies and practices that improve working conditions.

Response categories: For each of the following policies or practices, please indicate the degree to which they are implemented at your company: not at all, somewhat, mostly, completely.
a. The workplace is routinely evaluated by staff trained to identify potential health and safety hazards.
b. Supervisors are responsible for identifying unsafe working conditions on their units.
c. Supervisors are responsible for correcting unsafe working conditions on their units.
d. This workplace provides a supportive environment for safe and healthy behaviors, such as a tobacco-free policy, healthy food options, or facilities for physical activity.
e. Organizational policies or programs are in place to support employees when they are dealing with personal or family issues.
f. Leadership, such as supervisors and managers, make sure that workers are able to take their entitled breaks during work (e.g. meal breaks).
g. Supervisors and managers make sure workers are able to take their earned times away from work such as sick time, vacation, and parental leave.
h. This organization ensures that policies to prevent harm to employees from abuse, harassment, discrimination, and violence are followed.
i. This organization has trainings for workers and managers across all levels to prevent harm to employees from abuse, harassment, discrimination, and violence.
j. This workplace provides support to employees who are returning to work after time off due to work-related health conditions.
k. This workplace provides support to employees who are returning to work after time off due to non-work related health conditions.
l. This organization takes proactive measures to make sure that the employee’s workload is reasonable, for example, that employees can usually complete their assigned job tasks within their shift.
m. Employees have the resources such as equipment and training do their jobs safely and well.
n. All employees in this organization receive paid leave, including sick leave.
4. The following questions refer to comprehensive and collaborative strategies. We define this term to mean the following: Employees from across the organization work together to develop comprehensive health and safety initiatives.

Response categories: For the following collaborative or comprehensive policies, programs, or practices, please indicate the degree to which your company implements each: not at all, somewhat, mostly, completely.
a. This company has a comprehensive approach to promote and protect worker safety and health. This includes collaborative efforts across departments as well as education and programs for individuals and policies about the work environment.
b. This company has a comprehensive approach to worker wellbeing. This includes collaboration across departments in efforts to prevent work-related illness and injury and to promote worker health.
c. This company coordinates policies, programs, and practices for worker health safety, and wellbeing across departments.
d. Managers are held accountable for implementing best practices to protect worker safety, health, and wellbeing, for example through their performance reviews.
e. Managers are given resources, such as equipment and trainings, for implementing best practices to protect and promote worker safety, health, and wellbeing.
f. This company prioritizes protection and promotion of worker safety and health when selecting vendors and subcontractors.
5. The following questions refer to adherence. We define the term “adherence” to mean the following: The organization adheres to federal and state regulations, as well as ethical norms, that advance worker safety, health, and well-being.

Response categories: For each of the following statements, please indicate the degree to which you believe your company adheres to or prioritizes standards and regulations: not at all, somewhate, mostly, completely.
a. This organization complies with standards for legal conduct.
b. In this organization, people show sincere respect for others’ ideas, values, beliefs
c. This workplace complies with regulations aimed at eliminating or minimizing potential exposures to recognized hazards.
d. This company ensures that safeguards regarding worker confidentiality, privacy and non-retaliation protections are followed.
e. The wages for the lowest-paid employees in this organization seem to be enough to cover basic living expenses such as housing and food.
6. The following questions refer to data-driven change. We define this term to mean the following: Regular evaluation guides an organization’s priority setting, decision making, and continuous improvement of worker safety, health, and well-being initiatives.

Response categories: Please indicate the degree to which your company does each of the following: not at all, somewhat,mostly, completely.
a. The effects of policies and programs to promote worker safety and health are measured using data from multiple sources, such as injury data, employee feedback, and absence records
b. Data from multiple sources on health, safety, and wellbeing are integrated and presented to leadership on a regular basis.
c. Evaluations of policies, programs and practices to protect and promote worker health are used to improve future efforts.
d. Integrated data on employee safety and health outcomes are coordinated across all relevant departments.

Leadership Commitment , defined as: “Leadership makes worker safety, health, and well-being a clear priority for the entire organization. They drive accountability and provide the necessary resources and environment to create positive working conditions.” This construct was included in our Indicators of Integration; items in the WISH Assessment were adapted from this prior measure, as well as from other sources. 1 , 41 , 51 , 52 Organizational leadership has been linked to an array of worker safety, health and wellbeing outcomes, 53 , 54 including organizational safety climate, 55 , 56 job-related wellbeing, 57 , 58 workplace injuries, 59 , 60 and health behaviors. 61 , 62 This element recognizes that top management is ultimately responsible for setting priorities that define worker and worksite safety and health as part of the organization’s vision and mission. 14 , 16 Leadership roles include providing the resources needed for implementing best practices related to worker safety, health and wellbeing; establishing accountability for implementation of relevant policies and practices; and effectively communicating these priorities through formal and informal channels. 51 , 52

Participation , defined as: “Stakeholders at every level of an organization, including labor unions or other worker organizations if present, help plan and carry out efforts to protect and promote worker safety and health.” Many organizations have mechanisms in place to engage employees and managers in decision making and planning. These mechanisms may be used in planning and implementing integrated policies and programs, for example through joint worker-management committees that combine efforts to protect and promote worker safety, health and wellbeing. 7 , 63 , 64 Employee participation in decision-making facilitates a broader organizational culture of health, safety and wellbeing. Participation also includes encouraging employees to identify and report threats to safety and health, without fear of retaliation and with the expectation that their concerns will be addressed. Items included were adapted from the Indicators of Integration 1 and a self-assessment checklist from the Center for the Promotion of Health in the New England Workplace. 65

Policies, programs and practices that foster supportive working conditions , defined as: “The organization enhances worker safety, health, and well-being with policies and practices that improve working conditions.” These policies, programs and practices are central to the conceptual model presented in Figure 1 . Items include measures of the physical work environment and the organization of work (i.e., psychosocial factors, job tasks, demands, and resources), and are drawn from multiple sources. 1 , 41 , 66 – 69 The focus on working conditions is based on principles of prevention articulated in a hierarchy of controls framework, which has been applied within TWH. 10 , 70 Eliminating or reducing recognized hazards, whether in the physical work environment or the organizational environment, provides the most effective means of reducing exposure to potential for hazards on the job. Policies and processes to protect workers from physical hazards include routine inspections of the work environment, with mechanisms in place for correction of identified hazards, as well as policies that support safe and healthy behaviors, such as tobacco control policies. A supportive work organization includes safeguards against job strain, work overload, and harassment, 7 , 71 – 74 as well as supports for workers as they address work-life balance, return to work after an illness or injury, and take entitled breaks, including meal breaks as well as sick and vacation time. 75 , 76

Comprehensive and collaborative strategies , defined as: “Employees from across the organization work together to develop comprehensive health and safety initiatives.” Measures were adapted from our Indicators of Integration 1 and also relied on recent recommendations from the American College of Occupational and Environmental Medicine. 41 Although efforts to protect and promote worker safety and health have traditionally functioned independently, this construct acknowledges the benefits derived from collaboration across departments within an organization to protect and promote worker safety and health, both through policies about the work environment as well as education for workers. These efforts carry through into the selection of subcontractors and vendors, recognizing their impact on working conditions.

Adherence , defined as: “ The organization adheres to federal and state regulations, as well as ethical norms, that advance worker safety, health, and well-being.” The importance of this construct has been recognized by multiple organizations, whose contributions and metrics were incorporated in the measures included here. 7 , 77 – 79 Employers have a legal obligation to provide a safe and healthy work environment. 7 , 68 There is also significant agreement that any system that includes health and safety metrics must include safeguards for employee confidentiality and privacy. 7 , 41

Data-driven change , defined as: “Regular evaluation guides an organization’s priority setting, decision making, and continuous improvement of worker safety, health, and well-being initiatives.” Building health metrics into corporate reporting underscores the importance of worker health and safety as a business priority. 16 , 80 Feedback to leadership based on evaluation and monitoring of integrated programs, policies and practices can provide a basis for ongoing quality improvement. An integrated system that reports outcomes related both to occupational health as well as health behaviors and other health and wellbeing indicators can point to shared root causes within the conditions of work. 1 , 14

Cognitive testing results

We tested the WISH Assessment in three rounds of cognitive testing with a total of 19 participants. (See Appendix 1 for changes made to the items across the three rounds of testing.) On average, participants completed the self-administered survey in about 10–15 minutes, and the cognitive interviews took an average of 45 minutes. In the first round of cognitive testing, three participants completed a web version of the survey, and five, a paper-and-pencil version. Because we found no differences in concerns raised, the second round used only a paper survey, whereas the third included web respondents to confirm no differences in the final instrument. Changes made to the survey items were based on input from multiple respondents over the three rounds of interviews, and did not rely specifically on input from any one individual.

For items with uniform interpretation, revisions were made if respondents suggested a word or phrase that would clarify the question that investigators felt retained substantive focus. In addition, some items were dropped because they revealed multiple sources of problems, were too difficult to answer, or were identified as redundant. The first round of testing led to the removal of seven items and the modifying of 24 items. The second round of cognitive testing revealed that problems with the question wording remained with 15 questions in the context of the full survey. No additional questions were removed. These specific questions were updated and re-tested among three participants.

Throughout the cognitive testing process, we found several items to have either ambiguous terminology resulting in non-uniform or restrictive interpretation, inadequate framing of key terms and constructs, or lack of knowledge or perceived ability to provide an answer, resulting in poor information retrieval or mapping to the construct. Items measuring integration or collaboration within an organization were more often identified as problematic; to address this concern, we included a description of these constructs in the survey’s introduction to frame the survey for respondents. Although there was uniform interpretation of items asking about employee’s living wage, some respondents reported they did not have knowledge to provide an accurate answer. Most other items revealed uniform interpretation and no concerns regarding information retrieval or selecting response categories.

Looking at the items by construct, we identified particular concerns with items measuring two domains: “ policies, programs, and practices that foster supportive working conditions” and “adherence ” to norms and regulations. To address these concerns, we revised these items by improving the description of constructs or the terms in the respective sections’ introductions, using less ambiguous wording and integrating appropriate examples as necessary. We found commonalities across the responses in the remaining four domains, and describe our specific remediation process for each of these domains:

Leadership commitment

In round 1, we found a lack of clarity for the concept of “leadership.” For example, one respondent from the health care industry reported that: “[senior leaders and middle managers] should be distinguished and not conflated because there are several layers of management.” As a result, several respondents expressed difficulty retrieving accurate information due to level-specific answers. One respondent from a laboratory research and development company noted “[…] leadership communicate their commitment to safety and health through written policies. If you were to add supervisors – people closer to the front line – it would be different.” We addressed this concern by rewriting the introduction for this domain to clearly define “leadership commitment” and remove mention of leadership levels. However, we retained the wording “[…] such as senior leaders and middle managers, […]” in two items to reflect that organizations have channels through which commitment is communicated or enforced.

Participation

The questions for collaborative participation were largely identified as clear and uniformly interpreted. However, there was lack of clarity regarding who the key stakeholders were, particularly in the introduction. In addition, respondents reported that the introduction was too wordy and had a high literacy bar. Given these concerns and the suggestion that the use of the term “culture” was too academic, so we omitted use of this term. Some respondents expressed difficulty retrieving an appropriate answer due to this lack of clear framing of items in the introduction. We addressed this concern by rewriting the introduction to emphasize the definition of “participation” in the context of an organization’s activities that ensure worker safety and health. Feedback from round 2 found that this helped frame the item set more clearly. However, the word “encourage” in “In this organizational culture, managers encourage employees to get involved in making decisions” was identified as ambiguous. This was changed to “[…] seek employee involvement and feedback […]”.

Comprehensive and collaborative strategies

The most common feedback, expressed among several participants in multiple industries, for items in this domain was difficulty with the concept of “comprehensive,” i.e., that programming should address both prevention of illness and injury and promotion of worker health and safety. To a lesser extent, respondents also found difficulty with the “collaborative” concept. For example, some respondents including those from the hospital industry and in risk management, expressed difficulty responding to an item that included both “prevent” and “enhance,” which were perceived as “two different questions within this question.” We addressed these concerns by more clearly defining the two core constructs in a revised introduction. Moreover, we revised items to retain both “prevent and promote” while more clearly framing the question in context of collaboration. For example, the item “This company has a comprehensive approach to worker wellbeing that includes efforts to prevent work-related illness and injury as well as to enhance worker health” was revised to “This company has a comprehensive approach to worker wellbeing. This includes collaboration across departments in efforts to prevent work-related illness and injury and to promote worker health.”

Data-driven change

For this domain, we found evidence of poor cueing for the concepts of integration and coordination in the context of using data to produce organizational change. For example, several respondents from the hospital industry expressed that they did not understand what was meant by “integrated” in the context of “Summary reports on integrated policies and programs are presented to leadership on a regular basis, while also protecting employee confidentiality,” or “coordinated system” in context of “Data related to employee safety and health outcomes are integrated within a coordinated system.” Remediation focused on clarifying the context and definitions for integration and coordination. First, the introduction was revised to explicitly define data-driven change. Secondly, items were reworded to clarify integration and coordination. For example, “Summary reports on integrated policies and programs are presented to leadership on a regular basis, while also protecting employee confidentiality” was revised to “Data from multiple sources on health, safety, and wellbeing are integrated and presented to leadership on a regular basis.”

Our analyses also underscored commonalities across industries even when these issues seemed industry-specific. For example, comments from several participants suggested that a product-based mission may often dominate concerns about worker safety and health. In healthcare, this may be manifested by prioritizing patient care and safety over worker health and safety, or in other industries by a focus on production or timeline goals. Across industries, there was widespread agreement that Employee Assistant Programs was the primary resource for supporting employees dealing with personal or family issues.

Effective policies, programs and practices contribute to improvements in worker health, safety and wellbeing, as well as to enterprise outcomes such as improved employee morale, reduced absence and turnover, potentially reduced healthcare costs, and improved quality of services. 2 , 40 , 81 – 84 This manuscript presents the Workplace Integrated Safety and Health (WISH) Assessment, designed to evaluate the extent to which organizations implement best practice recommendations for an integrated, systems approach to protecting and promoting worker safety, health and wellbeing. This instrument builds on the Indicators of Integration, previously published and validated by the Center for Work, Health and Wellbeing. 1 , 42 , 43 We have expanded this tool based on the conceptual model presented in Figure 1 , 2 which prioritizes working conditions as determinants of worker safety, health and wellbeing. In addition, the WISH Assessment is designed to measure the extent to which an organization implements best practice recommendations. These constructs have also been used to inform in the Center’s guidelines for implementing best practice integrated approaches. 8

A growing range of metrics are available to assess organizational approaches to worker safety and health. The Integrated Health & Safety Index (IHS Index), created by the American College of Occupational and Environmental Medicine in collaboration with the Underwriters Laboratories, focuses on translating health and safety into value for businesses using three dimensions: economic, environmental and social standards. 41 By focusing on value, this measure has the potential to bolster the business case for health and safety. 41 , 85 The HERO Health and Well-being Best Practices Scorecard in Collaboration with Mercer is an online tool that allows employers to receive emailed feedback on their health and well-being practices. 86 Similarly, the American Heart Association’s Workplace Health Achievement Index provides an on-line self-assessment scorecard that includes comparisons with other companies. 87 The health metrics designed by the Vitality Institute include both a long and short form questionnaire, both with automatic scoring. 88 The Center for the Promotion of Health in the New England Workplace (CPH-NEW) has developed a tool to assess organizational readiness for implementing an integrated approach 11 and is developing a tool that focuses on participatory engagement of workers, with the goal of involving workers in the process of prioritizing health and safety issues and then developing and evaluating the proposed solutions. 89 Other measures of the work environment, such as the Health and Safety Executive Managements Standards Indicator Tool used in the United Kingdom, are designed to be taken by workers and so can provide detailed information on conditions as they are experienced by workers, but do not capture company-level policies and programs. 90 The WISH Assessment, designed to assess a company’s use of best practices for health and safety, is substantially shorter than the IHS Index and the HERO Scorecard, does not require the compilation of metrics and does not use individual employee data. In addition, in comparison to these other measures, the WISH Assessment can be used to guide organizations towards best practices and can be easily completed by organizations that might not have the resources to use the more extensive assessments.

Next steps in the development of the WISH Assessment include validation of the instrument across multiple samples, and design and testing of a scoring system. We validated the Indicators of Integration in two samples and found it to have convergent validity and high internal consistency, and to express one unified factor even when slight changes were made to adapt the measure. 42 , 43 We expect to follow a similar approach in validating this tool and assessing its dimensionality in large samples using factor analysis. Our goal is to design a scoring system that would be appropriate for both applied and research applications. As such, we expect the scoring algorithm to be simple enough for auto-calculation.

This tool may ultimately serve multiple purposes. As a research tool, it may provide a measure of workplace best practices that can be examined as determinants of worker safety and health outcomes. After being validated, the WISH Assessment may be used to explore organizational characteristics that may be associated with implementation of best practices. This instrument also responds to calls for practical tools for organizations implementing an integrated approach and focusing on working conditions. 41 The Center used the Indicators of Integration as part of a larger assessment process in three small-to-medium manufacturing businesses to inform organizations’ priority setting and decision making around the integration of occupational safety and health and health promotion. 91 In-person group discussions with key staff and executive leaders were used to rate each question on the scorecard, resulting in actionable steps based on identified gaps. Similarly, a validated WISH Assessment could be translated into a scorecard to be used to inform priority setting, decision making and to monitor changes over time in conditions of work and related health and safety outcomes. The Center has also applied the constructs defined in the WISH Assessment in its new best practice guidelines, 8 which include suggestions for formal and informal policies and practices ( Table 2 ).

Example Policies and Practices by each WISH construct.

ConstructFormal PoliciesInformal Practices
Physical environment Work organization Psychosocial environment Physical environment Work organization Psychosocial environment

McLellan D, Moore W, Nagler E, Sorensen G. 2017. Implementing an integrated approach: Weaving worker health, safety, and well-being into the fabric of your organization. Dana-Farber Cancer Institute: Boston, MA. http://centerforworkhealth.sph.harvard.edu/

These indicators describe policies, programs and practices within the control of a specific organization or enterprise, and are most likely to apply to organizations that employ approximately 100 or more employees. The cognitive testing conducted to refine the items included in the WISH Assessment included representatives from organizations in selected settings; the generalizability of these results may therefore be restricted to similar types of organizations. There remains a need for exploring how this measure may function in different industries and across organizations of varying size. Although the purpose of the WISH instrument is to provide a measure that might be broadly useful across industries, we also recognize that each industry faces particular challenges due to the nature of what they do; supplementary questions may be needed to address these industry-specific concerns. Although this measure has not yet been validated, we believe it is important to share it and to explore opportunities for collaboration with other researchers interested in testing its psychometric properties and across populations and settings, in order to further develop this tool. It will ultimately be important as well to develop mechanisms for scoring this instrument, taking account potential weighting across the domains included.

Growing evidence clearly documents the benefits to be derived from integrated systems approaches for protecting and promoting worker safety, health and wellbeing. Practical, validated measures of best practices that are supported by existing evidence and do not place an undue burden on respondents are needed to support systematic study and organizational change. In cognitive testing, we demonstrated that the items included in this instrument effectively assess the defined constructs. Our goal was to create a measure that will be broadly useful and valid across industry, and might contribute to understanding differences and similarities by industry. Thus, the general applicability of this instrument is a strength in that it would allow for comparisons across industries, if so desired by substantive research. We also recognize the potential benefits of industry-specific versions of this instrument which may use this broader instrument as a base set of measures while also expanding on areas that are unique to a given industry. This may help increase understanding of industry-specific health and safety challenges. The WISH Assessment holds promise as a tool that may inform organizational priority setting and guide research around causal pathways influencing implementation and outcomes related to these approaches.

Supplementary Material

Supplemental digital content, acknowledgments.

This work was supported by a grant from the National Institute for Occupational Safety and Health (U19 OH008861) for the Harvard T.H. Chan School of Public Health Center for Work, Health and Well-being.

Conflict of Interest noted: None

Site logo

  • Understanding Evaluation Methodologies: M&E Methods and Techniques for Assessing Performance and Impact
  • Learning Center

EVALUATION METHODOLOGIES and M&E Methods

This article provides an overview and comparison of the different types of evaluation methodologies used to assess the performance, effectiveness, quality, or impact of services, programs, and policies. There are several methodologies both qualitative and quantitative, including surveys, interviews, observations, case studies, focus groups, and more…In this essay, we will discuss the most commonly used qualitative and quantitative evaluation methodologies in the M&E field.

Table of Contents

  • Introduction to Evaluation Methodologies: Definition and Importance
  • Types of Evaluation Methodologies: Overview and Comparison
  • Program Evaluation methodologies
  • Qualitative Methodologies in Monitoring and Evaluation (M&E)
  • Quantitative Methodologies in Monitoring and Evaluation (M&E)
  • What are the M&E Methods?
  • Difference Between Evaluation Methodologies and M&E Methods
  • Choosing the Right Evaluation Methodology: Factors and Criteria
  • Our Conclusion on Evaluation Methodologies

1. Introduction to Evaluation Methodologies: Definition and Importance

Evaluation methodologies are the methods and techniques used to measure the performance, effectiveness, quality, or impact of various interventions, services, programs, and policies. Evaluation is essential for decision-making, improvement, and innovation, as it helps stakeholders identify strengths, weaknesses, opportunities, and threats and make informed decisions to improve the effectiveness and efficiency of their operations.

Evaluation methodologies can be used in various fields and industries, such as healthcare, education, business, social services, and public policy. The choice of evaluation methodology depends on the specific goals of the evaluation, the type and level of data required, and the resources available for conducting the evaluation.

The importance of evaluation methodologies lies in their ability to provide evidence-based insights into the performance and impact of the subject being evaluated. This information can be used to guide decision-making, policy development, program improvement, and innovation. By using evaluation methodologies, stakeholders can assess the effectiveness of their operations and make data-driven decisions to improve their outcomes.

Overall, understanding evaluation methodologies is crucial for individuals and organizations seeking to enhance their performance, effectiveness, and impact. By selecting the appropriate evaluation methodology and conducting a thorough evaluation, stakeholders can gain valuable insights and make informed decisions to improve their operations and achieve their goals.

2. Types of Evaluation Methodologies: Overview and Comparison

Evaluation methodologies can be categorized into two main types based on the type of data they collect: qualitative and quantitative. Qualitative methodologies collect non-numerical data, such as words, images, or observations, while quantitative methodologies collect numerical data that can be analyzed statistically. Here is an overview and comparison of the main differences between qualitative and quantitative evaluation methodologies:

Qualitative Evaluation Methodologies:

  • Collect non-numerical data, such as words, images, or observations.
  • Focus on exploring complex phenomena, such as attitudes, perceptions, and behaviors, and understanding the meaning and context behind them.
  • Use techniques such as interviews, observations, case studies, and focus groups to collect data.
  • Emphasize the subjective nature of the data and the importance of the researcher’s interpretation and analysis.
  • Provide rich and detailed insights into people’s experiences and perspectives.
  • Limitations include potential bias from the researcher, limited generalizability of findings, and challenges in analyzing and synthesizing the data.

Quantitative Evaluation Methodologies:

  • Collect numerical data that can be analyzed statistically.
  • Focus on measuring specific variables and relationships between them, such as the effectiveness of an intervention or the correlation between two factors.
  • Use techniques such as surveys and experimental designs to collect data.
  • Emphasize the objectivity of the data and the importance of minimizing bias and variability.
  • Provide precise and measurable data that can be compared and analyzed statistically.
  • Limitations include potential oversimplification of complex phenomena, limited contextual information, and challenges in collecting and analyzing data.

Choosing between qualitative and quantitative evaluation methodologies depends on the specific goals of the evaluation, the type and level of data required, and the resources available for conducting the evaluation. Some evaluations may use a mixed-methods approach that combines both qualitative and quantitative data collection and analysis techniques to provide a more comprehensive understanding of the subject being evaluated.

3. Program evaluation methodologies

Program evaluation methodologies encompass a diverse set of approaches and techniques used to assess the effectiveness, efficiency, and impact of programs and interventions. These methodologies provide systematic frameworks for collecting, analyzing, and interpreting data to determine the extent to which program objectives are being met and to identify areas for improvement. Common program evaluation methodologies include quantitative methods such as experimental designs, quasi-experimental designs, and surveys, as well as qualitative approaches like interviews, focus groups, and case studies.

Each methodology offers unique advantages and limitations depending on the nature of the program being evaluated, the available resources, and the research questions at hand. By employing rigorous program evaluation methodologies, organizations can make informed decisions, enhance program effectiveness, and maximize the use of resources to achieve desired outcomes.

Catch HR’s eye instantly?

  • Resume Review
  • Resume Writing
  • Resume Optimization

Premier global development resume service since 2012

Stand Out with a Pro Resume

4. Qualitative Methodologies in Monitoring and Evaluation (M&E)

Qualitative methodologies are increasingly being used in monitoring and evaluation (M&E) to provide a more comprehensive understanding of the impact and effectiveness of programs and interventions. Qualitative methodologies can help to explore the underlying reasons and contexts that contribute to program outcomes and identify areas for improvement. Here are some common qualitative methodologies used in M&E:

Interviews involve one-on-one or group discussions with stakeholders to collect data on their experiences, perspectives, and perceptions. Interviews can provide rich and detailed data on the effectiveness of a program, the factors that contribute to its success or failure, and the ways in which it can be improved.

Observations

Observations involve the systematic and objective recording of behaviors and interactions of stakeholders in a natural setting. Observations can help to identify patterns of behavior, the effectiveness of program interventions, and the ways in which they can be improved.

Document review

Document review involves the analysis of program documents, such as reports, policies, and procedures, to understand the program context, design, and implementation. Document review can help to identify gaps in program design or implementation and suggest ways in which they can be improved.

Participatory Rural Appraisal (PRA)

PRA is a participatory approach that involves working with communities to identify and analyze their own problems and challenges. It involves using participatory techniques such as mapping, focus group discussions, and transect walks to collect data on community perspectives, experiences, and priorities. PRA can help ensure that the evaluation is community-driven and culturally appropriate, and can provide valuable insights into the social and cultural factors that influence program outcomes.

Key Informant Interviews

Key informant interviews are in-depth, open-ended interviews with individuals who have expert knowledge or experience related to the program or issue being evaluated. Key informants can include program staff, community leaders, or other stakeholders. These interviews can provide valuable insights into program implementation and effectiveness, and can help identify areas for improvement.

Ethnography

Ethnography is a qualitative method that involves observing and immersing oneself in a community or culture to understand their perspectives, values, and behaviors. Ethnographic methods can include participant observation, interviews, and document analysis, among others. Ethnography can provide a more holistic understanding of program outcomes and impacts, as well as the broader social context in which the program operates.

Focus Group Discussions

Focus group discussions involve bringing together a small group of individuals to discuss a specific topic or issue related to the program. Focus group discussions can be used to gather qualitative data on program implementation, participant experiences, and program outcomes. They can also provide insights into the diversity of perspectives within a community or stakeholder group .

Photovoice is a qualitative method that involves using photography as a tool for community empowerment and self-expression. Participants are given cameras and asked to take photos that represent their experiences or perspectives on a program or issue. These photos can then be used to facilitate group discussions and generate qualitative data on program outcomes and impacts.

Case Studies

Case studies involve gathering detailed qualitative data through interviews, document analysis, and observation, and can provide a more in-depth understanding of a specific program component. They can be used to explore the experiences and perspectives of program participants or stakeholders and can provide insights into program outcomes and impacts.

Qualitative methodologies in M&E are useful for identifying complex and context-dependent factors that contribute to program outcomes, and for exploring stakeholder perspectives and experiences. Qualitative methodologies can provide valuable insights into the ways in which programs can be improved and can complement quantitative methodologies in providing a comprehensive understanding of program impact and effectiveness

5. Quantitative Methodologies in Monitoring and Evaluation (M&E)

Quantitative methodologies are commonly used in monitoring and evaluation (M&E) to measure program outcomes and impact in a systematic and objective manner. Quantitative methodologies involve collecting numerical data that can be analyzed statistically to provide insights into program effectiveness, efficiency, and impact. Here are some common quantitative methodologies used in M&E:

Surveys involve collecting data from a large number of individuals using standardized questionnaires or surveys. Surveys can provide quantitative data on people’s attitudes, opinions, behaviors, and experiences, and can help to measure program outcomes and impact.

Baseline and Endline Surveys

Baseline and endline surveys are quantitative surveys conducted at the beginning and end of a program to measure changes in knowledge, attitudes, behaviors, or other outcomes. These surveys can provide a snapshot of program impact and allow for comparisons between pre- and post-program data.

Randomized Controlled Trials (RCTs)

RCTs are a rigorous quantitative evaluation method that involve randomly assigning participants to a treatment group (receiving the program) and a control group (not receiving the program), and comparing outcomes between the two groups. RCTs are often used to assess the impact of a program.

Cost-Benefit Analysis

Cost-benefit analysis is a quantitative method used to assess the economic efficiency of a program or intervention. It involves comparing the costs of the program with the benefits or outcomes generated, and can help determine whether a program is cost-effective or not.

Performance Indicators

Performance indicator s are quantitative measures used to track progress toward program goals and objectives. These indicators can be used to assess program effectiveness, efficiency, and impact, and can provide regular feedback on program performance.

Statistical Analysis

Statistical analysis involves using quantitative data and statistical method s to analyze data gathered from various evaluation methods, such as surveys or observations. Statistical analysis can provide a more rigorous assessment of program outcomes and impacts and help identify patterns or relationships between variables.

Experimental designs

Experimental designs involve manipulating one or more variables and measuring the effects of the manipulation on the outcome of interest. Experimental designs are useful for establishing cause-and-effect relationships between variables, and can help to measure the effectiveness of program interventions.

Quantitative methodologies in M&E are useful for providing objective and measurable data on program outcomes and impact, and for identifying patterns and trends in program performance. Quantitative methodologies can provide valuable insights into the effectiveness, efficiency, and impact of programs, and can complement qualitative methodologies in providing a comprehensive understanding of program performance.

6. What are the M&E Methods?

Monitoring and Evaluation (M&E) methods encompass the tools, techniques, and processes used to assess the performance of projects, programs, or policies.

These methods are essential in determining whether the objectives are being met, understanding the impact of interventions, and guiding decision-making for future improvements. M&E methods fall into two broad categories: qualitative and quantitative, often used in combination for a comprehensive evaluation.

7. Choosing the Right Evaluation Methodology: Factors and Criteria

Choosing the right evaluation methodology is essential for conducting an effective and meaningful evaluation. Here are some factors and criteria to consider when selecting an appropriate evaluation methodology:

  • Evaluation goals and objectives: The evaluation goals and objectives should guide the selection of an appropriate methodology. For example, if the goal is to explore stakeholders’ perspectives and experiences, qualitative methodologies such as interviews or focus groups may be more appropriate. If the goal is to measure program outcomes and impact, quantitative methodologies such as surveys or experimental designs may be more appropriate.
  • Type of data required: The type of data required for the evaluation should also guide the selection of the methodology. Qualitative methodologies collect non-numerical data, such as words, images, or observations, while quantitative methodologies collect numerical data that can be analyzed statistically. The type of data required will depend on the evaluation goals and objectives.
  • Resources available: The resources available, such as time, budget, and expertise, can also influence the selection of an appropriate methodology. Some methodologies may require more resources, such as specialized expertise or equipment, while others may be more cost-effective and easier to implement.
  • Accessibility of the subject being evaluated: The accessibility of the subject being evaluated, such as the availability of stakeholders or data, can also influence the selection of an appropriate methodology. For example, if stakeholders are geographically dispersed, remote data collection methods such as online surveys or video conferencing may be more appropriate.
  • Ethical considerations: Ethical considerations, such as ensuring the privacy and confidentiality of stakeholders, should also be taken into account when selecting an appropriate methodology. Some methodologies, such as interviews or focus groups, may require more attention to ethical considerations than others.

Overall, choosing the right evaluation methodology depends on a variety of factors and criteria, including the evaluation goals and objectives, the type of data required, the resources available, the accessibility of the subject being evaluated, and ethical considerations. Selecting an appropriate methodology can ensure that the evaluation is effective, meaningful, and provides valuable insights into program performance and impact.

8. Our Conclusion on Evaluation Methodologies

It’s worth noting that many evaluation methodologies use a combination of quantitative and qualitative methods to provide a more comprehensive understanding of program outcomes and impacts. Both qualitative and quantitative methodologies are essential in providing insights into program performance and effectiveness.

Qualitative methodologies focus on gathering data on the experiences, perspectives, and attitudes of individuals or communities involved in a program, providing a deeper understanding of the social and cultural factors that influence program outcomes. In contrast, quantitative methodologies focus on collecting numerical data on program performance and impact, providing more rigorous evidence of program effectiveness and efficiency.

Each methodology has its strengths and limitations, and a combination of both qualitative and quantitative approaches is often the most effective in providing a comprehensive understanding of program outcomes and impact. When designing an M&E plan, it is crucial to consider the program’s objectives, context, and stakeholders to select the most appropriate methodologies.

Overall, effective M&E practices require a systematic and continuous approach to data collection, analysis, and reporting. With the right combination of qualitative and quantitative methodologies, M&E can provide valuable insights into program performance, progress, and impact, enabling informed decision-making and resource allocation, ultimately leading to more successful and impactful programs.

' data-src=

Munir Barnaba

Thanks for your help its of high value, much appreciated

' data-src=

Very informative. Thank you

' data-src=

Chokri HAMOUDA

I am grateful for this article, which offers valuable insights and serves as an excellent educational resource. My thanks go to the author.

Leave a Comment Cancel Reply

Your email address will not be published.

How strong is my Resume?

Only 2% of resumes land interviews.

Land a better, higher-paying career

research report on monitoring and measuring techniques

Jobs for You

Subject matter expert (media literacy).

  • North Macedonia

Evaluation Specialist

Senior associate, human resources.

  • United States

Team Leader

College of education: open-rank, evaluation/social research methods — educational psychology.

  • Champaign, IL, USA
  • University of Illinois at Urbana-Champaign

Deputy Director – Operations and Finance

Energy/environment senior advisor, climate finance specialist, call for consultancy: evaluation of dfpa projects in kenya, uganda and ethiopia.

  • The Danish Family Planning Association

Project Assistant – Close Out

  • United States (Remote)

Intern- International Project and Proposal Support – ISPI

Budget and billing consultant, manager ii, budget and billing, usaid/lac office of regional sustainable development – program analyst, services you might be interested in, useful guides ....

How to Create a Strong Resume

Monitoring And Evaluation Specialist Resume

Resume Length for the International Development Sector

Types of Evaluation

Monitoring, Evaluation, Accountability, and Learning (MEAL)

LAND A JOB REFERRAL IN 2 WEEKS (NO ONLINE APPS!)

Sign Up & To Get My Free Referral Toolkit Now:

American University Online

  • make a call
  • schedule an appointment
  • Chat Loading...
  • Request Info

855-725-7614

  • Online Graduate Degree Programs
  • School of Public Affairs
  • Master of Public Administration and Policy
  • MS in Counter-Terrorism and Homeland Security
  • College of Arts and Sciences
  • MA in Economics, Applied Economics Specialization
  • MS in Nutrition Education
  • School of Communication
  • School of Professional and Extended Studies
  • MS in Health Promotion Management
  • MS in Human Resource Analytics and Management
  • MS in Measurement & Evaluation
  • MS in Sports Analytics and Management
  • Online Graduate Certificates
  • School of Professional and Extended Studies Graduate Certificate Programs
  • Graduate Certificate in Human Resource Analytics and Management
  • Graduate Certificate in Nutrition Education
  • Graduate Certificate in Project Monitoring & Evaluation
  • Graduate Certificate in Sports Analytics and Management
  • Financial Aid and Tuition
  • Scholarships
  • International Students
  • Military Students
  • Schedule an Appointment
  • University Registrar
  • Campus Programs
  • Why American University
  • Accreditation and Rankings
  • President's Message
  • Online Student Life
  • Virtual Open Houses & Webinars
  • Frequently Asked Questions

Qualitative Methods in Monitoring and Evaluation: Thoughts Considering the Project Cycle

  • Online Degrees
  • Program Resources

You are here

As we monitor and evaluate projects, we use many different kinds of qualitative methods, and each of these methods gives us different kinds of data.  Depending on our evaluation statement of work or performance monitoring plan, we use different methods on particular occasions to elicit certain kinds of data.

As we craft our qualitative or mixed method evaluation designs, we should consider what qualitative methods we would use, and what kind of data those methods would give us.  Evaluators have a large toolkit of qualitative methods, and we use each of these methods under different circumstances to gather different kinds of data.  As Nightengale and Rossman (2010) explain, we need to decide what our unit of analysis will be; the number of sites that we will use; how we will choose those sites; what data we need; and what method will give us that data.  We also need to consider Bamberger, Rugh, and Mabry's (2012) constraints of time, budget, data, and politics as we plan our qualitative research and evaluations. We should pay special attention to ethical considerations, as qualitative researchers tend to spend a lot of time with informants, gathering sensitive data in the process.

Let’s consider the use of several qualitative methods through the project cycle, from planning, to implementation, and project conclusion.  We should consider what qualitative methods we would use, and what kind of data those methods would give us. 

Planning As we are planning our project, if we are lucky, a donor will give us money to carry out a needs assessment.   A quantitative needs assessment, perhaps even using already existing data, might tell us literacy rates or hospitalization rates, for example.  This kind of data can be important for our project, depending on its scope, objectives, and activities. 

A qualitative needs assessment might give us more of a disaggregated perspective of literacy or health issues that takes into account emic perspectives.  Observation might give us a picture of what is happening in the project setting.  Participant observation might give us more of an emic understanding of what is happening, especially if we are allowed into the backstage where the observer effect is no longer as evident.  At this stage, key informant interviews might give us some possible project parameters, and this might be of particular importance if there are gatekeepers in the community who could help or hinder a project and its activities.  Participatory tools like seasonal calendars might help us to understand the emic needs of the community, and the different local events or micropolitics issues that might impact project implementation and beneficiary access. 

Understanding the needs of the community is an important process, and with emic data we can construct projects and activities and set indicators that are culturally appropriate. 

Another aspect here is baseline data collection .  We sometimes collect this as we are planning our project, and we sometimes collect it just before we start our activities.  Collecting baseline data may be important if we want to be able to show outcomes or conduct an impact assessment after the conclusion of our project.  If we want to show the impact of our project, or the changes in people’s attitudes, behaviors, or competencies, then we may need a baseline to compare to.  Depending on our project, we might use a census table or a structured interview schedule to collect baseline data during the planning phase of a project.

Implementation We incorporate qualitative data into our monitoring efforts and formative evaluations so that we can improve project activities. We adapt and learn from our project’s implementation when we carry out formative evaluations. 

Qualitative methods that monitor progress are particularly important during the implementation phase of a project.  Using qualitative data to monitor projects gives us insight into a project’s activities as they are being implemented.  This can be more helpful to us than quantitative data, such as “number of people trained.”  Indeed, one of the most common uses of qualitative data is to help explain or add perspective to quantitative data.  We can use qualitative data to tweak or change direction of our programming, especially if we are not hitting our intended objectives or making progress towards our indicators.

We use observation to see what is happening in our project, who is participating, and who is not participating.  We use participant observation and key informant interviews to understand what is happening in our project as it is being implemented.   Focus groups and participatory tools are also important for us so that we can get a wider perspective of project activities and outputs.  

Outcomes and Impact Showing causation between the baseline and outcome data is something to consider in the design of an impact evaluation. Without that baseline data, we might not be in a position to show our project’s impact, so we need to think about collecting baseline data during the planning or at the start of the implementation phase if we want to show this later on.

As above, observation and participant observation allow us to observe and understand change that has or has not taken place in society as a result of our program.  Key informant interviews and focus groups give us insight into the change, or lack thereof.

Concluding Thoughts While our evaluation designs need to be solid, we also need to have knowledge to implement the designs within other particular historical, cultural, and linguistic settings.  Our designs are only going to take us so far, and that we as evaluators need training and expertise to use qualitative methods in culturally appropriate ways.

References: Michael Bamberger, Jim Rugh, and Linda Mabry, Real World Evaluation: Working Under Budget, Time, Data, and Political Constraints, Thousand Oaks: SAGE, 2012. Demetra Smith Nightengale and Shelli Rossman, “Collecting Data in the Field,” in Joseph Wholey, Harry Hatry, and Kathryn Newcomer, eds., Handbook of Practical Program Evaluation, San Francisco: Wiley, 2010.

About the Author: Dr. Beverly Peters has more than twenty years of experience teaching, conducting qualitative research, and managing community development, microcredit, infrastructure, and democratization projects in several countries in Africa. As a consultant, Dr. Peters worked on EU and USAID funded infrastructure, education, and microcredit projects in South Africa and Mozambique. She also conceptualized and developed the proposal for Darfur Peace and Development Organization’s women’s crisis center, a center that provides physical and economic assistance to women survivors of violence in the IDP camps in Darfur. Dr. Peters has a Ph.D. from the University of Pittsburgh. Learn more about Dr. Peters .

To learn more about American University’s online MS in Measurement & Evaluation or Graduate Certificate in Project Monitoring & Evaluation,  request more information  or call us toll free at 855-725-7614.

(855) 725-7614

  • Terms & Conditions
  • About American University

research report on monitoring and measuring techniques

Breadcrumbs Section. Click here to navigate to respective pages.

Health and safety monitoring and measuring

Health and safety monitoring and measuring

DOI link for Health and safety monitoring and measuring

Click here to navigate to parent product.

The main purpose of monitoring health and safety performance is to provide information on the progress and current status of the strategies, processes and activities employed to control health and safety risks. Effective measurement not only provides information on what the levels are but also why they are at this level, so that corrective action can be taken. Managers should check by asking key questions to ensure that arrangements for health and safety risk control are in place, comply with the law as a minimum, and operate effectively. Lagging indicators are the traditional safety metrics used to indicate progress toward compliance with safety rules. A leading indicator is a measure preceding or indicating a future event used to drive and measure activities carried out to prevent and control injury. Leading indicators are focused on future safety performance and continuous improvement. Performance should be measured at each management level from directors downwards.

  • Privacy Policy
  • Terms & Conditions
  • Cookie Policy
  • Taylor & Francis Online
  • Taylor & Francis Group
  • Students/Researchers
  • Librarians/Institutions

Connect with us

Registered in England & Wales No. 3099067 5 Howick Place | London | SW1P 1WG © 2024 Informa UK Limited

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 05 September 2024

Comparative research on monitoring methods for nitrate nitrogen leaching in tea plantation soils

  • Shenghong Zheng 1 , 2 ,
  • Kang Ni 2 ,
  • Hongling Chai 1 ,
  • Qiuyan Ning 3 ,
  • Chen Cheng 4 ,
  • Huajing Kang 1 &
  • Jianyun Ruan 2 , 5  

Scientific Reports volume  14 , Article number:  20747 ( 2024 ) Cite this article

Metrics details

  • Plant ecology

Great concern has long been raised about nitrate leaching in cropland due to its possible environmental side effects in ground water contamination. Here we employed two common techniques to measure nitrate leaching in tea plantation soils in subtropical China. Using drainage lysimeter as a reference method, the adaptability of estimating drainage and nitrate leaching by combining the water balance equation with the suction cup technique was investigated. Results showed that the final cumulative leachate volume for the calculated and measured method was 721.43 mm and 729.92 mm respectively during the study period. However, nitrate concentration exerted great influence in the estimation of nitrate leaching from the suction cup-based method. The cumulative nitrate leaching loss from the lysimeter and suction cup-based method was 47.45 kg ha −1 and 43.58 kg ha −1 under lysimeter nitrate concentrations ranging from 7 mg L −1 to 13 mg L −1 , 156.28 kg ha −1 and 79.95 kg ha −1 under lysimeter nitrate concentrations exceeding 13 mg L −1 . Therefore, the suction cup-based method could be an alternative way of monitoring nitrate leaching loss within a range of 7–13 mg L −1 of nitrate concentrations in leachate. Besides, lower results occurred in suction cup samplers due to lack of representative samples which mainly leached via preferential flow when in strong leaching events. Thus, it is advisable to increase sampling frequency under such special conditions. The results of this experiment can serve as a reference and guidance for the application of ceramic cups in monitoring nitrogen and other nutrient-ion leaching in tea plantation soils.

Similar content being viewed by others

research report on monitoring and measuring techniques

Effects of different agronomic practices on the selective soil properties and nitrogen leaching of black soil in Northeast China

research report on monitoring and measuring techniques

Nitrogen leaching and groundwater N contamination risk in saffron/wheat intercropping under different irrigation and soil fertilizers regimes

research report on monitoring and measuring techniques

Evaluation of agricultural non-point source pollution using an in-situ and automated photochemical flow analysis system

Introduction.

The intensive and extensive land-use activities associated with crops and animal production cause the most substantial anthropogenic source of nitrate, among which over-use of nitrogen fertilizer is one of the most contributing factors for nitrate pollution 1 , 2 . Compared with other crops, the tea plant (Camellia sinensis) requires an elevated nitrogen supply for the growth of tea shoots to enhance tea yield and quality 3 , 4 . The mean annual N application rate ranges from 281 to 745 kg ha −1 in the main tea production provinces in China. This means about 30% of the surveyed tea gardens applied excessive chemical fertilizers according to the current recommendation 5 . Meanwhile, higher N input levels increased concentrations of NO 3 − and NH 4 + in the 90–200 cm soil of the tea gardens, posing a high risk of N leaching loss in the tea gardens 6 . Thus, nitrate leaching from tea gardens should be of great concern for both scientists and producers.

In terms of monitoring methods for nitrate nitrogen leaching in agricultural soils, ceramic suction cup samplers and buried drainage lysimeters are the two most commonly employed techniques 7 . Ceramic suction cups are favored for their ease of installation and potential for repeated sampling at the same location 8 . They are deemed suitable for monitoring nitrate nitrogen leaching in non-structured soils 9 , 10 . However, ceramic suction cups are limited in their capacity to assess nitrate nitrogen concentrations only at specific soil depths and during particular sampling times 11 . This limitation makes it challenging to establish a comprehensive mass balance unless simultaneous quantification of soil water flux is undertaken. Additionally, characterized by low soil water retention and vulnerability to drought in coarse sandy soils, obtaining adequate sample volumes and capturing representative pore samples can be problematic 10 , 12 .

On the contrary, drainage lysimeters yield both the leachate volume and the nitrate nitrogen concentration, facilitating the calculation of nitrogen load passing below the defined soil layers. Other advantages embody larger sample volumes, enabling a representative sample of the soil pore network. Nonetheless, the installation and burial of drainage lysimeters traditionally introduce considerable soil disturbance, resulting in significant deviations from the original soil’s hydraulic properties and natural attributes, including the pathways for water and solute flow 13 . Strictly speaking, this approach constitutes a comprehensive method that integrates both temporal and spatial dimensions 14 . It thereby offers a more systematic and precise assessment of nitrogen leaching losses compared to other methodologies, which often capture relatively small-scale nitrogen leaching events and provide only a momentary glimpse into nitrogen leaching patterns 15 .

Due to the advantages and limitations inherent in both ceramic suction cup extraction and drainage lysimeter methodologies, these techniques are widely applied in empirical research. Several studies have also undertaken comparative analyses of their respective monitoring performance 10 , 12 , 13 . Nevertheless, extant research often relies on the ceramic suction cup approach to estimate nitrate nitrogen leaching quantities through multiplying the nitrate nitrogen concentration within the extracted solution by the measured volume obtained from the drainage lysimeter. This practice poses constraints on the application of the ceramic suction cup method, as the calculation of soil water flux becomes the key limiting factor when drainage lysimeter equipment is unavailable. Thus, it is imperative to explore alternative methods for calculating water flux and, on this basis, to conduct a comparative analysis of the two techniques. This approach is essential for promoting the practical utility and quantitative operability of the ceramic suction cup method.

Currently, there is limited research on localized nitrate nitrogen leaching in tea plantation soils, and a lack of comparative assessments of monitoring methods. In this study, we employed two methodologies, ceramic suction cup sampler and drainage lysimeters, to concurrently monitor nitrate nitrogen leaching in tea plantation soils. We put particular emphasis on the ceramic suction cup method, combined with a water balance equation, to evaluate the accuracy and efficacy of nitrate nitrogen leaching monitoring. Our objective is to provide insights and reference points for research efforts related to nitrogen leaching in tea plantation soils.

Materials and methods

Site description.

The field experiment was conducted at Tea Research Institute of Chinese Academy of Agricultural Sciences (TRI-CAAS) Experimental Station in Zhejiang province of China (29.74°N, 120.82°E). The experimental site has a typical subtropical monsoon climate, with 12.6 °C in mean annual temperature and 1200 mm yr −1 in annual total precipitation. Before the experiment, tea plants (clone variety Baiye1 and Longjing43, hereafter referred to BY1 and LJ43) were planted in rows (1.5 m between rows and 0.33 m between plants) at a density of approximately 6000 plants ha −1 and allowed to grow for 4 years in the research site. The soil at the site was acidic red soil, developed from granite parent material with a texture that is clay. Before the experiment, the surface (0–20 cm) soil properties were pH 4.47, SOC 5.71 g kg −1 , TN 0.47 g kg −1 , available potassium (AK) 20.42 g kg −1 , and low available phosphorus (AP) 1.48 g kg −1 .

Experimental design

The experiment included different nitrogen (N) treatment levels, ranging from 150 kg N ha −1 to 450 kg N ha -1 , with three replicates arranged in a randomized complete block design. Urea was used as the nitrogen fertilizer, and nitrogen fertilization was divided into spring (30% of the total), summer (20% of the total), and fall (50% of the total) applications. In addition to nitrogen, each plot received a one-time application of 90 kg ha −1 phosphorus (as P 2 O 5 ), 120 kg ha −1 potassium (as K 2 O), and 1200 kg ha −1 of organic fertilizer as a basal application. The phosphorus fertilizer used was calcium superphosphate (13% P 2 O 5 ), the potassium fertilizer was potassium sulfate (50% K 2 O), and the organic fertilizer was rapeseed cake (5% N). Fertilization was conducted during the fall season using manual trenching (10–15 cm depth). The required amount of fertilizer for each plot was evenly spread in the trench, followed by soil backfilling.

Sample collection method

Lysimeter installation and water sample collection.

Drainage lysimeters were installed in July 2015 in such a way that they were collected for a representative transect of the production bed. This involved digging pits with 1.5 m length × 1 m width × 1 m height in the middle of the tea plant rows. In case of the side-seepage of soil solution, each lysimeter pit was surrounded by a piece of plastic leather before soil backfilling. Each lysimeter was paired with two 1.5-m pipes among which one was for air passage and another was fitted with a 1.0-cm butyl rubber suction tube to allow extraction of the leachate collected at the bottom of the lysimeter by a vacuum pump. leachate was regularly removed bi-weekly by applying a partial vacuum (25–30 kpa) using a 10-L vacuum bottle placed in the vacuum line for each lysimeter. Leachate volume was determined gravimetrically and subsamples were collected from each bottle for drainage and nitrate analysis. Please refer to our previous study reported by Zheng et al. 16 for detailed information on the installation of lysimeters and the collection of water samples.

Soil solution extraction using ceramic suction cups

The soil solution extraction using the negative pressure ceramic suction cup method involved burying ceramic suction cups at a specific soil depth and connecting them to PVC pipes. Before sampling, a vacuum pump was used to create a vacuum inside the ceramic suction cup through the PVC pipe. This vacuum pressure allowed soil solution to be drawn into the ceramic suction cup, from which soil solution samples can then be extracted. In this experiment, ceramic suction cups were installed at a depth of 100 cm in the middle of tea rows. Four ceramic suction cups were placed horizontally at distances of − 0 cm, − 25 cm, − 50 cm, and − 75 cm from the tea tree roots. Before rainfall events, the ceramic suction cups were subjected to a vacuum pressure of approximately − 80 kPa to collect soil solution generated during rainfall. This sampling way was conducted simultaneously with the lysimeter method throughout the experiment.

Meteorological data were automatically collected by a weather station located about 100 m from the research site, and soil moisture was monitored using soil moisture sensors as described in our previous study reported by Zheng et al. 16 . The average temperature and rainfall during the experiment are shown in Fig.  1 . It can be observed from the figure that the total rainfall for March to December 2019 and January to June 2020 was 1374.60 mm and 1095 mm, respectively. The average daily temperature fluctuated within the range of 4.97 °C to 29.18 °C, with the highest daily average temperatures occurring in July and August and the lowest temperatures often emerging in December or January. Rainfall was most abundant from June to September, while November and December experienced lower levels of rainfall.

figure 1

Total monthly precipitation and mean daily temperature by month from March 2019 to June 2020 at the research site.

Sample analysis and data processing

After filtering the collected soil solution and leachate samples, the nitrate nitrogen concentration, NO 3 − –N concentration, was determined using a UV dual-wavelength spectrophotometry method with wavelengths of 220 nm and 275 nm 17 , 18 .

For the calculation of nitrate nitrogen leaching amount (CL) from the leachate collector, it is calculated by multiplying the volume of the collected water sample by its nitrate nitrogen concentration, and the specific calculation method is as follows in Eq. ( 1 ).

where Ci is the measured NO 3 − –N concentration in the water sample, kg N L −1 , Vi is the volume of leachate collected per extraction. The numbers 1.5 and 1.0 represent the length and width of the lysimeter, m. 0.01 is the conversion factor.

For the ceramic cup method, we need to apply a water balance equation to calculate the water flux over a specific time period. After that, you can multiply it by the concentration of nitrate nitrogen in the extracting solution to obtain the nitrate nitrogen leaching amount. The specific calculation process is as follows in Eq. ( 2 ).

The cumulative nitrate nitrogen leaching amount (CLs) for the ceramic cup method can be calculated as follows:

where C ἰ and C ἰ+1 (kg N L −1 ) represent the average concentrations of nitrate nitrogen in the soil-extracting solution for two consecutive sampling times. n represents the total number of sampling events.

D represents the water flux over the time interval between the two sampling events, which can be calculated using the water balance equation as shown in Eq. ( 3 ).

where P is the precipitation (mm), I means the irrigation water quantity (mm), which is not relevant in this study and is not considered in the calculations. VR is the change in soil water storage (mm). D is the leachate flux (mm). ETc is the crop evapotranspiration (mm), calculated as ETc = kc* ET 0 , where ET 0 is the reference evapotranspiration for crops calculated from meteorological data according to FAO-56 Penman–Monteith equation 19 . The calculation of ET 0 can be simplified as follows in Eq. ( 4 ).

where ET 0 is the reference evapotranspiration (mm day −1 ), R n is the net radiation at the crop surface (MJ m −2  day −1 ), G is the soil heat flux density (MJ m −2  day −1 ), T is the mean daily air temperature at 2 m height (°C), u 2 is the wind speed at 2 m height (m s −1 ), e s is the saturation vapor pressure (kPa), e a is the actual vapor pressure (kPa), (es-ea) is the saturation vapor pressure deficit (kPa), ∆ is the slope vapor pressure curve (kPa °C −1 ), γ psychrometric constant (kPa °C −1 ), and 900 is the conversion factor.

Statistical data analysis was conducted using SPSS 22 software (SPSS Inc., New York, USA). One-way analysis of variance (ANOVA) was performed, followed by Duncan's post hoc test (p < 0.05 indicates significant differences, while p < 0.01 indicates highly significant differences). All graphs were generated using Sigmaplot 12.5 software (Systat Software Inc., Milpitas, USA).

Results and discussion

Comparison of drainage flux and leachate volume calculation.

During the experimental period from March 2019 to June 2020, 22 samples were taken both for BY1 and LJ43. The drainage flux for each sampling interval was calculated using the water balance equation. Based on the results from our previous study 16 , for BY1, Kc was set to 0.71 to calculate evapotranspiration. When the rainfall exceeded 78.02 mm, the drainage flux was fixed at the maximum value of 20.63 mm. For LJ43, Kc was set to 0.84 to calculate evapotranspiration, and when the rainfall reached or exceeded 90.98 mm, the drainage flux was fixed at the maximum value of 21.45 mm. For other rainfall levels, the drainage flux was calculated using the actual rainfall and the water balance equation. On this basis, the calculated drainage flux was compared and analyzed against the equivalent water depth calculated by converting the leachate volume extracted from the lysimeter (Lysimeter leachate). The equivalent water depth (mm) is calculated as the extracted water volume (L) divided by the lysimeter's area (1.5 m 2 in this study). The results are shown in Fig.  2 .

figure 2

Correlation analysis of lysimeter leachate and calculated drainage ( a ) and comparison of cumulative leachate and cumulative calculated drainage ( b ) for the BY1 and LJ43 during the study period.

From Fig.  2 a, it can be observed that the volume data points for both methods are distributed close to the 1:1 line, indicating that the calculated drainage flux and the lysimeter leachate volume measurements are generally in good agreement. Furthermore, the total volume sums for both methods were calculated separately (Fig.  2 b). The results indicate that the cumulative calculated drainage flux for BY1 during the experimental period was 389.21 mm, slightly higher than the total lysimeter leachate volume measured at 367.77 mm. For LJ43, the total calculated drainage flux was 332.22 mm, slightly lower than the total lysimeter leachate volume of 362.15 mm. Finally, when combining all results for BY1 and LJ43, the total calculated drainage flux and the total lysimeter leachate volume were 721.43 mm and 729.92 mm, respectively, with the former only 1.16% lower than the latter. Therefore, the application of the water balance equation for soil drainage flux calculation demonstrated high accuracy and feasibility.

Comparison of soil solution and leachate nitrate nitrogen concentrations

A relationship was created with the nitrate nitrogen concentration of the lysimeter leachate during the experimental period as the x-axis and the nitrate nitrogen concentration of the soil solution extracted using the ceramic cup method as the y-axis. Additionally, a logarithmic transformation was applied to further analyze the impact of the two extraction methods on nitrate nitrogen concentration. The results are shown in Fig.  3 . It can be observed in Fig.  3 a that when the nitrate nitrogen concentration in the lysimeter leachate is less than 7 mg L −1 , all nitrate nitrogen concentrations in the soil solution extracted from the ceramic cup method are higher than those in the lysimeter leachate. Subsequently, as the nitrate nitrogen concentration in the lysimeter leachate increases from 7 mg L −1 to 13 mg L −1 , approximately half of the soil solution extracted from the ceramic cup method has a higher nitrate nitrogen concentration than the lysimeter leachate, while the other half has a lower nitrate nitrogen concentration. Then, when the nitrate nitrogen concentration in the lysimeter leachate exceeds 13 mg L −1 , all soil solution extracted using the ceramic cup method has a lower nitrate nitrogen concentration than the lysimeter leachate.

figure 3

Correlation between ( a ) nitrate concentration from lysimeter and suction cup and ( b ) nitrate concentration from lysimeter and logarithmic conversion value of the ratio of nitrate concentration from lysimeter to suction cup nitrate concentration.

Further analysis was conducted by taking the ratio of the nitrate nitrogen concentrations in the lysimeter leachate and the soil solution extracted using the ceramic cup method as a real number, with a base of 2 for logarithmic transformation. The trend of this transformed value with respect to the nitrate nitrogen concentration in the lysimeter leachate is shown in Fig.  3 b. It is evident that as the nitrate nitrogen concentration in the lysimeter leachate increases, the logarithmic transformation value increases from its minimum value of − 3.51 to 1.93. The transformation value exhibits distinct trends and characteristics based on the grouping of nitrate nitrogen concentrations in the lysimeter leachate. When the lysimeter leachate concentration is less than 7 mg L −1 , the transformation value is consistently less than 0. When the lysimeter leachate concentration exceeds 13 mg L −1 , the transformation value is consistently greater than 0. However, when the lysimeter leachate concentration falls between 7 mg L −1 and 13 mg L −1 , both positive and negative transformation values coexist.

Comparison of nitrate nitrogen leaching between two methods

Similarly, a relationship was created with the nitrate nitrogen concentration of the lysimeter leachate (Lysimeter method) as the x-axis and the nitrate nitrogen concentration obtained using the ceramic cup method combined with the water balance equation (Ceramic cup method) as the y-axis. Additionally, a logarithmic transformation was applied to further analyze the impact of the two methods on nitrate nitrogen leaching. The results are shown in Fig.  4 . From Fig.  4 a, it can be observed that when the nitrate nitrogen concentration in the lysimeter leachate is less than 7 mg L −1 , almost all nitrate nitrogen leaching calculated using the ceramic cup method is higher than the nitrate nitrogen concentration in the lysimeter leachate. When the lysimeter leachate concentration falls between 7 mg L −1 and 13 mg L −1 , more than half of the nitrate nitrogen leaching calculated using the ceramic cup method is lower than the lysimeter method, while the other half is higher. Then, when the lysimeter leachate concentration exceeds 13 mg L −1 , all nitrate nitrogen concentrations calculated using the ceramic cup method are lower than the lysimeter leachate.

figure 4

Correlation between ( a ) nitrate leaching from lysimeter and suction cup and ( b ) nitrate leaching from lysimeter and logarithmic conversion value of the ratio of nitrate leaching from lysimeter to suction cup nitrate leaching.

Further analysis was conducted by taking the ratio of the nitrate nitrogen concentrations in the lysimeter leachate and those calculated using the ceramic cup method as a real number, with a base of 2 for logarithmic transformation. The trend of this transformed value with respect to the nitrate nitrogen concentration in the lysimeter leachate is shown in Fig.  4 b. It is evident that the transformation value follows a trend highly similar to the concentration transformation trend mentioned above. As the nitrate nitrogen concentration in the lysimeter leachate increases, the logarithmic transformation value increases from its minimum value of − 3.51 to 1. This transformation value exhibits distinct trends and characteristics based on the grouping of nitrate nitrogen concentrations in the lysimeter leachate. When the lysimeter leachate concentration is less than 7 mg L −1 , the transformation value is consistently less than 0. When the lysimeter leachate concentration exceeds 13 mg L −1 , the transformation value is consistently greater than 0. However, when the lysimeter leachate concentration falls between 7 mg L −1 and 13 mg L −1 , both positive and negative transformation values coexist.

In addition, statistical analysis was performed on the total nitrate nitrogen leaching for each concentration group. The results indicate that when the lysimeter leachate concentration was less than 7 mg L −1 , the total nitrate nitrogen leaching obtained by the lysimeter method and the ceramic cup method is 22.24 kg ha −1 and 44.05 kg ha −1 , respectively. When the lysimeter leachate concentration fell between 7 mg L −1 and 13 mg L −1 , the total nitrate nitrogen leaching calculated by the lysimeter method and the ceramic cup method was 47.45 kg ha −1 and 43.58 kg ha −1 , respectively. When the lysimeter leachate concentration exceeded 13 mg L −1 , the total nitrate nitrogen leaching obtained by the lysimeter method and the ceramic cup method was 156.28 kg ha −1 and 79.95 kg ha −1 , respectively. In summary, there were differences in quantified nitrate nitrogen leaching losses between the two methods. If the lysimeter method was used as the standard, the ceramic cup method exhibited higher monitoring accuracy when the nitrate nitrogen concentration in the lysimeter leachate fell within the range of 7–13 mg L −1 .

Effect of rainfall on the application of the water balance model

The use of ceramic cup methods to monitor nitrate nitrogen leaching in farmland requires estimation of soil water flux through modeling. This inevitably introduces uncertainties in accurately quantifying nitrate nitrogen 20 . In this study, the application of a water balance model for quantitatively calculating soil drainage volume seemed to yield slightly lower water flux results compared to the corresponding measurements obtained through the lysimeter method, especially when rainfall was low (Fig.  2 a). One possible reason for this discrepancy could be that the water balance equation typically accounts for only the saturated flow above field capacity, neglecting unsaturated flow. However, it is reported that unsaturated flow, which occurs at lower soil moisture levels, is more common in practice, especially when rainfall is low and soil moisture levels remain relatively low 21 . Therefore, it is speculated that unsaturated flow is the primary reason for the water balance model calculating lower water flux than the lysimeter measurements under these conditions.

On the other hand, for conditions with higher rainfall intensity, when applying the water balance equation to estimate water flux, it should strictly include runoff as part of the water output, with the most accurate method being the construction of runoff tanks for precise measurement. However, this study lacked the necessary means to estimate runoff, which likely led to significant deviations in the final water flux calculations. Nevertheless, previous study reported that runoff typically occurs during heavy rainfall events and increases with higher rainfall amounts 22 , 23 , 24 and when a certain critical rainfall intensity is reached, water will be lost as runoff because the soil cannot absorb and retain it, and an eventual maximum leachate flux will occur 25 . Based on our previous study, critical rainfall amounts and maximum water leachate fluxes were determined for the tea varieties of Longjing 43 and BaiYe 1, thus mitigating the significant calculation bias arising from the absence of runoff monitoring.

Effect of soil texture on the accuracy of the suction cup-based method

The lysimeter method, being considered a relatively accurate technique for monitoring and quantifying soil nitrate nitrogen leaching, is often regarded as a true reflection of nitrate nitrogen leaching in soil 26 . This study indicated that when the nitrate nitrogen concentration in lysimeter leachate fell below 13 mg L -1 (especially within the range of 7–13 mg L −1 ), the ceramic cup method demonstrated relatively accurate monitoring results. However, when the leachate nitrate concentration exceeded 13 mg L −1 , A much lower result was obtained from the ceramic cup method compared to the lysimeter method. The reason for this may rely on the soil structure. From the perspective of soil texture, this experiment was conducted in a relatively heavy clay tea plantation, where the clay content within the top meter of soil ranged from 62.53 to 69.99% 16 . Under such soil conditions, nitrate nitrogen is likely to be transported downward through preferential flow. Preferential flow is characterized by the rapid movement of most soil water and solutes through the large and intermediate pores of the soil, bypassing the surface soil and moving downward 27 . Previous studies have found that the occurrence of preferential flow was much higher in clay soils than in sandy or loamy soils 28 , 29 , 30 . This often resulted in higher concentrations of nitrate nitrogen in leachate water 31 .

Ceramic cups, on the other side, have been reported to be unsuitable for use in clayey soils because the presence of preferential flow makes it difficult for ceramic cups to effectively collect water flowing through large pores, especially during heavy rainfall events 32 . Additionally, Barbee and Brown (1986) compared the performance of ceramic cups and lysimeters in monitoring chloride ions in soils with three different textures. The results showed that lysimeters generally provided higher and more stable monitoring results in loam and sandy loam soils, while ceramic cups were almost ineffective in clayey soils due to the rapid leaching and movement of water through large pores. Therefore, to some extent, ceramic cups were considered to be a flawed soil solution extraction technique for clayey soils. These factors need to be considered in soil nitrate nitrogen leaching studies, especially in soil types like clay, where choosing an appropriate solution extraction method is crucial for obtaining accurate data.

Conclusions

In comparison to direct measurements using lysimeters as a reference, the feasibility of the ceramic cup's negative pressure extraction estimation method was analyzed. The results demonstrated that the total calculated drainage flux and the total measured volume for lysimeter leachate were 721.43 mm and 729.92 mm, respectively, indicating that the application of the water balance equation for estimating soil drainage flux is accurate and feasible. Furthermore, through a comparative analysis of nitrate nitrogen concentrations in water samples collected by lysimeters and ceramic cups, it was observed that the ceramic cup method exhibited a certain accuracy in estimating nitrogen leaching, especially when the nitrate nitrogen concentration in lysimeter leachate fell within the range of 7–13 mg L −1 . However, under conditions of intense leaching (nitrate nitrogen concentration in lysimeter leachate exceeding 13 mg L −1 ), there was a risk of underestimation due to the potential lack of representative samples. Therefore, it is advisable to increase sampling frequency under such special circumstances.

Data availability

The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.

Abbreviations

Variety Baiye1

Variety Longjing43

Hallberg, G. R. Nitrate in groundwater in the United States. In Nitrogen Management and Groundwater Protection (ed. Follett, R. F.) 35–74 (Elsevier, 1989).

Chapter   Google Scholar  

Keeney, D. R. Sources of nitrate to ground water. CRC Crit. Rev. Environ. Contr. 16 , 257–304 (1986).

Article   CAS   Google Scholar  

Han, W. Y., Ma, L. F., Shi, Y. Z., Ruan, J. Y. & Kemmitt, S. J. Nitrogen release dynamics and transformation of slow release fertiliser products and their effects on tea yield and quality. J. Sci. Food Agric. 88 , 839–846 (2008).

Han, W., Xu, J., Wei, K., Shi, Y. & Ma, L. Estimation of N 2 O emission from tea garden soils, their adjacent vegetable garden and forest soils in eastern China. Environ. Earth Sci. 70 , 2495–2500 (2013).

Article   ADS   CAS   Google Scholar  

Ni, K. et al. Fertilization status and reduction potential in tea gardens of China. J. Plant Nutr. Fertil. 25 , 421–432 (2019).

Google Scholar  

Yan, P. et al. Tea planting affects soil acidification and nitrogen and phosphorus distribution in soil. Agric. Ecosyst. Environ. 254 , 20–25 (2018).

Wey, H., Hunkeler, D., Bischoff, W. A. & Bünemann, E. K. Field-scale monitoring of nitrate leaching in agriculture: Assessment of three methods. Environ. Monit. Assess. 194 , 1–20 (2022).

Article   Google Scholar  

Creasey, C. L. & Dreiss, S. J. Porous cup samplers: Cleaning procedures and potential sample bias from trace element contamination. Soil Sci. 145 , 93–101 (1988).

Webster, C. P., Shepherd, M. A., Goulding, K. W. T. & Lord, E. Comparisons of methods for measuring the leaching of mineral nitrogen from arable land. J. Soil Sci. 44 , 49–62 (1993).

Barbee, G. C. & Brown, K. W. Comparison between suction and free-drainage soil solution samplers. Soil Sci. 141 , 149–154 (1986).

Wolf, K. A., Pullens, J. W. & Børgesen, C. D. Optimized number of suction cups required to predict annual nitrate leaching under varying conditions in Denmark. J. Environ. Manag. 328 , 116964 (2023).

Lord, E. I. & Shepherd, M. A. Developments in the use of porous ceramic cups for measuring nitrate leaching. J. Soil Sci. 44 , 435–449 (1993).

Wang, Q. et al. Comparison of lysimeters and porous ceramic cups for measuring nitrate leaching in different soil types. New Zealand J. Agric. Res. 55 , 333–345 (2012).

Brown, S. et al. Assessing variability of soil water balance components measured at a new lysimeter facility dedicated to the study of soil ecosystem services. J. Hydrol. 603 , 127037 (2021).

Zotarelli, L., Scholberg, J. M., Dukes, M. D. & Muñoz-Carpena, R. Monitoring of nitrate leaching in sandy soils: Comparison of three methods. J. Environ. Qual. 36 , 953–962 (2007).

Article   CAS   PubMed   Google Scholar  

Zheng, S. et al. Estimation of evapotranspiration and crop coefficient of rain-fed tea plants under a subtropical climate. Agronomy 11 , 2332 (2021).

Norman, R. J., Edberg, J. C. & Stucki, J. W. Determination of nitrate in soil extracts by dual-wavelength ultraviolet spectrophotometry. Soil Sci. Soc. Am. J. 49 , 1182–1185 (1985).

Goldman, E. & Jacobs, R. Determination of Nitrates by Ultraviolet Absorption. Am. Water Works Assoc. 53 , 187–191 (1961).

Allen, R.G., Pereira, L.S., Raes, D., Smith, M. Crop Evapotranspiration-Guidelines for Computing Crop Water Requirements. FAO Irrigation and Drainage Paper 56, FAO: Rome, Italy, 1998; pp. 2–15 (1998).

Weihermüller, L. et al. In situ soil water extraction: A review. J. Environ. Qual. 36 , 1735–1748 (2007).

Article   PubMed   Google Scholar  

Hu, K., Li, B., Chen, D. & White, R. E. Estimation of water percolation and nitrogen leaching in farmland: Comparison of two models. Adv. Water Sci. 15 , 87–93 (2004).

CAS   Google Scholar  

Alizadehtazi, B., Gurian, P. L. & Montalto, F. A. Impact of successive rainfall events on the dynamic relationship between vegetation canopies, infiltration, and recharge in engineered urban green infrastructure systems. Ecohydrology 13 , e2185 (2020).

Liu, G. et al. Interactive effects of raindrop impact and groundwater seepage on soil erosion. J. Hydrol. 578 , 124066 (2019).

Wang, H. et al. Effects of rainfall intensity on groundwater recharge based on simulated rainfall experiments and a groundwater flow model. CATENA 127 , 80–91 (2015).

Wang, J., Chen, L. & Yu, Z. Modeling rainfall infiltration on hillslopes using Flux-concentration relation and time compression approximation. J. Hydrol. 557 , 243–253 (2018).

Article   ADS   Google Scholar  

Sołtysiak, M. & Rakoczy, M. An overview of the experimental research use of lysimeters. Environ. Socio-Econ. Stud. 7 , 49–56 (2019).

Beven, K. & Germann, P. Macropores and water flow in soils. Water Resour. Res. 18 , 1311–1325 (1982).

Butters, G. L., Jury, W. A. & Ernst, F. F. Field scale transport of bromide in an unsaturated soil: 1. Experimental methodology and results. Water Resour. Res. 25 , 1575–1581 (1989).

Roth, K., Jury, W. A., Flühler, H. & Attinger, W. Transport of chloride through an unsaturated field soil. Water Resour. Res. 27 , 2533–2541 (1991).

Ellsworth, T. R., Jury, W. A., Ernst, F. F. & Shouse, P. J. A three-dimensional field study of solute transport through unsaturated, layered, porous media: 1. Methodology, mass recovery, and mean transport. Water Resour. Res. 27 , 951–965 (1991).

Bronswijk, J. J. B., Hamminga, W. & Oostindie, K. Rapid nutrient leaching to groundwater and surface water in clay soil areas. Eur. J. Agron. 4 , 431–439 (1995).

Grossmann, J., Bredemeier, M. & Udluft, P. Sorption of trace elements by suction cups of aluminum-oxide, ceramic, and plastics. Z. Pflanzenernähr. Bodenkd. 153 , 359–364 (1990).

Download references

Acknowledgements

This work was financially supported by the National Key Research and Development Program of China (2022YFF0606802) and the Earmarked Fund for China Agriculture Research System (CARS-19).

Author information

Authors and affiliations.

Key Laboratory of Crop Breeding in South Zhejiang, Wenzhou Academy of Agricultural Sciences, Wenzhou, 325006, China

Shenghong Zheng, Hongling Chai & Huajing Kang

Key Laboratory of Tea Biology and Resource Utilization of Tea (Ministry of Agriculture), Tea Research Institute, Chinese Academy of Agriculture Sciences, Hangzhou, 310008, China

Shenghong Zheng, Kang Ni & Jianyun Ruan

Lishui Academy of Agricultural and Forestry Sciences, Lishui, 323000, China

Qiuyan Ning

College of Ecology, Lishui University, Lishui, 323000, China

Xihu National Agricultural Experimental Station for Soil Quality, Hangzhou, 310008, China

Jianyun Ruan

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization: SZ and JR; writing-original draft preparation: SZ;Writing-review and editing: KN, HC and JR; formal analysis: QN and CC; resources: HK; funding acquisition: JR. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Huajing Kang or Jianyun Ruan .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Zheng, S., Ni, K., Chai, H. et al. Comparative research on monitoring methods for nitrate nitrogen leaching in tea plantation soils. Sci Rep 14 , 20747 (2024). https://doi.org/10.1038/s41598-024-71081-3

Download citation

Received : 10 July 2024

Accepted : 23 August 2024

Published : 05 September 2024

DOI : https://doi.org/10.1038/s41598-024-71081-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Drainage flux
  • Nitrate leaching
  • Water balance model
  • Suction cup sampler
  • Buried lysimeter

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research report on monitoring and measuring techniques

Identification of health and safety performance improvement measuring indicators: A literature review

  • January 2011
  • Conference: West Africa Built Environment Research (WABER) Conference
  • At: Accra, Ghana

Justus Ngala Agumba at Tshwane University of Technology

  • Tshwane University of Technology

Wellington Didibhuku Thwala at Walter Sisulu University

  • Walter Sisulu University

Theo C Haupt at Nelson Mandela University

  • Nelson Mandela University

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Jin Xiaohua

  • Pavlopoulou Georgia

Alexandra Tripolitsioti

  • Pavlopoulos Themis

Seyed Sajad Mousavi

  • Adegoke I.O.

Fred Otieno

  • David Smallbone

Marcello Bertotti

  • R. Mitchell
  • ROSS W. TRETHEWY
  • J. Toellner

Helen Lingard

  • Henry W. Parker

Sathy Rajendran

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

sensors-logo

Article Menu

research report on monitoring and measuring techniques

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Implementing autonomous control in the digital-twins-based internet of robotic things for remote patient monitoring.

research report on monitoring and measuring techniques

1. Introduction

1.1. motivation, 1.2. contribution.

  • Our proposed system enables the operator to remotely monitor the PT’s autonomous navigation in the VE and switch to manual control when needed.
  • We developed and implemented a decision-making algorithm for autonomous navigation and an obstacle avoidance mechanism that uses various geometrical patterns to evade obstructions for path recalculation.
  • We analyzed the performance of PT’s navigation and patient monitoring setup and performed a comparative analysis of the RPM systems and virtual reality (VR) and DTs frameworks.

2. Related Work

3. the proposed system, 3.1. locomotion, 3.2. perception, 3.3. cognition, 3.3.1. obstacle avoidance, 3.3.2. path calculation, 3.4. navigation, 3.5. implementation, 4. experimental setup and performance evaluation, 4.1. navigation accuracy, 4.2. monitoring data quality, 4.3. comparative analysis, 5. discussion, 6. conclusions and future work, author contributions, institutional review board statement, informed consent statement, data availability statement, acknowledgments, conflicts of interest.

  • Smith, G.B.; Recio-Saucedo, A.; Griffiths, P. The measurement frequency and completeness of vital signs in general hospital wards: An evidence free zone? Int. J. Nurs. Stud. 2017 , 74 , A1–A4. [ Google Scholar ] [ CrossRef ]
  • Shaik, T.; Tao, X.; Higgins, N.; Li, L.; Gururajan, R.; Zhou, X.; Acharya, U.R. Remote patient monitoring using artificial intelligence: Current state, applications, and challenges. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2023 , 13 , e1485. [ Google Scholar ] [ CrossRef ]
  • Leila, E.; Othman, S.B.; Sakli, H. An Internet of Robotic Things System for combating coronavirus disease pandemic (COVID-19). In Proceedings of the 2020 20th International Conference on Sciences and Techniques of Automatic Control and Computer Engineering (STA), Sfax, Tunisia, 20–22 December 2020; pp. 333–337. [ Google Scholar ]
  • Wang, H.; Huang, D.; Huang, H.; Zhang, J.; Guo, L.; Liu, Y.; Ma, H.; Geng, Q. The psychological impact of COVID-19 pandemic on medical staff in Guangdong, China: A cross-sectional study. Psychol. Med. 2022 , 52 , 884–892. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Miranda, R.; Oliveira, M.D.; Nicola, P.; Baptista, F.M.; Albuquerque, I. Towards a framework for implementing remote patient monitoring from an integrated care perspective: A scoping review. Int. J. Health Policy Manag. 2023 , 12 , 7299. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Foster, C.; Schinasi, D.; Kan, K.; Macy, M.; Wheeler, D.; Curfman, A. Remote Monitoring of Patient-and Family-Generated Health Data in Pediatrics. Pediatrics 2022 , 149 , 54137. [ Google Scholar ] [ CrossRef ]
  • Hayes, C.J.; Dawson, L.; McCoy, H.; Hernandez, M.; Andersen, J.; Ali, M.M.; Bogulski, C.A.; Eswaran, H. Utilization of remote patient monitoring within the United States health care system: A scoping review. Telemed. E-Health 2023 , 29 , 384–394. [ Google Scholar ] [ CrossRef ]
  • Khan, M.A.; Din, I.U.; Kim, B.-S.; Almogren, A. Visualization of Remote Patient Monitoring System Based on Internet of Medical Things. Sustainability 2023 , 15 , 8120. [ Google Scholar ] [ CrossRef ]
  • Farias, F.A.C.d.; Dagostini, C.M.; Bicca, Y.d.A.; Falavigna, V.F.; Falavigna, A. Remote patient monitoring: A systematic review. Telemed. E-Health 2020 , 26 , 576–583. [ Google Scholar ] [ CrossRef ]
  • Hidefjäll, P.; Laurell, H.; Johansson, J.; Barlow, J. Institutional logics and the adoption and implementation of remote patient monitoring. Innovation 2023 , 2162907. [ Google Scholar ] [ CrossRef ]
  • Pradhan, B.; Bharti, D.; Chakravarty, S.; Ray, S.S.; Voinova, V.V.; Bonartsev, A.P.; Pal, K. Internet of things and robotics in transforming current-day healthcare services. J. Healthc. Eng. 2021 , 2021 , 9999504. [ Google Scholar ] [ CrossRef ]
  • Khan, S.; Ullah, S.; Khan, H.U.; Rehman, I.U. Digital-Twins-Based Internet of Robotic Things for Remote Health Monitoring of COVID-19 Patients. IEEE Internet Things J. 2023 , 10 , 16087–16098. [ Google Scholar ] [ CrossRef ]
  • Vermesan, O.; Bahr, R.; Ottella, M.; Serrano, M.; Karlsen, T.; Wahlstrøm, T.; Sand, H.E.; Ashwathnarayan, M.; Gamba, M.T. Internet of robotic things intelligent connectivity and platforms. Front. Robot. AI. 2020 , 7 , 104. [ Google Scholar ] [ CrossRef ]
  • Haag, S.; Anderl, R. Digital twin–Proof of concept. Manuf. Lett. 2018 , 15 , 64–66. [ Google Scholar ] [ CrossRef ]
  • da Silva Mendonça, R.; de Oliveira Lins, S.; de Bessa, I.V.; de Carvalho Ayres, F.A., Jr.; de Medeiros, R.L.P.; de Lucena, V.F., Jr. Digital twin applications: A survey of recent advances and challenges. Processes 2022 , 10 , 744. [ Google Scholar ] [ CrossRef ]
  • Semeraro, C.; Olabi, A.; Aljaghoub, H.; Alami, A.H.; Al Radi, M.; Dassisti, M.; Abdelkareem, M.A. Digital twin application in energy storage: Trends and challenges. J. Energy Storage 2023 , 58 , 106347. [ Google Scholar ] [ CrossRef ]
  • Bondarenko, O.; Fukuda, T. Development of a diesel engine’s digital twin for predicting propulsion system dynamics. Energy 2020 , 196 , 117126. [ Google Scholar ] [ CrossRef ]
  • Mamun, K.; Sharma, A.; Hoque, A.; Szecsi, T. Remote patient physical condition monitoring service module for iWARD hospital robots. In Proceedings of the Asia-Pacific World Congress on Computer Science and Engineering, Nadi, Fiji, 4–5 November 2014; pp. 1–8. [ Google Scholar ]
  • Shwetha, R.; Kirubanand, V. Remote monitoring of heart patients using robotic process automation (RPA). In Proceedings of the ITM Web of Conferences, Online, 27–29 January 2021; p. 01002. [ Google Scholar ]
  • Arabi, Y.M.; Murthy, S.; Webb, S. COVID-19: A novel coronavirus and a novel challenge for critical care. Intensive Care Med. 2020 , 46 , 833–836. [ Google Scholar ] [ CrossRef ]
  • Ruan, K.; Wu, Z.; Xu, Q. Smart cleaner: A new autonomous indoor disinfection robot for combating the COVID-19 pandemic. Robotics 2021 , 10 , 87. [ Google Scholar ] [ CrossRef ]
  • Mohammadi, A.; Kucharski, A.; Rawashdeh, N. UVC and far-UVC light disinfection ground robot design for sterilizing the Coronavirus on vertical surfaces. In Proceedings of the Autonomous Systems: Sensors, Processing and Security for Ground, Air, Sea and Space Vehicles and Infrastructure 2022, Orlando, FL, USA, 3–7 April 2022; pp. 57–63. [ Google Scholar ]
  • Plunk, A.; Smith, J.; Strickland, D.; Erkal, C.; Sargolzaei, S. AutonoMop, Automating Mundane Work in the COVID-19 Era. In Proceedings of the SoutheastCon 2022, Mobile, AL, USA, 26 March–3 April 2022; pp. 626–630. [ Google Scholar ]
  • Fanti, M.P.; Mangini, A.M.; Roccotelli, M.; Silvestri, B. Hospital drugs distribution with autonomous robot vehicles. In Proceedings of the 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), Hong Kong, China, 20–21 August 2020; pp. 1025–1030. [ Google Scholar ]
  • Amin, R.; Islam, S.H.; Biswas, G.; Khan, M.K.; Kumar, N. A robust and anonymous patient monitoring system using wireless medical sensor networks. Future Gener. Comput. Syst. 2018 , 80 , 483–495. [ Google Scholar ] [ CrossRef ]
  • Bajeh, A.O.; Mojeed, H.A.; Ameen, A.O.; Abikoye, O.C.; Salihu, S.A.; Abdulraheem, M.; Oladipo, I.D.; Awotunde, J.B. Internet of robotic things: Its domain, methodologies, and applications. In Emergence of Cyber Physical System and IoT in Smart Automation and Robotics ; Springer: Berlin/Heidelberg, Germany, 2021; pp. 135–146. [ Google Scholar ]
  • Khan, Z.H.; Siddique, A.; Lee, C.W. Robotics utilization for healthcare digitization in global COVID-19 management. Int. J. Environ. Res. Public Health 2020 , 17 , 3819. [ Google Scholar ] [ CrossRef ]
  • Krell, E.; Sheta, A.; Balasubramanian, A.P.R.; King, S.A. Collision-free autonomous robot navigation in unknown environments utilizing PSO for path planning. J. Artif. Intell. Soft. Comput. Res. 2019 , 9 , 267–282. [ Google Scholar ] [ CrossRef ]
  • Hopko, S.K.; Mehta, R.K.; Pagilla, P.R. Physiological and perceptual consequences of trust in collaborative robots: An empirical investigation of human and robot factors. Appl. Ergon. 2023 , 106 , 103863. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Parsons, H.M.; KearsleY, G.P. Human Factors and Robotics: CurrentStatus and-Future. Science 1980 , 208 , 1327–1335. [ Google Scholar ]
  • Khan, H.R.; Haura, I.; Uddin, R. RoboDoc: Smart Robot Design Dealing with Contagious Patients for Essential Vitals Amid COVID-19 Pandemic. Sustainability 2023 , 15 , 1647. [ Google Scholar ] [ CrossRef ]
  • Mamun, K.A.; Sharma, A.; Islam, F.; Hoque, A.; Szecsi, T. Patient Condition Monitoring Modular Hospital Robot. J. Softw. 2016 , 11 , 768–786. [ Google Scholar ] [ CrossRef ]
  • Mišeikis, J.; Caroni, P.; Duchamp, P.; Gasser, A.; Marko, R.; Mišeikienė, N.; Zwilling, F.; De Castelbajac, C.; Eicher, L.; Früh, M. Lio-a personal robot assistant for human-robot interaction and care applications. IEEE Robot. Autom. Lett. 2020 , 5 , 5339–5346. [ Google Scholar ] [ CrossRef ]
  • Cantone, A.A.; Esposito, M.; Perillo, F.P.; Romano, M.; Sebillo, M.; Vitiello, G. Enhancing Elderly Health Monitoring: Achieving Autonomous and Secure Living through the Integration of Artificial Intelligence, Autonomous Robots, and Sensors. Electronics 2023 , 12 , 3918. [ Google Scholar ] [ CrossRef ]
  • Rai, A.; Kundu, K.; Dev, R.; Keshari, J.P.; Gupta, D. Design and development Virtual Doctor Robot for contactless monitoring of patients during COVID-19. Int. J. Exp. Res. Rev. 2023 , 31 , 42–50. [ Google Scholar ] [ CrossRef ]
  • Mireles, C.; Sanchez, M.; Cruz-Ortiz, D.; Salgado, I.; Chairez, I. Home-care nursing controlled mobile robot with vital signal monitoring. Med. Biol. Eng. Comput. 2023 , 61 , 399–420. [ Google Scholar ] [ CrossRef ]
  • Antony, M.; Parameswaran, M.; Mathew, N.; Sajithkumar, V.; Joseph, J.; Jacob, C.M. Design and implementation of automatic guided vehicle for hospital application. In Proceedings of the 2020 5th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 10–12 June 2020; pp. 1031–1036. [ Google Scholar ]
  • Topini, A.; Sansom, W.; Secciani, N.; Bartalucci, L.; Ridolfi, A.; Allotta, B. Variable admittance control of a hand exoskeleton for virtual reality-based rehabilitation tasks. Front. Neurorobot. 2022 , 15 , 789743. [ Google Scholar ] [ CrossRef ]
  • Kalinov, I.; Trinitatova, D.; Tsetserukou, D. Warevr: Virtual reality interface for supervision of autonomous robotic system aimed at warehouse stocktaking. In Proceedings of the 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Melbourne, Australia, 17–20 October 2021; pp. 2139–2145. [ Google Scholar ]
  • Ponomareva, P.; Trinitatova, D.; Fedoseev, A.; Kalinov, I.; Tsetserukou, D. Grasplook: A vr-based telemanipulation system with r-cnn-driven augmentation of virtual environment. In Proceedings of the 2021 20th International Conference on Advanced Robotics (ICAR), Ljubljana, Slovenia, 6–10 December 2021; pp. 166–171. [ Google Scholar ]
  • Solanes, J.E.; Muñoz, A.; Gracia, L.; Tornero, J. Virtual reality-based interface for advanced assisted mobile robot teleoperation. Appl. Sci. 2022 , 12 , 6071. [ Google Scholar ] [ CrossRef ]
  • Garg, G.; Kuts, V.; Anbarjafari, G. Digital twin for fanuc robots: Industrial robot programming and simulation using virtual reality. Sustainability 2021 , 13 , 10336. [ Google Scholar ] [ CrossRef ]
  • Tähemaa, T.; Bondarenko, Y. Digital twin based synchronised control and simulation of the industrial robotic cell using virtual reality. J. Mach. Eng. 2019 , 19 , 128–144. [ Google Scholar ]
  • Laaki, H.; Miche, Y.; Tammi, K. Prototyping a digital twin for real time remote control over mobile networks: Application of remote surgery. IEEE Access 2019 , 7 , 20325–20336. [ Google Scholar ] [ CrossRef ]
  • Rubio, F.; Valero, F.; Llopis-Albert, C. A review of mobile robots: Concepts, methods, theoretical framework, and applications. Int. J. Adv. Robot. Syst. 2019 , 16 , 1729881419839596. [ Google Scholar ] [ CrossRef ]
  • Siegwart, R.; Nourbakhsh, I.R. Introduction to Autonomous Mobile Robots ; A Bradford Book; The MIT Press: Cambridge, MA, USA; London, UK, 2004; ISBN 0-262-19502-X. [ Google Scholar ]
  • Medina Sánchez, C.; Zella, M.; Capitán, J.; Marrón, P.J. From perception to navigation in environments with persons: An indoor evaluation of the state of the art. Sensors 2022 , 22 , 1191. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Fusic, S.; Sugumari, T. A review of perception-based navigation system for autonomous mobile robots. Rec. Patent. Eng. 2023 , 17 , 13–22. [ Google Scholar ] [ CrossRef ]
  • Nakhaeinia, D.; Tang, S.H.; Noor, S.M.; Motlagh, O. A review of control architectures for autonomous navigation of mobile robots. Int. J. Phy. Sci. 2011 , 6 , 169–174. [ Google Scholar ]
  • Chen, H.; Hailey, D.; Wang, N.; Yu, P. A review of data quality assessment methods for public health information systems. Int. J. Environ. Res. Public Health 2014 , 11 , 5170–5207. [ Google Scholar ] [ CrossRef ]
  • Connelly, L.M. Introduction to analysis of variance (ANOVA). Medsurg. Nurs. 2021 , 30 , 158–218. [ Google Scholar ]
  • Lee, Y.W.; Pipino, L.L.; Funk, J.D.; Wang, R.Y. Journey to Data Quality ; The MIT Press: Cambridge, MA, USA; London, UK, 2009; p. 240. [ Google Scholar ]
  • Liaw, S.-T.; Rahimi, A.; Ray, P.; Taggart, J.; Dennis, S.; de Lusignan, S.; Jalaludin, B.; Yeo, A.; Talaei-Khoei, A. Towards an ontology for data quality in integrated chronic disease management: A realist review of the literature. Int. J. Med. Inform. 2013 , 82 , 10–24. [ Google Scholar ] [ CrossRef ] [ PubMed ]

Click here to enlarge figure

S. NoFunctions/KeywordsDescription
1Forward()Moves forward
2Backward()Moves backward
3Right()Turns right
4Left()Turns left
5nLeft()Turns left 90°
6fLeft()Turns left 45°
7fRight()Turns right 45°
8nRight()Turns right 90°
9eTurn()Turns right 180°
10mForwardMoves forward 0.5 m
11sForward()Moves forward 0.7 m
12oForward()Moves forward 1 m
13eForward()Moves forward 0.8 m
14mReverse()Moves backward 0.5
15Stop()Comes to halt
16Wait()Waits for 10 s
17TP Target position 1 (19 m)
18TP Target position 2 (15.62)
19TP Target position 3 (12.24)
20TP Target position 4 (8.86)
21DistDistance
22oLeftLeft obstacle distance less than 0.4 cm
23oRightRight obstacle distance less than 0.4 cm
24oFrontFront obstacle distance less than 0.4 cm
25U Upkey
26D Down key
27R Right key
28L Left key
29aModeAutonomous mode
30mModeManual mode
31UCUser command
32oAvoidObstacle avoidance
33SPStarting point
TasksTarget PositionsObstaclesD (m)MD (m)ErrorAccuracy
Task 1TP 3837.212.08%97.92%
Task 2TP 31.2430.691.76%98.24%
Task 3TP 24.4824.081.63%98.37%
Task 4TP 17.7217.451.52%98.48%
TasksTarget PositionsObstaclesObstacles StatusObstacles PositionsD (m)MD (m)ErrorAccuracy
Task 5TP 1StaticTP (12.24 m) Front3837.052.50%97.50%
Task 6TP 2StaticTP Front, Left3837.052.50%97.50%
Task 7TP 2StaticTP Front, Right3837.042.53%97.47%
Task 8TP 3StaticTP Front, Right, Left38372.64%97.37%
Task 9TP 1MovingTP Front3837.042.53%97.47%
Tasks ME (cm) SD
Task 17913.93
Task 2557.31
Task 3407.71
Task 4278.18
Tasks ME (cm) SD
Task 59514.18
Task 695.514.2
Task 795.914.19
Task 8100.316.77
Task 995.914.14
Fdfp-Value
Errors53.1930.012
Fdfp-Value
Errors0.2240.985
TasksSensorsAccuracyCompletenessTimeliness
Task 1Heartbeat0.9770.9840.967
Oxygen0.9860.9810.967
Temperature0.9810.9810.967
Task 4Heartbeat0.9780.9850.912
Oxygen0.9870.9860.912
Temperature0.9840.9820.912
SystemsTechnologiesConnectivityInterfaceModeServicesEvaluation ProtocolNavigation AccuracyComparative Analysis
[ ]RoboticsWi-Fi,
Bluetooth
Desktop-based GUIManualRPM
[ ]RoboticsBluetooth,
Wi-Fi
Web-basedAutonomousRPM,
Detecting patients lying on floor
[ ]RoboticsWi-FiWeb-basedAutonomousDetecting infected patients,
Surface disinfection
85.5%
[ ]IoRTWi-Fi,
IEEE 802.22,
Ethernet
Telegram bot interfaceAutonomousElderly people monitoring
[ ]Robotics,
IoT
Wi-FiMobile ApplicationAutonomousRPM
[ ]RoboticsZigBeeGUIAutonomousRPM,
Gait cycle assistance
[ ]RoboticsBluetooth,
Internet
Android ApplicationAutonomous,
Manual
RPM,
Medicine delivery,
Waste collection
Proposed SystemDTs-based IoRTBluetooth,
NRF24L01+
Desktop-based VRAutonomous,
Manual
RPM97.81%
SystemsAccuracyCompletenessTimelinessComparative Analysis
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]0.970
[ ]
Proposed System0.9820.9830.940
PapersSystemsTechnologiesDescriptionApplicationsAutonomous OperationExternal Sensors ConnectivityComparative Analysis
[ ]VR systemVRControlling mobile robotTask inspection
[ ]Robotic hand exoskeletonVR,
DT
Observing virtual objectsRehabilitation
[ ]WareVRVR,
DT
Monitoring autonomous robotTransporting stock in a warehouse
[ ]Robotic armVR,
DT
TeleoperationConducting laboratory tests
[ ]DTs-based robotic systemVR,
DT
Controlling a FANUC robotIndustrial processes
[ ]DTs-based robotic systemVR,
DT
Controlling Industrial manufacturing robotsIndustrial processes
[ ]Robotic armVR,
DT
To perform remote surgeryMedical purpose
Proposed SystemDTs-based IoRTDTs, IoRT, VRControl and monitor autonomous robotRPM
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Khan, S.; Ullah, S.; Ullah, K.; Almutairi, S.; Aftan, S. Implementing Autonomous Control in the Digital-Twins-Based Internet of Robotic Things for Remote Patient Monitoring. Sensors 2024 , 24 , 5840. https://doi.org/10.3390/s24175840

Khan S, Ullah S, Ullah K, Almutairi S, Aftan S. Implementing Autonomous Control in the Digital-Twins-Based Internet of Robotic Things for Remote Patient Monitoring. Sensors . 2024; 24(17):5840. https://doi.org/10.3390/s24175840

Khan, Sangeen, Sehat Ullah, Khalil Ullah, Sulaiman Almutairi, and Sulaiman Aftan. 2024. "Implementing Autonomous Control in the Digital-Twins-Based Internet of Robotic Things for Remote Patient Monitoring" Sensors 24, no. 17: 5840. https://doi.org/10.3390/s24175840

Article Metrics

Further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

COMMENTS

  1. How to Write a Monitoring and Evaluation Report

    How to Write a Monitoring and Evaluation Report

  2. (PDF) Principles and Practice of Monitoring and Evaluation: A

    (PDF) Principles and Practice of Monitoring and Evaluation

  3. Perspectives on Monitoring and Evaluation

    Submit Paper. American Journal of Evaluation. Impact Factor: 1.1 / 5-Year Impact Factor: 1.7 . Journal Homepage. ... Monitoring and Evaluation Training: A Systematic Approach. Thousand Oaks, CA: Sage. 464 pp. $69 (paperback), ISBN 9781452288918. ... Sage Research Methods Supercharging research opens in new tab;

  4. Performance Measurement and Performance Indicators: A Literature Review

    Darlene Russ-Eft, PhD, is professor and Discipline Liaison of Adult Education and Higher Education Leadership in the College of Education at Oregon State University.Her research, books, and articles focus on program evaluation, workplace learning and development, and competency research. She was president of the Academy of Human Resource Development; a former director of the International ...

  5. PDF Basic Principles of Monitoring and Evaluation

    monitoring and evaluation for organizational learning, decision-making and accountability. The setting up a performance monitoring system for youth employment programmes, therefore, requires: clarifying programme objectives; identifying performance indicators; setting the baseline and targets, monitoring results, and reporting.

  6. A Survey of Data Quality Measurement and Monitoring Tools

    The aim of this survey is to observe not only the functionalities of current DQ tools in terms of data profiling and measurement, but also in terms of true DQ monitoring. Pushkarev et al. (2010) and a follow-up study (Pulla et al., 2016) point out that none of the tools observed had any monitoring functionality.

  7. Introduction to Monitoring and Evaluation: The Basics

    ️Designing a Monitoring and Evaluation Plan: Steps and Strategies. Designing a monitoring and evaluation (M&E) plan involves several steps and strategies to ensure that the plan is effective in measuring program performance, identifying areas for improvement, and making evidence-based decisions. Here are some of the key steps and strategies: ...

  8. Monitoring and Evaluation: Tools, Methods and Approaches

    Monitoring and Evaluation: Tools, Methods and Approaches

  9. PDF Measuring and Monitoring progress towards the Sustainable ...

    MEASURING AND MONITORING SDGs

  10. The Importance of Monitoring and Evaluation for Decision-Making

    As Fig. 4.3 shows, evaluators often use monitoring reports to inform their analysis. These reports are prepared by two sources: external consultants or the program/project actors. If an independent external monitor is hired, then the monitor will also inform their work with the ongoing monitoring reports and any information related to other monitoring-related activities completed by the ...

  11. Basic Measurement and Monitoring Techniques

    Basic Measurement and Monitoring Techniques. March 2021. DOI: 10.1007/978-981-33-4665-9_14. In book: Fiber Optic Communications (pp.577-622) Authors: Gerd Keiser. To read the full-text of this ...

  12. Measurement and monitoring of safety: impact and challenges of putting

    Background. The measurement and monitoring of safety continues to be a priority for all healthcare systems. While the extent of serious harm from healthcare, and in particular the mortality from unsafe care, is much debated, there is little doubt that care is often unreliable and sometimes harmful.1-3 To make healthcare safer, organisations need to continually measure harm and reliability to ...

  13. Measuring best practices for workplace safety, health and wellbeing

    Measuring best practices for workplace safety, health and ...

  14. PDF Using Mixed Methods in Monitoring and Evaluation

    evaluation methods; it also explores the importance of examining "process" in addition to "impact", This paper—a product of the Poverty and Inequality Team, Development Research Group—is part of a larger effort in the department to integrate qualitative and quantitative methods for monitoring and evaluation. Policy Research Working

  15. Understanding Evaluation Methodologies: M&E Methods and Techniques for

    Understanding Evaluation Methodologies: M&E Methods ...

  16. Qualitative Methods in Monitoring and Evaluation

    Qualitative Methods in Monitoring and Evaluation

  17. (PDF) Framework for Monitoring and Measuring Construction Safety

    A mixed-methods research approach was used to (1)clearly identify and define elements of the safety management process that can be measured and monitored during the construction phase, (2)describe ...

  18. PDF Methodologies for data collection and analysis for monitoring and

    Methodologies for data collection and analysis for monitoring ...

  19. The path toward successful safety performance measurement

    4.2. Measurement of research variables. Measurement of the research variables in this study utilized a recently developed safety performance measurement maturity model (Jääskeläinen, Tappura, & Pirhonen, 2019).The companies participating in this study were already involved in the testing of the developed measurement instrument and confirmed its applicability in their firm context.

  20. PDF Element 4

    3 Element 4: Health and safety monitoring and measuring 4.0 Learning outcomes and assessment criteria The learner should be able to: z Take part in incident investigations 4.2Explain why and how incidents should be investigated, recorded and reported z Help their employer to check their management system effectiveness - through monitoring, audits and reviews

  21. Health and safety monitoring and measuring

    ABSTRACT. The main purpose of monitoring health and safety performance is to provide information on the progress and current status of the strategies, processes and activities employed to control health and safety risks. Effective measurement not only provides information on what the levels are but also why they are at this level, so that ...

  22. Application of PS-InSAR and Diagnostic Train Measurement Techniques for

    As an auxiliary monitoring approach, Interferometric Synthetic Aperture Radar (InSAR) methods, which are capable of measuring ground displacement at the millimeter level and detecting track settlement or subsidence across whole railway networks [3,4], for monitoring transportation infrastructures have gained popularity in recent time owing to ...

  23. Comparative research on monitoring methods for nitrate nitrogen

    Here we employed two common techniques to measure nitrate leaching in tea plantation soils in subtropical China. ... Comparative research on monitoring methods for nitrate nitrogen leaching in tea ...

  24. Identification of health and safety performance improvement measuring

    The research concludes that if significant improvements in the delivery of construction projects are to be attained, project management techniques need to be used adequately regardless of the type ...

  25. Sensors

    The noninvasive measurement and sensing of vital bio signs, such as respiration and cardiopulmonary parameters, has become an essential part of the evaluation of a patient's physiological condition. The demand for new technologies that facilitate remote and noninvasive techniques for such measurements continues to grow. While previous research has made strides in the continuous monitoring of ...

  26. Sensors

    In conventional patient monitoring methods, medical personnel keep manual records and continuously monitor patients' health. Hospitals have limited resources, thus manually taking patients' vital signs depends on many factors, including clinical workload, staff working hours, and patient diagnosis [].Furthermore, invasive devices are used for patient monitoring, which requires skin-to-skin ...

  27. Solved Preparing a brief research report on monitoring and

    Preparing a brief research report on monitoring and measuring techniques 2 Prepare a brief research report that • critically reviews techniques for monitoring and measuring health and safety performance • evaluates the effectiveness of your chosen organisation's health and safety monitoring and measuring techniques • makes TWO recommendations for improving the monitoring and measuring ...

  28. Measuring food preference and reward: Application and cross-cultural

    Decisions about what we eat play a central role in human appetite and energy balance. Measuring food reward and its underlying components of implicit motivation (wanting) and explicit sensory pleasure (liking) is therefore important in understanding which foods are preferred in a given context and at a given moment in time. Among the different methods used to measure food reward, the Leeds ...