National Academies Press: OpenBook

Effective Experiment Design and Data Analysis in Transportation Research (2012)

Chapter: chapter 3 - examples of effective experiment design and data analysis in transportation research.

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

10 Examples of Effective Experiment Design and Data Analysis in Transportation Research About this Chapter This chapter provides a wide variety of examples of research questions. The examples demon- strate varying levels of detail with regard to experiment designs and the statistical analyses required. The number and types of examples were selected after consulting with many practitioners. The attempt was made to provide a couple of detailed examples in each of several areas of transporta- tion practice. For each type of problem or analysis, some comments also appear about research topics in other areas that might be addressed using the same approach. Questions that were briefly introduced in Chapter 2 are addressed in considerably more depth in the context of these examples. All the examples are organized and presented using the outline below. Where applicable, ref- erences to the two-volume primer produced under NCHRP Project 20-45 have been provided to encourage the reader to obtain more detail about calculation techniques and more technical discussion of issues. Basic Outline for Examples The numbered outline below is the model for the structure of all of the examples that follow. 1. Research Question/Problem Statement: A simple statement of the research question is given. For example, in the maintenance category, does crack sealant A perform better than crack sealant B? 2. Identification and Description of Variables: The dependent and independent variables are identified and described. The latter includes an indication of whether, for example, the variables are discrete or continuous. 3. Data Collection: A hypothetical scenario is presented to describe how, where, and when data should be collected. As appropriate, reference is made to conventions or requirements for some types of data (e.g., if delay times at an intersection are being calculated before and after some treatment, the data collected need to be consistent with the requirements in the Highway Capacity Manual). Typical problems are addressed, such as sample size, the need for control groups, and so forth. 4. Specification of Analysis Technique and Data Analysis: The links between successfully framing the research question, fully describing the variables that need to be considered, and the specification of the appropriate analysis technique are highlighted in each example. Refer- ences to NCHRP Project 20-45 are provided for additional detail. The appropriate types of statistical test(s) are described for the specific example. 5. Interpreting the Results: In each example, results that can be expected from the analysis are discussed in terms of what they mean from a statistical perspective (e.g., the t-test result from C h a p t e r 3

examples of effective experiment Design and Data analysis in transportation research 11 a comparison of means indicates whether the mean values of two distributions can be con- sidered to be equal with a specified degree of confidence) as well as an operational perspective (e.g., judging whether the difference is large enough to make an operational difference). In each example, the typical results and their limitations are discussed. 6. Conclusion and Discussion: This section recaps how the early steps in the process lead directly to the later ones. Comments are made regarding how changes in the early steps can affect not only the results of the analysis but also the appropriateness of the approach. 7. Applications in Other Areas of Transportation Research: Each example includes a short list of typical applications in other areas of transportation research for which the approach or analysis technique would be appropriate. Techniques Covered in the Examples The determination of what kinds of statistical techniques to include in the examples was made after consulting with a variety of professionals and examining responses to a survey of research- oriented practitioners. The examples are not exhaustive insofar as not every type of statistical analysis is covered. However, the attempt has been made to cover a representative sample of tech- niques that the practitioner is most likely to encounter in undertaking or supervising research- oriented projects. The following techniques are introduced in one or more examples: • Descriptive statistics • Fitting distributions/goodness of fit (used in one example) • Simple one- and two-sample comparison of means • Simple comparisons of multiple means using analysis of variance (ANOVA) • Factorial designs (also ANOVA) • Simple comparisons of means before and after some treatment • Complex before-and-after comparisons involving control groups • Trend analysis • Regression • Logit analysis (used in one example) • Survey design and analysis • Simulation • Non-parametric methods (used in one example) Although the attempt has been made to make the examples as readable as possible, some tech- nical terms may be unfamiliar to some readers. Detailed definitions for most applicable statistical terms are available in the glossary in NCHRP Project 20-45, Volume 2, Appendix A. Most defini- tions used here are consistent with those contained in NCHRP Project 20-45, which contains useful information for everyone from the beginning researcher to the most accomplished statistician. Some variations appear in the notations used in the examples. For example, in statistical analy- sis an alternate hypothesis may be represented by Ha or by H1, and readers will find both notations used in this report. The examples were developed by several authors with differing backgrounds, and latitude was deliberately given to the authors to use the notations with which they are most familiar. The variations have been included purposefully to acquaint readers with the fact that the same concepts (e.g., something as simple as a mean value) may be noted in various ways by different authors or analysts. Finally, the more widely used techniques, such as analysis of variance (ANOVA), are applied in more than one example. Readers interested in ANOVA are encouraged to read all the ANOVA examples as each example presents different aspects of or perspectives on the approach, and computational techniques presented in one example may not be repeated in later examples (although a citation typically is provided).

12 effective experiment Design and Data analysis in transportation research Areas Covered in the Examples Transportation research is very broad, encompassing many fields. Based on consultation with many research-oriented professionals and a survey of practitioners, key areas of research were identified. Although these areas have lots of overlap, explicit examples in the following areas are included: • Construction • Environment • Lab testing and instrumentation • Maintenance • Materials • Pavements • Public transportation • Structures/bridges • Traffic operations • Traffic safety • Transportation planning • Work zones The 21 examples provided on the following pages begin with the most straightforward ana- lytical approaches (i.e., descriptive statistics) and progress to more sophisticated approaches. Table 1 lists the examples along with the area of research and method of analysis for each example. Example 1: Structures/Bridges; Descriptive Statistics Area: Structures/bridges Method of Analysis: Descriptive statistics (exploring and presenting data to describe existing conditions and develop a basis for further analysis) 1. Research Question/Problem Statement: An engineer for a state agency wants to determine the functional and structural condition of a select number of highway bridges located across the state. Data are obtained for 100 bridges scheduled for routine inspection. The data will be used to develop bridge rehabilitation and/or replacement programs. The objective of this analysis is to provide an overview of the bridge conditions, and to present various methods to display the data in a concise and meaningful manner. Question/Issue Use collected data to describe existing conditions and prepare for future analysis. In this case, bridge inspection data from the state are to be studied and summarized. 2. Identification and Description of Variables: Bridge inspection generally entails collection of numerous variables that include location information, traffic data, structural elements’ type and condition, and functional characteristics. In this example, the variables are: bridge condition ratings of the deck, superstructure, and substructure; and overall condition of the bridge. Based on the severity of deterioration and the extent of spread through a bridge component, a condition rating is assigned on a discrete scale from 0 (failed) to 9 (excellent). These ratings (in addition to several other factors) are used in categorization of a bridge in one of three overall conditions: not deficient; structurally deficient; or functionally obsolete.

examples of effective experiment Design and Data analysis in transportation research 13 Example Area Method of Analysis 1 Structures/bridges Descriptive statistics (exploring and presenting data to describe existing conditions) 2 Public transport Descriptive statistics (organizing and presenting data to describe a system or component) 3 Environment Descriptive statistics (organizing and presenting data to explain current conditions) 4 Traffic operations Goodness of fit (chi-square test; determining if observed/collected data fit a certain distribution) 5 Construction Simple comparisons to specified values (t-test to compare the mean value of a small sample to a standard or other requirement) 6 Maintenance Simple two-sample comparison (t-test for paired comparisons; comparing the mean values of two sets of matched data) 7 Materials Simple two-sample comparisons (t-test for paired comparisons and the F-test for comparing variances) 8 Laboratory testing and/or instrumentation Simple ANOVA (comparing the mean values of more than two samples using the F-test) 9 Materials Simple ANOVA (comparing more than two mean values and the F-test for equality of means) 10 Pavements Simple ANOVA (comparing the mean values of more than two samples using the F-test) 11 Pavements Factorial design (an ANOVA approach exploring the effects of varying more than one independent variable) 12 Work zones Simple before-and-after comparisons (exploring the effect of some treatment before it is applied versus after it is applied) 13 Traffic safety Complex before-and-after comparisons using control groups (examining the effect of some treatment or application with consideration of other factors) 14 Work zones Trend analysis (examining, describing, and modeling how something changes over time) 15 Structures/bridges Trend analysis (examining a trend over time) 16 Transportation planning Multiple regression analysis (developing and testing proposed linear models with more than one independent variable) 17 Traffic operations Regression analysis (developing a model to predict the values that a dependent variable can take as a function of one or more independent variables) 18 Transportation planning Logit and related analysis (developing predictive models when the dependent variable is dichotomous) 19 Public transit Survey design and analysis (organizing survey data for statistical analysis) 20 Traffic operations Simulation (using field data to simulate or model operations or outcomes) 21 Traffic safety Non-parametric methods (methods to be used when data do not follow assumed or conventional distributions) Table 1. Examples provided in this report.

14 effective experiment Design and Data analysis in transportation research 3. Data Collection: Data are collected at 100 scheduled locations by bridge inspectors. It is important to note that the bridge condition rating scale is based on subjective categories, and there may be inherent variability among inspectors in their assignment of ratings to bridge components. A sample of data is compiled to document the bridge condition rating of the three primary structural components and the overall condition by location and ownership (Table 2). Notice that the overall condition of a bridge is not necessarily based only on the condition rating of its components (e.g., they cannot just be added). 4. Specification of Analysis Technique and Data Analysis: The two primary variables of inter- est are bridge condition rating and overall condition. The overall condition of the bridge is a categorical variable with three possible values: not deficient; structurally deficient; and functionally obsolete. The frequencies of these values in the given data set are calculated and displayed in the pie chart below. A pie chart provides a visualization of the relative proportions of bridges falling into each category that is often easier to communicate to the reader than a table showing the same information (Figure 1). Another way to look at the overall bridge condition variable is by cross-tabulation of the three condition categories with the two location categories (urban and rural), as shown in Table 3. A cross-tabulation provides the joint distribution of two (or more) variables such that each cell represents the frequency of occurrence of a specific combination of pos- sible values. For example, as seen in Table 3, there are 10 structurally deficient bridges in rural areas, which represent 11.4% of all rural area bridges inspected. The numbers in the parentheses are column percentages and add up to 100%. Table 3 also shows that 88 of the bridges inspected were located in rural areas, whereas 12 were located in urban areas. The mean values of the bridge condition rating variable for deck, superstructure, and sub- structure are shown in Table 4. These have been calculated by taking the sum of all the values and then dividing by the total number of cases (100 in this example). Generally, a condition rating Bridge No. Owner Location Bridge Condition Rating Overall Condition Deck Superstructure Substructure 1 State Rural 8 8 8 ND* 7 Local agency Rural 6 6 6 FO* 39 State Urban 6 6 2 SD* 69 State park Rural 7 5 5 SD 92 City Urban 5 6 6 ND *ND = not deficient; FO: functionally obsolete; SD: structurally deficient. Table 2. Sample bridge inspection data. Structurally Deficient (SD), 13% Functionally Obsolete (FO), 10% Neither SD/FO, 77% Figure 1. Highway bridge conditions.

examples of effective experiment Design and Data analysis in transportation research 15 of 4 or below indicates deficiency in a structural component. For the purpose of comparison, the mean bridge condition rating of the 13 structurally deficient bridges also is provided. Notice that while the rating scale for the bridge conditions is discrete with values ranging from 0 (failure) to 9 (excellent), the average bridge condition variable is continuous. Therefore, an average score of 6.47 would indicate overall condition of all bridges to be between 6 (satisfactory) and 7 (good). The combined bridge condition rating of deck, superstructure, and substructure is not defined; therefore calculating the mean of the three components’ average rating would make no sense. Also, the average bridge condition rating of functionally obsolete bridges is not calculated because other functional characteristics also accounted for this designation. The distributions of the bridge condition ratings for deck, superstructure, and substructure are shown in Figure 2. Based on the cut-off point of 4, approximately 7% of all bridge decks, 2% of all superstructures, and 5% of all substructures are deficient. 5. Interpreting the Results: The results indicate that a majority of bridges (77%) are not struc- turally or functionally deficient. The inspections were carried out on bridges primarily located in rural areas (88 out of 100). The bridge condition variable may also be cross-tabulated with the ownership variable to determine distribution by jurisdiction. The average condition ratings for the three bridge components for all bridges lies between 6 (satisfactory, some minor problems) and 7 (good, no problems noted). 6. Conclusion and Discussion: This example illustrates how to summarize and present quan- titative and qualitative data on bridge conditions. It is important to understand the mea- surement scale of variables in order to interpret the results correctly. Bridge inspection data collected over time may also be analyzed to determine trends in the condition of bridges in a given area. Trend analysis is addressed in Example 15 (structures). 7. Applications in Other Areas of Transportation Research: Descriptive statistics could be used to present data in other areas of transportation research, such as: • Transportation Planning—to assess the distribution of travel times between origin- destination pairs in an urban area. Overall averages could also be calculated. • Traffic Operations—to analyze the average delay per vehicle at a railroad crossing. Rating Category Mean Value Overall average bridge condition rating (deck) 6.20 Overall average bridge condition rating (superstructure) 6.47 Overall average bridge condition rating (substructure) 6.08 Average bridge condition rating of structurally deficient bridges (deck) 4.92 Average bridge condition rating of structurally deficient bridges (superstructure) 5.30 Average bridge condition rating of structurally deficient bridges (substructure) 4.54 Table 4. Bridge condition ratings. Rural Urban Total Structurally deficient 10 (11.4%) 3 (25.0%) 13 Functionally obsolete 6 (6.8%) 4 (33.3%) 10 Not deficient 72 (81.8%) 5 (41.7%) 77 Total 88 (100%) 12 (100%) 100 Table 3. Cross-tabulation of bridge condition by location.

16 effective experiment Design and Data analysis in transportation research • Traffic Operations/Safety—to examine the frequency of turning violations at driveways with various turning restrictions. • Work Zones, Environment—to assess the average energy consumption during various stages of construction. Example 2: Public Transport; Descriptive Statistics Area: Public transport Method of Analysis: Descriptive statistics (organizing and presenting data to describe a system or component) 1. Research Question/Problem Statement: The manager of a transit agency would like to present information to the board of commissioners on changes in revenue that resulted from a change in the fare. The transit system provides three basic types of service: local bus routes, express bus routes, and demand-responsive bus service. There are 15 local bus routes, 10 express routes, and 1 demand-responsive system. 0 5 10 15 20 25 30 35 40 45 9 8 7 6 5 4 3 2 1 0 Condition Ratings Pe rc en ta ge o f S tru ctu re s Deck Superstructure Substructure Figure 2. Bridge condition ratings. Question/Issue Use data to describe some change over time. In this instance, data from 2008 and 2009 are used to describe the change in revenue on each route/part of a transit system when the fare structure was changed from variable (per mile) to fixed fares. 2. Identification and Description of Variables: Revenue data are available for each route on the local and express bus system and the demand-responsive system as a whole for the years 2008 and 2009. 3. Data Collection: Revenue data were collected on each route for both 2008 and 2009. The annual revenue for the demand-responsive system was also collected. These data are shown in Table 5. 4. Specification of Analysis Technique and Data Analysis: The objective of this analysis is to present the impact of changing the fare system in a series of graphs. The presentation is intended to show the impact on each component of the transit system as well as the impact on overall system revenue. The impact of the fare change on the overall revenue is best shown with a bar graph (Figure 3). The variation in the impact across system components can be illustrated in a similar graph (Figure 4). A pie chart also can be used to illustrate the relative impact on each system component (Figure 5).

examples of effective experiment Design and Data analysis in transportation research 17 Bus Route 2008 Revenue 2009 Revenue Local Route 1 $350,500 $365,700 Local Route 2 $263,000 $271,500 Local Route 3 $450,800 $460,700 Local Route 4 $294,300 $306,400 Local Route 5 $173,900 $184,600 Local Route 6 $367,800 $375,100 Local Route 7 $415,800 $430,300 Local Route 8 $145,600 $149,100 Local Route 9 $248,200 $260,800 Local Route 10 $310,400 $318,300 Local Route 11 $444,300 $459,200 Local Route 12 $208,400 $205,600 Local Route 13 $407,600 $412,400 Local Route 14 $161,500 $169,300 Local Route 15 $325,100 $340,200 Express Route 1 $85,400 $83,600 Express Route 2 $110,300 $109,200 Express Route 3 $65,800 $66,200 Express Route 4 $125,300 $127,600 Express Route 5 $90,800 $90,400 Express Route 6 $125,800 $123,400 Express Route 7 $87,200 $86,900 Express Route 8 $68.300 $67,200 Express Route 9 $110,100 $112,300 Express Route 10 $73,200 $72,100 Demand-Responsive System $510,100 $521,300 Table 5. Revenue by route or type of service and year. 6.02 6.17 0 1 2 3 4 5 6 7 8 2008 2009 Total System Revenue Re ve nu e (M illi on $ ) Figure 3. Impact of fare change on overall revenue.

18 effective experiment Design and Data analysis in transportation research Express Buses, 15.7% Express Buses, 15.2% Local Buses, 76.3% Local Buses, 75.8% Demand Responsive, 8.5% Demand Responsive, 8.5% 2008 2009 Figure 5. Pie charts illustrating percent of revenue from each component of a transit system. If it is important to display the variability in the impact within the various bus routes in the local bus or express bus operations, this also can be illustrated (Figure 6). This type of diagram shows the maximum value, minimum value, and mean value of the percent increase in revenue across the 15 local bus routes and the 10 express bus routes. 5. Interpreting the results: These results indicate that changing from a variable fare based on trip length (2008) to a fixed fare (2009) on both the local bus routes and the express bus routes had little effect on revenue. On the local bus routes, there was an average increase in revenue of 3.1%. On the express bus routes, there was an average decrease in revenue of 0.4%. These changes altered the percentage of the total system revenue attributed to the local bus routes and the express bus routes. The local bus routes generated 76.3% of the revenue in 2009, compared to 75.8% in 2008. The percentage of revenue generated by the express bus routes dropped from 15.7% to 15.2%, and the demand-responsive system generated 8.5% in both 2008 and 2009. 6. Conclusion and Discussion: The total revenue increased from $6.02 million to $6.17 mil lion. The cost of operating a variable fare system is greater than that of operating a fixed fare system— hence, net income probably increased even more (more revenue, lower cost for fare collection), and the decision to modify the fare system seems reasonable. Notice that the entire discussion Figure 4. Variation in impact of fare change across system components. 0.94 0.51 0.94 0.52 4.57 4.71 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Local Buses Express Buses Demand Responsive Re ve nu e (M illi on $ ) 2008 2009

examples of effective experiment Design and Data analysis in transportation research 19 also is based on the assumption that no other factors changed between 2008 and 2009 that might have affected total revenues. One of the implicit assumptions is that the number of riders remained relatively constant from 1 year to the next. If the ridership had changed, the statistics reported would have to be changed. Using the measure revenue/rider, for example, would help control (or normalize) for the variation in ridership. 7. Applications in Other Areas in Transportation Research: Descriptive statistics are widely used and can convey a great deal of information to a reader. They also can be used to present data in many areas of transportation research, including: • Transportation Planning—to display public response frequency or percentage to various alternative designs. • Traffic Operations—to display the frequency or percentage of crashes by route type or by the type of traffic control devices present at an intersection. • Airport Engineering—to display the arrival pattern of passengers or flights by hour or other time period. • Public Transit—to display the average load factor on buses by time of day. Example 3: Environment; Descriptive Statistics Area: Environment Method of Analysis: Descriptive statistics (organizing and presenting data to explain current conditions) 1. Research Question/Problem Statement: The planning and programming director in Envi- ronmental City wants to determine the current ozone concentration in the city. These data will be compared to data collected after the projects included in the Transportation Improvement Program (TIP) have been completed to determine the effects of these projects on the environ- ment. Because the terrain, the presence of hills or tall buildings, the prevailing wind direction, and the sample station location relative to high volume roads or industrial sites all affect the ozone level, multiple samples are required to determine the ozone concentration level in a city. For this example, air samples are obtained each weekday in the month of July (21 days) at 14 air-sampling stations in the city: 7 in the central city and 7 in the outlying areas of the city. The objective of the analysis is to determine the ozone concentration in the central city, the outlying areas of the city, and the city as a whole. Figure 6. Graph showing variation in revenue increase by type of bus route. -0.4 -1.3 -2.1 3.1 6.2 2.0 -3 -2 -1 0 1 2 3 4 5 6 7 Local Bus Routes Express Bus Routes Percent Increase in Revenue

20 effective experiment Design and Data analysis in transportation research 2. Identification and Description of Variables: The variable to be analyzed is the 8-hour average ozone concentration in parts per million (ppm) at each of the 14 air-sampling stations. The 8-hour average concentration is the basis for the EPA standard, and July is selected because ozone levels are temperature sensitive and increase with a rise in the temperature. 3. Data Collection: Ozone concentrations in ppm are recorded for each hour of the day at each of the 14 air-sampling stations. The highest average concentration for any 8-hour period during the day is recorded and tabulated. This results in 294 concentration observations (14 stations for 21 days). Table 6 and Table 7 show the data for the seven central city locations and the seven outlying area locations. 4. Specification of Analysis Technique and Data Analysis: Much of the data used in analyzing transportation issues has year-to-year, month-to-month, day-to-day, and even hour-to-hour variations. For this reason, making only one observation, or even a few observations, may not accurately describe the phenomenon being observed. Thus, standard practice is to obtain several observations and report the mean value of all observations. In this example, the phenomenon being observed is the daily ozone concentration at a series of air-sampling locations. The statistic to be estimated is the mean value of this variable over Question/Issue Use collected data to describe existing conditions and prepare for future analysis. In this example, air pollution levels in the central city, the outlying areas, and the overall city are to be described. Day Station 1 2 3 4 5 6 7 ∑ 1 0.079 0.084 0.081 0.083 0.088 0.086 0.089 0.590 2 0.082 0.087 0.088 0.086 0.086 0.087 0.081 0.597 3 0.080 0.081 0.077 0.072 0.084 0.083 0.081 0.558 4 0.083 0.086 0.082 0.079 0.086 0.087 0.089 0.592 5 0.082 0.087 0.080 0.075 0.090 0.089 0.085 0.588 6 0.075 0.084 0.079 0.076 0.080 0.083 0.081 0.558 7 0.078 0.079 0.080 0.074 0.078 0.080 0.075 0.544 8 0.081 0.077 0.082 0.081 0.076 0.079 0.074 0.540 9 0.088 0.084 0.083 0.085 0.083 0.083 0.088 0.594 10 0.085 0.087 0.086 0.089 0.088 0.087 0.090 0.612 11 0.079 0.082 0.082 0.089 0.091 0.089 0.090 0.602 12 0.078 0.080 0.081 0.086 0.088 0.089 0.089 0.591 13 0.081 0.079 0.077 0.083 0.084 0.085 0.087 0.576 14 0.083 0.080 0.079 0.081 0.080 0.082 0.083 0.568 15 0.084 0.083 0.080 0.085 0.082 0.086 0.085 0.585 16 0.086 0.087 0.085 0.087 0.089 0.090 0.089 0.613 17 0.082 0.085 0.083 0.090 0.087 0.088 0.089 0.604 18 0.080 0.081 0.080 0.087 0.085 0.086 0.088 0.587 19 0.080 0.083 0.077 0.083 0.085 0.084 0.087 0.579 20 0.081 0.084 0.079 0.082 0.081 0.083 0.088 0.578 21 0.082 0.084 0.080 0.081 0.082 0.083 0.085 0.577 ∑ 1.709 1.744 1.701 1.734 1.773 1.789 1.793 12.243 Table 6. Central city 8-hour ozone concentration samples (ppm).

examples of effective experiment Design and Data analysis in transportation research 21 the test period selected. The mean value of any data set (x _ ) equals the sum of all observations in the set divided by the total number of observations in the set (n): x x n i i n = = ∑ 1 The variables of interest stated in the research question are the average ozone concentration for the central city, the outlying areas, and the total city. Thus, there are three data sets: the first table, the second table, and the sum of the two tables. The first data set has a sample size of 147; the second data set also has a sample size of 147, and the third data set contains 294 observations. Using the formula just shown, the mean value of the ozone concentration in the central city is calculated as follows: x xi i = = = = ∑ 147 12 243 147 0 083 1 147 . . ppm The mean value of the ozone concentration in the outlying areas of the city is: x xi i = = = = ∑ 147 10 553 147 0 072 1 147 . . ppm The mean value of the ozone concentration for the entire city is: x xi i = = = = ∑ 294 22 796 294 0 078 1 294 . . ppm Day Station 8 9 10 11 12 13 14 ∑ 1 0.072 0.074 0.073 0.071 0.079 0.070 0.074 0.513 2 0.074 0.075 0.077 0.075 0.081 0.075 0.077 0.534 3 0.070 0.072 0.074 0.074 0.083 0.078 0.080 0.531 4 0.067 0.070 0.071 0.077 0.080 0.077 0.081 0.523 5 0.064 0.067 0.068 0.072 0.079 0.078 0.079 0.507 6 0.069 0.068 0.066 0.070 0.075 0.079 0.082 0.509 7 0.071 0.069 0.070 0.071 0.074 0.071 0.077 0.503 8 0.073 0.072 0.074 0.072 0.076 0.073 0.078 0.518 9 0.072 0.075 0.077 0.074 0.078 0.074 0.080 0.530 10 0.074 0.077 0.079 0.077 0.080 0.076 0.079 0.542 11 0.070 0.072 0.075 0.074 0.079 0.074 0.078 0.522 12 0.068 0.067 0.068 0.070 0.074 0.070 0.075 0.492 13 0.065 0.063 0.067 0.068 0.072 0.067 0.071 0.473 14 0.063 0.062 0.067 0.069 0.073 0.068 0.073 0.475 15 0.064 0.064 0.066 0.067 0.070 0.066 0.070 0.467 16 0.061 0.059 0.062 0.062 0.067 0.064 0.069 0.434 17 0.065 0.061 0.060 0.064 0.069 0.066 0.073 0.458 18 0.067 0.063 0.065 0.068 0.073 0.069 0.076 0.499 19 0.069 0.067 0.068 0.072 0.077 0.071 0.078 0.502 20 0.071 0.069 0.070 0.074 0.080 0.074 0.077 0.515 21 0.070 0.065 0.072 0.076 0.079 0.073 0.079 0.514 ∑ 1.439 1.431 1.409 1.497 1.598 1.513 1.606 10.553 Table 7. Outlying area 8-hour ozone concentration samples (ppm).

22 effective experiment Design and Data analysis in transportation research Using the same equation, the mean value for each air-sampling location can be found by summing the value of the ozone concentration in the column representing that location and dividing by the 21 observations at that location. For example, considering Sample Station 1, the mean value of the ozone concentration is 1.709/21 = 0.081 ppm. Similarly, the mean value of the ozone concentrations for any specific day can be found by summing the ozone concentration values in the row representing that day and dividing by the number of stations. For example, for Day 1, the mean value of the ozone concentration in the central city is 0.590/7=0.084. In the outlying areas of the city, it is 0.513/7=0.073, and for the entire city it is 1.103/14=0.079. The highest and lowest values of the ozone concentration can be obtained by searching the two tables. The highest ozone concentration (0.091 ppm) is logged as having occurred at Station 5 on Day 11. The lowest ozone concentration (0.059 ppm) occurred at Station 9 on Day 16. The variation by sample location can be illustrated in the form of a frequency diagram. A graph can be used to show the variation in the average ozone concentration for the seven sample stations in the central city (Figure 7). Notice that all of these calculations (and more) can be done very easily if all the data are put in a spreadsheet and various statistical functions used. Graphs and other displays also can be made within the spreadsheet. 5. Interpreting the Results: In this example, the data are not tested to determine whether they fit a known distribution or whether one average value is significantly higher or lower than another. It can only be reported that, as recorded in July, the mean ozone concentration in the central city was greater than the concentration in the outlying areas of the city. (For testing to see whether the data fit a known distribution or comparing mean values, see Example 4 on fitting distribu- tions and goodness of fit. For comparing mean values, see examples 5 through 7.) It is known that ozone concentration varies by day and by location of the air-sampling equipment. If there is some threshold value of importance, such as the ozone concentration level considered acceptable by the EPA, these data could be used to determine the number of days that this level was exceeded, or the number of stations that recorded an ozone concentration above this threshold. This is done by comparing each day or each station with the threshold 0.081 0.083 0.081 0.083 0.084 0.085 0.085 0.070 0.072 0.074 0.076 0.078 0.080 0.082 0.084 0.086 1 2 3 4 5 6 7 Station A ve ra ge o zo ne c on ce nt ra tio n Figure 7. Average ozone concentration for seven central city sampling stations (ppm).

examples of effective experiment Design and Data analysis in transportation research 23 value. It must be noted that, as presented, this example is not a statistical comparison per se (i.e., there has been no significance testing or formal statistical comparison). 6. Conclusion and Discussion: This example illustrates how to determine and present quanti- tative information about a data set containing values of a varying parameter. If a similar set of data were captured each month, the variation in ozone concentration could be analyzed to describe the variation over the year. Similarly, if data were captured at these same locations in July of every year, the trend in ozone concentration over time could be determined. 7. Applications in Other Areas in Transportation: These descriptive statistics techniques can be used to present data in other areas of transportation research, such as: • Traffic Operations/Safety and Transportation Planning – to analyze the average speed of vehicles on streets with a speed limit of 45 miles per hour (mph) in residential, commercial, and industrial areas by sampling a number of streets in each of these area types. – to examine the average emergency vehicle response time to various areas of the city or county, by analyzing dispatch and arrival times for emergency calls to each area of interest. • Pavement Engineering—to analyze the average number of potholes per mile on pavement as a function of the age of pavement, by sampling a number of streets where the pavement age falls in discrete categories (0 to 5 years, 5 to 10 years, 10 to 15 years, and greater than 15 years). • Traffic Safety—to evaluate the average number of crashes per month at intersections with two-way STOP control versus four-way STOP control by sampling a number of intersections in each category over time. Example 4: Traffic Operations; Goodness of Fit Area: Traffic operations Method of Analysis: Goodness of fit (chi-square test; determining if observed distributions of data fit hypothesized standard distributions) 1. Research Question/Problem Statement: A research team is developing a model to estimate travel times of various types of personal travel (modes) on a path shared by bicyclists, in-line skaters, and others. One version of the model relies on the assertion that the distribution of speeds for each mode conforms to the normal distribution. (For a helpful definition of this and other statistical terms, see the glossary in NCHRP Project 20-45, Volume 2, Appendix A.) Based on a literature review, the researchers are sure that bicycle speeds are normally distributed. However, the shapes of the speed distributions for other users are unknown. Thus, the objective is to determine if skater speeds are normally distributed in this instance. Question/Issue Do collected data fit a specific type of probability distribution? In this example, do the speeds of in-line skaters on a shared-use path follow a normal distribution (are they normally distributed)? 2. Identification and Description of Variables: The only variable collected is the speed of in-line skaters passing through short sections of the shared-use path. 3. Data Collection: The team collects speeds using a video camera placed where most path users would not notice it. The speed of each free-flowing skater (i.e., each skater who is not closely following another path user) is calculated from the times that the skater passes two benchmarks on the path visible in the camera frame. Several days of data collection allow a large sample of 219 skaters to be measured. (An implicit assumption is made that there is no

24 effective experiment Design and Data analysis in transportation research variation in the data by day.) The data have a familiar bell shape; that is, when graphed, they look like they are normally distributed (Figure 8). Each bar in the figure shows the number of observations per 1.00-mph-wide speed bin. There are 10 observations between 6.00 mph and 6.99 mph. 4. Specification of Analysis Technique and Data Analysis: This analysis involves several pre- liminary steps followed by two major steps. In the preliminaries, the team calculates the mean and standard deviation from the data sample as 10.17 mph and 2.79 mph, respectively, using standard formulas described in NCHRP Project 20-45, Volume 2, Chapter 6, Section C under the heading “Frequency Distributions, Variance, Standard Deviation, Histograms, and Boxplots.” Then the team forms bins of observations of sufficient size to conduct the analysis. For this analysis, the team forms bins containing at least four observations each, which means forming a bin for speeds of 5 mph and lower and a bin for speeds of 17 mph or higher. There is some argument regarding the minimum allowable cell size. Some analysts argue that the minimum is five; others argue that the cell size can be smaller. Smaller numbers of observations in a bin may distort the results. When in doubt, the analysis can be done with different assumptions regarding the cell size. The left two columns in Table 8 show the data ready for analysis. The first major step of the analysis is to generate the theoretical normal distribution to compare to the field data. To do this, the team calculates a value of Z, the standard normal variable for each bin i, using the following equation: Z xi = − µ σ where x is the speed in miles per hour (mph) corresponding to the bin, µ is the mean speed, and s is the standard deviation of all of the observations in the speed sample in mph. For example (and with reference to the data in Table 8), for a speed of 5 mph the value of Z will be (5 - 10.17)/2.79 = -1.85 and for a speed of 6 mph, the value of Z will be (6 - 10.17)/2.79 = -1.50. The team then consults a table of standard normal values (i.e., NCHRP Project 20-45, Volume 2, Appendix C, Table C-1) to convert these Z values into A values representing the area under the standard normal distribution curve. The A value for a Z of -1.85 is 0.468, while the A value for a Z of -1.50 is 0.432. The difference between these two A values, representing the area under the standard normal probability curve corresponding to the speed of 6 mph, is 0.036 (calculated 0.468 - 0.432 = 0.036). The team multiplies 0.036 by the total sample size (219), to estimate that there should be 7.78 skaters with a speed of 6 mph if the speeds follow the standard normal distribution. The team follows Figure 8. Distribution of observed in-line skater speeds. 0 5 10 15 20 25 30 35 40 1 3 5 7 9 11 13 15 17 232119 Speed, mph Nu m be r o f o bs er va tio ns

examples of effective experiment Design and Data analysis in transportation research 25 a similar procedure for all speeds. Notice that the areas under the curve can also be calculated in a simple Excel spreadsheet using the “NORMDIST” function for a given x value and the average speed of 10.17 and standard deviation of 2.79. The values shown in Table 8 have been estimated using the Excel function. The second major step of the analysis is to use the chi-square test (as described in NCHRP Project 20-45, Volume 2, Chapter 6, Section F) to determine if the theoretical normal distribution is significantly different from the actual data distribution. The team computes a chi-square value for each bin i using the formula: χi i i i O E E 2 2 = −( ) where Oi is the number of actual observations in bin i and Ei is the expected number of obser- vations in bin i estimated by using the theoretical distribution. For the bin of 6 mph speeds, O = 10 (from the table), E = 7.78 (calculated), and the ci2 contribution for that cell is 0.637. The sum of the ci2 values for all bins is 19.519. The degrees of freedom (df) used for this application of the chi-square test are the number of bins minus 1 minus the number of variables in the distribution of interest. Given that the normal distribution has two variables (see May, Traffic Flow Fundamentals, 1990, p. 40), in this example the degrees of freedom equal 9 (calculated 12 - 1 - 2 = 9). From a standard table of chi-square values (NCHRP Project 20-45, Volume 2, Appendix C, Table C-2), the team finds that the critical value at the 95% confidence level for this case (with df = 9) is 16.9. The calculated value of the statistic is ~19.5, more than the tabular value. The results of all of these observations and calculations are shown in Table 8. 5. Interpreting the Results: The calculated chi-square value of ~19.5 is greater than the criti- cal chi-square value of 16.9. The team concludes, therefore, that the normal distribution is significantly different from the distribution of the speed sample at the 95% level (i.e., that the in-line skater speed data do not appear to be normally distributed). Larger variations between the observed and expected distributions lead to higher values of the statistic and would be interpreted as it being less likely that the data are distributed according to the Speed (mph) Number of Observations Number Predicted by Normal Distribution Chi-Square Value Under 5.99 6 6.98 0.137 6.00 to 6.99 10 7.78 0.637 7.00 to 7.99 18 13.21 1.734 8.00 to 8.99 24 19.78 0.902 9.00 to 9.99 37 26.07 4.585 10.00 to 10.99 38 30.26 1.980 11.00 to 11.99 24 30.93 1.554 12.00 to 12.99 21 27.85 1.685 13.00 to 13.99 15 22.08 2.271 14.00 to 14.99 13 15.42 0.379 15.00 to 15.99 4 9.48 3.169 16.00 to 16.99 4 5.13 0.251 17.00 and over 5 4.03 0.234 Total 219 219 19.519 Table 8. Observations, theoretical predictions, and chi-square values for each bin.

26 effective experiment Design and Data analysis in transportation research hypothesized distribution. Conversely, smaller variations between observed and expected distributions result in lower values of the statistic, which would suggest that it is more likely that the data are normally distributed because the observed values would fit better with the expected values. 6. Conclusion and Discussion: In this case, the results suggest that the normal distribution is not a good fit to free-flow speeds of in-line skaters on shared-use paths. Interestingly, if the 23 mph observation is considered to be an outlier and discarded, the results of the analysis yield a different conclusion (that the data are normally distributed). Some researchers use a simple rule that an outlier exists if the observation is more than three standard deviations from the mean value. (In this example, the 23 mph observation is, indeed, more than three standard deviations from the mean.) If there is concern with discarding the observation as an outlier, it would be easy enough in this example to repeat the data collection exercise. Looking at the data plotted above, it is reasonably apparent that the well-known normal distribution should be a good fit (at least without the value of 23). However, the results from the statistical test could not confirm the suspicion. In other cases, the type of distribution may not be so obvious, the distributions in question may be obscure, or some distribution parameters may need to be calibrated for a good fit. In these cases, the statistical test is much more valuable. The chi-square test also can be used simply to compare two observed distributions to see if they are the same, independent of any underlying probability distribution. For example, if it is desired to know if the distribution of traffic volume by vehicle type (e.g., automobiles, light trucks, and so on) is the same at two different freeway locations, the two distributions can be compared to see if they are similar. The consequences of an error in the procedure outlined here can be severe. This is because the distributions chosen as a result of the procedure often become the heart of predictive models used by many other engineers and planners. A poorly-chosen distribution will often provide erroneous predictions for many years to come. 7. Applications in Other Areas of Transportation Research: Fitting distributions to data samples is important in several areas of transportation research, such as: • Traffic Operations—to analyze shapes of vehicle headway distributions, which are of great interest, especially as a precursor to calibrating and using simulation models. • Traffic Safety—to analyze collision frequency data. Analysts often assume that the Poisson distribution is a good fit for collision frequency data and must use the method described here to validate the claim. • Pavement Engineering—to form models of pavement wear or otherwise compare results obtained using different designs, as it is often required to check the distributions of the parameters used (e.g., roughness). Example 5: Construction; Simple Comparisons to Specified Values Area: Construction Method of Analysis: Simple comparisons to specified values—using Student’s t-test to compare the mean value of a small sample to a standard or other requirement (i.e., to a population with a known mean and unknown standard deviation or variance) 1. Research Question/Problem Statement: A contractor wants to determine if a specified soil compaction can be achieved on a segment of the road under construction by using an on-site roller or if a new roller must be brought in.

examples of effective experiment Design and Data analysis in transportation research 27 The cost of obtaining samples for many construction materials and practices is quite high. As a result, decisions often must be made based on a small number of samples. The appropri- ate statistical technique for comparing the mean value of a small sample with a standard or requirement is Student’s t-test. Formally, the working, or null, hypothesis (Ho) and the alternative hypothesis (Ha) can be stated as follows: Ho: The soil compaction achieved using the on-site roller (CA) is less than a specified value (CS); that is, (CA < CS). Ha: The soil compaction achieved using the on-site roller (CA) is greater than or equal to the specified value (CS); that is, (CA ≥ CS). Question/Issue Determine whether a sample mean exceeds a specified value. Alternatively, deter- mine the probability of obtaining a sample mean (x _ ) from a sample of size n, if the universe being sampled has a true mean less than or equal to a population mean with an unknown variance. In this example, is an observed mean of soil compaction samples equal to or greater than a specified value? 2. Identification and Description of Variables: The variable to be used is the soil density results of nuclear densometer tests. These values will be used to determine whether the use of the on-site roller is adequate to meet the contract-specified soil density obtained in the laboratory (Proctor density) of 95%. 3. Data Collection: A 125-foot section of road is constructed and compacted with the on-site roller, and four samples of the soil density are obtained (25 feet, 50 feet, 75 feet, and 100 feet from the beginning of the test section). 4. Specification of Analysis Technique and Data Analysis: For small samples (n < 30) where the population mean is known but the population standard deviation is unknown, it is not appropriate to describe the distribution of the sample mean with a normal distribution. The appropriate distribution is called Student’s distribution (t-distribution or t-statistic). The equation for Student’s t-statistic is: t x x S n = − ′ where x _ is the sample mean, x _ ′ is the population mean (or specified standard), S is the sample standard deviation, and n is the sample size. The four nuclear densometer readings were 98%, 97%, 93% and 99%. Then, showing some simple sample calculations, X X S X i i i n = = + + + = = = = = ∑ 4 98 97 93 99 4 387 4 96 75 1 4 1 . % Σ i X n S −( ) − = = 2 1 20 74 3 2 63 . . %

28 effective experiment Design and Data analysis in transportation research and using the equation for t above, t = − = = 96 75 95 00 2 63 2 1 75 1 32 1 33 . . . . . . The calculated value of the t-statistic (1.33) is most typically compared to the tabularized values of the t-statistic (e.g., NCHRP Project 20-45, Volume 2, Appendix C, Table C-4) for a given significance level (typically called t critical or tcrit). For a sample size of n = 4 having 3 (n - 1) degrees of freedom (df), the values for tcrit are: 1.638 for a = 0.10 and 2.353 for a = 0.05 (two common values of a for testing, the latter being most common). Important: The specification of the significance level (a level) for testing should be done before actual testing and interpretation of results are done. In many instances, the appropriate level is defined by the agency doing the testing, a specified testing standard, or simply common practice. Generally speaking, selection of a smaller value for a (e.g., a = 0.05 versus a = 0.10) sets a more stringent standard. In this example, because the calculated value of t (1.33) is less than the critical value (2.353, given a = 0.05), the null hypothesis is accepted. That is, the engineer cannot be confident that the mean value from the densometer tests (96.75%) is greater than the required specifica- tion (95%). If a lower confidence level is chosen (e.g., a = 0.15), the value for tcrit would change to 1.250, which means the null hypothesis would be rejected. A lower confidence level can have serious implications. For example, there is an approximately 15% chance that the standard will not be met. That level of risk may or may not be acceptable to the contractor or the agency. Notice that in many standards the required significance level is stated (typically a = 0.05). It should be emphasized that the confidence level should be chosen before calculations and testing are done. It is not generally permissible to change the confidence level after calculations have been performed. Doing this would be akin to arguing that standards can be relaxed if a test gives an answer that the analyst doesn’t like. The results of small sample tests often are sensitive to the number of samples that can be obtained at a reasonable cost. (The mean value may change considerably as more data are added.) In this example, if it were possible to obtain nine independent samples (as opposed to four) and the mean value and sample standard deviation were the same as with the four samples, the calculation of the t-statistic would be: t = − = 96 75 95 00 2 63 3 1 99 . . . . Comparing the value of t (with a larger sample size) to the appropriate tcrit (for n - 1 = 8 df and a = 0.05) of 1.860 changes the outcome. That is, the calculated value of the t-statistic is now larger than the tabularized value of tcrit, and the null hypothesis is rejected. Thus, it is accepted that the mean of the densometer readings meets or exceeds the standard. It should be noted, however, that the inclusion of additional tests may yield a different mean value and standard deviation, in which case the results could be different. 5. Interpreting the Results: By themselves, the results of the statistical analysis are insufficient to answer the question as to whether a new roller should be brought to the project site. These results only provide information the contractor can use to make this decision. The ultimate decision should be based on these probabilities and knowledge of the cost of each option. What is the cost of bringing in a new roller now? What is the cost of starting the project and then determining the current roller is not adequate and then bringing in a new roller? Will this decision result in a delay in project completion—and does the contract include an incentive for early completion and/or a penalty for missing the completion date? If it is possible to conduct additional independent densometer tests, what is the cost of conducting them?

examples of effective experiment Design and Data analysis in transportation research 29 If there is a severe penalty for missing the deadline (or a significant reward for finishing early), the contractor may be willing to incur the cost of bringing in a new roller rather than accepting a 15% probability of being delayed. 6. Conclusion and Discussion: In some cases the decision about which alternative is preferable can be expressed in the form of a probability (or level of confidence) required to make a deci- sion. The decision criterion is then expressed in a hypothesis and the probability of rejecting that hypothesis. In this example, if the hypothesis to be tested is “Using the on-site roller will provide an average soil density of 95% or higher” and the level of confidence is set at 95%, given a sample of four tests the decision will be to bring in a new roller. However, if nine independent tests could be conducted, the results in this example would lead to a decision to use the on-site roller. 7. Applications in Other Areas in Transportation Research: Simple comparisons to specified values can be used in a variety of areas of transportation research. Some examples include: • Traffic Operations—to compare the average annual number of crashes at intersections with roundabouts with the average annual number of crashes at signalized intersections. • Pavement Engineering—to test the comprehensive strength of concrete slabs. • Maintenance—to test the results of a proposed new deicer compound. Example 6: Maintenance; Simple Two-Sample Comparisons Area: Maintenance Method of Analysis: Simple two-sample comparisons (t-test for paired comparisons; com- paring the mean values of two sets of matched data) 1. Research Question/Problem Statement: As a part of a quality control and quality assurance (QC/QA) program for highway maintenance and construction, an agency engineer wants to compare and identify discrepancies in the contractor’s testing procedures or equipment in making measurements on materials being used. Specifically, compacted air voids in asphalt mixtures are being measured. In this instance, the agency’s test results need to be compared, one-to-one, with the contractor’s test results. Samples are drawn or made and then literally split and tested—one by the contractor, one by the agency. Then the pairs of measurements are analyzed. A paired t-test will be used to make the comparison. (For another type of two-sample comparison, see Example 7.) Question/Issue Use collected data to test if two sets of results are similar. Specifically, do two test- ing procedures to determine air voids produce the same results? Stated in formal terms, the null and alternative hypotheses are: Ho: There is no mean difference in air voids between agency and contractor test results: H Xo d: = 0 Ha: There is a mean difference in air voids between agency and contractor test results: H Xa d: ≠ 0 (For definitions and more discussion about the formulation of formal hypotheses for test- ing, see NCHRP Project 20-45, Volume 2, Appendix A and Volume 1, Chapter 2, “Hypothesis.”) 2. Identification and Description of Variables: The testing procedure for laboratory-compacted air voids in the asphalt mixture needs to be verified. The split-sample test results for laboratory-

30 effective experiment Design and Data analysis in transportation research compacted air voids are shown in Table 9. Twenty samples are prepared using the same asphalt mixture. Half of the samples are prepared in the agency’s laboratory and the other half in the contractor’s laboratory. Given this arrangement, there are basically two variables of concern: who did the testing and the air void determination. 3. Data Collection: A sufficient quantity of asphalt mix to make 10 lots is produced in an asphalt plant located on a highway project. Each of the 10 lots is collected, split into two samples, and labeled. A sample from each lot, 4 inches in diameter and 2 inches in height, is prepared in the contractor’s laboratory to determine the air voids in the compacted samples. A matched set of samples is prepared in the agency’s laboratory and a similar volumetric procedure is used to determine the agency’s lab-compacted air voids. The lab-compacted air void contents in the asphalt mixture for both the contractor and agency are shown in Table 9. 4. Specification of Analysis Technique and Data Analysis: A paired (two-sided) t-test will be used to determine whether a difference exists between the contractor and agency results. As noted above, in a paired t-test the null hypothesis is that the mean of the differences between each pair of two tests is 0 (there is no difference between the means). The null hypothesis can be expressed as follows: H Xo d: = 0 The alternate hypothesis, that the two means are not equal, can be expressed as follows: H Xa d: ≠ 0 The t-statistic for the paired measurements (i.e., the difference between the split-sample test results) is calculated using the following equation: t X s n d d = − 0 Using the actual data, the value of the t-statistic is calculated as follows: t = − = 0 88 0 0 7 10 4 . . Sample Air Voids (%) DifferenceContractor Agency 1 4.37 4.15 0.21 2 3.76 5.39 -1.63 3 4.10 4.47 -0.37 4 4.39 4.52 -0.13 5 4.06 5.36 -1.29 6 4.14 5.01 -0.87 7 3.92 5.23 -1.30 8 3.38 4.97 -1.60 9 4.12 4.37 -0.25 10 3.68 5.29 -1.61 X 3.99 4.88 dX = -0.88 S 0.31 0.46 ds = 0.70 Table 9. Laboratory-compacted air voids in split samples.

examples of effective experiment Design and Data analysis in transportation research 31 For n - 1 (10 - 1 = 9) degrees of freedom and a = 0.05, the tcrit value can be looked up using a t-table (e.g., NCHRP Project 20-45, Volume 2, Appendix C, Table C-4): t0 025 9 2 262. , .= For a more detailed description of the t-statistic, see the glossary in NCHRP Project 20-45, Volume 2, Appendix A. 5. Interpreting the Results: Given that t = 4 > t0.025, 9 = 2.685, the engineer would reject the null hypothesis and conclude that the results of the paired tests are different. This means that the contractor and agency test results from paired measurements indicate that the test method, technicians, and/or test equipment are not providing similar results. Notice that the engineer cannot conclude anything about the material or production variation or what has caused the differences to occur. 6. Conclusion and Discussion: The results of the test indicate that a statistically significant difference exists between the test results from the two groups. When making such comparisons, it is important that random sampling be used when obtaining the samples. Also, because sources of variability influence the population parameters, the two sets of test results must have been sampled over the same time period, and the same sampling and testing procedures must have been used. It is best if one sample is drawn and then literally split in two, then another sample drawn, and so on. The identification of a difference is just that: notice that a difference exists. The reason for the difference must still be determined. A common misinterpretation is that the result of the t-test provides the probability of the null hypothesis being true. Another way to look at the t-test result in this example is to conclude that some alternative hypothesis provides a better description of the data. The result does not, however, indicate that the alternative hypothesis is true. To ensure practical significance, it is necessary to assess the magnitude of the difference being tested. This can be done by computing confidence intervals, which are used to quantify the range of effect size and are often more useful than simple hypothesis testing. Failure to reject a hypothesis also provides important information. Possible explanations include: occurrence of a type-II error (erroneous acceptance of the null hypothesis); small sample size; difference too small to detect; expected difference did not occur in data; there is no difference/effect. Proper experiment design and data collection can minimize the impact of some of these issues. (For a more comprehensive discussion of this topic, see NCHRP Project 20-45, Volume 2, Chapter 1.) 7. Applications in Other Areas of Transportation Research: The application of the t-test to compare two mean values in other areas of transportation research may include: • Traffic Operations—to evaluate average delay in bus arrivals at various bus stops. • Traffic Operations/Safety—to determine the effect of two enforcement methods on reduction in a particular traffic violation. • Pavement Engineering—to investigate average performance of two pavement sections. • Environment—to compare average vehicular emissions at two locations in a city. Example 7: Materials; Simple Two-Sample Comparisons Area: Materials Method of Analysis: Simple two-sample comparisons (using the t-test to compare the mean values of two samples and the F-test for comparing variances) 1. Research Question/Problem Statement: As a part of dispute resolution during quality control and quality assurance, a highway agency engineer wants to validate a contractor’s test results concerning asphalt content. In this example, the engineer wants to compare the results

32 effective experiment Design and Data analysis in transportation research of two sets of tests: one from the contractor and one from the agency. Formally, the (null) hypothesis to be tested, Ho, is that the contractor’s tests and the agency’s tests are from the same population. In other words, the null hypothesis is that the means of the two data sets will be equal, as will the standard deviations. Notice that in the latter instance the variances are actually being compared. Test results were also compared in Example 6. In that example, the comparison was based on split samples. The same test specimens were tested by two different analysts using different equipment to see if the same results could be obtained by both. The major difference between Example 6 and Example 7 is that, in this example, the two samples are randomly selected from the same pavement section. Question/Issue Use collected data to test if two measured mean values are the same. In this instance, are two mean values of asphalt content the same? Stated in formal terms, the null and alternative hypotheses can be expressed as follows: Ho: There is no difference in asphalt content between agency and contractor test results: H m mo c a: − =( )0 Ha: There is a difference in asphalt content between agency and contractor test results: H m ma c a: − ≠( )0 2. Identification and Description of Variables: The contractor runs 12 asphalt content tests and the agency engineer runs 6 asphalt content tests over the same period of time, using the same random sampling and testing procedures. The question is whether it is likely that the tests have come from the same population based on their variability. 3. Data Collection: If the agency’s objective is simply to identify discrepancies in the testing procedures or equipment, then verification testing should be done on split samples (as in Example 6). Using split samples, the difference in the measured variable can more easily be attributed to testing procedures. A paired t-test should be used. (For more information, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A, “Analysis of Variance Methodology.”) A split sample occurs when a physical sample (of whatever is being tested) is drawn and then literally split into two testable samples. On the other hand, if the agency’s objective is to identify discrepancies in the overall material, process, sampling, and testing processes, then validation testing should be done on independent samples. Notice the use of these terms. It is important to distinguish between testing to verify only the testing process (verification) versus testing to compare the overall production, sampling, and testing processes (validation). If independent samples are used, the agency test results still can be compared with contractor test results (using a simple t-test for comparing two means). If the test results are consistent, then the agency and contractor tests can be combined for contract compliance determination. 4. Specification of Analysis Technique and Data Analysis: When comparing the two data sets, it is important to compare both the means and the variances because the assumption when using the t-test requires equal variances for each of the two groups. A different test is used in each instance. The F-test provides a method for comparing the variances (the standard devia- tion squared) of two sets of data. Differences in means are assessed by the t-test. Generally, construction processes and material properties are assumed to follow a normal distribution.

examples of effective experiment Design and Data analysis in transportation research 33 In this example, a normal distribution is assumed. (The assumption of normality also can be tested, as in Example 4.) The ratios of variances follow an F-distribution, while the means of relatively small samples follow a t-distribution. Using these distributions, hypothesis tests can be conducted using the same concepts that have been discussed in prior examples. (For more information about the F-test and the t-distribution, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A, “Compute the F-ratio Test Statistic.” For more information about the t-distribution, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A.) For samples from the same normal population, the statistic F (the ratio of the two-sample variances) has a sampling distribution called the F-distribution. For validation and verification testing, the F-test is based on the ratio of the sample variance of the contractor’s test results (sc 2) and the sample variance of the agency’s test results (sa 2). Similarly, the t-test can be used to test whether the sample mean of the contractor’s tests, X _ c, and the agency’s tests, X _ a, came from populations with the same mean. Consider the asphalt content test results from the contractor samples and agency samples (Table 10). In this instance, the F-test is used to determine whether the variance observed for the contractor’s tests differs from the variance observed for the agency’s tests. Using the F-test Step 1. Compute the variance (s2), for each set of tests: sc 2 = 0.064 and sa 2 = 0.092. As an example, sc 2 can be calculated as: s x X n c i c i2 2 2 2 1 6 4 6 1 11 6 2 6 1 11 = −( ) − = −( ) + −( )∑ . . . . + + −( ) + −( ) =. . . . . . . 6 6 1 11 5 7 6 1 11 0 0645 2 2 Step 2. Compute F s s calc a c = = = 2 2 0 092 0 064 1 43 . . . . Contractor Samples Agency Samples 1 6.4 1 5.4 2 6.2 2 5.8 3 6.0 3 6.2 4 6.6 4 5.4 5 6.1 5 5.6 6 6.0 6 5.8 7 6.3 8 6.1 9 5.9 10 5.8 11 6.0 12 5.7 Descriptive Statistics = 6.1cX Descriptive Statistics = 5.7aX = 0.0642cs = 0.0922as = 0.25cs = 0.30as = 12cn = 6an Table 10. Asphalt content test results from independent samples.

34 effective experiment Design and Data analysis in transportation research Step 3. Determine Fcrit from the F-distribution table, making sure to use the correct degrees of freedom (df) for the numerator (the number of observations minus 1, or na - 1 = 6 - 1 = 5) and the denominator (nc - 1 = 12 - 1 = 11). For a = 0.01, Fcrit = 5.32. The critical F-value can be found from tables (see NCHRP Project 20-45, Volume 2, Appendix C, Table C-5). Read the F-value for 1 - a = 0.99, numerator and denominator degrees of freedom 5 and 11, respectively. Interpolation can be used if exact degrees of freedom are not available in the table. Alternatively, a statistical function in Microsoft Excel™ can be used to determine the F-value. Step 4. Compare the two values to determine if Fcalc < Fcrit. If Fcalc < Fcrit is true, then the variances are equal; if not, they are unequal. In this example, Fcalc (1.43) is, in fact, less than Fcrit (5.32) and, thus, there is no evidence of unequal variances. Given this result, the t-test for the case of equal variances is used to determine whether to declare that the mean of the contractor’s tests differs from the mean of the agency’s tests. Using the t-test Step 1. Compute the sample means (X _ ) for each set of tests: X _ c = 6.1 and X _ a = 5.7. Step 2. Compute the pooled variance sp 2 from the individual sample variances: s s n s n n n p c c a a c a 2 2 21 1 2 0 064 12 1 = −( )+ −( ) + − = −( )+. 0 092 6 1 12 6 2 0 0731 . . −( ) + − = Step 3. Compute the t-statistic using the following equation for equal variance: t X X s n s n c a p c p a = − + = − + = 2 2 6 1 5 7 0 0731 12 0 0731 6 . . . . 2 9. t0 005 16 2 921. , .= (For more information, see NCHRP Project 20-45, Volume 2, Appendix C, Table C-4 for A v= − =1 2 16 α and .) 5. Interpreting the Results: Given that F < Fcrit (i.e., 1.43 < 5.32), there is no reason to believe that the two sets of data have different variances. That is, they could have come from the same population. Therefore, the t-test can be used to compare the means using equal variance. Because t < tcrit (i.e., 2.9 < 2.921), the engineer does not reject the null hypothesis and, thus, assumes that the sample means are equal. The final conclusion is that it is likely that the contractor and agency test results represent the same process. In other words, with a 99% confidence level, it can be said that the agency’s test results are not different from the contrac- tor’s and therefore validate the contractor tests. 6. Conclusion and Discussion: The simple t-test can be used to validate the contractor’s test results by conducting independent sampling from the same pavement at the same time. Before conducting a formal t-test to compare the sample means, the assumption of equal variances needs to be evaluated. This can be accomplished by comparing sample variances using the F-test. The interpretation of results will be misleading if the equal variance assumption is not validated. If the variances of two populations being compared for their means are different, the mean comparison will reflect the difference between two separate populations. Finally, based on the comparison of means, one can conclude that the construction materials have consistent properties as validated by two independent sources (contractor and agency). This sort of comparison is developed further in Example 8, which illustrates tests for the equality of more than two mean values.

examples of effective experiment Design and Data analysis in transportation research 35 7. Applications in Other Areas of Transportation Research: The simple t-test can be used to compare means of two independent samples. Applications for this method in other areas of transportation research may include: • Traffic Operations – to compare average speeds at two locations along a route. – to evaluate average delay times at two intersections in an urban area. • Pavement Engineering—to investigate the difference in average performance of two pavement sections. • Maintenance—to determine the effects of two maintenance treatments on average life extension of two pavement sections. Example 8: Laboratory Testing/Instrumentation; Simple Analysis of Variance (ANOVA) Area: Laboratory testing and/or instrumentation Method of Analysis: Simple analysis of variance (ANOVA) comparing the mean values of more than two samples and using the F-test 1. Research Question/Problem Statement: An engineer wants to test and compare the com- pressive strength of five different concrete mix designs that vary in coarse aggregate type, gradation, and water/cement ratio. An experiment is conducted in a laboratory where five different concrete mixes are produced based on given specifications, and tested for com- pressive strength using the ASTM International standard procedures. In this example, the comparison involves inference on parameters from more than two populations. The purpose of the analysis, in other words, is to test whether all mix designs are similar to each other in mean compressive strength or whether some differences actually exist. ANOVA is the statistical procedure used to test the basic hypothesis illustrated in this example. Question/Issue Compare the means of more than two samples. In this instance, compare the compres- sive strengths of five concrete mix designs with different combinations of aggregates, gradation, and water/cement ratio. More formally, test the following hypotheses: Ho: There is no difference in mean compressive strength for the various (five) concrete mix types. Ha: At least one of the concrete mix types has a different compressive strength. 2. Identification and Description of Variables: In this experiment, the factor of interest (independent variable) is the concrete mix design, which has five levels based on differ- ent coarse aggregate types, gradation, and water/cement ratios (denoted by t and labeled A through E in Table 11). Compressive strength is a continuous response (dependent) variable, measured in pounds per square inch (psi) for each specimen. Because only one factor is of interest in this experiment, the statistical method illustrated is often called a one-way ANOVA or simple ANOVA. 3. Data Collection: For each of the five mix designs, three replicates each of cylinders 4 inches in diameter and 8 inches in height are made and cured for 28 days. After 28 days, all 15 specimens are tested for compressive strength using the standard ASTM International test. The compres- sive strength data and summary statistics are provided for each mix design in Table 11. In this example, resource constraints have limited the number of replicates for each mix design to

36 effective experiment Design and Data analysis in transportation research three. (For a discussion on sample size determination based on statistical power requirements, see NCHRP Project 20-45, Volume 2, Chapter 1, “Sample Size Determination.”) 4. Specification of Analysis Technique and Data Analysis: To perform a one-way ANOVA, pre- liminary calculations are carried out to compute the overall mean (y _ P), the sample means (y _ i.), and the sample variances (si 2) given the total sample size (nT = 15) as shown in Table 11. The basic strategy for ANOVA is to compare the variance between levels or groups—specifically, the variation between sample means—to the variance within levels. This comparison is used to determine if the levels explain a significant portion of the variance. (Details for perform- ing a one-way ANOVA are given in NCHRP Project 20-45, Volume 2, Chapter 4, Section A, “Analysis of Variance Methodology.”) ANOVA is based on partitioning of the total sum of squares (TSS, a measure of overall variability) into within-level and between-levels components. The TSS is defined as the sum of the squares of the differences of each observation (yij) from the overall mean (y _ P). The TSS, between-levels sum of squares (SSB), and within-level sum of squares (SSE) are computed as follows. TSS y y SSB y y ij i j i = −( ) = = −( ) ∑ .. , . .. . 2 2 4839620 90 = = −( ) = ∑ 4331513 60 508107 30 2 . . , . , i j ij i i j SSE y y∑ The next step is to compute the between-levels mean square (MSB) and within-levels mean square (MSE) based on respective degrees of freedom (df). The total degrees of freedom (dfT), between-levels degrees of freedom (dfB), and within-levels degrees of freedom (dfE) for one- way ANOVA are computed as follows: df n df t df n t T T B E T = − = − = = − = − = = − = − = 1 15 1 14 1 5 1 4 15 5 10 where nT = the total sample size and t = the total number of levels or groups. The next step of the ANOVA procedure is to compute the F-statistic. The F-statistic is the ratio of two variances: the variance due to interaction between the levels, and the variance due to differences within the levels. Under the null hypothesis, the between-levels mean square (MSB) and within-levels mean square (MSE) provide two independent estimates of the variance. If the means for different levels of mix design are truly different from each other, the MSB will tend Replicate Mix Design A B C D E 1 y11 = 5416 y21 = 5292 y31 = 4097 y41 = 5056 y51 = 4165 2 y12 = 5125 y22 = 4779 y32 = 3695 y42 = 5216 y52 = 3849 3 y13 = 4847 y23 = 4824 y33 = 4109 y43 = 5235 y53 = 4089 Mean y– 1. = 5129 y– 2. = 4965 y– 3. = 3967 y– 4. = 5169 y– 5. = 4034 Standard deviation s1 = 284.52 s2 = 284.08 s3 = 235.64 s4 = 98.32 s5 = 164.94 Overall mean y–.. = 4653 Table 11. Concrete compressive strength (psi) after 28 days.

examples of effective experiment Design and Data analysis in transportation research 37 to be larger than the MSE, such that it will be more likely to reject the null hypothesis. For this example, the calculations for MSB, MSE, and F are as follows: MSB SSB df MSE SSE df F M B E = = = = = 1082878 40 50810 70 . . SB MSE = 21 31. If there are no effects due to level, the F-statistic will tend to be smaller. If there are effects due to level, the F-statistic will tend to be larger, as is the case in this example. ANOVA computations usually are summarized in the form of a table. Table 12 summarizes the computations for this example. The final step is to determine Fcrit from the F-distribution table (e.g., NCHRP Project 20-45, Volume 2, Appendix C, Table C-5) with t - 1 (5 - 1 = 4) degrees of freedom for the numerator and nT - t (15 - 5 = 10) degrees of freedom for the denominator. For a significance level of a = 0.01, Fcrit is found (in Table C-5) to be 5.99. Given that F > Fcrit (21.31 > 5.99), the null hypothesis that all mix designs have equal compressive strength is rejected, supporting the conclusion that at least two mix designs are different from each other in their mean effect. Table 12 also shows the p-value calculated using a computer program. The p-value is the probability that a sample would result in the given statistic value if the null hypothesis were true. The p-value of 0.0000698408 is well below the chosen significance level of 0.01. 5. Interpreting the Results: The ANOVA results in rejection of the null hypothesis at a = 0.01. That is, the mean values are judged to be statistically different. However, the ANOVA result does not indicate where the difference lies. For example, does the compressive strength of mix design A differ from that of mix design C or D? To carry out such multiple mean comparisons, the analyst must control the experiment-wise error rate (EER) by employing more conservative methods such as Tukey’s test, Bonferroni’s test, or Scheffe’s test, as appropriate. (Details for ANOVA are given in NCHRP Project 20-45, Volume 2, Chapter 4, Section A, “Analysis of Variance Methodology.”) The coefficient of determination (R2) provides a rough indication of how well the statistical model fits the data. For this example, R2 is calculated as follows: R SSB TSS 2 4331513 60 4839620 90 0 90= = = . . . For this example, R2 indicates that the one-way ANOVA classification model accounts for 90% of the total variation in the data. In the controlled laboratory experiment demonstrated in this example, R2 = 0.90 indicates a fairly acceptable fit of the statistical model to the data. 6. Conclusion and Discussion: This example illustrates a simple one-way ANOVA where infer- ence regarding parameters (mean values) from more than two populations or treatments was Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Probability > F (Significance) Between 4331513.60 4 1082878.40 21.31 0.0000698408 Within 508107.30 10 50810.70 Total 4839620.90 14 Table 12. ANOVA results.

38 effective experiment Design and Data analysis in transportation research desired. The focus of computations was the construction of the ANOVA table. Before pro- ceeding with ANOVA, however, an analyst must verify that the assumptions of common vari- ance and data normality are satisfied within each group/level. The results do not establish the cause of difference in compressive strength between mix designs in any way. The experimental setup and analytical procedure shown in this example may be used to test other properties of mix designs such as flexure strength. If another factor (for example, water/cement ratio with levels low or high) is added to the analysis, the classification will become a two-way ANOVA. (In this report, two-way ANOVA is demonstrated in Example 11.) Notice that the equations shown in Example 8 may only be used for one-way ANOVA for balanced designs, meaning that in this experiment there are equal numbers of replicates for each level within a factor. (For a discussion of computations on unbalanced designs and multifactor designs, see NCHRP Project 20-45.) 7. Applications in Other Areas of Transportation Research: Examples of applications of one-way ANOVA in other areas of transportation research include: • Traffic Operations—to determine the effect of various traffic calming devices on average speeds in residential areas. • Traffic Operations/Safety—to study the effect of weather conditions on accidents in a given time period. • Work Zones—to compare the effect of different placements of work zone signs on reduction in highway speeds at some downstream point. • Materials—to investigate the effect of recycled aggregates on compressive and flexural strength of concrete. Example 9: Materials; Simple Analysis of Variance (ANOVA) Area: Materials Method of Analysis: Simple analysis of variance (ANOVA) comparing more than two mean values and using the F-test for equality of means 1. Research Question/Problem Statement: To illustrate how increasingly detailed analysis may be appropriate, Example 9 is an extension of the two-sample comparison presented in Exam- ple 7. As a part of dispute resolution during quality control and quality assurance, let’s say the highway agency engineer from Example 7 decides to reconfirm the contractor’s test results for asphalt content. The agency hires an independent consultant to verify both the contractor- and agency-measured asphalt contents. It now becomes necessary to compare more than two mean values. A simple one-way analysis of variance (ANOVA) can be used to analyze the asphalt contents measured by three different parties. Question/Issue Extend a comparison of two mean values to compare three (or more) mean values. Specifically, use data collected by several (>2) different parties to see if the results (mean values) are the same. Formally, test the following null (Ho) and alternative (Ha) hypotheses, which can be stated as follows: Ho: There is no difference in asphalt content among three different parties: H m m mo contractor agency: = =( )consultant Ha: At least one of the parties has a different measured asphalt content.

examples of effective experiment Design and Data analysis in transportation research 39 2. Identification and Description of Variables: The independent consultant runs 12 additional asphalt content tests by taking independent samples from the same pavement section as the agency and contractor. The question is whether it is likely that the tests came from the same population, based on their variability. 3. Data Collection: The descriptive statistics (mean, standard deviation, and sample size) for the asphalt content data collected by the three parties are shown in Table 13. Notice that 12 measurements each have been taken by the contractor and the independent consultant, while the agency has only taken six measurements. The data for the contractor and the agency are the same as presented in Example 7. For brevity, the consultant’s raw observations are not repeated here. The mean value and standard deviation for the consultant’s data are calculated using the same formulas and equations that were used in Example 7. 4. Specification of Analysis Technique and Data Analysis: The agency engineer can use one-way ANOVA to resolve this question. (Details for one-way ANOVA are available in NCHRP Project 20-45, Volume 2, Chapter 4, Section A, “Analysis of Variance Methodology.”) The objective of the ANOVA is to determine whether the variance observed in the depen- dent variable (in this case, asphalt content) is due to the differences among the samples (different from one party to another) or due to the differences within the samples. ANOVA is basically an extension of two-sample comparisons to cases when three or more samples are being compared. More formally, the technician is testing to see whether the between- sample variability is large relative to the within-sample variability, as stated in the formal hypothesis. This type of comparison also may be referred to as between-groups versus within-groups variance. Rejection of the null hypothesis (that the mean values are the same) gives the engineer some information concerning differences among the population means; however, it does not indicate which means actually differ from each other. Rejection of the null hypothesis tells the engineer that differences exist, but it does not specify that X _ 1 differs from X _ 2 or from X _ 3. To control the experiment-wise error rate (EER) for multiple mean comparisons, a con- servative test—Tukey’s procedure for unplanned comparisons—can be used for unplanned comparisons. (Information about Tukey’s procedure can be found in almost any good statistics textbook, such as those by Freund and Wilson [2003] and Kutner et al. [2005].) The F-statistic calculated for determining the effect of who (agency, contractor, or consultant) measured Party Type Asphalt Content Percent Contractor 1 1 1 X s n = 6.1 = 0.254 = 12 Agency 2 2 2 X s n = 5.7 = 0.303 = 6 Consultant 3 3 3 X s n = 5.12 = 0.186 = 12 Table 13. Asphalt content data summary.

40 effective experiment Design and Data analysis in transportation research the asphalt content is given in Table 14. (See Example 8 for a more detailed discussion of the calculations necessary to create Table 14.) Although the ANOVA results reveal whether there are overall differences, it is always good practice to visually examine the data. For example, Figure 9 shows the mean and associated 95% confidence intervals (CI) of the mean asphalt content measured by each of the three parties involved in the testing. 5. Interpreting the Results: A simple one-way ANOVA is conducted to determine whether there is a difference in mean asphalt content as measured by the three different parties. The analysis shows that the F-statistic is significant (p-value < 0.05), meaning that at least two of the means are significantly different from each other. The engineer can use Tukey’s procedure for com- parisons of multiple means, or he or she can observe the plotted 95% confidence intervals to figure out which means are actually (and significantly) different from each other (see Figure 9). Because the confidence intervals overlap, the results show that the asphalt content measured by the contractor and the agency are somewhat different. (These same conclusions were obtained in Example 7.) However, the mean asphalt content obtained by the consultant is significantly different from (and lower than) that obtained by both of the other parties. This is evident because the confidence interval for the consultant doesn’t overlap with the confidence interval of either of the other two parties. Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Significance Between groups 5.6 2 2.8 49.1 0.000 Within groups 1.5 27 0.06 Total 7.2 29 Table 14. ANOVA results. Figure 9. Mean and confidence intervals for asphalt content data.

examples of effective experiment Design and Data analysis in transportation research 41 6. Conclusion and Discussion: This example uses a simple one-way ANOVA to compare the mean values of three sets of results using data drawn from the same test section. The error bar plots for data from the three different parties visually illustrate the statistical differences in the multiple means. However, the F-test for multiple means should be used to formally test the hypothesis of the equality of means. The interpretation of results will be misleading if the variances of populations being compared for their mean difference are not equal. Based on the comparison of the three means, it can be concluded that the construction material in this example may not have consistent properties, as indicated by the results from the independent consultant. 7. Applications in Other Areas of Transportation Research: Simple one-way ANOVA is often used when more than two means must be compared. Examples of applications in other areas of transportation research include: • Traffic Safety/Operations—to evaluate the effect of intersection type on the average number of accidents per month. Three or more types of intersections (e.g., signalized, non-signalized, and rotary) could be selected for study in an urban area having similar traffic volumes and vehicle mix. • Pavement Engineering – to investigate the effect of hot-mix asphalt (HMA) layer thickness on fatigue cracking after 20 years of service life. Three HMA layer thicknesses (5 inches, 6 inches, and 7 inches) are to be involved in this study, and other factors (i.e., traffic, climate, and subbase/base thicknesses and subgrade types) need to be similar. – to determine the effect of climatic conditions on rutting performance of flexible pavements. Three or more climatic conditions (e.g., wet-freeze, wet-no-freeze, dry-freeze, and dry-no-freeze) need to be considered while other factors (i.e., traffic, HMA, and subbase/ base thicknesses and subgrade types) need to be similar. Example 10: Pavements; Simple Analysis of Variance (ANOVA) Area: Pavements Method of Analysis: Simple analysis of variance (ANOVA) comparing the mean values of more than two samples and using the F-test 1. Research Question/Problem Statement: The aggregate coefficient of thermal expansion (CTE) in Portland cement concrete (PCC) is a critical factor affecting thermal behavior of PCC slabs in concrete pavements. In addition, the interaction between slab curling (caused by the thermal gradient) and axle loads is assumed to be a critical factor for concrete pavement performance in terms of cracking. To verify the effect of aggregate CTE on slab cracking, a pavement engineer wants to conduct a simple observational study by collecting field pave- ment performance data on three different types of pavement. For this example, three types of aggregate (limestone, dolomite, and gravel) are being used in concrete pavement construction and yield the following CTEs: • 4 in./in. per °F • 5 in./in. per °F • 6.5 in./in. per °F It is necessary to compare more than two mean values. A simple one-way ANOVA is used to analyze the observed slab cracking performance by the three different concrete mixes with different aggregate types based on geology (limestone, dolomite, and gravel). All other factors that might cause variation in cracking are assumed to be held constant.

42 effective experiment Design and Data analysis in transportation research 2. Identification and Description of Variables: The engineer identifies 1-mile sections of uni- form pavement within the state highway network with similar attributes (aggregate type, slab thickness, joint spacing, traffic, and climate). Field performance, in terms of the observed percentage of slab cracked (“% slab cracked,” i.e., how cracked is each slab) for each pavement section after about 20 years of service, is considered in the analysis. The available pavement data are grouped (stratified) based on the aggregate type (CTE value). The % slab cracked after 20 years is the dependent variable, while CTE of aggregates is the independent variable. The question is whether pavement sections having different types of aggregate (CTE values) exhibit similar performance based on their variability. 3. Data Collection: From the data stratified by CTE, the engineer randomly selects nine pave- ment sections within each CTE category (i.e., 4, 5, and 6.5 in./in. per °F). The sample size is based on the statistical power (1-b) requirements. (For a discussion on sample size determina- tion based on statistical power requirements, see NCHRP Project 20-45, Volume 2, Chapter 1, “Sample Size Determination.”) The descriptive statistics for the data, organized by three CTE categories, are shown in Table 15. The engineer considers pavement performance data for 9 pavement sections in each CTE category. 4. Specification of Analysis Technique and Data Analysis: Because the engineer is concerned with the comparison of more than two mean values, the easiest way to make the statistical comparison is to perform a one-way ANOVA (see NCHRP Project 20-45, Volume 2, Chapter 4). The comparison will help to determine whether the between-section variability is large relative to the within-section variability. More formally, the following hypotheses are tested: HO: All mean values are equal (i.e., m1 = m2 = m3). HA: At least one of the means is different from the rest. Although rejection of the null hypothesis gives the engineer some information concerning difference among the population means, it doesn’t tell the engineer anything about how the means differ from each other. For example, does m1 differ from m2 or m3? To control the experiment-wise error rate (EER) for multiple mean comparisons, a conservative test— Tukey’s procedure for unplanned comparisons—can be used. (Information about Tukey’s procedure can be found in almost any good statistics textbook, such as those by Freund and Wilson [2003] and Kutner et al. [2005].)The F-statistic calculated for determining the effect of CTE on % slab cracked after 20 years is shown in Table 16. Question/Issue Compare the means of more than two samples. Specifically, is the cracking perfor- mance of concrete pavements designed using more than two different types of aggregates the same? Stated a bit differently, is the performance of three different types of concrete pavement statistically different (are the mean performance measures different)? CTE (in./in. per oF) % Slab Cracked After 20 Years 4 1 1 137, 4.8, 9X s n= = = 5 2 2 253.7, 6.1, 9X s n= = = 6.5 3 3 372.5, 6.3, 9X s n= = = Table 15. Pavement performance data.

examples of effective experiment Design and Data analysis in transportation research 43 The data in Table 16 have been produced by considering the original data and following the procedures presented in earlier examples. The emphasis in this example is on understanding what the table of results provides the researcher. Also in this example, the test for homogeneity of variances (Levene test) shows no significant difference among the standard deviations of % slab cracked for different CTE values. Figure 10 presents the mean and associated 95% confi- dence intervals of the average % slab cracked (also called the mean and error bars) measured for the three CTE categories considered. 5. Interpreting the Results: A simple one-way ANOVA is conducted to determine if there is a difference among the mean values for % slab cracked for different CTE values. The analysis shows that the F-statistic is significant (p-value < 0.05), meaning that at least two of the means are statistically significantly different from each other. To gain more insight, the engineer can use Tukey’s procedure to specifically compare the mean values, or the engineer may simply observe the plotted 95% confidence intervals to ascertain which means are significantly different from each other (see Figure 10). The plotted results show that the mean % slab cracked varies significantly for different CTE values—there is no overlap between the different mean/error bars. Figure 10 also shows that the mean % slab cracked is significantly higher for pavement sections having a higher CTE value. (For more information about Tukey’s procedure, see NCHRP Project 20-45, Volume 2, Chapter 4.) 6. Conclusion and Discussion: In this example, simple one-way ANOVA is used to assess the effect of CTE on cracking performance of rigid pavements. The F-test for multiple means is used to formally test the (null) hypothesis of mean equality. The confidence interval plots for data from pavements having three different CTE values visually illustrate the statistical differ- ences in the three means. The interpretation of results will be misleading if the variances of Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Significance Between groups 5652.7 2 0.0002826.3 84.1 Within groups 806.9 24 33.6 Total 6459.6 26 Table 16. ANOVA results. Figure 10. Error bars for % slab cracked with different CTE.

44 effective experiment Design and Data analysis in transportation research populations being compared for their mean difference are not equal or if a proper multiple mean comparisons procedure is not adopted. Based on the comparison of the three means in this example, the engineer can conclude that the pavement slabs having aggregates with a higher CTE value will exhibit more cracking than those with lower CTE values, given that all other variables (e.g., climate effects) remain constant. 7. Applications in Other Areas of Transportation Research: Simple one-way ANOVA is widely used and can be employed whenever multiple means within a factor are to be compared with one another. Potential applications in other areas of transportation research include: • Traffic Operations—to evaluate the effect of commuting time on level of service (LOS) of an urban highway. Mean travel times for three periods (e.g., morning, afternoon, and evening) could be selected for specified highway sections to collect the traffic volume and headway data in all lanes. • Traffic Safety—to determine the effect of shoulder width on accident rates on rural highways. More than two shoulder widths (e.g., 0 feet, 6 feet, 9 feet, and 12 feet) should be selected in this study. • Pavement Engineering—to investigate the impact of air void content on flexible pavement fatigue performance. Pavement sections having three or more air void contents (e.g., 3%, 5%, and 7%) in the surface HMA layer could be selected to compare their average fatigue cracking performance after the same period of service (e.g., 15 years). • Materials—to study the effect of aggregate gradation on the rutting performance of flexible pavements. Three types of aggregate gradations (fine, intermediate, and coarse) could be adopted in the laboratory to make different HMA mix samples. Performance testing could be conducted in the laboratory to measure rut depths for a given number of load cycles. Example 11: Pavements; Factorial Design (ANOVA Approach) Area: Pavements Method of Analysis: Factorial design (an ANOVA approach used to explore the effects of varying more than one independent variable) 1. Research Question/Problem Statement: Extending the information from Example 10 (a simple ANOVA example for pavements), the pavement engineer has verified that the coefficient of thermal expansion (CTE) in Portland cement concrete (PCC) is a critical factor affecting thermal behavior of PCC slabs in concrete pavements and significantly affects concrete pave- ment performance in terms of cracking. The engineer now wants to investigate the effects of another factor, joint spacing (JS), in addition to CTE. To study the combined effects of PCC CTE and JS on slab cracking, the engineer needs to conduct a factorial design study by collect- ing field pavement performance data. As before, three CTEs will be considered: • 4 in./in. per °F, • 5 in./in. per °F, and • 6.5 in./in. per °F. Now, three different joint spacings (12 ft, 16 ft, and 20 ft) also will be considered. For this example, it is necessary to compare multiple means within each factor (main effects) and the interaction between the two factors (interactive effects). The statistical technique involved is called a multifactorial two-way ANOVA. 2. Identification and Description of Variables: The engineer identifies uniform 1-mile pavement sections within the state highway network with similar attributes (e.g., slab thickness, traffic, and climate). The field performance, in terms of observed percentage of each slab cracked (% slab cracked) after about 20 years of service for each pavement section, is considered the

examples of effective experiment Design and Data analysis in transportation research 45 dependent (or response) variable in the analysis. The available pavement data are stratified based on CTE and JS. CTE and JS are considered the independent variables. The question is whether pavement sections having different CTE and JS exhibit similar performance based on their variability. Question/Issue Use collected data to determine the effects of varying more than one independent variable on some measured outcome. In this example, compare the cracking perfor- mance of concrete pavements considering two independent variables: (1) coefficients of thermal expansion (CTE) as measured using more than two types of aggregate and (2) differing joint spacing (JS). More formally, the hypotheses can be stated as follows: Ho : ai = 0, No difference in % slabs cracked for different CTE values. Ho : gj = 0, No difference in % slabs cracked for different JS values. Ho : (ag)ij = 0, for all i and j, No difference in % slabs cracked for different CTE and JS combinations. 3. Data Collection: The descriptive statistics for % slab cracked data by three CTE and three JS categories are shown in Table 17. From the data stratified by CTE and JS, the engineer has randomly selected three pavement sections within each of nine combinations of CTE values. (In other words, for each of the nine pavement sections from Example 10, the engineer has selected three JS.) 4. Specification of Analysis Technique and Data Analysis: The engineer can use two-way ANOVA test statistics to determine whether the between-section variability is large relative to the within-section variability for each factor to test the following null hypotheses: • Ho : ai = 0 • Ho : gj = 0 • Ho : (ag)ij = 0 As mentioned before, although rejection of the null hypothesis does give the engineer some information concerning differences among the population means (i.e., there are differences among them), it does not clarify which means differ from each other. For example, does µ1 differ from µ2 or µ3? To control the experiment-wise error rate (EER) for the comparison of multiple means, a conservative test—Tukey’s procedure for an unplanned comparison—can be used. (Information about two-way ANOVA is available in NCHRP Project 20-45, Volume 2, CTE (in/in per oF) Marginal µ & σ 4 5 6.5 Joint spacing (ft) 12 1,1 = 32.4 s1,1 = 0.1 1,2 = 46.8 s1,2 = 1.8 1,3 = 65.3 s 1,3 = 3.2 1,. = 48.2 s1,. = 14.4 16 2,1 = 36.0 s2,1 = 2.4 2,2 = 54 s2,2 = 2.9 2,3 = 73 s2,3 = 1.1 2,. = 54.3 s2,. = 16.1 20 3,1 = 42.7 s3,1 = 2.4 3,2 = 60.3 s3,2 = 0.5 3,3 = 79.1 s3,3 = 2.0 3,. = 60.7 s3,. = 15.9 Marginal µ & σ .,1 = 37.0 x– x– x– x– x– x– x– x– x– x– x– x– x– x– x– x– s.,1 = 4.8 .,2 = 53.7 s.,2 = 6.1 .,3 = 72.5 s.,3 = 6.3 .,. = 54.4 s.,. = 15.8 Note: n = 3 in each cell; values are cell means and standard deviations. Table 17. Summary of cracking data.

46 effective experiment Design and Data analysis in transportation research Chapter 4. Information about Tukey’s procedure can be found in almost any good statistics textbook, such as those by Freund and Wilson [2003] and Kutner et al. [2005].) The results of the two-way ANOVA are shown in Table 18. From the first line it can be seen that both of the main effects, CTE and JS, are significant in explaining cracking behavior (i.e., both p-values < 0.05). However, the interaction (CTE × JS) is not significant (i.e., the p-value is 0.999, much greater than 0.05). Also, the test for homogeneity of variances (Levene statistic) shows that there is no significant difference among the standard deviations of % slab cracked for different CTE and JS values. Figure 11 illustrates the main and interactive effects of CTE and JS on % slabs cracked. 5. Interpreting the Results: A two-way (multifactorial) ANOVA is conducted to determine if difference exists among the mean values for “% slab cracked” for different CTE and JS values. The analysis shows that the main effects of both CTE and JS are significant, while the inter- action effect is insignificant (p-value > 0.05). These results show that when CTE and JS are considered jointly, they significantly impact the slab cracking separately. Given these results, the conclusions from the results will be based on the main effects alone without considering interaction effects. In fact, if the interaction effect had been significant, the conclusions would be based on them. To gain more insight, the engineer can use Tukey’s procedure to compare specific multiple means within each factor, or the engineer can simply observe the plotted means in Figure 11 to ascertain which means are significantly different from each other. The plotted results show that the mean % slab cracked varies significantly for different CTE and JS values; that is, the CTE seems to be more influential than JS. All lines are almost parallel to Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Significance CTE 5677.74 2 2838.87 657.16 0.000 JS 703.26 2 351.63 81.40 0.000 CTE × JS 0.12 4 0.03 0.007 0.999 Residual/error 77.76 18 4.32 Total 6458.88 26 Table 18. ANOVA results. M ea n % s la bs c ra ck ed 6.55.04.0 75 70 65 60 55 50 45 40 35 201612 CTE JS Main Effects Plot (data means) for Cracking Joint Spacing (ft) M ea n % s la bs c ra ck ed 201612 80 70 60 50 40 30 CTE 6.5 4.0 5.0 Interaction Plot (data means) for Cracking Figure 11. Main and interaction effects of CTE and JS on slab cracking.

examples of effective experiment Design and Data analysis in transportation research 47 each other when plotted for both factors together, showing no interactive effects between the levels of two factors. 6. Conclusion and Discussion: The two-way ANOVA can be used to verify the combined effects of CTE and JS on cracking performance of rigid pavements. The marginal mean plot for cracking having three different CTE and JS levels visually illustrates the differences in the multiple means. The plot of cell means for cracking within the levels of each factor can indicate the presence of interactive effect between two factors (in this example, CTE and JS). However, the F-test for multiple means should be used to formally test the hypothesis of mean equality. Finally, based on the comparison of three means within each factor (CTE and JS), the engineer can conclude that the pavement slabs having aggregates with higher CTE and JS values will exhibit more cracking than those with lower CTE and JS values. In this example, the effect of CTE on concrete pavement cracking seems to be more critical than that of JS. 7. Applications in Other Areas of Transportation Research: Multifactorial designs can be used when more than one factor is considered in a study. Possible applications of these methods can extend to all transportation-related areas, including: • Pavement Engineering – to determine the effects of base type and base thickness on pavement performance of flexible pavements. Two or more levels can be considered within each factor; for exam- ple, two base types (aggregate and asphalt-treated bases) and three base thicknesses (8 inches, 12 inches, and 18 inches). – to investigate the impact of pavement surface conditions and vehicle type on fuel con- sumption. The researcher can select pavement sections with three levels of ride quality (smooth, rough, and very rough) and three types of vehicles (cars, vans, and trucks). The fuel consumptions can be measured for each vehicle type on all surface conditions to determine their impact. • Materials – to study the effects of aggregate gradation and surface on tensile strength of hot-mix asphalt (HMA). The engineer can evaluate two levels of gradation (fine and coarse) and two types of aggregate surfaces (smooth and rough). The samples can be prepared for all the combinations of aggregate gradations and surfaces for determination of tensile strength in the laboratory. – to compare the impact of curing and cement types on the compressive strength of concrete mixture. The engineer can design concrete mixes in laboratory utilizing two cement types (Type I & Type III). The concrete samples can be cured in three different ways for 24 hours and 7 days (normal curing, water bath, and room temperature). Example 12: Work Zones; Simple Before-and-After Comparisons Area: Work zones Method of Analysis: Simple before-and-after comparisons (exploring the effect of some treat- ment before it is applied versus after it is applied) 1. Research Question/Problem Statement: The crash rate in work zones has been found to be higher than the crash rate on the same roads when a work zone is not present. For this reason, the speed limit in construction zones often is set lower than the prevailing non-work-zone speed limit. The state DOT decides to implement photo-radar speed enforcement in a work zone to determine if this speed-enforcement technique reduces the average speed of free- flowing vehicles in the traffic stream. They measure the speeds of a sample of free-flowing vehicles prior to installing the photo-radar speed-enforcement equipment in a work zone and

48 effective experiment Design and Data analysis in transportation research then measure the speeds of free-flowing vehicles at the same location after implementing the photo-radar system. Question/Issue Use collected data to determine whether a difference exists between results before and after some treatment is applied. For this example, does a photo-radar speed- enforcement system reduce the speed of free-flowing vehicles in a work zone, and, if so, is the reduction statistically significant? 2. Identification and Description of Variables: The variable to be analyzed is the mean speed of vehicles before and after the implementation of a photo-radar speed-enforcement system in a work zone. 3. Data Collection: The speeds of individual free-flowing vehicles are recorded for 30 minutes on a Tuesday between 10:00 a.m. and 10:30 a.m. before installing the photo-radar system. After the system is installed, the speeds of individual free-flowing vehicles are recorded for 30 minutes on a Tuesday between 10:00 a.m. and 10:30 a.m. The before sample contains 120 observations and the after sample contains 100 observations. 4. Specification of Analysis Technique and Data Analysis: A test of the significance of the difference between two means requires a statement of the hypothesis to be tested (Ho) and a statement of the alternate hypothesis (H1). In this example, these hypotheses can be stated as follows: Ho: There is no difference in the mean speed of free-flowing vehicles before and after the photo-radar speed-enforcement system is displayed. H1: There is a difference in the mean speed of free-flowing vehicles before and after the photo-radar speed-enforcement system is displayed. Because these two samples are independent, a simple t-test is appropriate to test the stated hypotheses. This test requires the following procedure: Step 1. Compute the mean speed (x _ ) for the before sample (x _ b) and the after sample (x _ a) using the following equation: x x n n ni i i n i b a= = = = ∑ 1 120 100; and Results: x _ b = 53.1 mph and x _ a = 50.5 mph. Step 2. Compute the variance (S2) for each sample using the following equation: S x x n i i i n 2 2 1 1 = −( ) − − ∑ where na = 100; x _ a= 50.5 mph; nb = 120; and x _ b = 53.1 mph Results: S x x n b b b b 2 2 1 12 06= −( ) − =∑ . and S x x n a a a a 2 2 1 12 97= −( ) − =∑ . . Step 3. Compute the pooled variance of the two samples using the following equation: S x x x x n n p a a b b b a 2 2 2 2 = −( ) + −( ) + − ∑∑ Results: S2p = 12.472 and Sp = 3.532.

examples of effective experiment Design and Data analysis in transportation research 49 Step 4. Compute the t-statistic using the following equation: t x x S n n n n b a p a b a b = − + Result: t = − ( )( ) + = 53 1 50 5 3 532 100 120 100 120 5 43 . . . . . 5. Interpreting the Results: The results of the sample t-test are obtained by comparing the value of the calculated t-statistic (5.43 in this example) with the value of the t-statistic for the level of confidence desired. For a level of confidence of 95%, the t-statistic must be greater than 1.96 to reject the null hypotheses (Ho) that the use of a photo-radar speed-enforcement sys- tem does not change the speed of free-flowing vehicles. (For more information, see NCHRP Project 20-45, Volume 2, Appendix C, Table C-4.) 6. Conclusion and Discussion: The sample problem illustrates the use of a statistical test to determine whether the difference in the value of the variable of interest between the before conditions and the after conditions is statistically significant. The before condition is without photo-radar speed enforcement; and the after condition is with photo-radar speed enforcement. In this sample problem, the computed t-statistic (5.43) is greater than the critical t-statistic (1.96), so the null hypothesis is rejected. This means the change in the speed of free-flowing vehicles when the photo-radar speed-enforcement system is used is statistically significant. The assumption is made that all other factors that would affect the speed of free-flowing vehicles (e.g., traffic mix, weather, or construction activity) are the same in the before-and-after conditions. This test is robust if the normality assumption does not hold completely; however, it should be checked using box plots. For significant departures from normality and variance equality assumptions, non-parametric tests must be conducted. (For more information, see NCHRP Project 20-45, Volume 2, Chapter 6, Section C and also Example 21). The reliability of the results in this example could be improved by using a control group. As the example has been constructed, there is an assumption that the only thing that changed at this site was the use of photo-radar speed enforcement; that is, it is assumed that all observed differences are attributable to the use of the photo-radar. If other factors—even something as simple as a general decrease in vehicle speeds in the area—might have impacted speed changes, the effect of the photo-radar speed enforcement would have to be adjusted for those other factors. Measurements taken at a control site (ideally identical to the experiment site) during the same time periods could be used to detect background changes and then to adjust the photo-radar effects. Such a situation is explored in Example 13. 7. Applications in Other Areas in Transportation: The before-and-after comparison can be used whenever two independent samples of data are (or can be assumed to be) normally distributed with equal variance. Applications of before-and-after comparison in other areas of transportation research may include: • Traffic Operations – to compare the average delay to vehicles approaching a signalized intersection when a fixed time signal is changed to an actuated signal or a traffic-adaptive signal. – to compare the average number of vehicles entering and leaving a driveway when access is changed from full access to right-in, right-out only. • Traffic Safety – to compare the average number of crashes on a section of road before and after the road is resurfaced. – to compare the average number of speeding citations issued per day when a stationary operation is changed to a mobile operation. • Maintenance—to compare the average number of citizen complaints per day when a change is made in the snow plowing policy.

50 effective experiment Design and Data analysis in transportation research Example 13: Traffic Safety; Complex Before-and-After Comparisons and Controls Area: Traffic safety Method of Analysis: Complex before-and-after comparisons using control groups (examining the effect of some treatment or application with consideration of other factors that may also have an effect) 1. Research Question/Problem Statement: A state safety engineer wants to estimate the effec- tiveness of fluorescent orange warning signs as compared to standard orange signs in work zones on freeways and other multilane highways. Drivers can see fluorescent signs from a longer distance than standard signs, especially in low-visibility conditions, and the extra cost of the fluorescent material is not too high. Work-zone safety is a perennial concern, especially on freeways and multilane highways where speeds and traffic volumes are high. Question/Issue How can background effects be separated from the effects of a treatment or application? Compared to standard orange signs, do fluorescent orange warning signs increase safety in work zones on freeways and multilane highways? 2. Identification and Description of Variables: The engineer quickly concludes that there is a need to collect and analyze safety surrogate measures (e.g., traffic conflicts and late lane changes) rather than collision data. It would take a long time and require experimentation at many work zones before a large sample of collision data could be ready for analysis on this question. Surrogate measures relate to collisions, but they are much more numerous and it is easier to collect a large sample of them in a short time. For a study of traffic safety, surrogate measures might include near-collisions (traffic conflicts), vehicle speeds, or locations of lane changes. In this example, the engineer chooses to use the location of the lane-change maneuver made by drivers in a lane to be closed entering a work zone. This particular surrogate safety measure is a measure of effectiveness (MOE). The hypothesis is that the farther downstream at which a driver makes a lane change out of a lane to be closed—when the highway is still below capacity—the safer the work zone. 3. Data Collection: The engineer establishes site selection criteria and begins examining all active work zones on freeways and multilane highways in the state for possible inclusion in the study. The site selection criteria include items such as an active work zone, a cooperative contractor, no interchanges within the approach area, and the desired lane geometry. Seven work zones meet the criteria and are included in the study. The engineer decides to use a before-and-after (sometimes designated B/A or b/a) experiment design with randomly selected control sites. The latter are sites in the same population as the treatment sites; that is, they meet the same selection criteria but are untreated (i.e., standard warning signs are employed, not the fluorescent orange signs). This is a strong experiment design because it minimizes three common types of bias in experiments: history, maturation, and regression to the mean. History bias exists when changes (e.g., new laws or large weather events) happen at about the same time as the treatment in an experiment, so that the engineer or analyst cannot separate the effect of the treatment from the effects of the other events. Maturation bias exists when gradual changes occur throughout an extended experiment period and cannot be separated from the effects of the treatment. Examples of maturation bias might involve changes like the aging of driver populations or new vehicles with more air bags. History and maturation biases are referred to as specification errors and are described in more detail in NCHRP Project 20-45, Volume 2,

examples of effective experiment Design and Data analysis in transportation research 51 Chapter 1, in the section “Quasi-Experiments.” Regression-to-the-mean bias exists when sites with the highest MOE levels in the before time period are treated. If the MOE level falls in the after period, the analyst can never be sure how much of the fall was due to the treatment and how much was due to natural fluctuations in the values of the MOE back toward its usual mean value. A before-and-after study with randomly selected control sites minimizes these biases because their effects are expected to apply just as much to the treatment sites as to the control sites. In this example, the engineer randomly selects four of the seven work zones to receive fluorescent orange signs. The other three randomly selected work zones received standard orange signs and are the control sites. After the signs have been in place for a few weeks (a common tactic in before-and-after studies to allow regular drivers to get used to the change), the engineer collects data at all seven sites. The location of each vehicle’s lane-change maneuver out of the lane to be closed is measured from video tape recorded for several hours at each site. Table 19 shows the lane-change data at the midpoint between the first warning sign and beginning of the taper. Notice that the same number of vehicles is observed in the before-and- after periods for each type of site. 4. Specification of Analysis Technique and Data Analysis: Depending on their format, data from a before-and-after experiment with control sites may be analyzed several ways. The data in the table lend themselves to analysis with a chi-square test to see whether the distributions between the before-and-after conditions are the same at both the treatment and control sites. (For more information about chi-square testing, see NCHRP Project 20-45, Volume 2, Chapter 6, Section E, “Chi-Square Test for Independence.”) To perform the chi-square test on the data for Example 13, the engineer first computes the expected value in each cell. For the cell corresponding to the before time period for control sites, this value is computed as the row total (3361) times the column total (2738) divided by the grand total (6714): 3361 2738 6714 1371 = vehicles The engineer next computes the chi-square value for each cell using the following equation: χi i i i O E E 2 2 = −( ) where Oi is the number of actual observations in cell i and Ei is the expected number of observations in cell i. For example, the chi-square value in the cell corresponding to the before time period for control sites is (1262 - 1371)2 / 1371 = 8.6. The engineer then sums the chi-square values from all four cells to get 29.1. That sum is then compared to the critical chi-square value for the significance level of 0.025 with 1 degree of freedom (degrees of freedom = number of rows - 1 * number of columns - 1), which is shown on a standard chi-square distribution table to be 5.02 (see NCHRP Project 20-45, Volume 2, Appendix C, Table C-2.) A significance level of 0.025 is not uncommon in such experiments (although 0.05 is a general default value), but it is a standard that is difficult but not impossible to meet. Time Period Number of Vehicles Observed in Lane to be Closed at Midpoint Control Treatment Total Before 1262 2099 3361 After 1476 1877 3353 Total 2738 3976 6714 Table 19. Lane-change data for before-and-after comparison using controls.

52 effective experiment Design and Data analysis in transportation research 5. Interpreting the Results: Because the calculated chi-square value is greater than the critical chi-square value, the engineer concludes that there is a statistically significant difference in the number of vehicles in the lane to be closed at the midpoint between the before-and-after time periods for the treatment sites relative to what would be expected based on the control sites. In other words, there is a difference that is due to the treatment. 6. Conclusion and Discussion: The experiment results show that fluorescent orange signs in work zone approaches like those tested would likely have a safety benefit. Although the engi- neer cannot reasonably estimate the number of collisions that would be avoided by using this treatment, the before-and-after study with control using a safety surrogate measure makes it clear that some collisions will be avoided. The strength of the experiment design with randomly selected control sites means that agencies can have confidence in the results. The consequences of an error in an analysis like this that results in the wrong conclusion can be devastating. If the error leads an agency to use a safety measure more than it should, precious safety funds will be wasted that could be put to better use. If the error leads an agency to use the safety measure less often than it should, money will be spent on measures that do not prevent as many collisions. With safety funds in such short supply, solid analyses that lead to effective decisions on countermeasure deployment are of great importance. A before-and-after experiment with control is difficult to arrange in practice. Such an experiment is practically impossible using collision data, because that would mean leaving some higher collision sites untreated during the experiment. Such experiments are more plausible using surrogate measures like the one described in this example. 7. Applications in Other Areas of Transportation Research: Before-and-after experiments with randomly selected control sites are difficult to arrange in transportation safety and other areas of transportation research. The instinct to apply treatments to the worst sites, rather than randomly—as this method requires—is difficult to overcome. Despite the difficulties, such experiments are sometimes performed in: • Traffic Operations—to test traffic control strategies at a number of different intersections. • Pavement Engineering—to compare new pavement designs and maintenance processes to current designs and practice. • Materials—to compare new materials, mixes, or processes to standard mixtures or processes. Example 14: Work Zones; Trend Analysis Area: Work zones Method of Analysis: Trend analysis (examining, describing, and modeling how something changes over time) 1. Research Question/Problem Statement: Measurements conducted over time often reveal patterns of change called trends. A model may be used to predict some future measurement, or the relative success of a different treatment or policy may be assessed. For example, work/ construction zone safety has been a concern for highway officials, engineers, and planners for many years. Is there a pattern of change? Question/Issue Can a linear model represent change over time? In this particular example, is there a trend over time for motor vehicle crashes in work zones? The problem is to predict values of crash frequency at specific points in time. Although the question is simple, the statistical modeling becomes sophisticated very quickly.

examples of effective experiment Design and Data analysis in transportation research 53 2. Identification and Description of Variables: Highway safety, rather the lack of it, is revealed by the total number of fatalities due to motor vehicle crashes. The percentage of those deaths occurring in work zones reveals a pattern over time (Figure 12). The data points for the graph are calculated using the following equation: WZP a b YEAR u= + + where WZP = work zone percentage of total fatalities, YEAR = calendar year, and u = an error term, as used here. 3. Data Collection: The base data are obtained from the Fatality Analysis Reporting System maintained by the National Highway Traffic Safety Administration (NHTSA), as reported at www.workzonesafety.org. The data are state specific as well as for the country as a whole, and cover a period of 26 years from 1982 through 2007. The numbers of fatalities from motor vehicle crashes in and not in construction/maintenance zones (work zones) are used to compute the percentage of fatalities in work zones for each of the 26 years. 4. Specification of Analysis Techniques and Data Analysis: Ordinary least squares (OLS) regression is used to develop the general model specified above. The discussion in this example focuses on the resulting model and the related statistics. (See also examples 15, 16, and 17 for details on calculations. For more information about OLS regression, see NCHRP Project 20-45, Volume 2, Chapter 4, Section B, “Linear Regression.”) Looking at the data in Figure 12 another way, WZP = -91.523 (-8.34) (0.000) + 0.047(YEAR) (8.51) (0.000) R = 0.867 t-values p-values R2 = 0.751 The trend is significant: the line (trend) shows an increase of 0.047% each year. Generally, this trend shows that work-zone fatalities are increasing as a percentage of total fatalities. 5. Interpreting the Results: This experiment is a good fit and generally shows that work-zone fatalities were an increasing problem over the period 1982 through 2007. This is a trend that highway officials, engineers, and planners would like to change. The analyst is therefore interested in anticipating the trajectory of the trend. Here the trend suggests that things are getting worse. Figure 12. Percentage of all motor vehicle fatalities occurring in work zones.

54 effective experiment Design and Data analysis in transportation research How far might authorities let things go—5%? 10%? 25%? Caution must be exercised when interpreting a trend beyond the limits of the available data. Technically the slope, or b-coefficient, is the trend of the relationship. The a-term from the regression, also called the intercept, is the value of WZP when the independent variable equals zero. The intercept for the trend in this example would technically indicate that the percentage of motor vehicle fatalities in work zones in the year zero would be -91.5%. This is absurd on many levels. There could be no motor vehicles in year zero, and what is a negative percentage of the total? The absurdity of the intercept in this example reveals that trends are limited concepts, limited to a relevant time frame. Figure 12 also suggests that the trend, while valid for the 26 years in aggregate, doesn’t work very well for the last 5 years, during which the percentages are consistently falling, not rising. Something seems to have changed around 2002; perhaps the highway officials, engineers, and planners took action to change the trend, in which case, the trend reversal would be considered a policy success. Finally, some underlying assumptions must be considered. For example, there is an implicit assumption that the types of roads with construction zones are similar from year to year. If this assumption is not correct (e.g., if a greater number of high speed roads, where fatalities may be more likely, are worked on in some years than in others), then interpreting the trend may not make much sense. 6. Conclusion and Discussion: The computation of this dependent variable (the percent of motor-vehicle fatalities occurring in work zones, or MZP) is influenced by changes in the number of work-zone fatalities and the number of non-work-zone fatalities. To some extent, both of these are random variables. Accordingly, it is difficult to distinguish a trend or trend reversal from a short series of possibly random movements in the same direction. Statistically, more observations permit greater confidence in non-randomness. It is also possible that a data series might be recorded that contains regular, non-random movements that are unrelated to a trend. Consider the dependent variable above (MZP), but measured using monthly data instead of annual data. Further, imagine looking at such data for a state in the upper Midwest instead of for the nation as a whole. In this new situation, the WZP might fall off or halt altogether each winter (when construction and maintenance work are minimized), only to rise again in the spring (reflecting renewed work-zone activity). This change is not a trend per se, nor is it random. Rather, it is cyclical. 7. Applications in Other Areas of Transportation Research: Applications of trend analysis models in other areas of transportation research include: • Transportation Safety—to identify trends in traffic crashes (e.g., motor vehicle/deer) over time on some part of the roadway system (e.g., freeways). • Public Transportation—to determine the trend in rail passenger trips over time (e.g., in response to increasing gas prices). • Pavement Engineering—to monitor the number of miles of pavement that is below some service-life threshold over time. • Environment—to monitor the hours of truck idling time in rest areas over time. Example 15: Structures/Bridges; Trend Analysis Area: Structures/bridges Method of Analysis: Trend analysis (examining a trend over time) 1. Research Question/Problem Statement: A state agency wants to monitor trends in the condition of bridge superstructures in order to perform long-term needs assessment for bridge rehabilitation or replacement. Bridge condition rating data will be analyzed for bridge

examples of effective experiment Design and Data analysis in transportation research 55 2. Identification and Description of Variables: Bridge inspection generally entails collection of numerous variables including location information, traffic data, structural elements (type and condition), and functional characteristics. Based on the severity of deterioration and the extent of spread through a bridge component, a condition rating is assigned on a dis- crete scale from 0 (failed) to 9 (excellent). Generally a condition rating of 4 or below indicates deficiency in a structural component. The state agency inspects approximately 300 bridges every year (denominator). The number of superstructures that receive a rating of 4 or below each year (number of events, numerator) also is recorded. The agency is concerned with the change in overall rate (calculated per 100) of structurally deficient bridge superstructures. This rate, which is simply the ratio of the numerator to the denominator, is the indicator (dependent variable) to be examined for trend over a time period of 15 years. Notice that the unit of analysis is the time period and not the individual bridge superstructures. 3. Data Collection: Data are collected for bridges scheduled for inspection each year. It is important to note that the bridge condition rating scale is based on subjective categories, and therefore there may be inherent variability among inspectors in their assignments of rates to bridge superstructures. Also, it is assumed that during the time period for which the trend analysis is conducted, no major changes are introduced in the bridge inspection methods. Sample data provided in Table 20 show the rate (per 100), number of bridges per year that received a score of four or below, and total number of bridges inspected per year. 4. Specification of Analysis Technique and Data Analysis: The data set consists of 15 observa- tions, one for each year. Figure 13 shows a scatter plot of the rate (dependent variable) versus time in years. The scatter plot does not indicate the presence of any outliers. The scatter plot shows a seemingly increasing linear trend in the rate of deficient superstructures over time. No need for data transformation or smoothing is apparent from the examination of the scatter plot in Figure 13. To determine whether the apparent linear trend is statistically significant in this data, ordinary least squares (OLS) regression can be employed. Question/Issue Use collected data to determine if the values that some variables have taken show an increasing trend or a decreasing trend over time. In this example, determine if levels of structural deficiency in bridge superstructures have been increasing or decreasing over time, and determine how rapidly the increase or decrease has occurred. No. Year Rate (per 100) Number of Events (Numerator) Number of Bridges Inspected (Denominator) 1 1990 8.33 25 300 2 1991 8.70 26 299 5 1994 10.54 31 294 11 2000 13.55 42 310 15 2004 14.61 45 308 Table 20. Sample bridge inspection data. superstructures that have been inspected over a period of 15 years. The objective of this study is to examine the overall pattern of change in the indicator variable over time.

56 effective experiment Design and Data analysis in transportation research The linear regression model takes the following form: y x ei o i i= + +β β1 where i = 1, 2, . . . , n (n = 15 in this example), y = dependent variable (rate of structurally deficient bridge superstructures), x = independent variable (time), bo = y-intercept (only provides reference point), b1 = slope (change in unit y for a change in unit x), and ei = residual error. The first step is to estimate the bo and b1 in the regression function. The residual errors (e) are assumed to be independently and identically distributed (i.e., they are mutually independent and have the same probability distribution). b1 and bo can be computed using the following equations: ˆ . ˆ β β 1 1 2 1 0 454= −( ) −( ) −( ) = = = = ∑ ∑ x x y y x x i i i n i i n o y x− =β1 8 396. where y _ is the overall mean of the dependent variable and x _ is the overall mean of the independent variable. The prediction equation for rate of structurally deficient bridge superstructures over time can be written using the following equation: ˆ ˆ ˆ . .y x xo= + = +β β1 8 396 0 454 That is, as time increases by a year, the rate of structurally deficient bridge superstructures increases by 0.454 per 100 bridges. The plot of the regression line is shown in Figure 14. Figure 14 indicates some small variability about the regression line. To conduct hypothesis testing for the regression relationship (Ho: b1 = 0), assessment of this variability and the assumption of normality would be required. (For a discussion on assumptions for residual errors, see NCHRP Project 20-45, Volume 2, Chapter 4.) Like analysis of variance (ANOVA, described in examples 8, 9, and 10), statistical inference is initiated by partitioning the total sum of squares (TSS) into the error sum of squares (SSE) Figure 13. Scatter plot of time versus rate. 7.00 9.00 11.00 13.00 15.00 Time in years Ra te p er 1 00 1 3 5 7 9 11 13 15

examples of effective experiment Design and Data analysis in transportation research 57 and the model sum of squares (SSR). That is, TSS = SSE + SSR. The TSS is defined as the sum of the squares of the difference of each observation from the overall mean. In other words, deviation of observation from overall mean (TSS) = deviation of observation from prediction (SSE) + deviation of prediction from overall mean (SSR). For our example, TSS y y SSR x x i i n i = −( ) = = −( ) = = ∑ 2 1 1 2 2 60 892 57 7 . ˆ .β 90 3 102 1i n SSE TSS SSR = ∑ = − = . Regression analysis computations are usually summarized in a table (see Table 21). The mean squared errors (MSR, MSE) are computed by dividing the sums of squares by corresponding model and error degrees of freedom. For the null hypothesis (Ho: b1 = 0) to be true, the expected value of MSR is equal to the expected value of MSE such that F = MSR/MSE should be a random draw from an F-distribution with 1, n - 2 degrees of freedom. From the regression shown in Table 21, F is computed to be 242.143, and the probability of getting a value larger than the F computed is extremely small. Therefore, the null hypothesis is rejected; that is, the slope is significantly different from zero, and the linearly increasing trend is found to be statistically significant. Notice that a slope of zero implies that knowing a value of the independent variable provides no insight on the value of the dependent variable. 5. Interpreting the Results: The linear regression model does not imply any cause-and-effect relationship between the independent and dependent variables. The y-intercept only provides a reference point, and the relationship need not be linear outside the data range. The 95% confidence interval for b1 is computed as [0.391, 0.517]; that is, the analyst is 95% confident that the true mean increase in the rate of structurally deficient bridge superstructures is between Plot of regression line y = 8.396 + 0.454x R2 = 0.949 7.00 9.00 11.00 13.00 15.00 1 3 5 7 9 11 13 15 Time in years Ra te p er 1 00 Figure 14. Plot of regression line. Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square F Significance Regression 57.790 1 57.790 (MSR) 242.143 8.769e-10 Error 3.102 13 0.239 (MSE) Total 60.892 14 Table 21. Analysis of regression table.

58 effective experiment Design and Data analysis in transportation research 0.391% and 0.517% per year. (For a discussion on computing confidence intervals, see NCHRP Project 20-45, Volume 2, Chapter 4.) The coefficient of determination (R2) provides an indication of the model fit. For this example, R2 is calculated using the following equation: R SSE TSS 2 0 949= = . The R2 indicates that the regression model accounts for 94.9% of the total variation in the (hypothetical) data. It should be noted that such a high value of R2 is almost impossible to attain from analysis of real observational data collected over a long time. Also, distributional assumptions must be checked before proceeding with linear regression, as serious violations may indicate the need for data transformation, use of non-linear regression or non-parametric methods, and so on. 6. Conclusion and Discussion: In this example, simple linear regression has been used to deter- mine the trend in the rate of structurally deficient bridge superstructures in a geographic area. In addition to assessing the overall patterns of change, trend analysis may be performed to: • study the levels of indicators of change (or dependent variables) in different time periods to evaluate the impact of technical advances or policy changes; • compare different geographic areas or different populations with perhaps varying degrees of exposure in absolute and relative terms; and • make projections to monitor progress toward an objective. However, given the dynamic nature of trend data, many of these applications require more sophisticated techniques than simple linear regression. An important aspect of examining trends over time is the accuracy of numerator and denominator data. For example, bridge structures may be examined more than once during the analysis time period, and retrofit measures may be taken at some deficient bridges. Also, the age of structures is not accounted for in this analysis. For the purpose of this example, it is assumed that these (and other similar) effects are negligible and do not confound the data. In real-life application, however, if the analysis time period is very long, it becomes extremely important to account for changes in factors that may have affected the dependent variable(s) and their measurement. An example of the latter could be changes in the volume of heavy trucks using the bridge, changes in maintenance policies, or changes in plowing and salting regimes. 7. Applications in Other Areas of Transportation Research: Trend analysis is carried out in many areas of transportation research, such as: • Transportation Planning/Traffic Operations—to determine the need for capital improve- ments by examining traffic growth over time. • Traffic Safety—to study the trends in overall, fatal, and/or injury crash rates over time in a geographic area. • Pavement Engineering—to assess the long-term performance of pavements under varying loads. • Environment—to monitor the emission levels from commercial traffic over time with growth of industrial areas. Example 16: Transportation Planning; Multiple Regression Analysis Area: Transportation planning Method of Analysis: Multiple regression analysis (testing proposed linear models with more than one independent variable when all variables are continuous)

examples of effective experiment Design and Data analysis in transportation research 59 1. Research Question/Problem Statement: Transportation planners and engineers often work on variations of the classic four-step transportation planning process for estimat- ing travel demand. The first step, trip generation, generally involves developing a model that can be used to predict the number of trips originating or ending in a zone, which is a geographical subdivision of a corridor, city, or region (also referred to as a traffic analysis zone or TAZ). The objective is to develop a statistical relationship (a model) that can be used to explain the variation in a dependent variable based on the variation of one or more independent variables. In this example, ordinary least squares (OLS) regres- sion is used to develop a model between trips generated (the dependent variable) and demographic, socio-economic, and employment variables (independent variables) at the household level. Question/Issue Can a linear relationship (model) be developed between a dependent variable and one or more independent variables? In this application, the dependent variable is the number of trips produced by households. Independent variables include persons, workers, and vehicles in a household, household income, and average age of persons in the household. The basic question is whether the relationship between the dependent (Y) and independent (X) variables can be represented by a linear model using two coefficients (a and b), expressed as follows: Y X= +a b i where a = the intercept and b = the slope of the line. If the relationship being examined involves more than one independent variable, the equa- tion will simply have more terms. In addition, in a more formal presentation, the equation will also include an error term, e, added at the end. 2. Identification and Description of Variables: Data for four-step modeling of travel demand or for calibration of any specific model (e.g., trip generation or trip origins) come from a variety of sources, ranging from the U.S. Census to mail or telephone surveys. The data that are collected will depend, in part, on the specific purpose of the modeling effort. Data appropriate for a trip-generation model typically are collected from some sort of household survey. For the dependent variable in a trip-generation model, data must be collected on trip-making characteristics. These characteristics could include something as simple as the total trips made by a household in a day or involve more complicated break- downs by trip purpose (e.g., work-related trips versus shopping trips) and time of day (e.g., trips made during peak and non-peak hours). The basic issue that must be addressed is to determine the purpose of the proposed model: What is to be estimated or predicted? Weekdays and work trips normally are associated with peak congestion and are often the focus of these models. For the independent variable(s), the analyst must first give some thought to what would be the likely causes for household trips to vary. For example, it makes sense intuitively that household size might be pertinent (i.e., it seems reasonable that more persons in the household would lead to a higher number of household trips). Household members could be divided into workers and non-workers, two variables instead of one. Likewise, other socio-economic characteristics, such as income-related variables, might also make sense as candidate variables for the model. Data are collected on a range of candidate variables, and

60 effective experiment Design and Data analysis in transportation research the analysis process is used to sort through these variables to determine which combination leads to the best model. To be used in ordinary regression modeling, variables need to be continuous; that is, measured ratio or interval scale variables. Nominal data may be incorporated through the use of indicator (dummy) variables. (For more information on continuous variables, see NCHRP Project 20-45, Volume 2, Chapter 1; for more information on dummy variables, see NCHRP Project 20-45, Volume 2, Chapter 4). 3. Data Collection: As noted, data for modeling travel demand often come from surveys designed especially for the modeling effort. Data also may be available from centralized sources such as a state DOT or local metropolitan planning organization (MPO). 4. Specification of Analysis Techniques and Data Analysis: In this example, data for 178 house- holds in a small city in the Midwest have been provided by the state DOT. The data are obtained from surveys of about 15,000 households all across the state. This example uses only a tiny portion of the data set (see Table 22). Based on the data, a fairly obvious relationship is initially hypothesized: more persons in a household (PERS) should produce more person- trips (TRIPS). In its simplest form, the regression model has one dependent variable and one independent variable. The underlying assumption is that variation in the independent variable causes the variation in the dependent variable. For example, the dependent variable might be TRIPSi (the count of total trips made on a typical weekday), and the independent variable might be PERS (the total number of persons, or occupants, in the household). Expressing the relation- ship between TRIPS and PERS for the ith household in a sample of households results in the following hypothesized model: TRIPS PERSi i i= + +a b i ε where a and b are coefficients to be determined by ordinary least squares (OLS) regression analysis and ei is the error term. The difference between the value of TRIPS for any household predicted using the devel- oped equation and the actual observed value of TRIPS for that same household is called the residual. The resulting model is an equation for the best fit straight line (for the given data) where a is the intercept and b is the slope of the line. (For more information about fitted regression and measures of fit see NCHRP Project 20-45, Volume 2, Chapter 4). In Table 22, R is the multiple R, the correlation coefficient in the case of the simplest linear regression involving one variable (also called univariate regression). The R2 (coefficient of determination) may be interpreted as the proportion of the variance of the dependent variable explained by the fitted regression model. The adjusted R2 corrects for the number of independent variables in the equation. A “perfect” R2 of 1.0 could be obtained if one included enough independent variables (e.g., one for each observation), but doing so would hardly be useful. Coefficients t-values (statistics) p-values Measures of Fit a = 3.347 4.626 0.000 R = 0.510 b = 2.001 7.515 0.000 R2 = 0.260 Adjusted R2 = 0.255 Table 22. Regression model statistics.

examples of effective experiment Design and Data analysis in transportation research 61 Restating the now-calibrated model, TRIPS PERS= +4 626 7 515. . i The statistical significance of each coefficient estimate is evaluated with the p-values of calculated t-statistics, provided the errors are normally distributed. The p-values (also known as probability values) generally indicate whether the coefficients are significantly different from zero (which they need to be in order for the model to be useful). More formally stated, a p-value is the probability of a Type I error. In this example, the t- and p-values shown in Table 22 indicate that both a and b are sig- nificantly different from zero at a level of significance greater than the 99.9% confidence level. P-values are generally offered as two-tail (two-sided hypothesis testing) test values in results from most computer packages; one-tail (one-sided) values may sometimes be obtained by dividing the printed p-values by two. (For more information about one-sided versus two- sided hypothesis testing, see NCHRP Project 20-45, Volume 2, Chapter 4.) The R2 may be tested with an F-statistic; in this example, the F was calculated as 56.469 (degrees of freedom = 2, 176) (See NCHRP Project 20-45, Volume 2, Chapter 4). This means that the model explains a significant amount of the variation in the dependent variable. A plot of the estimated model (line) and the actual data are shown in Figure 15. A strict interpretation of this model suggests that a household with zero occupants (PERS = 0) will produce 3.347 trips per day. Clearly, this is not feasible because there can’t be a household of zero persons, which illustrates the kind of problem encountered when a model is extrapolated beyond the range of the data used for the calibration. In other words, a formal test of the intercept (the a) is not always meaningful or appropriate. Extension of the Model to Multivariate Regression: When the list of potential inde- pendent variables is considered, the researcher or analyst might determine that more than one cause for variation in the dependent variable may exist. In the current example, the question of whether there is more than one cause for variation in the number of trips can be considered. 0 1 2 3 4 5 6 7 8 9 10 PERS 0 10 20 30 40 TR IP S Figure 15. Plot of the line for the estimated model.

62 effective experiment Design and Data analysis in transportation research The model just discussed for evaluating the effect of one independent variable is called a uni- variate model. Should the final model for this example be multivariate? Before determining the final model, the analyst may want to consider whether a variable or variables exist that further clarify what has already been modeled (e.g., more persons cause more trips). The variable PERS is a crude measure, made up of workers and non-workers. Most households have one or two workers. It can be shown that a measure of the non-workers in the household is more effective in explaining trips than is total persons; so a new variable, persons minus workers (DEP), is calculated. Next, variables may exist that address entirely different causal relationships. It might be hypothesized that as the number of registered motor vehicles available in the household (VEH) increases, the number of trips will increase. It may also be argued that as household income (INC, measured in thousands of dollars) increases, the number of trips will increase. Finally, it may be argued that as the average age of household occupants (AVEAGE) increases, the number of trips will decrease because retired people generally make fewer trips. Each of these statements is based upon a logical argument (hypothesis). Given these arguments, the hypothesized multivariate model takes the following form: TRIPS DEP VEH INC AVEAGE= + + + + +a b c d ei i i i ε The results from fitting the multivariate model are given in Table 23. Results of the analysis of variance (ANOVA) for the overall model are shown in Table 24. 5. Interpreting the Results: It is common for regression packages to provide some values in scientific notation as shown for the p-values in Table 23. The coefficient d, showing the relationship of TRIPS with INC, is read 1.907 E-05, which in turn is read as 1.907  10-5 or 0.000001907. All coefficients are of the expected sign and significantly different from 0 (at the 0.05 level) except for d. However, testing the intercept makes little sense. (The intercept value would be the number of trips for a household with 0 vehicles, 0 income, 0 average age, and 0 depen- dents, a most unlikely household.) The overall model is significant as shown by the F-ratio and its p-value, meaning that the model explains a significant amount of the variation in Coefficients t-values (statistics) p-values Measures of Fit a = 8.564 6.274 3.57E-09* R = 0.589 b = 0.899 2.832 0.005 R2 = 0.347 c = 1.067 3.360 0.001 adjusted R2 = 0.330 d = 1.907E-05* 1.927 0.056 e = -0.098 -4.808 3.68E-06 *See note about scientific notation in Section 5, Interpreting the Results. Table 23. Results from fitting the multivariate model. ANOVA Sum of Squares (SS) Degrees of Freedom (df) F-ratio p-value Regression 1487.5 4 19.952 3.4E-13 Residual 2795.7 150 Table 24. ANOVA results for the overall model.

examples of effective experiment Design and Data analysis in transportation research 63 the dependent variable. This model should reliably explain 33% of the variance of house- hold trip generation. Caution should be exercised when interpreting the significance of the R2 and the overall model because it is not uncommon to have a significant F-statistic when some of the coefficients in the equation are not significant. The analyst may want to consider recalibrating the model without the income variable because the coefficient d was insignificant. 6. Conclusion and Discussion: Regression, particularly OLS regression, relies on several assumptions about the data, the nature of the relationships, and the results. Data are assumed to be interval or ratio scale. Independent variables generally are assumed to be measured without error, so all error is attributed to the model fit. Furthermore, indepen- dent variables should be independent of one another. This is a serious concern because the presence in the model of related independent variables, called multicollinearity, compro- mises the t-tests and confuses the interpretation of coefficients. Tests of this problem are available in most statistical software packages that include regression. Look for Variance- Inflation Factor (VIF) and/or Tolerance tests; most packages will have one or the other, and some will have both. In the example above where PERS is divided into DEP and workers, knowing any two variables allows the calculation of the third. Including all three variables in the model would be a case of extreme multicollinearity and, logically, would make no sense. In this instance, because one variable is a linear combination of the other two, the calculations required (within the analysis program) to calibrate the model would actually fail. If the independent variables are simply highly correlated, the regression coefficients (at a minimum) may not have intuitive meaning. In general, equations or models with highly correlated independent variables are to be avoided; alternative models that examine one variable or the other, but not both, should be analyzed. It is also important to analyze the error distributions. Several assumptions relate to the errors and their distributions (normality, constant variance, uncorrelated, etc.) In transportation plan- ning, spatial variables and associations might become important; they require more elaborate constructs and often different estimation processes (e.g., Bayesian, Maximum Likelihood). (For more information about errors and error distributions, see NCHRP Project 20-45, Volume 2, Chapter 4.) Other logical considerations also exist. For example, for the measurement units of the different variables, does the magnitude of the result of multiplying the coefficient and the measured variable make sense and/or have a reasonable effect on the predicted magnitude of the dependent variable? Perhaps more importantly, do the independent variables make sense? In this example, does it make sense that changes in the number of vehicles in the household would cause an increase or decrease in the number of trips? These are measures of operational significance that go beyond consideration of statistical significance, but are no less important. 7. Applications in Other Areas of Transportation Research: Regression is a very important technique across many areas of transportation research, including: • Transportation Planning – to include the other half of trip generation, e.g., predicting trip destinations as a function of employment levels by various types (factory, commercial), square footage of shopping center space, and so forth. – to investigate the trip distribution stage of the 4-step model (log transformation of the gravity model). • Public Transportation—to predict loss/liability on subsidized freight rail lines (function of segment ton-miles, maintenance budgets and/or standards, operating speeds, etc.) for self-insurance computations. • Pavement Engineering—to model pavement deterioration (or performance) as a function of easily monitored predictor variables.

64 effective experiment Design and Data analysis in transportation research Example 17: Traffic Operations; Regression Analysis Area: Traffic operations Method of Analysis: Regression analysis (developing a model to predict the values that some variable can take as a function of one or more other variables, when not all variables are assumed to be continuous) 1. Research Question/Problem Statement: An engineer is concerned about false capacity at inter- sections being designed in a specified district. False capacity occurs where a lane is dropped just beyond a signalized intersection. Drivers approaching the intersection and knowing that the lane is going to be dropped shortly afterward avoid the lane. However, engineers estimating the capacity and level of service of the intersection during design have no reliable way to estimate the percentage of traffic that will avoid the lane (the lane distribution). Question/Issue Develop a model that can be used to predict the values that a dependent vari- able can take as a function of changes in the values of the independent variables. In this particular instance, how can engineers make a good estimate of the lane distribution of traffic volume in the case of a lane drop just beyond an intersec- tion? Can a linear model be developed that can be used to predict this distribu- tion based on other variables? The basic question is whether a linear relationship exists between the dependent variable (Y; in this case, the lane distribution percentage) and some independent variable(s) (X). The relationship can be expressed using the following equation: Y X= +a b i where a is the intercept and b is the slope of the line (see NCHRP Project 20-45, Volume 2, Chapter 4, Section B). 2. Identification and Description of Variables: The dependent variable of interest in this example is the volume of traffic in each lane on the approach to a signalized intersection with a lane drop just beyond. The traffic volumes by lane are converted into lane utilization factors (fLU), to be consistent with standard highway capacity techniques. The Highway Capacity Manual defines fLU using the following equation: f v v N LU g g = ( )1 where Vg is the flow rate in a lane group in vehicles per hour, Vg1 is the flow rate in the lane with the highest flow rate of any in the group in vehicles per hour, and N is the number of lanes in the lane group. The engineer thinks that lane utilization might be explained by one or more of 15 different factors, including the type of lane drop, the distance from the intersection to the lane drop, the taper length, and the heavy vehicle percentage. All of the variables are continuous except the type of lane drop. The type of lane drop is used to categorize the sites. 3. Data Collection: The engineer locates 46 lane-drop sites in the area and collects data at these sites by means of video recording. The engineer tapes for up to 3 hours at each site. The data are summarized in 15-minute periods, again to be consistent with standard highway capacity practice. For one type of lane-drop geometry, with two through lanes and an exclusive right- turn lane on the approach to the signalized intersection, the engineer ends up with 88 valid

examples of effective experiment Design and Data analysis in transportation research 65 data points (some sites have provided more than one data point), covering 15 minutes each, to use in equation (model) development. 4. Specification of Analysis Technique and Data Analysis: Multiple (or multivariate) regression is a standard statistical technique to develop predictive equations. (More information on this topic is given in NCHRP Project 20-45, Volume 2, Chapter 4, Section B). The engineer performs five steps to develop the predictive equation. Step 1. The engineer examines plots of each of the 15 candidate variables versus fLU to see if there is a relationship and to see what forms the relationships might take. Step 2. The engineer screens all 15 candidate variables for multicollinearity. (Multicollinearity occurs when two variables are related to each other and essentially contribute the same informa- tion to the prediction.) Multicollinearity can lead to models with poor predicting power and other problems. The engineer examines the variables for multicollinearity by • looking at plots of each of the 15 candidate variables against every other candidate variable; • calculating the correlation coefficient for each of the 15 candidate independent variables against every other candidate variable; and • using more sophisticated tests (such as the variance influence factor) that are available in statistical software. Step 3. The engineer reduces the set of candidate variables to eight. Next, the engineer uses statistical software to select variables and estimate the coefficients for each selected variable, assuming that the regression equation has a linear form. To select variables, the engineer employs forward selection (adding variables one at a time until the equation fit ceases to improve significantly) and backward elimination (starting with all candidate variables in the equation and removing them one by one until the equation fit starts to deteriorate). The equation fit is measured by R2 (for more information, see NCHRP Project 20-45, Volume 2, Chapter 4, Section B, under the heading, “Descriptive Measures of Association Between X and Y”), which shows how well the equation fits the data on a scale from 0 to 1, and other factors provided by statistical software. In this case, forward selection and backward elimination result in an equation with five variables: • Drop: Lane drop type, a 0 or 1 depending on the type; • Left: Left turn status, a 0 or 1 depending on the types of left turns allowed; • Length: The distance from the intersection to the lane drop, in feet ÷ 1000; • Volume: The average lane volume, in vehicles per hour per lane ÷ 1000; and • Sign: The number of signs warning of the lane drop. Notice that the first two variables are discrete variables and had to assume a zero-or-one format to work within the regression model. Each of the five variables has a coefficient that is significantly different from zero at the 95% confidence level, as measured by a t-test. (For more information, see NCHRP Project 20-45, Volume 2, Chapter 4, Section B, “How Are t-statistics Interpreted?”) Step 4. Once an initial model has been developed, the engineer plots the residuals for the tentative equation to see whether the assumed linear form is correct. A residual is the differ- ence, for each observation, between the prediction the equation makes for fLU and the actual value of fLU. In this example, a plot of the predicted value versus the residual for each of the 88 data points shows a fan-like shape, which indicates that the linear form is not appropriate. (NCHRP Project 20-45, Volume 2, Chapter 4, Section B, Figure 6 provides examples of residual plots that are and are not desirable.) The engineer experiments with several other model forms, including non-linear equations that involve transformations of variables, before settling on a lognormal form that provides a good R2 value of 0.73 and a desirable shape for the residual plot.

66 effective experiment Design and Data analysis in transportation research Step 5. Finally, the engineer examines the candidate equation for logic and practicality, asking whether the variables make sense, whether the signs of the variables make sense, and whether the variables can be collected easily by design engineers. Satisfied that the answers to these questions are “yes,” the final equation (model) can be expressed as follows: f Drop Left LLU = − − + +exp . . . .0 539 0 218 0 148 0 178i i i ength Volume Sign+ −( )0 627 0 105. .i i 5. Interpreting the Results: The process described in this example results in a useful equation for estimating the lane utilization in a lane to be dropped, thereby avoiding the estimation of false capacity. The equation has five terms and is non-linear, which will make its use a bit challenging. However, the database is large, the equation fits the data well, and the equation is logical, which should boost the confidence of potential users. If potential users apply the equation within the ranges of the data used for the calibration, the equation should provide good predictions. Applying any model outside the range of the data on which it was calibrated increases the likelihood of an inaccurate prediction. 6. Conclusion and Discussion: Regression is a powerful statistical technique that provides models engineers can use to make predictions in the absence of direct observation. Engineers tempted to use regression techniques should notice from this and other examples that the effort is substantial. Engineers using regression techniques should not skip any of the steps described above, as doing so may result in equations that provide poor predictions to users. Analysts considering developing a regression model to help make needed predictions should not be intimidated by the process. Although there are many pitfalls in developing a regression model, analysts considering making the effort should also consider the alternative: how the prediction will be made in the absence of a model. In the absence of a model, predic- tions of important factors like lane utilization would be made using tradition, opinion, or simple heuristics. With guidance from NCHRP Project 20-45 and other texts, and with good software available to make the calculations, credible regression models often can be developed that perform better than the traditional prediction methods. Because regression models developed by transportation engineers are often reused in later studies by others, the stakes are high. The consequences of a model that makes poor pre- dictions can be severe in terms of suboptimal decisions. Lane utilization models often are employed in traffic studies conducted to analyze new development proposals. A model that under-predicts utilization in a lane to be dropped may mean that the development is turned down due to the anticipated traffic impacts or that the developer has to pay for additional and unnecessary traffic mitigation measures. On the other hand, a model that over-predicts utilization in a lane to be dropped may mean that the development is approved with insufficient traffic mitigation measures in place, resulting in traffic delays, collisions, and the need for later intervention by a public agency. 7. Applications in Other Areas of Transportation Research: Regression is used in almost all areas of transportation research, including: • Transportation Planning—to create equations to predict trip generation and mode split. • Traffic Safety—to create equations to predict the number of collisions expected on a particular section of road. • Pavement Engineering/Materials—to predict long-term wear and condition of pavements. Example 18: Transportation Planning; Logit and Related Analysis Area: Transportation planning Method of Analysis: Logit and related analysis (developing predictive models when the dependent variable is dichotomous—e.g., 0 or 1)

examples of effective experiment Design and Data analysis in transportation research 67 2. Identification and Description of Variables: Considering a typical, traditional urban area in the United States, it is reasonable to argue that the likelihood of taking public transit to work (Y) will be a function of income (X). Generally, more income means less likelihood of taking public transit. This can be modeled using the following equation: Y X ui i i= + +β β1 2 where Xi = family income, Y = 0 if the family uses public transit, and Y = 1 if the family doesn’t use public transit. 3. Data Collection: These data normally are obtained from travel surveys conducted at the local level (e.g., by a metropolitan area or specific city), although the agency that collects the data often is a state DOT. 4. Specification of Analysis Techniques and Data Analysis: In this example the dependent variable is dichotomous and is a linear function of an explanatory variable. Consider the equation E(YiXi) = b1 + b2Xi. Notice that if Pi = probability that Y = 1 (household utilizes transit), then (1 - Pi) = probability that Y = 0 (doesn’t utilize transit). This has been called a linear probability model. Note that within this expression, “i” refers to a household. Thus, Y has the distribution shown in Table 25. Any attempt to estimate this relationship with standard (OLS) regression is saddled with many problems (e.g., non-normality of errors, heteroscedasticity, and the possibility that the predicted Y will be outside the range 0 to 1, to say nothing of pretty terrible R2 values). Question/Issue Can a linear model be developed that can be used to predict the probability that one of two choices will be made? In this example, the question is whether a household will use public transit (or not). Rather than being continuous (as in linear regression), the dependent variable is reduced to two categories, a dichotomous variable (e.g., yes or no, 0 or 1). Although the question is simple, the statistical modeling becomes sophisticated very quickly. 1. Research Question/Problem Statement: Transportation planners often utilize variations of the classic four-step transportation planning process for predicting travel demand. Trip generation, trip distribution, mode split, and trip assignment are used to predict traffic flows under a variety of forecasted changes in networks, population, land use, and controls. Mode split, deciding which mode of transportation a traveler will take, requires predicting mutually exclusive outcomes. For example, will a traveler utilize public transit or drive his or her own car? Table 25. Distribution of Y. Values that Y Takes Probability Meaning/Interpretation 1 Pi Household uses transit 0 1 – Pi Household does not use transit 1.0 Total

68 effective experiment Design and Data analysis in transportation research An alternative formulation for estimating Pi, the cumulative logistic distribution, is expressed by the following equation: Pi Xi = + − +( ) 1 1 1 2ε β β This function can be plotted as a lazy Z-curve where on the left, with low values of X (low household income), the probability starts near 1 and ends at 0 (Figure 16). Notice that, even at 0 income, not all households use transit. The curve is said to be asymptotic to 1 and 0. The value of Pi varies between 1 and 0 in relation to income, X. Manipulating the definition of the cumulative logistic distribution from above, 1 11 2+( ) =− +( )ε β β Xi iP P Pi i Xi+( ) =− +( )ε β β1 2 1 P Pi Xi iε β β− +( ) = −1 2 1 ε β β− +( ) = −1 2 1Xi i i P P and ε β β1 2 1 +( ) = − Xi i i P P The final expression is the ratio of the probability of utilizing public transit divided by the probability of not utilizing public transit. It is called the odds ratio. Next, taking the natural log of both sides (and reversing) results in the following equation: L P P Xi i i i= −   = +ln 1 1 2β β L is called the logit, and this is called a logit model. The left side is the natural log of the odds ratio. Unfortunately, this odds ratio is meaningless for individual households where the prob- ability is either 0 or 1 (utilize or not utilize). If the analyst uses standard OLS regression on this Figure 16. Plot of cumulative logistic distribution showing a lazy Z-curve.

examples of effective experiment Design and Data analysis in transportation research 69 equation, with data for individual households, there is a problem because when Pi happens to equal either 0 or 1 (which is all the time!), the odds ratio will, as a result, equal either 0 or infinity (and the logarithm will be undefined) for all observations. However, by using groups of households the problem can be mitigated. Table 26 presents data based on a survey of 701 households, more than half of which use transit (380). The income data are recorded for intervals; here, interval mid-points (Xj) are shown. The number of households in each income category is tallied (Nj), as is the number of households in each income category that utilizes public transit (nj). It is important to note that while there are more than 700 households (i), the number of observations (categories, j) is only 13. Using these data, for each income bracket, the probability of taking transit can be estimated as follows: P n N j j j  = This equation is an expression of relative frequency (i.e., it expresses the proportion in income bracket “j” using transit). An examination of Table 26 shows clearly that there is progression of these relative frequen- cies, with higher income brackets showing lower relative frequencies, just as was hypothesized. We can calculate the odds ratio for each income bracket listed in Table 26 and estimate the following logit function with OLS regression: L n N n N Xj j j j j j= −       = +ln 1 1 2β β The results of this regression are shown in Table 27. The results also can be expressed as an equation: LogOddsRatio X= −1 037 0 00003863. .  5. Interpreting the Results: This model provides a very good fit. The estimates of the coefficients can be inserted in the original cumulative logistic function to directly estimate the probability of using transit for any given X (income level). Indeed, the logistic graph in Figure 16 is produced with the estimated function. Xj ($) Nj (Households) nj (Utilizing Transit) Pj (Defined Above) $6,000 40 30 0.750 $8,000 55 39 0.709 $10,000 65 43 0.662 $13,000 88 58 0.659 $15,000 118 69 0.585 $20,000 81 44 0.543 $25,000 70 33 0.471 $30,000 62 25 0.403 $35,000 40 16 0.400 $40,000 30 11 0.367 $50,000 22 6 0.273 $60,000 18 4 0.222 $75,000 12 2 0.167 Total: 701 380 Table 26. Data examined by groups of households.

70 effective experiment Design and Data analysis in transportation research 6. Conclusion and Discussion: This approach to estimation is not without further problems. For example, the N within each income bracket needs to be sufficiently large that the relative fre- quency (and therefore the resulting odds ratio) is accurately estimated. Many statisticians would say that a minimum of 25 is reasonable. This approach also is limited by the fact that only one independent variable is used (income). Common sense suggests that the right-hand side of the function could logically be expanded to include more than one predictor variable (more Xs). For example, it could be argued that educational level might act, along with income, to account for the probability of using transit. However, combining predictor variables severely impinges on the categories (the j) used in this OLS regression formulation. To illustrate, assume that five educational categories are used in addition to the 13 income brackets (e.g., Grade 8 or less, high school graduate to Grade 9, some college, BA or BS degree, and graduate degree). For such an OLS regression analysis to work, data would be needed for 5 × 13, or 65 categories. Ideally, other travel modes should also be considered. In the example developed here, only transit and not-transit are considered. In some locations it is entirely reasonable to examine private auto versus bus versus bicycle versus subway versus light rail (involving five modes, not just two). This notion of a polychotomous logistic regression is possible. However, five modes cannot be estimated with the OLS regression technique employed above. The logit above is a variant of the binomial distribution and the polychotomous logistic model is a variant of the multi- nomial distribution (see NCHRP Project 20-45, Volume 2, Chapter 5). Estimation of these more advanced models requires maximum likelihood methods (as described in NCHRP Project 20-45, Volume 2, Chapter 5). Other model variants are based upon other cumulative probability distributions. For exam- ple, there is the probit model, in which the normal cumulative density function is used. The probit model is very similar to the logit model, but it is more difficult to estimate. 7. Applications in Other Areas of Transportation Research: Applications of logit and related models abound within transportation studies. In any situation in which human behavior is relegated to discrete choices, the category of models may be applied. Examples in other areas of transportation research include: • Transportation Planning—to model any “choice” issue, such as shopping destination choices. • Traffic Safety—to model dichotomous responses (e.g., did a motorist slow down or not) in response to traffic control devices. • Highway Design—to model public reactions to proposed design solutions (e.g., support or not support proposed road diets, installation of roundabouts, or use of traffic calming techniques). Example 19: Public Transit; Survey Design and Analysis Area: Public transit Method of Analysis: Survey design and analysis (organizing survey data for statistical analysis) Coefficients t-values (statistics) p-values Measures of “Fit” 1 = 1.037 12.156 0.000 R = 0.980 2 = -0.00003863 β β -16.407 0.000 R2 = 0.961 adjusted R2 = 0.957 Table 27. Results of OLS regression.

examples of effective experiment Design and Data analysis in transportation research 71 2. Identification and Description of Variables: Two types of variables are needed for this analysis. The first is data on the characteristics of the riders, such as gender, age, and access to an automobile. These data are discrete variables. The second is data on the riders’ stated responses to proposed changes in the fare or service characteristics. These data also are treated as discrete variables. Although some, like the fare, could theoretically be continuous, they are normally expressed in discrete increments (e.g., $1.00, $1.25, $1.50). 3. Data Collection: These data are normally collected by agencies conducting a survey of the transit users. The initial step in the experiment design is to choose the variables to be collected for each of these two data sets. The second step is to determine how to categorize the data. Both steps are generally based on past experience and common sense. Some of the variables used to describe the characteristics of the transit user are dichotomous, such as gender (male or female) and access to an automobile (yes or no). Other variables, such as age, are grouped into discrete categories within which the transit riding characteristics are similar. For example, one would not expect there to be a difference between the transit trip needs of a 14-year-old student and a 15-year-old student. Thus, the survey responses of these two age groups would be assigned to the same age category. However, experience (and common sense) leads one to differentiate a 19-year-old transit user from a 65-year-old transit user, because their purposes for taking trips and their perspectives on the relative value of the fare and the service components are both likely to be different. Obtaining user responses to changes in the fare or service is generally done in one of two ways. The first is to make a statement and ask the responder to mark one of several choices: strongly agree, agree, neither agree nor disagree, disagree, and strongly disagree. The number of statements used in the survey depends on how many parameter changes are being contemplated. Typical statements include: 1. I would increase the number of trips I make each month if the fare were reduced by $0.xx. 2. I would increase the number of trips I make each month if I could purchase a monthly pass. 3. I would increase the number of trips I make each month if the waiting time at the stop were reduced by 10 minutes. 4. I would increase the number of trips I make each month if express services were available from my origin to my destination. The second format is to propose a change and provide multiple choices for the responder. Typical questions for this format are: 1. If the fare were increased by $0.xx per trip I would: a) not change the number of trips per month b) reduce the non-commute trips c) reduce both the commute and non-commute trips d) switch modes 2. If express service were offered for an additional $0.xx per trip I would: a) not change the number of trips per month on this local service b) make additional trips each month c) shift from the local service to the express service Question/Issue Use and analysis of data collected in a survey. Results from a survey of transit users are used to estimate the change in ridership that would result from a change in the service or fare. 1. Research Question/Problem Statement: The transit director is considering changes to the fare structure and the service characteristics of the transit system. To assist in determining which changes would be most effective or efficient, a survey of the current transit riders is developed.

72 effective experiment Design and Data analysis in transportation research These surveys generally are administered by handing a survey form to people as they enter the transit vehicle and collecting them as people depart the transit vehicle. The surveys also can be administered by mail, telephone, or in a face-to-face interview. In constructing the questions, care should be taken to use terms with which the respondents will be familiar. For example, if the system does not currently offer “express” service, this term will need to be defined in the survey. Other technical terms should be avoided. Similarly, the word “mode” is often used by transportation professionals but is not commonly used by the public at large. The length of a survey is almost always an issue as well. To avoid asking too many questions, each question needs to be reviewed to see if it is really necessary and will produce useful data (as opposed to just being something that would be nice to know). 4. Specification of Analysis Technique and Data Analysis: The results of these surveys often are displayed in tables or in frequency distribution diagrams (see also Example 1 and Example 2). Table 28 lists responses to a sample question posed in the form of a statement. Figure 17 shows the frequency diagram for these data. Similar presentations can be made for any of the groupings included in the first type of variables discussed above. For example, if gender is included as a Type 1 question, the results might appear as shown in Table 29 and Figure 18. Figure 18 shows the frequency diagram for these data. Presentations of the data can be made for any combination of the discrete variable groups included in the survey. For example, to display responses of female users over 65 years old, Strongly Agree Agree Neither Agree nor Disagree Disagree Strongly Disagree Total responses 450 600 300 400 100 Table 28. Table of responses to sample statement, “I would increase the number of trips I make each month if the fare were reduced by $0.xx.” 450 600 300 400 100 0 50 100 150 200 250 300 350 400 450 500 550 600 Strongly agree agree neither agree nor disagree disagree strongly disagree Figure 17. Frequency diagram for total responses to sample statement.

examples of effective experiment Design and Data analysis in transportation research 73 all of the survey forms on which these two characteristics (female and over 65 years old) are checked could be extracted and recorded in a table and shown in a frequency diagram. 5. Interpreting the Results: Survey data can be used to compare the responses to fare or service changes of different groups of transit users. This flexibility can be important in determining which changes would impact various segments of transit users. The information can be used to evaluate various fare and service options being considered and allows the transit agency to design promotions to obtain the greatest increase in ridership. For example, by creating fre- quency diagrams to display the responses to statements 2, 3, and 4 listed in Section 3, the engi- neer can compare the impact of changing the fare versus changing the headway or providing express services in the corridor. Organizing response data according to different characteristics of the user produces con- tingency tables like the one illustrated for males and females. This table format can be used to conduct chi-square analysis to determine if there is any statistically significant difference among the various groups. (Chi-square analysis is described in more detail in Example 4.) 6. Conclusions and Discussion: This example illustrates how to obtain and present quan- titative information using surveys. Although survey results provide reasonably good esti- mates of the relative importance users place on different transit attributes (fare, waiting time, hours of service, etc.), when determining how often they would use the system, the magnitude of users’ responses often is overstated. Experience shows that what users say they would do (their stated preference) generally is different than what they actually do (their revealed preference). Strongly Agree Agree Neither Agree nor Disagree Disagree Strongly Disagree Male 200 275 200 200 70 Female 250 325 100 200 30 Total responses 450 600 300 400 100 Table 29. Contingency table showing responses by gender to sample statement, “I would increase the number of trips I make each month if the fare were reduced by $0.xx.” 200 275 200 200 70 250 325 100 200 30 0 50 100 150 200 250 300 350 Strongly agree agree neither agree nor disagree disagree strongly disagree Male Female Figure 18. Frequency diagram showing responses by gender to sample statement.

74 effective experiment Design and Data analysis in transportation research In this example, 1,050 of the 1,850 respondents (57%) have responded that they would use the bus service more frequently if the fare were decreased by $0.xx. Five hundred respondents (27%) have indicated that they would not use the bus service more frequently, and 300 respondents (16%) have indicated that they are not sure if they would change their bus use frequency. These percentages show the stated preferences of the users. The engineer does not yet know the revealed preferences of the users, but experience suggests that it is unlikely that 57% of the riders would actually increase the number of trips they make. 7. Applications in Other Area in Transportation: Survey design and analysis techniques can be used to collect and present data in many areas of transportation research, including: • Transportation Planning—to assess public response to a proposal to enact a local motor fuel tax to improve road maintenance in a city or county. • Traffic Operations—to assess public response to implementing road diets (e.g., 4-lane to 3-lane conversions) on different corridors in a city. • Highway Design—to assess public response to proposed alternative cross-section designs, such as a boulevard design versus an undivided multilane design in a corridor. Example 20: Traffic Operations; Simulation Area: Traffic operations Method of Analysis: Simulation (using field data to simulate, or model, operations or outcomes) 1. Research Question/Problem Statement: A team of engineers wants to determine whether one or more unconventional intersection designs will produce lower travel times than a conventional design at typical intersections for a given number of lanes. There is no way to collect field data to compare alternative intersection designs at a particular site. Macroscopic traffic operations models like those in the Highway Capacity Manual do a good job of estimating delay at specific points but are unable to provide travel time estimates for unconventional designs that consist of several smaller intersections and road segments. Microscopic simulation models measure the behaviors of individual vehicles as they traverse the highway network. Such simulation models are therefore very flexible in the types of networks and measures that can be examined. The team in this example turns to a simulation model to determine how other intersection designs might work. Question/Issue Developing and using a computer simulation model to examine operations in a computer environment. In this example, a traffic operations simulation model is used to show whether one or more unconventional intersection designs will produce lower travel times than a conventional design at typical intersections for a given number of lanes. 2. Identification and Description of Variables: The engineering team simulates seven different intersections to provide the needed scope for their findings. At each intersection, the team examines three different sets of traffic volumes: volumes from the evening (p.m.) peak hour, a typical midday off-peak hour, and a volume that is 15% greater than the p.m. peak hour to represent future conditions. At each intersection, the team models the current conventional intersection geometry and seven unconventional designs: the quadrant roadway, median U-turn, superstreet, bowtie, jughandle, split intersection, and continuous flow intersection. Traffic simulation models break the roadway network into nodes (intersections) and links (segments between intersections). Therefore, the engineering team has to design each of the

examples of effective experiment Design and Data analysis in transportation research 75 alternatives at each test site in terms of numbers of lanes, lane lengths, and such, and then faithfully translate that geometry into links and nodes that the simulation model can use. For each combination of traffic volume and intersection design, the team uses software to find the optimum signal timing and uses that during the simulation. To avoid bias, the team keeps all other factors (e.g., network size, numbers of lanes, turn lane lengths, truck percentages, average vehicle speeds) constant in all simulation runs. 3. Data Collection: The field data collection necessary in this effort consists of noting the current intersection geometries at the seven test intersections and counting the turning movements in the time periods described above. In many simulation efforts, it is also necessary to collect field data to calibrate and validate the simulation model. Calibration is the process by which simulation output is compared to actual measurements for some key measure(s) such as travel time. If a difference is found between the simulation output and the actual measurement, the simulation inputs are changed until the difference disappears. Validation is a test of the calibrated simulation model, comparing simulation output to a previously unused sample of actual field measurements. In this example, however, the team determines that it is unnecessary to collect calibration and validation data because a recent project has successfully calibrated and validated very similar models of most of these same unconventional designs. The engineer team uses the CORSIM traffic operations simulation model. Well known and widely used, CORSIM models the movement of each vehicle through a specified network in small time increments. CORSIM is a good choice for this example because it was originally designed for problems of this type, has produced appropriate results, has excellent animation and other debugging features, runs quickly in these kinds of cases, and is well-supported by the software developers. The team makes two CORSIM runs with different random number seeds for each combina- tion of volume and design at each intersection, or 48 runs for each intersection altogether. It is necessary to make more than one run (or replication) of each simulation combination with different random number seeds because of the randomness built into simulation models. The experiment design in this case allows the team to reduce the number of replications to two; typical practice in simulations when one is making simple comparisons between two variables is to make at least 5 to 10 replications. Each run lasts 30 simulated minutes. Table 30 shows the simulation data for one of the seven intersections. The lowest travel time produced in each case is bolded. Notice that Table 30 does not show data for the bowtie design. That design became congested (gridlocked) and produced essentially infinite travel times for this intersection. Handling overly congested networks is a difficult problem in many efforts and with several different simulation software packages. The best current advice is for analysts to not push their networks too hard and to scan often for gridlock. 4. Specification of Analysis Technique and Data Analysis: The experiment assembled in this example uses a factorial design. (Factorial design also is discussed in Example 11.) The team analyzes the data from this factorial experiment using analysis of variance (ANOVA). Because Time of Day Total Travel Time, Vehicle-hours, Average of Two Simulation Runs Conventional Quadrant Median U Superstreet Jughandle Split Continuous Midday 67 64 61 74 63 59* 75 P.M. peak 121 95 119 179 139 114 106 Peak + 15% 170 *Lowest total travel time. 135 145 245 164 180 142 Table 30. Simulation results for different designs and time of day.

76 effective experiment Design and Data analysis in transportation research the experimenter has complete control in a simulation, it is common to use efficient designs like factorials and efficient analysis methods like ANOVA to squeeze all possible information out of the effort. Statistical tests comparing the individual mean values of key results by factor are common ways to follow up on ANOVA results. Although ANOVA will reveal which factors make a significant contribution to the overall variance in the dependent variable, means tests will show which levels of a significant factor differ from the other levels. In this example, the team uses Tukey’s means test, which is available as part of the battery of standard tests accom- panying ANOVA in statistical software. (For more information about ANOVA, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A.) 5. Interpreting the Results: For the data shown in Table 30, the ANOVA reveals that the volume and design factors are statistically significant at the 99.99% confidence level. Furthermore, the interaction between the volume and design factors also is statistically significant at the 99.99% level. The means tests on the design factors show that the quadrant roadway is significantly different from (has a lower overall travel time than) the other designs at the 95% level. The next- best designs overall are the median U-turn and the continuous flow intersection; these are not statistically different from each other at the 95% level. The third tier of designs consists of the conventional and the split, which are statistically different from all others at the 95% level but not from each other. Finally, the jughandle and the superstreet designs are statistically different from each other and from all other designs at the 95% level according to the means test. Through the simulation, the team learns that several designs appear to be more efficient than the conventional design, especially at higher volume levels. From the results at all seven intersections, the team sees that the quadrant roadway and median U-turn designs generally lead to the lowest travel times, especially with the higher volume levels. 6. Conclusion and Discussion: Simulation is an effective tool to analyze traffic operations, as at the seven intersections of interest in this example. No other tool would allow such a robust comparison of many different designs and provide the results for travel times in a larger net- work rather than delays at a single spot. The simulation conducted in this example also allows the team to conduct an efficient factorial design, which maximizes the information provided from the effort. Simulation is a useful tool in research for traffic operations because it • affords the ability to conduct randomized experiments, • allows the examination of details that other methods cannot provide, and • allows the analysis of large and complex networks. In practice, simulation also is popular because of the vivid and realistic animation output provided by common software packages. The superb animations allow analysts to spot and treat flaws in the design or model and provide agencies an effective tool by which to share designs with politicians and the public. Although simulation results can sometimes be surprising, more often they confirm what the analysts already suspect based on simpler analyses. In the example described here, the analysts suspected that the quadrant roadway and median U-turn designs would perform well because these designs had performed well in prior Highway Capacity Manual calculations. In many studies, simulations provide rich detail and vivid animation but no big surprises. 7. Applications in Other Areas of Transportation Research: Simulations are critical analysis methods in several areas of transportation research. Besides traffic operations, simulations are used in research related to: • Maintenance—to model the lifetime performance of traffic signs. • Traffic Safety – to examine vehicle performance and driver behaviors or performance. – to predict the number of collisions from a new roadway design (potentially, given the recent development of the FHWA SSAM program).

examples of effective experiment Design and Data analysis in transportation research 77 Example 21: Traffic Safety; Non-parametric Methods Area: Traffic safety Method of Analysis: Non-parametric methods (methods used when data do not follow assumed or conventional distributions, such as when comparing median values) 1. Research Question/Problem Statement: A city traffic engineer has been receiving many citizen complaints about the perceived lack of safety at unsignalized midblock crosswalks. Apparently, some motorists seem surprised by pedestrians in the crosswalks and do not yield to the pedestrians. The engineer believes that larger and brighter warning signs may be an inexpensive way to enhance safety at these locations. Question/Issue Determine whether some treatment has an effect when data to be tested do not follow known distributions. In this example, a nonparametric method is used to determine whether larger and brighter warning signs improve pedestrian safety at unsignalized midblock crosswalks. The null hypothesis and alternative hypothesis are stated as follows: Ho: There is no difference in the median values of the number of conflicts before and after a treatment. Ha: There is a difference in the median values. 2. Identification and Description of Variables: The engineer would like to collect collision data at crosswalks with improved signs, but it would take a long time at a large sample of crosswalks to collect a reasonable sample size of collisions to answer the question. Instead, the engineer collects data for conflicts, which are near-collisions when one or both of the involved entities brakes or swerves within 2 seconds of a collision to avoid the collision. Research literature has shown that conflicts are related to collisions, and because conflicts are much more numerous than collisions, it is much quicker to collect a good sample size. Conflict data are not nearly as widely used as collision data, however, and the underlying distribution of conflict data is not clear. Thus, the use of non-parametric methods seems appropriate. 3. Data Collection: The engineer identifies seven test crosswalks in the city based on large pedes- trian volumes and the presence of convenient vantage points for observing conflicts. The engi- neering staff collects data on traffic conflicts for 2 full days at each of the seven crosswalks with standard warning signs. The engineer then has larger and brighter warning signs installed at the seven sites. After waiting at least 1 month at each site after sign installation, the staff again collects traffic conflicts for 2 full days, making sure that weather, light, and as many other conditions as possible are similar between the before-and-after data collection periods at each site. 4. Specification of Analysis Technique and Data Analysis: A nonparametric statistical test is an efficient way to analyze data when the underlying distribution is unclear (as in this example using conflict data) and when the sample size is small (as in this example with its small number of sites). Several such tests, such as the sign test and the Wilcoxon signed-rank (Wilcoxon rank-sum) test are plausible in this example. (For more information about nonparametric tests, see NCHRP Project 20-45, Volume 2, Chapter 6, Section D, “Hypothesis About Population Medians for Independent Samples.” ) The decision is made to use the Wilcoxon signed-rank test because it is a more powerful test for paired numerical measurements than other tests, and this example uses paired (before-and-after) measurements. The sign test is a popular nonparametric test for paired data but loses information contained in numerical measurements by reducing the data to a series of positive or negative signs.

78 effective experiment Design and Data analysis in transportation research Having decided on the Wilcoxon signed-rank test, the engineer arranges the data (see Table 31). The third row of the table is the difference between the frequencies of the two conflict measurements at each site. The last row shows the rank order of the sites from lowest to highest based on the absolute value of the difference. Site 3 has the least difference (35 - 33 = 2) while Site 7 has the greatest difference (54 - 61 = -16). The Wilcoxon signed-rank test ranks the differences from low to high in terms of absolute values. In this case, that would be 2, 3, 7, 7, 12, 15, and 16. The test statistic, x, is the sum of the ranks that have positive differences. In this example, x = 1 + 2 + 3.5 + 3.5 + 6 = 16. Notice that all but the sixth and seventh ranked sites had positive differences. Notice also that the tied differences were assigned ranks equal to the average of the ranks they would have received if they were just slightly different from each other. The engineer then consults a table for the Wilcoxon signed-rank test to get a critical value against which to compare. (Such a table appears in NCHRP Project 20-45, Volume 2, Appendix C, Table C-8.) The standard table for a sample size of seven shows that the critical value for a one-tailed test (testing whether there is an improvement) with a confidence level of 95% is x = 24. 5. Interpreting the Results: Because the calculated value (x = 16) is less than the critical value (x = 24), the engineer concludes that there is not a statistically significant difference between the number of conflicts recorded with standard signs and the number of conflicts recorded with larger and brighter signs. 6. Conclusion and Discussion: Nonparametric tests do not require the engineer to make restric- tive assumptions about an underlying distribution and are therefore good choices in cases like this, in which the sample size is small and the data collected do not have a familiar underlying distribution. Many nonparametric tests are available, so analysts should do some reading and searching before settling on the best one for any particular case. Once a nonparametric test is determined, it is usually easy to apply. This example also illustrates one of the potential pitfalls of statistical testing. The engineer’s conclusion is that there is not a statistically significant difference between the number of conflicts recorded with standard signs and the number of conflicts recorded with larger and brighter signs. That conclusion does not necessarily mean that larger and brighter signs are a bad idea at sites similar to those tested. Notice that in this experiment, larger and brighter signs produced lower conflict frequencies at five of the seven sites, and the average number of conflicts per site was lower with the larger and brighter signs. Given that signs are relatively inexpensive, they may be a good idea at sites like those tested. A statistical test can provide useful information, especially about the quality of the experiment, but analysts must be careful not to interpret the results of a statistical test too strictly. In this example, the greatest danger to the validity of the test result lies not in the statistical test but in the underlying before-and-after test setup. For the results to be valid, it is necessary that the only important change that affects conflicts at the test sites during data collection be Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Standard signs 170 39 35 32 32 19 45 Larger and brighter signs 155 26 33 29 25 31 61 Difference 15 7 2 3 7 -12 -16 Rank of absolute difference 6 73.5 1 2 3.5 5 Table 31. Number of conflicts recorded during each (equal) time period at each site.

examples of effective experiment Design and Data analysis in transportation research 79 the new signs. The engineer has kept the duration short between the before-and-after data collection periods, which helps minimize the chances of other important changes. However, if there is any reason to suspect other important changes, these test results should be viewed skeptically and a more sophisticated test strategy should be employed. 7. Applications in Other Areas of Transportation Research: Nonparametric tests are helpful when researchers are working with small sample sizes or sample data wherein the underlying distribution is unknown. Examples of other areas of transportation research in which non- parametric tests may be applied include: • Transportation Planning, Public Transportation—to analyze data from surveys and questionnaires when the scale of the response calls into question the underlying distribution. Such data are often analyzed in transportation planning and public transportation. • Traffic Operations—to analyze small samples of speed or volume data. • Structures, Pavements—to analyze quality ratings of pavements, bridges, and other trans- portation assets. Such ratings also use scales. Resources The examples used in this report have included references to the following resources. Researchers are encouraged to consult these resources for more information about statistical procedures. Freund, R. J. and W. J. Wilson (2003). Statistical Methods. 2d ed. Burlington, MA: Academic Press. See page 256 for a discussion of Tukey’s procedure. Kutner, M. et al. (2005). Applied Linear Statistical Models. 5th ed. Boston: McGraw-Hill. See page 746 for a discussion of Tukey’s procedure. NCHRP CD-22: Scientific Approaches to Transportation Research, Vol. 1 and 2. 2002. Transpor- tation Research Board of the National Academies, Washington, D.C. This two-volume electronic manual developed under NCHRP Project 20-45 provides a comprehensive source of information on the conduct of research. The manual includes state-of-the-art techniques for problem state- ment development; literature searching; development of the research work plan; execution of the experiment; data collection, management, quality control, and reporting of results; and evaluation of the effectiveness of the research, as well as the requirements for the systematic, pro- fessional, and ethical conduct of transportation research. For readers’ convenience, the references to NCHRP Project 20-45 from the various examples contained in this report are summarized here by topic and location in NCHRP CD-22. More information about NCHRP CD-22 is available at http://www.trb.org/Main/Blurbs/152122.aspx. • Analysis of Variance (one-way ANOVA and two-way ANOVA): See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 113, 119–31). • Assumptions for residual errors: See Volume 2, Chapter 4. • Box plots; Q-Q plots: See Volume 2, Chapter 6, Section C. • Chi-square test: See Volume 2, Chapter 6, Sections E (Chi-Square Test for Independence) and F. • Chi-square values: See Volume 2, Appendix C, Table C-2. • Computations on unbalanced designs and multi-factorial designs: See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 119–31). • Confidence intervals: See Volume 2, Chapter 4. • Correlation coefficient: See Volume 2, Appendix A, Glossary, Correlation Coefficient. • Critical F-value: See Volume 2, Appendix C, Table C-5. • Desirable and undesirable residual plots (scatter plots): See Volume 2, Chapter 4, Section B, Figure 6.

80 effective experiment Design and Data analysis in transportation research • Equation fit: See Volume 2, Chapter 4, Glossary, Descriptive Measures of Association Between X and Y. • Error distributions (normality, constant variance, uncorrelated, etc.): See Volume 2, Chapter 4 (pp. 146–55). • Experiment design and data collection: See Volume 2, Chapter 1. • Fcrit and F-distribution table: See Volume 2, Appendix C, Table C-5. • F-test (or F-test): See Volume 2, Chapter 4, Section A, Compute the F-ratio Test Statistic (p. 124). • Formulation of formal hypotheses for testing: See Volume 1, Chapter 2, Hypothesis; Volume 2, Appendix A, Glossary. • History and maturation biases (specification errors): See Volume 2, Chapter 1, Quasi- Experiments. • Indicator (dummy) variables: See Volume 2, Chapter 4 (pp. 142–45). • Intercept and slope: See Volume 2, Chapter 4 (pp. 140–42). • Maximum likelihood methods: See Volume 2, Chapter 5 (pp. 208–11). • Mean and standard deviation formulas: See Volume 2, Chapter 6, Table C, Frequency Distribu- tions, Variance, Standard Deviation, Histograms, and Boxplots. • Measured ratio or interval scale: See Volume 2, Chapter 1 (p. 83). • Multinomial distribution and polychotomous logistical model: See Volume 2, Chapter 5 (pp. 211–18). • Multiple (multivariate) regression: See Volume 2, Chapter 4, Section B. • Non-parametric tests: See Volume 2, Chapter 6, Section D. • Normal distribution: See Volume 2, Appendix A, Glossary, Normal Distribution. • One- and two-sided hypothesis testing (one- and two-tail test values): See Volume 2, Chapter 4 (pp. 161 and 164–5). • Ordinary least squares (OLS) regression: See Volume 2, Chapter 4, Section B, Linear Regression. • Sample size and confidence: See Volume 2, Chapter 1, Sample Size Determination. • Sample size determination based on statistical power requirements: See Volume 2, Chapter 1, Sample Size Determination (p. 94). • Sign test and the Wilcoxon signed-rank (Wilcoxon rank-sum) test: See Volume 2, Chapter 6, Section D, and Appendix C, Table C-8, Hypothesis About Population Medians for Independent Samples. • Split samples: See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 119–31). • Standard chi-square distribution table: See Volume 2, Appendix C, Table C-2. • Standard normal values: See Volume 2, Appendix C, Table C-1. • tcrit values: See Volume 2, Appendix C, Table C-4. • t-statistic: See Volume 2, Appendix A, Glossary. • t-statistic using equation for equal variance: See Volume 2, Appendix C, Table C-4. • t-test: See Volume 2, Chapter 4, Section B, How are t-statistics Interpreted? • Tabularized values of t-statistic: See Volume 2, Appendix C, Table C-4. • Tukey’s test, Bonferroni’s test, Scheffe’s test: See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 119–31). • Types of data and implications for selection of analysis techniques: See Volume 2, Chapter 1, Identification of Empirical Setting.

Abbreviations and acronyms used without definitions in TRB publications: AAAE American Association of Airport Executives AASHO American Association of State Highway Officials AASHTO American Association of State Highway and Transportation Officials ACI–NA Airports Council International–North America ACRP Airport Cooperative Research Program ADA Americans with Disabilities Act APTA American Public Transportation Association ASCE American Society of Civil Engineers ASME American Society of Mechanical Engineers ASTM American Society for Testing and Materials ATA American Trucking Associations CTAA Community Transportation Association of America CTBSSP Commercial Truck and Bus Safety Synthesis Program DHS Department of Homeland Security DOE Department of Energy EPA Environmental Protection Agency FAA Federal Aviation Administration FHWA Federal Highway Administration FMCSA Federal Motor Carrier Safety Administration FRA Federal Railroad Administration FTA Federal Transit Administration HMCRP Hazardous Materials Cooperative Research Program IEEE Institute of Electrical and Electronics Engineers ISTEA Intermodal Surface Transportation Efficiency Act of 1991 ITE Institute of Transportation Engineers NASA National Aeronautics and Space Administration NASAO National Association of State Aviation Officials NCFRP National Cooperative Freight Research Program NCHRP National Cooperative Highway Research Program NHTSA National Highway Traffic Safety Administration NTSB National Transportation Safety Board PHMSA Pipeline and Hazardous Materials Safety Administration RITA Research and Innovative Technology Administration SAE Society of Automotive Engineers SAFETEA-LU Safe, Accountable, Flexible, Efficient Transportation Equity Act: A Legacy for Users (2005) TCRP Transit Cooperative Research Program TEA-21 Transportation Equity Act for the 21st Century (1998) TRB Transportation Research Board TSA Transportation Security Administration U.S.DOT United States Department of Transportation

TRB’s National Cooperative Highway Research Program (NCHRP) Report 727: Effective Experiment Design and Data Analysis in Transportation Research describes the factors that may be considered in designing experiments and presents 21 typical transportation examples illustrating the experiment design process, including selection of appropriate statistical tests.

The report is a companion to NCHRP CD-22, Scientific Approaches to Transportation Research, Volumes 1 and 2 , which present detailed information on statistical methods.

READ FREE ONLINE

Welcome to OpenBook!

You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

Do you want to take a quick tour of the OpenBook's features?

Show this book's table of contents , where you can jump to any chapter by name.

...or use these buttons to go back to the previous chapter or skip to the next one.

Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

To search the entire text of this book, type in your search term here and press Enter .

Share a link to this book page on your preferred social network or via email.

View our suggested citation for this chapter.

Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

Get Email Updates

Do you enjoy reading reports from the Academies online for free ? Sign up for email notifications and we'll let you know about new publications in your areas of interest when they're released.

Use It for Good

Census Data for Good: Analysis to Action

By Lara Cleveland

IPUMS International regularly asks representatives of National Statistical Offices (NSOs) around the world to share their data with the research community. While IPUMS offers a license payment to countries for the right to redistribute microdata, NSO representatives are most interested in how sharing data with IPUMS will benefit the people of their countries. After 30 years of harmonizing data that NSOs have shared with us, IPUMS can indeed point to innovative research from data users all over the world, many at major universities in these partner countries. Directors of statistical offices, especially those with close ties to academia, are thrilled that the data are used for scholarly scientific production and for the purpose of educating the next generation. However, most of these leaders are much more interested in how data sharing leads to effective policy. And they want examples. They are essentially asking how the data have been “used for good,” as the original IPUMS tagline, “Use it for good!” implores.

Sustainable Development Goals Square Text Logo, color wheel as O in goals

In response, IPUMS has been following data-to-policy trails where we can find them. The United Nations’ efforts to establish and measure the Sustainable Development Goals (SDGs) have provided wins in this area. Early in the life of the SDGs, colleagues from the World Health Organization visited IPUMS to leverage detailed information in the occupational variables for locating the health workforce. Microdata from censuses helped them measure the density of a range of health worker classifications at subnational levels. The International Organization for Migration (IOM) did similar work to disaggregate census-based SDGs by migratory status. At the start of the pandemic, The United Nations Population Fund (UNFPA) used IPUMS census microdata to spin up a dashboard showing the living arrangements of older adults, again at subnational levels. Each of these applications of IPUMS International data resulted in policy recommendations, informed by additional data, additional policy research, and pilot projects.

Image of a report cover from the International Organization for Migration publication entitled “A pilot study on disaggregating SDG indicators by migratory status.”

These UN organizations, of course, are in the business of using data to inform a broader policy agenda. They make it relatively easy to follow data through the policy pipeline. In the case of scholarly articles–the output of most research using IPUMS–that path is more difficult to trace. We know that the road to improving lives often begins with observation, description, and experimentation and may have a long journey to policy change. We also know that the ability to measure experiences and monitor change with real-world data are a vital part of this process. We are confident when we tell national statistical offices that sharing data enables evidence-based policy making and contributes to improved lives. We are on the lookout for stories from researchers about how their work informed policy conversations. If you have a story of how data from IPUMS has made a difference for the lives of others, please share it with us by emailing [email protected] . Not only will your stories help IPUMS make a case for data sharing, they will also help NSOs persuade their governments to continue funding data production.

Three maps of Benin displaying the distribution of health worker classifications in Benin including physicians, nursing and midwifery personnel, and traditional medicine.

Generate accurate APA citations for free

  • Knowledge Base
  • APA Style 7th edition
  • How to write an APA results section

Reporting Research Results in APA Style | Tips & Examples

Published on December 21, 2020 by Pritha Bhandari . Revised on January 17, 2024.

The results section of a quantitative research paper is where you summarize your data and report the findings of any relevant statistical analyses.

The APA manual provides rigorous guidelines for what to report in quantitative research papers in the fields of psychology, education, and other social sciences.

Use these standards to answer your research questions and report your data analyses in a complete and transparent way.

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

upload-your-document-ai-proofreader

Table of contents

What goes in your results section, introduce your data, summarize your data, report statistical results, presenting numbers effectively, what doesn’t belong in your results section, frequently asked questions about results in apa.

In APA style, the results section includes preliminary information about the participants and data, descriptive and inferential statistics, and the results of any exploratory analyses.

Include these in your results section:

  • Participant flow and recruitment period. Report the number of participants at every stage of the study, as well as the dates when recruitment took place.
  • Missing data . Identify the proportion of data that wasn’t included in your final analysis and state the reasons.
  • Any adverse events. Make sure to report any unexpected events or side effects (for clinical studies).
  • Descriptive statistics . Summarize the primary and secondary outcomes of the study.
  • Inferential statistics , including confidence intervals and effect sizes. Address the primary and secondary research questions by reporting the detailed results of your main analyses.
  • Results of subgroup or exploratory analyses, if applicable. Place detailed results in supplementary materials.

Write up the results in the past tense because you’re describing the outcomes of a completed research study.

Scribbr Citation Checker New

The AI-powered Citation Checker helps you avoid common mistakes such as:

  • Missing commas and periods
  • Incorrect usage of “et al.”
  • Ampersands (&) in narrative citations
  • Missing reference entries

data analysis sample research paper

Before diving into your research findings, first describe the flow of participants at every stage of your study and whether any data were excluded from the final analysis.

Participant flow and recruitment period

It’s necessary to report any attrition, which is the decline in participants at every sequential stage of a study. That’s because an uneven number of participants across groups sometimes threatens internal validity and makes it difficult to compare groups. Be sure to also state all reasons for attrition.

If your study has multiple stages (e.g., pre-test, intervention, and post-test) and groups (e.g., experimental and control groups), a flow chart is the best way to report the number of participants in each group per stage and reasons for attrition.

Also report the dates for when you recruited participants or performed follow-up sessions.

Missing data

Another key issue is the completeness of your dataset. It’s necessary to report both the amount and reasons for data that was missing or excluded.

Data can become unusable due to equipment malfunctions, improper storage, unexpected events, participant ineligibility, and so on. For each case, state the reason why the data were unusable.

Some data points may be removed from the final analysis because they are outliers—but you must be able to justify how you decided what to exclude.

If you applied any techniques for overcoming or compensating for lost data, report those as well.

Adverse events

For clinical studies, report all events with serious consequences or any side effects that occured.

Descriptive statistics summarize your data for the reader. Present descriptive statistics for each primary, secondary, and subgroup analysis.

Don’t provide formulas or citations for commonly used statistics (e.g., standard deviation) – but do provide them for new or rare equations.

Descriptive statistics

The exact descriptive statistics that you report depends on the types of data in your study. Categorical variables can be reported using proportions, while quantitative data can be reported using means and standard deviations . For a large set of numbers, a table is the most effective presentation format.

Include sample sizes (overall and for each group) as well as appropriate measures of central tendency and variability for the outcomes in your results section. For every point estimate , add a clearly labelled measure of variability as well.

Be sure to note how you combined data to come up with variables of interest. For every variable of interest, explain how you operationalized it.

According to APA journal standards, it’s necessary to report all relevant hypothesis tests performed, estimates of effect sizes, and confidence intervals.

When reporting statistical results, you should first address primary research questions before moving onto secondary research questions and any exploratory or subgroup analyses.

Present the results of tests in the order that you performed them—report the outcomes of main tests before post-hoc tests, for example. Don’t leave out any relevant results, even if they don’t support your hypothesis.

Inferential statistics

For each statistical test performed, first restate the hypothesis , then state whether your hypothesis was supported and provide the outcomes that led you to that conclusion.

Report the following for each hypothesis test:

  • the test statistic value,
  • the degrees of freedom ,
  • the exact p- value (unless it is less than 0.001),
  • the magnitude and direction of the effect.

When reporting complex data analyses, such as factor analysis or multivariate analysis, present the models estimated in detail, and state the statistical software used. Make sure to report any violations of statistical assumptions or problems with estimation.

Effect sizes and confidence intervals

For each hypothesis test performed, you should present confidence intervals and estimates of effect sizes .

Confidence intervals are useful for showing the variability around point estimates. They should be included whenever you report population parameter estimates.

Effect sizes indicate how impactful the outcomes of a study are. But since they are estimates, it’s recommended that you also provide confidence intervals of effect sizes.

Subgroup or exploratory analyses

Briefly report the results of any other planned or exploratory analyses you performed. These may include subgroup analyses as well.

Subgroup analyses come with a high chance of false positive results, because performing a large number of comparison or correlation tests increases the chances of finding significant results.

If you find significant results in these analyses, make sure to appropriately report them as exploratory (rather than confirmatory) results to avoid overstating their importance.

While these analyses can be reported in less detail in the main text, you can provide the full analyses in supplementary materials.

To effectively present numbers, use a mix of text, tables , and figures where appropriate:

  • To present three or fewer numbers, try a sentence ,
  • To present between 4 and 20 numbers, try a table ,
  • To present more than 20 numbers, try a figure .

Since these are general guidelines, use your own judgment and feedback from others for effective presentation of numbers.

Tables and figures should be numbered and have titles, along with relevant notes. Make sure to present data only once throughout the paper and refer to any tables and figures in the text.

Formatting statistics and numbers

It’s important to follow capitalization , italicization, and abbreviation rules when referring to statistics in your paper. There are specific format guidelines for reporting statistics in APA , as well as general rules about writing numbers .

If you are unsure of how to present specific symbols, look up the detailed APA guidelines or other papers in your field.

It’s important to provide a complete picture of your data analyses and outcomes in a concise way. For that reason, raw data and any interpretations of your results are not included in the results section.

It’s rarely appropriate to include raw data in your results section. Instead, you should always save the raw data securely and make them available and accessible to any other researchers who request them.

Making scientific research available to others is a key part of academic integrity and open science.

Interpretation or discussion of results

This belongs in your discussion section. Your results section is where you objectively report all relevant findings and leave them open for interpretation by readers.

While you should state whether the findings of statistical tests lend support to your hypotheses, refrain from forming conclusions to your research questions in the results section.

Explanation of how statistics tests work

For the sake of concise writing, you can safely assume that readers of your paper have professional knowledge of how statistical inferences work.

In an APA results section , you should generally report the following:

  • Participant flow and recruitment period.
  • Missing data and any adverse events.
  • Descriptive statistics about your samples.
  • Inferential statistics , including confidence intervals and effect sizes.
  • Results of any subgroup or exploratory analyses, if applicable.

According to the APA guidelines, you should report enough detail on inferential statistics so that your readers understand your analyses.

  • the test statistic value
  • the degrees of freedom
  • the exact p value (unless it is less than 0.001)
  • the magnitude and direction of the effect

You should also present confidence intervals and estimates of effect sizes where relevant.

In APA style, statistics can be presented in the main text or as tables or figures . To decide how to present numbers, you can follow APA guidelines:

  • To present three or fewer numbers, try a sentence,
  • To present between 4 and 20 numbers, try a table,
  • To present more than 20 numbers, try a figure.

Results are usually written in the past tense , because they are describing the outcome of completed actions.

The results chapter or section simply and objectively reports what you found, without speculating on why you found these results. The discussion interprets the meaning of the results, puts them in context, and explains why they matter.

In qualitative research , results and discussion are sometimes combined. But in quantitative research , it’s considered important to separate the objective results from your interpretation of them.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2024, January 17). Reporting Research Results in APA Style | Tips & Examples. Scribbr. Retrieved October 9, 2024, from https://www.scribbr.com/apa-style/results-section/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, how to write an apa methods section, how to format tables and figures in apa style, reporting statistics in apa style | guidelines & examples, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Methods Section of Research Paper: Writing Steps

data analysis sample research paper

The methods section of a research paper can feel a bit intimidating, but it's actually one of the most straightforward parts. Here, you're basically walking readers through the steps you took to conduct your research. Every choice, from your data collection techniques to how you analyzed the results, matters because it ensures others can follow your process or validate your findings.

When writing about materials and methods in research paper, you're not just listing steps; you're explaining your approach and reasoning behind it. Why did you choose that specific method? What made it the best fit for your study? You add clarity and depth, giving readers confidence in your work, by answering these questions

Let our expert university essay writing services break down how to write the methods section of a research paper so it's less overwhelming.

What Is the Methods Section of a Research Paper

The methods section explains how you conducted your research. It's a detailed explanation of the steps you took to gather data, what tools or techniques you used, and how you analyzed the information. The goal is to give readers enough detail so they can understand your process and, if needed, replicate the study themselves.

This section covers things like your research design (was it experimental, observational, or something else?), the participants or materials involved, the procedures you followed, and the methods used to analyze the results. It's all about transparency—making sure others know exactly what you did and why you did it. And what is the purpose of the methods section in a research paper, you may ask? Let's look into that below.

For detailed information about the research methods in psychology , check out this blog.

data analysis sample research paper

Wednesday Addams

Mysterious, dark, and sarcastic

You’re the master of dark humor and love standing out with your unconventional style. Your perfect costume? A modern twist on Wednesday Addams’ gothic look. You’ll own Halloween with your unapologetically eerie vibe. 🖤🕸️

Struggling with Your Research Methods?

Let us help you create an accurate methods section that meets academic standards!

Why Is the Methods Section Important

Writing the methods section of a research paper is important because it shows the foundation of your research. If the methods aren't clear or reliable, then the entire study can be questioned. This section gives readers the confidence that your findings are valid by showing that your research was conducted in a logical, well-planned way. It's also key for reproducibility—other researchers should be able to follow your steps and get similar results if they replicate the study.

For example, imagine you're reading a paper about a new way to reduce anxiety through mindfulness exercises. If the methods part of a research paper doesn't clearly explain how the exercises were conducted, how many participants were involved, or how the results were measured, it's hard to trust the conclusion. But if the steps are detailed and make sense, you can see that the research was solid and potentially use it to build further studies or apply the findings in practice.

Methods Section of Research Paper: Structure Breakdown

Your methods section should follow a chronological order, starting with the first step (such as selecting participants) and ending with the last step (like analyzing the data). These sections are often divided into labeled subsections, each covering a specific part of the research process. The methods section outline can vary based on the type of project and discipline, but the following are common elements:

  • Study Design : It explains whether your study is experimental, observational, or something else. It's the framework guiding your investigation.
  • Setting and Subjects : This part describes where the study took place and who or what was involved. Whether it's a group of participants, animals, or objects, this section covers all the details of your sample.
  • Data Collection : When writing the methods section of a research paper, here you explain how you gathered your data—through surveys, interviews, experiments, etc. You also need to include the tools or instruments you used.
  • Data Analysis : Once you collected the data, how did you make sense of it? This part explains the methods you used to analyze your results, whether through statistical software or manual coding.
  • Ethical Approval : If your study involves human or animal subjects, you must explain how you obtained ethical clearance. This ensures your research was conducted responsibly and followed ethical guidelines.

These parts of a methods section are quite simple and straightforward—you can even start drafting it while your research is ongoing. Alternatively, you have the option to request - write my research paper for me cheap , and we’ll take care of it immediately.

How to Write the Methods Section of a Research Paper

Writing the methods section doesn't have to be complicated. Follow the seven key steps below to structure this section effectively:

  • Start with an Overview
  • Describe Your Study Design
  • Explain the Setting and Subjects
  • Detail Your Data Collection Methods
  • Outline Your Data Analysis Approach
  • Address Ethical Considerations
  • Provide Enough Detail for Replication

Step 1. Start with an Overview

The first step is to give readers a general idea of how your research was conducted. This overview should be brief, but it should cover the essential aspects of your approach—like the type of study, the key methods you used, and the overall goal of the research.

For example, you might write: "This study used a mixed-methods approach to investigate the effects of virtual reality on improving memory recall in elderly participants. A combination of cognitive tests and participant feedback was used to measure the impact."

This overview gives readers the gist of your study—what you were testing, who was involved, and how you gathered data. By mentioning both cognitive tests and feedback, you're hinting at the blend of methods without giving everything away upfront. Keep it clear, straightforward, and enough to hook your readers.

Step 2. Describe Your Study Design

The study design section answers the question: How did you structure your research? Be clear about whether your study was experimental, observational, qualitative, or something else. The goal is to explain the backbone of your research, giving readers a sense of how you planned to gather and interpret your data.

For example, you could say: "This study followed an experimental design, where participants were randomly assigned to either a virtual reality group or a traditional memory exercise group. The outcomes of both groups were compared to determine the effect of the intervention."

Don't go overboard with details here, but make sure you're clear about how your study was set up. This helps readers understand the logic behind your research and how you aimed to reach your conclusions.

Step 3. Explain the Setting and Subjects

This step is all about where and with whom the research took place. You'll need to describe the setting, whether it was in a lab, a classroom, or even online. Then, introduce your subjects or participants. Were they people, animals, or objects? And how did you select them? It's important to clarify these details so readers can understand the environment and who (or what) was involved.

For instance:

"The study took place in a community center with 40 elderly participants aged 65 and older. Participants were recruited through local senior groups and were chosen based on their willingness to engage in cognitive activities."

In the methods section of research paper, the setting and subjects are clearly defined, giving readers a good sense of where the research happened and who was part of it.

Step 4. Detail Your Data Collection Methods

Now, it's time to explain how you collected your data. This section is crucial because it shows readers the tools and techniques you used to gather information. Did you conduct surveys, run experiments, or hold interviews? Make sure to mention the methods clearly, including any specific equipment or instruments you used.

For example, you might write:

"Data were collected through two methods: cognitive tests that measured short-term memory performance and structured interviews that gathered qualitative feedback on the participants' experiences with virtual reality."

Notice how the sentence flows. You're not just listing methods; you're linking them together with a clear purpose. This helps create a smooth narrative while giving enough detail for others to understand how the data was gathered.

Step 5. Outline Your Data Analysis Approach

After collecting your data, the next step is to explain how you analyzed it. This is where you describe the methods or tools you used to make sense of the results. Were statistical tests involved? Did you categorize qualitative data? It's important to explain your approach so readers understand how you interpreted the findings.

"Quantitative data from the cognitive tests were analyzed using a t-test to compare memory performance between the two groups. Qualitative data from the interviews were coded thematically to identify common patterns in participant feedback."

Here, you're showing how both numbers and words were processed—giving a full picture of your analysis techniques. Keep it simple, but don't leave out important details.

Step 6. Address Ethical Considerations

Every study involving human or animal subjects must consider ethics. In this section, mention how you ensured the safety and rights of participants. Did you get informed consent? Was there any risk involved, and how was it minimized? Ethical approval shows that your research was conducted responsibly.

"All participants provided informed consent before the study began, and the research was approved by the local ethics committee. Participants were free to withdraw at any time, and confidentiality was strictly maintained throughout the study."

This reassures readers that you followed proper procedures, ensuring that the research was ethically sound.

Step 7. Provide Enough Detail for Replication

Lastly, give enough information so that someone else could replicate your study if needed. This doesn't mean overwhelming readers with every tiny detail, but the methods should be clear and specific enough that others can follow them. The goal is transparency.

For example: "Each participant completed three 30-minute virtual reality sessions over a two-week period, with cognitive tests administered before and after each session. The same protocol was followed for the traditional memory exercise group."

This kind of detail ensures that anyone trying to replicate your research can follow the same steps, reinforcing the reliability of your study.

How to Write a Methods Section APA

When writing the APA methods section, it's important to follow a clear format. Here's a brief overview of what to include:

  • Participants: In this part, describe the people who took part in your study. Start by explaining how you found them and the total number involved. For a proposal, mention how many participants you aim to include; for completed research, give the actual number. Include important details like age, gender, and other relevant characteristics. For example, you might say, "Participants (N = 150) will consist of 70 males and 80 females, aged 18-24."
  • Materials: This section details the tools and instruments used in your study. Make sure to provide enough information so someone else could replicate your research. If you use existing surveys or scales, mention their names and how they will be used. For example, you could write: "The Beck Depression Inventory (Beck et al., 1961) will measure participants' depressive symptoms. It includes 21 items rated on a scale from 0 to 3, where higher scores indicate more severe depression."
  • Procedure: Describe the steps that participants will follow in order. Clearly explain how they will be recruited, how you will obtain their consent, and what they will do during the study. For example: "Participants will be recruited through social media ads and will fill out an online consent form before starting the survey. They will answer demographic questions and then complete the main questionnaire on study habits."

In summary, the methods section in APA style should be organized into these subsections, each providing enough detail for others to replicate your study.

Methods Section of Research Paper Example

Below is a simplified real-life example to illustrate how the methods section is structured.

Tips and Pitfalls to Avoid

Even though writing the methods section of research paper can be straightforward, there are a few things to watch out for. Here are some tips to keep in mind—and some common pitfalls you'll want to avoid:

Be Clear, Not Vague - One common mistake is being too vague about what you did. For example, saying "participants completed a survey" doesn't give enough information. Instead, describe the survey, how it was administered, and what it measured. A better version would be: "Participants completed a 20-question survey on stress levels, administered online via Google Forms."

Avoid Overloading with Unnecessary Details - While it's essential to be detailed, don't drown your readers in irrelevant info. For example, if you're describing the software used for analysis, just mention it briefly unless it's crucial to understand your study. Instead of going into every setting you used, say: "Data was analyzed using SPSS, version 26."

Stay Consistent - One pitfall is inconsistencies between different sections of your paper. If you say you're going to measure five variables in your methods section, don't list six in your results. This creates confusion and makes your study harder to follow. Stay consistent throughout.

Don't Skip Ethical Considerations - Sometimes, writers forget to address ethics in their methods section. Make sure you include a brief mention of how you got informed consent and ensured confidentiality. For example, you could say: "All participants gave informed consent, and their data was anonymized to protect their privacy."

Proofread Carefully - Lastly, don't underestimate the power of proofreading. Small errors in your methods section can make a big difference. If a reader gets stuck on a typo or unclear phrase, they might miss your main point. So, go through your writing one more time to make sure everything flows smoothly.

What to Include in the Methods Section of a Research Paper

When writing the methods section, it's important to cover all key aspects without overwhelming your readers with unnecessary detail. Here's a checklist to guide you:

  • Participants or Subjects Start by describing who or what was involved in the study. This could include participants, animals, or specific materials. Provide relevant details like age, sex, and other characteristics, but keep it brief unless the specifics are crucial to your research.
  • Materials or Equipment Mention the tools, software, or equipment used in your study. Be sure to include important details like versions of software or manufacturers of equipment. But avoid unnecessary details—just the essentials.
  • Procedure Lay out the steps of your research, focusing on how you collected and analyzed data. Include enough detail for someone else to replicate your study, but don't bog it down with irrelevant minutiae.
  • Data Analysis Briefly explain how you analyzed your data. Mention any statistical tests or software used. If a specific method was used for data analysis, describe it concisely.
  • Ethical Considerations If applicable, include a short statement on how ethical standards were met, such as obtaining informed consent or ensuring confidentiality.

The methods section of research paper is about precision and clarity. By including all the key elements—participants, materials, procedures, and data analysis—you provide readers with a clear roadmap of your research. Keep it simple, stay focused, and make sure someone could replicate your study from your description. 

Stuck on the Methods Section?

EssayPro's experts can make it easy. Get our tailored guidance with a precise methods section!

What Is the Methods Section of a Research Paper?

What should be included in the methods section of a research paper, how to start the methods section of a research paper.

Annie Lambert

Annie Lambert

specializes in creating authoritative content on marketing, business, and finance, with a versatile ability to handle any essay type and dissertations. With a Master’s degree in Business Administration and a passion for social issues, her writing not only educates but also inspires action. On EssayPro blog, Annie delivers detailed guides and thought-provoking discussions on pressing economic and social topics. When not writing, she’s a guest speaker at various business seminars.

data analysis sample research paper

is an expert in nursing and healthcare, with a strong background in history, law, and literature. Holding advanced degrees in nursing and public health, his analytical approach and comprehensive knowledge help students navigate complex topics. On EssayPro blog, Adam provides insightful articles on everything from historical analysis to the intricacies of healthcare policies. In his downtime, he enjoys historical documentaries and volunteering at local clinics.

Guidelines for Writing APA Style Method Sections . (n.d.). https://www.tamuct.edu/coe/docs/Guidelines-for-Writing-APA-Style-Method-Sections-Revised.pdf

PowerPoint Presentation Ideas

We use cookies to enhance our website for you. Proceed if you agree to this policy or learn more about it.

  • Essay Database >
  • Essays Samples >
  • Essay Types >
  • Research Proposal Example

Data Analysis Research Proposals Samples For Students

132 samples of this type

While studying in college, you will inevitably have to write a bunch of Research Proposals on Data Analysis. Lucky you if putting words together and transforming them into relevant content comes naturally to you; if it's not the case, you can save the day by finding a previously written Data Analysis Research Proposal example and using it as a model to follow.

This is when you will certainly find WowEssays' free samples catalog extremely helpful as it includes numerous expertly written works on most various Data Analysis Research Proposals topics. Ideally, you should be able to find a piece that meets your criteria and use it as a template to build your own Research Proposal. Alternatively, our competent essay writers can deliver you an original Data Analysis Research Proposal model crafted from scratch according to your personal instructions.

Good Research Proposal About Statement: In My Field Of Information System Operation Management (Isom), The Current

Proposal for the current issue in my field research project.

There has been an increase in incidents involving digital cyber attacks worldwide. Most of the databases of corporations are targeted by criminals since they contain sensitive company information, which obtained can be used against the company. Hackers normally attack the databases to acquire sensitive information such as credit card numbers and other personal information of unsuspecting customers and use it to commit internet fraud.

Good Example Of Research Proposal On Plan For Developing Request For Proposal Document For Immunization Database In The

The request for proposal (RFP) document for this project will be designed to address the following issues contained in the sections and subsections outlined: - Cover page This section will contain the following: name of the department (department of public health), which seeks to recruit technical team to develop the database; title of the request for proposal, date, due date for submitting the proposal, and the officer to contact. The information will be written in that order centered at the cover page. - Table of content

The table of content will contain sections within the document

Creating the critical path research proposal sample, critical path.

Critical path description Critical path in project management are the project activities that takes the longest time and are in sequence. The project team has to give a lot of consideration to these activities such that the completion date of the project is not affected.

The critical path of the system that is to be developed will follow the following stages:

Don't waste your time searching for a sample.

Get your research proposal done by professional writers!

Just from $10/page

Research Proposal On Texting & Driving

Database design research proposal examples, free research proposal on association between modes of delivery, example of research proposal on database system implementation and importance, storage and processing research proposal examples, free research proposal on other details, section 1: design document.

Section 1: Design Document IntroductionCloud technology is the latest technology that helps businesses achieve their goals. It consists of a number of servers linked together to provide services to clients. Since a large number of servers are linked, clients have seemingly unlimited storage space. Expansion of businesses generates large volumes of data. Also with cross border expansions and varying time zones businesses face problems in accessing data. These problems can be resolved by migrating to cloud computing. The two main strengths of cloud computing are scalability and virtualization. This report presents a case for adoption of cloud technology.

Need for Cloud Technology

Good research proposal about gcu: res 880.

Dropping Out or Pushed Out: The Impact of High School Dropout Rate Relative to High Stakes Testing Policy in Wayne County, State of Michigan

Dissertation Prospectus Dropping Out or Pushed Out: The Impact of High School Dropout Rate Relative to High Stakes Testing Policy in Wayne County, State of Michigan <Insert Chair Name>

Dissertation Prospectus

Research proposal on target population, riordan erp research proposal sample, executive summary, example of academic research proposal on data analysis, research proposal on elementary school-aged obesity: review of home-based and school-based interventions sustainability, elementary school-aged obesity, free suggested by the writer research proposal sample.

1. The organizations need in their process to have access to transactional systems to generate, analyze and consult information, but surely the company could have problems regarding response times; the information distributed in different systems that are not homogeneous could cause inflexible and complex reports and results.

Historical Justification

Free research proposal about data analysis and analytics research, the role of the senior level informatacist in the informational technology change, research proposal on face reconstruction and recognition technology in 3d format, burglary research proposal example, introduction, draw topic & writing ideas from this research proposal on h0: “there is no significant relationship between employee turnover and performance appraisal.”, proposal for original business research, research proposal on efficient 3d face reconstruction from random image and recognition from database, research proposal on a on risks and challenges associated with the adoption of cloud, free librarygrantproposer research proposal sample, executive summary, free research methodology 9 research proposal example, data analysis of grief and nursing research proposal, central question research proposal example, breast feeding research proposal sample, contact sports and concussion research proposal sample, marital satisfaction and work-family balance research proposals examples, marital satisfaction and work-family conflict, windows network proposal: a top-quality research proposal for your inspiration, the relationship between school attendance and academic performance in learning research proposal to use for practical writing help, chapter 1: introduction, free research proposal about the usefulness for offender profiling by florida state police investigators, sample research proposal on windows network proposal, the development of new brighton tourism destination research proposal example.

Tourism is one of the most fast developing sectors in the world. Many countries that have good beaches have found the need to develop tourism destinations that will assist in marketing their attraction sites. The research below shows the development of New Brighton tourism destination. It offers an introduction to tourism followed by the background information on New Brighton. A review on different researches conducted on the same topic is analyzed in order to come up with the research gaps. Moreover, the proposal gives the recommended method of data collection. Primary data collection method is preferred for this research.

... Read more Business Development Planning Infrastructure Tourism Time Management Bible Information Destination Data Customers Industry 11 Pages Free Nursing Proposal Research Proposal Sample

Chapter one: introduction 1.

Background 1 Statement of the Problem 2 Research Objectives 2 Research Questions 3 Significance of the study 3 Chapter Two: Literature Review 4 Introduction 4 Theoretical Literature 4 Empirical Literature 5 System Barriers 5 Healthcare Professional Barrier 7 Social Barriers 7 Patient Barrier 8 Chapter Three: Methodology 8 Introduction 8 Research Design 9 Model Specification 9 Study Area 9 Target Population 10 Data Type and data Source 10 Data collection 10 Data Analysis 10

References 11

Implementation of biometrics technology for security applications in immigration research proposal, sustainability in the event tomorrowland- methodologies research proposal examples, nutrition vs medication research proposal sample, research proposal on coping strategies used by african american women to deal with racism and sexism.

INTRODUCTION

Research Proposal On Nursing Homes

Statement of the problem.

Old age is associated with several mental illnesses, which culminate into other psychosocial issues. For example, dementia and other related conditions such as the Creutzfeudz Jacob’s disease – as caused by advanced senility – have become common in the modern world. Usually, these diseases affect old people, which become more and more problematic as the age advances. The role of caring for the old people has therefore become very vital, usually requiring increased care and special treatment to these people.

Transitional Nursing For The Elderly Population Research Proposals Example

Nursing research.

Research Implementation Phase As mentioned earlier in the paper, in order to answer the five main research questions formulated for this research paper, four different types of data collection methods will be utilized; namely, secondary research of existing literature and primary research (survey-based questionnaires, interviews and direct observation). The implementation phase will include a brief overview of how the project will be executed on-ground, the expenditures that will be needed, as well as how the process of data analysis will be conducted.

Project Schedule and Timeframe

Direct practice improvement project prospectus research proposal sample.

<Insert Title here> <Insert Name> <Insert Submission Date> <Insert Chairperson Name>

Free Criminal Law Research Proposal Example

Example of research proposal on alcoholism and addictive personality, good example of direct practice improvement project prospectus research proposal, variables that influence psychological well-being following significant weight loss submitted by.

<Insert Name> <Insert Submission Date> <Insert Chairperson Name>

Effective Communication In The Workplace: The Relationship Between Employee Performance And Organizational Communication Research Proposal Samples

Literature review, good example of data analysis plans research proposal.

In order to gain a better and an in-depth understanding of a research question, the researcher must use a data analysis plan that satisfies all the needs of the research design and the respective variables. As such, data analysis for this research would be done using the SPSS software. The beauty of using this statistical analysis software is that it offers a wide variety of statistical features that can be applied for different types of variables, both demographic and study variables.

Plan for demographic data analysis

Evaluation on the effects of lilac-colored paper in readability research proposal sample, restatement of initial research questions, does religion and spirituality help veterans cope with military-related ptsd, and lower the risk of suicide: example research proposal by an expert writer to follow.

(Author, Department, University,

Corresponding Address and email)

Example of research proposal on cognitive behavioral therapy in women with depression, the lived experiences of staff nurses on burnout in oncology department research proposal example, the lived experiences of staff nurses on burnout in oncology department.

Research Question: What are the lived experiences of nurses in the Oncology Department? What are the coping experiences of nurses in the Oncology Department? Objectives of the study: Conceptual Framework Figure 1 Conceptual Framework

Research design

Growth opportunities: example research proposal by an expert writer to follow, section 1: project introduction, expertly crafted research proposal on research to investigate cultural visibility in influencing perinatal depression prevention methods among underserved women, example of a change proposal for branch m of the xyz company research proposal, sample research proposal on statement of the problem, investigating the significance of reputation management in real estate business during an economic crisis.

Investigating the Significance of Reputation Management in Real Estate Business during an Economic Crisis Introduction

Good Leveraging Big Data To Analyze Consumer Behavior In Digital And Retail Marketing Research Proposal Example

A research thesis proposal submitted to dr, data analysis and analytics research utilization project proposal research proposal examples, data analysis and analytics research, prevention research proposal sample, invasion of exotic channa argus in usa and its, research proposal on social implications that influence younger people to want to stay slim, overcoming communication barriers research proposal, executive proposal project research proposal samples.

Password recovery email has been sent to [email protected]

Use your new password to log in

You are not register!

By clicking Register, you agree to our Terms of Service and that you have read our Privacy Policy .

Now you can download documents directly to your device!

Check your email! An email with your password has already been sent to you! Now you can download documents directly to your device.

or Use the QR code to Save this Paper to Your Phone

The sample is NOT original!

Short on a deadline?

Don't waste time. Get help with 11% off using code - GETWOWED

No, thanks! I'm fine with missing my deadline

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 08 October 2024

The levels of amino acid metabolites in serum induce the pathogenesis of atopic dermatitis by mediating the inflammatory protein S100A12

  • Yaqi Zhang 1   na1 ,
  • Heng Xu 2   na1 ,
  • Yang Tang 1 ,
  • Yuhang Li 1 &
  • Fengjie Zheng 1  

Scientific Reports volume  14 , Article number:  23435 ( 2024 ) Cite this article

Metrics details

  • Drug development
  • Pathogenesis
  • Rheumatology
  • Risk factors

Atopic dermatitis (AD) is a chronic inflammatory skin disease affecting tens of millions of people globally. The causal relationship between metabolites and AD pathology has not yet been formally indicated, and the mediating mechanism by which metabolites affect AD has not yet been explored. This study aimed to determine the genetic relationship between metabolites and AD and to determine the pathways through which amino acid metabolites affect AD. Meta-analysis integrates the results of multiple GWAS analyses using METAL software. Using bidirectional two-sample Mendelian randomization (MR), we analyzed the causal relationships between metabolites and AD. The principal MR test of causal effects was conducted using inverse-variance weighted regression, and we used reverse MR analysis to exclude reverse causality. We also performed the MR-PRESSO test to detect and correct for possible pleiotropic effects, and used the Cochran Q test to assess heterogeneity. Two-step MR was utilized to analyze the mediating factors between amino acid metabolites and the onset of AD. The correlation between mediating factors (inflammatory protein S100A12) and immune cell infiltration was analyzed using the edgeR and GSVA software packages. Using single-cell sequencing data from skin tissues of patients with AD, we studied the regulatory role of the S100A12 gene in immune cells. Multiple drug databases and macromolecular docking were used to search for S100A12-targeting drugs. Bidirectional two-sample MR analyses indicated that twenty-two metabolites and one inflammatory protein (S100A12) were significantly associated with AD pathogenesis. S100A12 is a mediator of amino acid metabolites ( N 6-methyllysine; N 2-acetyl, N 6, N 6-dimethyllysine and N 6, N 6-dimethyllysine) that are genetically associated with AD. S100A12 was positively correlated with the infiltration of multiple immune cell types in lesional AD skin. The amino acid metabolites N 6-methyllysine; N 2-acetyl, N 6, N 6-dimethyllysine and N 6, N 6-dimethyllysine influence AD pathogenesis by mediating S100A12 expression.

Similar content being viewed by others

data analysis sample research paper

European and multi-ancestry genome-wide association meta-analysis of atopic dermatitis highlights importance of systemic immune regulation

data analysis sample research paper

Integrative transcriptome-wide analysis of atopic dermatitis for drug repositioning

data analysis sample research paper

Genetic association of lipid and lipid-lowering drug target genes with atopic dermatitis: a drug target Mendelian randomization study

Introduction.

AD is a common, chronic inflammatory skin disease, and patients with AD typically exhibit recurrent eczema and severe itching of the skin 1 . In the United States, approximately one-quarter of adults with AD develop symptoms in adulthood 2 . The pathogenesis of AD is quite elusive and complex. Some suspect that this may be caused by a combination of genetic and environmental factors that lead to skin barrier dysfunction and abnormal immune responses 3 , 4 . Metabolites are intermediates or end products of metabolic reactions that affect disease pathogenesis. Recent studies have shown that AD may be associated with metabolic disorders, including central obesity, diabetes, dyslipidemia and hypertension 5 , 6 .

Amino acid metabolites are related to the onset of AD. Tryptophan metabolites facilitate the resolution of skin inflammation in AD by restoring the barrier function of the epithelium and regulating immune and inflammatory responses by modulating the activation, survival, proliferation and differentiation of immune cells 7 . Bifidobacterium longum-mediated tryptophan metabolism relieves AD symptoms by activating the AHR-driven immune response 8 .

Metabolites often play a role in the immune response by affecting related inflammatory factors and inflammatory proteins. An association between metabolism and immunity has been reported since the 1960s 9 . In recent years, some studies have further demonstrated the interaction between metabolic pathways and the immune response 10 . For example, glutamate participates in multiple immune responses by promoting the polarization of macrophages in response to IL-4 stimulation 11 .

Circulating inflammatory proteins participate in the pathogenesis of various diseases such as autoimmune diseases by mediating abnormal inflammatory responses. Therefore, by analyzing the genetic variants associated with protein abundance, protein quantitative trait loci (pQTLs) could be used to identify proteins associated with the development of AD.

The mechanism by which serum metabolites affect the progression of AD at the genetic level has not been investigated. MR is a statistical method that utilizes genetic variants as instrumental variables (IVs) to test for causal relationships between exposures and outcomes 12 . Through MR and mediation analysis, we explored the genetic causality between serum metabolites and AD and revealed the underlying mechanism by which serum metabolites influence AD from the inflammatory protein perspective.

Study design

Figure  1 shows the detailed flowchart of this study.

figure 1

Research process for identifying metabolites and inflammatory proteins causally associated with AD. Schematic diagram in the lower right corner: the design of mediation Mendelian randomization analyses. First of all, a two-sample MR was performed to investigate the causal relationships between blood metabolites (exposure) and atopic dermatitis (outcome). Secondly, inflammatory proteins (mediator) were selected for subsequent mediation analyses. And lastly, the two-step MR analysis was conducted to detect potential mediating inflammatory proteins (Step 1, the effect of blood metabolites on inflammatory proteins; Step 2, the effect of inflammatory proteins on atopic dermatitis). The instrumental variables used for MR need to satisfy 3 assumptions. First, the genetic variants must be robustly associated with blood metabolites. Second, the genetic variants must not associate with any confounders. Third, the genetic variants only affect the outcome through the exposure and not through any direct casual pathway. Abbreviations: MR, Mendelian Randomization; AD, Atopic dermatitis; SNP, Single nucleotide polymorphism; MAF, Minor allele frequency.

Exposure data and IV selection

The genetic variants for 91 circulating inflammatory proteins were obtained from a GWAS involving 14,824 European ancestry participants 13 . In the ideal case, SNPs with a P value < 5e−8 should be selected for MR analysis. When only a few SNPs were available, we utilized a more relaxed threshold with an SNP P value < 1e−5. To identify independent SNPs, we performed LD pruning with a window size of 10,000 kb and r2 threshold of 0.001 in PLINK and used the European 1000 Genomes Project as a reference panel 14 . Because the P value threshold used in the significance test appeared loose, we set the following criteria to obtain more rigorous SNPs significantly associated with exposure: F statistics > 10 and minor allele frequency (MAF) > 0.01, which provides sufficient information MR analysis 15 . Finally, we excluded the SNPs significantly associated with outcome.

We obtained the human metabolome data from a large GWAS with 8,299 participants, which included 309 metabolite ratios and 1091 blood metabolites 16 . The exposure screening criteria used for MR causal analysis were as follows: (I) genome-wide significant association ( P value < 1e−5); (II) F statistics > 10 and MAF > 0.01 were used to correct weak IV bias; (III) kb = 500 and r2 = 0.1; and (IV) relevant SNPs significantly associated with outcome were removed.

The GWAS data of immune cells were obtained from GWAS catalog (GCST90001391–GCST90002121) including 731 immune immunophenotypes 17 .When calculating the causal relationship between immune cells and AD, the exposure screening criteria used for MR causal analysis were as follows: (I) genome-wide significant association ( P value < 5e−8); (II) F statistics > 10 and MAF > 0.01 were used to correct weak IV bias; (III) kb = 10,000 and r2 = 0.001; and (IV) relevant SNPs significantly associated with outcome were removed.

Outcome data and GWAS meta-analysis

To comprehensively evaluate the heritable loci for AD in Europeans. We conducted a meta-analysis of two GWAS datasets of AD using METAL software: FinnGen (n case = 13,473; n control = 336,589) and the UK Biobank (n case eur = 2904; n control eur = 412,489) 18 , 19 .

Mendelian randomization analysis

We utilized the TwoSampleMR (R package, v0.5.8) package to perform MR analysis between exposures and outcomes. The random-effect inverse variance–weighted (IVW) method and MR-Egger were our main analysis methods, and P values < 0.05 were considered to indicate statistical significance. The random-effect IVW method allows for heterogeneity for SNPs 20 .

For sensitivity analysis, heterogeneity was assessed by Cochran’s Q test, and a P value > 0.05 is indicative of no heterogeneity. Next, we utilized MR-PRESSO 21 and MR-Egger regression tests to detect horizontal pleiotropic effects with MRPRESSO (R package, v1.0) package. The MR-PRESSO global test can detect horizontal pleiotropy. The MR-PRESSO outlier test was used to remove outliers and correct for horizontal pleiotropy. This process was repeated until all the statistical tests were not significant with a P value > 0.05. After removing the pleiotropic SNPs, the remaining SNP list was used for further MR analysis. Leave-one-out analysis was used to reveal whether a single SNP influenced the MR results. In addition, we used the PhenoScanner database to investigate whether the SNPs were associated with confounders (including nitrogen dioxide air pollution, particulate matter air pollution (PM 10 ), particulate matter air pollution (PM 2.5 ), nitrogen dioxide air pollution, carbon monoxide and ultraviolet radiation, etc.) and removed potentially pleiotropic SNPs 22 . This analysis is conducted using the following parameters: the catalog was selected as GWAS, and the r2 was set to 0.8.

  • Mediation analysis

For metabolites that causally associate with both AD and inflammatory proteins, we conducted a mediation analysis to quantify the effects of these metabolites on AD through inflammatory proteins. The “total” effect of exposure on the outcome encompasses both “direct” effects and any “indirect” effects mediated through one or more intermediary variables. In this study, the total effect was obtained through a standard univariate MR analysis, also known as the primary MR. To disentangle direct and indirect effects, we utilized the results from a two-step MR approach, selecting the product method to estimate the β value for the indirect effect and the Delta method to estimate the standard error (SE) and 90% confidence interval (CI). We constructed a two-sample MR framework by combining two primary MR analyses (for which the metabolome was exposed and the AD was the outcome). Two-step MR (the first step: the metabolome served as the exposure, inflammatory proteins were the outcome; the second step: inflammation was the exposure, and AD was the outcome). Mediation analysis suggested that the total effect of exposure on outcomes included both direct and mediated indirect effects. We determined the overall effect of the metabolome on AD via univariate MR analysis. The product of the beta values from the two-step MR analysis was used as the “indirect” effect, whereas the direct effect was the total effect minus the indirect effect.

Reverse causality detection

To assess reverse causality, the genetic instruments used to assess AD from the meta-analysis used the same screening criteria as above. We tested the causal association of AD with two outcomes (inflammatory proteins and metabolome) using MR-IVW, MR-Egger, the weighted median, the simple mode and the weighted mode. In addition, we applied Steiger directionality to verify the causal relationship between exposures and outcomes.

Transcriptome analysis by RNA-seq

We downloaded the transcriptome sequencing data from the GSE193309 dataset of AD patients with lesional skin (n = 111) and healthy human skin (n = 112). The differential gene expression analysis of different groups was conducted using the edgeR (R package, v3.32.1) package. The infiltration levels of different groups of immune cells were quantified via ssGSEA via the GSVA (R package, v1.38.2) package.

Quality control and analysis of scRNA-seq of skin in AD

We obtained single-cell sequencing immune atlases of the skin from AD patients and healthy individuals (7 healthy people and 7 AD patients) 23 from the Human Cell Atlas database. Firstly, we conducted quality control (QC) on the sequencing data, adhering to the same parameters used in the original study with Seurat (R package, v4.3.0) 24 . Cells with over 20% mitochondrial gene percentages were excluded, as well as those expressing fewer than 100 or more than 6000 genes. Doublets were removed using the scDblFinder (R package, v1.5.7) package 25 . The NormalizeData and ScaleData functions in the Seurat package were then applied for data normalization and scaling. Highly variable genes were identified using the FindVariableFeatures function, and the Harmony (R package, v1.2.0) package 26 was utilized to remove batch effects between samples and integrate the data. Finally, principal component analysis (PCA) and Uniform Manifold Approximation and Projection (UMAP) were employed for dimensionality reduction and clustering of cells.

The Human Protein Atlas (HPA) database

The HPA database (Human Protein Atlas) is based on proteomic, transcriptomic, and systems biology data, providing access to experimental images related to tissues, cells, organs, and more. We obtained immunohistochemical images of S100A12 in different tissues from HPA database.

Drug targets analysis and molecular docking for potential compounds

We utilized the Drug-Gene Interaction Database (DGIdb) to retrieve druggable proteins and search for potential drugs for AD-related proteins. In addition, we used the Chinese medicine database (HERB) to retrieve potential herbal compounds of AD-associated proteins and herbal medicines. To verify the molecular docking of traditional Chinese medicine (TCM) compounds related to S100A12 with the protein S100A12, we retrieved the 3D crystal structure of the S100A12 protein from the RCSB Protein Data Bank (PDB) database ( https://www.rcsb.org/ ) and used Pymol software (version 2.4.0) to remove ligand molecules and water molecules from the crystal structure. We downloaded the structure file of the TCM compound, converted the file from its original format to the mol2 format using OpenBabel software (version 2.3.2). Then, we uploaded the processed receptor (S100A12 protein) and ligands (TCM compounds in mol2 format) to SwissDock ( http://www.swissdock.ch/ ). We utilized SwissDock to perform the docking simulations, including removal of water molecules, addition of hydrogen atoms, assignment of charges, and setting rotatable bonds. Finally, we used Pymol again to visualize the docking results.

Two-sample MR and bidirectional MR analyses of metabolites and AD risk

To conduct a more comprehensive analysis of GWAS summary data for AD, we integrated GWAS data on AD from both the FinnGen database and UK Biobank, the meta-integrated GWAS data was then used as the outcome for a MR analysis (Fig. S1 ). We assessed the causal relationship between the metabolite levels and AD risk. After MR-PRESSO and MR‒Egger intercept stepwise removal of genetic instruments with horizontal pleiotropic effects, twenty-two significant AD-associated metabolites ( P value < 0.05) were detected. In addition to unknown metabolites (X-12112, X-24494, X-11787 and X-11843), the identified metabolites involved several metabolic pathways: amino acids (citramalate, 3-methyl-2-oxobutyrate, 4-methyl-2-oxopentanoate, 3-(4-hydroxyphenyl)lactate, N 6-methyllysine, N 2-acetyl, N 6, N 6-dimethyllysine, cys-gly,oxidized, N 6, N 6-dimethyllysine, spermidine and methylsuccinoylcarnitine), lipids (cortolone glucuronide (1) and 1-palmitoyl-2-arachidonoyl-GPE (16:0/20:4)), xenobiotics (2-hydroxyhippurate) and partially characterized molecules (metabolonic lactone sulfate). The identified metabolite ratios included the Aspartate to asparagine ratio, the Androsterone glucuronide to etiocholanolone glucuronide ratio, the Phenylalanine to phosphate ratio and the Spermidine to N-acetylputrescine ratio. Among these metabolites, seven were protective factors against atopic dermatitis (Metabolonic lactone sulfate, X-11787, Cortolone glucuronide (1), 2-hydroxyhippurate, Methylsuccinoylcarnitine, X-24494 and the Aspartate to asparagine ratio), and all the other metabolites were risk factors for disease (Fig.  2 A,B). There was no evidence of reverse causality. The sensitivity analysis demonstrated that the associated metabolites had a consistent direction.

figure 2

Heatmap of 1400 serum metabolites with causal effects on AD. ( A ) Beta effect value of MR analysis results for metabolites and AD. ( B ) P value of MR analysis results for metabolites and AD.

Two-sample and bidirectional MR analyses of inflammatory proteins and AD

Similarly, we performed two-sample MR analysis to examine the association between inflammatory proteins and AD. IVW analysis indicated that S100A12 was positively associated with AD (beta = 0.006, p  = 0.03) (Fig.  3 ), and the analysis also revealed a positive correlation between natural killer cell receptor 2B4 levels and AD (beta = 0.002, p  = 0.02). However, the reverse analysis showed that AD had genetic causality on the natural killer cell receptor 2B4. Then we applied Steiger algorithm directionality to verify the causal relationship between S100A12 and AD, the Steiger test for all genetic instrumental variables is less than 0.05, this indicates that the instrumental variables selected for MR analysis are correct. S100A12 is a member of the small calcium-binding S100 protein family 27 , and S100A12 can activate NF-κB signaling and induce the recruitment of neutrophils, monocytes and lymphocytes, thus causing inflammation and autoimmune diseases 28 .

figure 3

The inflammatory protein S100A12 is significantly associated with the development of AD.

AD-related blood metabolites affect the expression of S100A12

To identify blood metabolites associated with the inflammatory protein S100A12, we further observed the causal relationships between metabolites and inflammatory factors. As shown in Figs.  4 and 5 , multiple metabolites, N 6-methyllysine (OR [95% CI] 1.038 [1.025, 1.052]; p  = 4.68E−08), N 2-acetyl, N 6, N 6-dimethyllysine (OR [95% CI] 1.038 [1.025, 1.051]; p  = 1.04E−07), X-24494 (OR [95% CI] 0.925 [0.886, 0.966]; p  = 1.21E−03), X-12112 (OR [95% CI] 1.032 [1.018, 1.045]; p  = 7.49E−06) and N 6, N 6-dimethyllysine (OR [95% CI] 1.033 [1.019, 1.046]; p  = 6.63E−06), affected the expression of S100A12, which was evaluated as a risk factor for AD. In addition, MR-Egger intercept analysis revealed no horizontal pleiotropy, and reverse analysis revealed no effect of inflammation on the metabolites. According to the sensitivity analyses, all five metabolites demonstrated consistent trends.

figure 4

The six AD-related metabolites had causal effects on the expression of S100A12. ( A ) Scatter plot for the effect of N 6-dimethyllysine on AD. ( B ) Scatter plot for the effect of N 2-acetyl, N 6, N 6-dimethyllysine on AD. ( C ) Scatter plot for the effect of N 6, N 6-dimethyllysine on AD. ( D ) Scatter plot for the effect of X-24494 on AD. ( E ) Scatter plot for the effect of X-12112 on AD.

figure 5

Forest plot of the influence of metabolites on S100A12 expression.

Amino acid metabolites influence AD through the inflammatory factor S100A12

Serum amino acid metabolites ( N 6, N 6-dimethyllysine, N 6-methyllysine, N 2-acetyl, N 6, N 6-dimethyllysine) have previously been found to be risk factors for AD. However, the detailed mechanism by which metabolites influence AD is not well understood. To explain the mediating mechanisms and factors of amino acid metabolites affecting AD, we used a two-step MR approach to identify the mediating factors of metabolites affecting AD (Fig.  6 ).

figure 6

Schematic diagram of the mediation effect analysis. β represents the causal effect value (disease risk) of the exposure variable on the outcome variable in the results of the MR analysis. β C Total effect value β C of metabolites on AD; β A effects of exposure on mediator; β B effects of mediator on outcome; β C′ value of direct effect between metabolites and AD; β A  × β C . Effect value mediated by the mediator.

The results showed that N 6-methyllysine; N 2-acetyl, N 6, N 6-dimethyllysine and N 6, N 6-dimethyllysine promoted AD progression by upregulating the expression of S100A12 (Fig.  7 ). The total effect between N 6, N 6-dimethyllysine and AD disease was 0.001, and the proportion of the mediation effect via S100A12 was approximately 19.2% (Fig.  7 A). The total effect between N 6 methyllysine and AD disease is 0.001, and the proportion of the mediation effect via S100A12 was approximately 22.8% (Fig.  7 B). The total effect beta value between N 2-acetyl, N 6, N 6-dimethyllysine and AD disease is 0.001, and the proportion of the mediation effect via S100A12 was 22.2% (Fig.  7 C).

figure 7

Effects of metabolites on AD via S100A12. ( A ) N 6, N 6-dimethyllysine effect on AD mediated by S100A12. ( B ) N 6-methyllysine effect on AD mediated by S100A12. ( C ) N 2-acetyl, N 6, N 6-dimethyllysine effect on AD mediated by S100A12.

Amino acid metabolites influence AD disease by decreasing the proportion of naive CD4 T cells

Importantly, we also investigated whether 731 immune cell phenotypes mediated the causal relationship between amino acid metabolites and AD. Interestingly, we found that only one type of immune cell (HVEM on naïve CD4 +  T cell) mediated the causal effects of these three metabolites, and this immune cell was a protective factor for AD (Fig. S3 ). This type of immune cell typically represents a naive CD4 + T cell state and is a cell type that does not contribute to skin inflammation in AD patients.

S100A12 is associated with immune stimulation in AD lesion skin

We first analyzed the differential expression of genes in the skin of the healthy and lesion groups (LS) and found that genes such as C19orf38, LILRA2, DOK2, TNF, COL6A5, and S100A12 were upregulated in patients in the LS group (Fig.  8 A), and S100A12 was significantly upregulated in the LS group (Fig.  8 C). We next assessed the relationship between S100A12 and immune activation and found a strong positive correlation between S100A12 and a variety of immune cells (activated CD4 T cells (r = 0.75), neutrophils (r = 0.72), and pDCs (r = 0.65)) (Fig.  8 B). There was greater infiltration of multiple immune cells in the S100A12 high-expression group. The S100A12 high-expression group exhibited infiltration of multiple immune cells at relatively high levels (activated CD8 + T cells, activated CD4 + T cells, and effector memory CD4 + T cells) (Fig.  8 D).

figure 8

S100A12 is significantly upregulated in lesional AD skin and significantly correlates with the infiltration of multiple activated immune cells. ( A ) Heatmap of differentially expressed genes between healthy and lesional AD skin samples based on the edgeR algorithm; ( B ) correlation of S100A12 with immune cell infiltration in AD lesional skin; ( C ) volcano plot of S100A12 significantly upregulated in AD lesional skin; ( D ) heatmap of the degree of immune cell infiltration between AD patients in the S100A12 high-expression group and AD patients in the S100A12 low-expression group.

S100A12 specifically expressed by monocytes can activate immune cells through AGER/TLR4 receptors

We obtained immunohistochemical images of S100A12 expression in various tissues from the HPA database, and the results showed that S100A12 expression was relatively pronounced in bone marrow and spleen, while its expression was weaker in skin (Fig. S2 ). Understanding the specific expression cells and functions of S100A12 protein can reveal its contributions to AD. To investigate the role of the S100A12 gene in immune cells within the skin tissues of patients with AD, we collected single-cell transcriptomic data from these patients. Using cell marker genes provided by the authors, we identified a total of 41 cellular subpopulations, which were primarily categorized into T cells, B cells, natural killer (NK) cells, Langerhans cells (LCs), dendritic cells, monocytes, macrophages, and mast cells (Fig.  9 A, B). Importantly, S100A12 is specifically expressed by monocytes and macrophages (Fig.  9 C), indicating its primary source and cellular specificity. Furthermore, studies have established that AGER and TLR4 are the primary receptors for S100A12 ligands. Thus, we delved deeper into the expression patterns of AGER and TLR4 across various cell types. Notably, AGER is predominantly expressed in lymphocyte subsets (Fig.  9 D), whereas TLR4 is mainly expressed in macrophage subsets (Fig.  9 E). To gain a more detailed insight into the expression of the S100A12-AGER/TLR4 receptor-ligand complex across cell types, we found that S100A12 is mainly released by monocytes and macrophages. The S100A12 further activate migratory memory classes T cells (Tmm3) and dendritic cells (DC1) through the AGER receptor, and also act on macrophages (Mac4) through TLR4.

figure 9

S100A12 specifically expressed by monocytes can act on immune cells through AGER/TLR4 receptors. ( A ) UMAP plot of immune cells from patients with AD and healthy individuals. ( B ) Marker gene plot annotated with cell types. ( C , D , E ) Expression patterns of S100A12, AGER, and TLR4 genes in the UMAP plot of immune cells. ( F ) Heatmap showing the expression of S100A12, AGER, and TLR4 genes across different cell types. Abbreviation: central memory cells (Tcm), tissue-resident memory T cells (Trm), Cytotoxic T cells (CTL-c), cytotoxic T lymphocytes exhaustion cluster (CTL-ex), cytotoxic T lymphocytes activated cluster (CTL-ax), CD3D +  T cell cluster (Tet), central memory Tregs (cmTreg), naive T cell (Tn), derived dendritic cells (DCs), langerhans cell populations (LCs), inflammatory Monos (InfMono), monocyte (mono), innate lymphoid cell population (ILC), migratory DC (migDC).

Drug prediction of the targets and molecular docking for potential compounds

S100A12 is a mediator of AD triggered by amino acid metabolites. To block the mediating factors linking blood amino acid metabolites to the onset of AD, we searched two databases (Drug-Gene Interaction Database (DGIdb) and Chinese medicine database (HERB)) for potential drugs (Tables S1 and S2 ). The clinically approved drugs RIMEGEPANT, UBROGEPANT, METHOTREXATE, ATOGEPANT and EPTINEZUMAB target S100A12 and exert an inhibitory and antagonistic effect on it. The potential TCM compound targeting S100A12 is citric acid.

To investigate the potential interactions between the candidate compound and the S100A12 protein, we performed computational molecular docking simulations with citric acid and S100A12. The results revealed a hydrogen bond interaction between citric acid and the ASP-61 residue of S100A12, which could potentially explain the inhibitory effect of citric acid on the function of the S100A12 protein (Fig.  10 ).

figure 10

Visualization of the molecular docking results between citric acid and S100A12 using Pymol software. The orange structure represents citric acid, while the cyan structure represents the S100A12 protein. The hydrogen bonds between the molecules are marked in yellow.

Therefore, these S100A12-targeted drugs and compound are expected to limit the progression of AD pathology.

In this study, we tested for the relationship between metabolites and AD and found that a variety of amino acid metabolites (citramalate, 3-methyl-2-oxobutyrate, 4-methyl-2-oxopentanoate, 3-(4-hydroxyphenyl)lactate, N 6-methyllysine, N 2-acetyl, N 6, N 6-dimethyllysine, cys-gly,oxidized, N 6, N 6-dimethyllysine, spermidine and methylsuccinoylcarnitine), lipids (cortolone glucuronide (1) and 1-palmitoyl-2-arachidonoyl-GPE (16:0/20:4)) and xenobiotics (2-hydroxyhippurate)) have potential causal relationships with AD. However, the detailed mechanism of these metabolites in AD is not fully understood. To date, few metabolomics studies on AD have been reported. Several studies have demonstrated that 3-(4-hydroxyphenyl)lactate shares common genetic regulation with hepatic steatosis and hepatic fibrosis 29 . In addition, 3-(4-hydroxyphenyl)lactate was also connected with the risk of severe bronchiolitis 30 .

The relationships between the amino acid metabolites ( N 6-methyllysine, N 2-acetyl, N 6, N 6-dimethyllysine and N 6, N 6-dimethyllysine) and AD have not yet been reported. Ottas reported the serum metabolome of AD patients, which showed differences in the levels and proportions of acylcarnitine, phosphatidylcholine and the cleavage product of fibrinogen A-α 31 . Previous studies have shown that the levels of amino acids, biogenic amines, acylcarnitine, sphingomyelins and phosphatidylcholines, including metabolites such as glutamine, asparagine and asymmetric dimethylarginine, are greater in the lesional skin than in the normal skin of AD patients 32 . Tryptophan was also identified as a crucial pathogenic metabolite for AD inflammation and pruritus 8 , 33 .

Importantly, we found the inflammatory protein S100A12 mediates the causal effect between amino acid metabolites ( N 6-methyllysine, N 2-acetyl, N 6, N 6-dimethyllysine and N 6, N 6-dimethyllysine) and AD. This indicates that N 6-methyllysine; N 2-acetyl, N 6, N 6-dimethyllysine and N 6, N 6-dimethyllysine promote the development of AD by affecting the expression of S100A12 in blood. Furthermore, we explored the role of S100A12 and its receptors TLR4/AGER in activating immune cells. Utilizing the immunohistochemical results from the HPA database, we found that S100A12 is prominently expressed in bone marrow and spleen, with weaker expression in skin. To delve deeper, we acquired single-cell sequencing data from skin tissues of patients with AD. Our analysis revealed that S100A12 is specifically expressed by monocytes and macrophages, and it exerts immune-activating effects on memory T cells, macrophages, and dendritic cells via TLR4/AGER. This finding underscores the immuno-activating role of S100A12 in skin tissues.The expression of S100A12 by neutrophils and monocytes promotes inflammation through binding to the receptor for advanced glycation and subsequently activating the nuclear factor kappa B pathway 34 . In addition to its intracellular activity, S100A12 has various extracellular activities that mediate innate immune responses 35 . S100A12 expression is upregulated in inflamed synovial tissue, and the serum level of S100A12 is correlated with rheumatoid arthritis (RA) activity 36 . Serum S100A12 levels were reported to be significantly upregulated in AD patients compared to healthy controls and correlated with AD severity. Therefore, the antimicrobial protein S100A12 may be a potential autoantigen for AD patients 37 .

Additionally, we investigated whether 731 immune cell phenotypes mediate the causal effect between amino acid metabolites ( N 6-methyllysine, N 2-acetyl, N 6, N 6-dimethyllysine and N 6, N 6-dimethyllysine) and atopic dermatitis. Our findings revealed that a decrease in HVEM expression on naïve CD4 +  T cells mediates this effect, suggesting that amino acid metabolites can significantly reduce the proportion of naïve CD4 +  T cells. Furthermore, this finding also indicates a potential activation of naïve CD4 +  T cells, driving their differentiation into effector CD4 +  T cells. We further queried multiple compound databases for drugs targeting S100A12, which blocks the mediating effect of amino acid metabolites on the onset of AD, including clinically approved medications (e.g. RIMEGEPANT, UBROGEPANT, METHOTREXATE) and herbal compounds (citric acid). METHOTREXATE is a biologic therapy approved by the US Food and Drug Administration for moderate to severe AD 38 . A 5-year follow-up study on the treatment of severe AD with methotrexate demonstrated that methotrexate was effective and safe 39 . Methotrexate is more persistent and less costly than cyclosporine (a calcineurin inhibitor approved for the treatment of AD) 40 . Consequently, we believe that these drugs could be beneficial for managing AD, and further clinical trials will be necessary to substantiate these findings in the future.

In our study, we found that metabolites such as 3-methyl-2-oxobutyrate, 4-methyl-2-oxopentanoate, 3-(4-hydroxyphenyl)lactate, Citramalate, N 6-methyllysine, N 2-acetyl, N 6, N 6-dimethyllysine, Cys-gly,oxidized, N 6, N 6-dimethyllysine, 1-palmitoyl-2-arachidonoyl-GPE (16:0/20:4) and Spermidine promote the development of diseases. However, metabolites such as Cortolone glucuronide (1), 2-hydroxyhippurate and Methylsuccinoylcarnitine inhibit disease development. Furthermore, N 6-methyllysine, N 6, N 6-dimethyllysine and N 2-acetyl, N 6, N 6-dimethyllysine influence the onset of AD by mediating the inflammatory protein S100A12. S100A12 was also associated with multiple immune cells (e.g., activated CD4 T cells and neutrophils). Therefore, metabolites may play a therapeutic role in AD by influencing immune cell infiltration via S100A12.

Data availability

The data used in this study are all publicly available. The FinnGen dataset ( https://www.finngen.fi/en ). The UK Biobank ( http://www.nealelab.is/blog/2017/7/19/rapid-gwas-of-thousands-of-phenotypes-for-337000-samples-in-the-uk-biobank ). The genetic variants for 91 circulating inflammatory proteins were obtained from the EBI GWAS Catalog (Accession Numbers GCST90274758–GCST90274848). The summary data of GWAS of the plasma metabolome were obtained from EBI GWAS Catalog (accession numbers GCST90199621-90201020). The transcriptome sequencing data of AD from the GSE193309 dataset. The other data is provided within the manuscript or supplementary information files.

Langan, S. M., Irvine, A. D. & Weidinger, S. Atopic dermatitis. Lancet 396 (10247), 345–360 (2020).

Article   PubMed   Google Scholar  

Lee, H. H. et al. A systematic review and meta-analysis of the prevalence and phenotype of adult-onset atopic dermatitis. J. Am. Acad. Dermatol. 80 (6), 1526–1532 (2019).

Hui-Beckman, J. W. et al. Endotypes of atopic dermatitis and food allergy. J. Allergy Clin. Immunol. 151 (1), 26–28 (2023).

Paller, A. S. et al. The atopic march and atopic multimorbidity: Many trajectories, many pathways. J. Allergy Clin. Immunol. 143 (1), 46–55 (2019).

Ali, Z. et al. Association between atopic dermatitis and the metabolic syndrome: A systematic review. Dermatology 234 (3–4), 79–85 (2018).

Wan, J. et al. Incidence of cardiovascular disease and venous thromboembolism in patients with atopic dermatitis. J. Allergy Clin. Immunol. 11 (10), 3123–3132 (2023).

Google Scholar  

Huang, Y. et al. Tryptophan, an important link in regulating the complex network of skin immunology response in atopic dermatitis. Front. Immunol. 14 , 1300378 (2023).

Fang, Z. et al. Bifidobacterium longum mediated tryptophan metabolism to improve atopic dermatitis via the gut-skin axis. Gut Microbes 14 (1), 2044723 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Oren, R. et al. Metabolic patterns in three types of phagocytizing cells. J. Cell Biol. 17 (3), 487–501 (1963).

O’Neill, L. A. J., Kishton, R. J. & Rathmell, J. A guide to immunometabolism for immunologists. Nat. Rev. Immunol. 16 (9), 553–565 (2016).

Kieler, M., Hofmann, M. & Schabbauer, G. More than just protein building blocks: how amino acids and related metabolic pathways fuel macrophage polarization. FEBS J. 288 (12), 3694–3714 (2021).

Pasman, J. A. et al. GWAS of lifetime cannabis use reveals new risk loci, genetic overlap with psychiatric traits, and a causal influence of schizophrenia. Nat. Neurosci. 21 (9), 1161–1170 (2018).

Zhao, J. H. et al. Genetics of circulating inflammatory proteins identifies drivers of immune-mediated disease risk and therapeutic targets. Nat. Immunol. 24 (9), 1540–1551 (2023).

Auton, A. et al. A global reference for human genetic variation. Nature 526 (7571), 68–74 (2015).

Article   ADS   PubMed   Google Scholar  

Burgess, S., Butterworth, A. & Thompson, S. G. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet. Epidemiol. 37 (7), 658–665 (2013).

Chen, Y. et al. Genomic atlas of the plasma metabolome prioritizes metabolites implicated in human diseases. Nat. Genet. 55 (1), 44–53 (2023).

Orrù, V. et al. Complex genetic signatures in immune cells underlie autoimmunity and inform therapy. Nat. Genet. 52 (10), 1036–1045 (2020).

Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613 (7944), 508–518 (2023).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Sudlow, C. et al. UK biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12 (3), e1001779 (2015).

Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: Effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44 (2), 512–525 (2015).

Verbanck, M. et al. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50 (5), 693–698 (2018).

Kamat, M. A. et al. PhenoScanner V2: An expanded tool for searching human genotype-phenotype associations. Bioinformatics 35 (22), 4851–4853 (2019).

Article   MathSciNet   PubMed   PubMed Central   Google Scholar  

Liu, Y. et al. Classification of human chronic inflammatory skin disease based on single-cell immune profiling. Science Immunol. 7 (70), eabl9165 (2022).

Article   Google Scholar  

Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177 (7), 1888–1902 (2019).

Germain, P.-L. et al. Doublet identification in single-cell sequencing data using scDblFinder. F1000Research 10 , 979 (2021).

Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16 (12), 1289–1296 (2019).

Zhang, X. et al. Characterization and engineering of S100A12-heparan sulfate interactions. Glycobiology 30 (7), 463–473 (2020).

Gonzalez, L. L., Garrie, K. & Turner, M. D. Role of S100 proteins in health and disease. Biochim. Biophys. Acta BBA Mol. Cell Res. 1867 (6), 118677 (2020).

Caussy, C. & Loomba, R. Gut microbiome, microbial metabolites and the development of NAFLD. Nat. Rev. Gastroenterol. Hepatol. 15 (12), 719–720 (2018).

Hasegawa, K. et al. Circulating 25-hydroxyvitamin D, nasopharyngeal airway metabolome, and bronchiolitis severity. Allergy 73 (5), 1135–1140 (2018).

Ottas, A. et al. Blood serum metabolome of atopic dermatitis: Altered energy cycle and the markers of systemic inflammation. PloS One 12 (11), e0188580 (2017).

Ilves, L. et al. Metabolomic analysis of skin biopsies from patients with atopic dermatitis reveals hallmarks of inflammation, disrupted barrier function and oxidative stress. Acta Derm. Venereol. 101 (2), adv00407 (2021).

Li, W. & Yosipovitch, G. The role of the microbiome and microbiome-derived metabolites in atopic dermatitis and non-histaminergic itch. Am. J. Clin. Dermatol. 21 (Suppl 1), 44–50 (2020).

Nazari, A. et al. S100A12 in renal and cardiovascular diseases. Life Sci. 191 , 253–258 (2017).

Yang, Z. et al. S100A12 provokes mast cell activation: A potential amplification pathway in asthma and innate immunity. J. Allergy Clin. Immunol. 119 (1), 106–114 (2007).

Foell, D. et al. Expression of the pro-inflammatory protein S100A12 (EN-RAGE) in rheumatoid and psoriatic arthritis. Rheumatology 42 (11), 1383–1389 (2003).

Mikus, M. et al. The antimicrobial protein S100A12 identified as a potential autoantigen in a subgroup of atopic dermatitis patients. Clin. Transl. Allergy 9 , 6 (2019).

Din, A. T. et al. Dupilumab for atopic dermatitis: The silver bullet we have been searching for?. Cureus 12 (4), e7565 (2020).

MathSciNet   Google Scholar  

Gerbens, L. A. A. et al. Methotrexate and azathioprine for severe atopic dermatitis: A 5-year follow-up study of a randomized controlled trial. Br. J. Dermatol. 178 (6), 1288–1296 (2018).

Flohr, C. et al. Efficacy and safety of ciclosporin versus methotrexate in the treatment of severe atopic dermatitis in children and young people (TREAT): A multicentre parallel group assessor-blinded clinical trial. Br. J. Dermatol. 189 (6), 674–684 (2023).

Download references

Acknowledgements

We thank all participants and investigator involved in the UK Biobank, the FinnGen study, Yiheng Chen et al. and Jinghua Zhao et al. for providing GWAS summary statistics.

This work is supported by National Natural Science Foundation of China (82374321).

Author information

These authors contributed equally: Yaqi Zhang and Heng Xu.

Authors and Affiliations

School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, China

Yaqi Zhang, Yang Tang, Yuhang Li & Fengjie Zheng

School of Life Sciences, Beijing University of Chinese Medicine, Beijing, China

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to the study conception and design. Y.Z. and H.X. were responsible for data collection. Y.Z., H.X. and Y.T. were responsible for data analysis. Y.Z., H.X. and Y.T. were responsible for data visualization communications. Y.L. and F.Z. were responsible for the control of the subject matter. Y.L. and F.Z. were responsible for writing. All authors have read and approved the final version of the manuscript and consent to its publication.

Corresponding authors

Correspondence to Yuhang Li or Fengjie Zheng .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethics approval and consent to participate

This study did not require institutional review board approval because it was based on only published or publicly available data.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information 1., supplementary information 2., supplementary information 3., supplementary information 4., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Zhang, Y., Xu, H., Tang, Y. et al. The levels of amino acid metabolites in serum induce the pathogenesis of atopic dermatitis by mediating the inflammatory protein S100A12. Sci Rep 14 , 23435 (2024). https://doi.org/10.1038/s41598-024-74522-1

Download citation

Received : 16 March 2024

Accepted : 26 September 2024

Published : 08 October 2024

DOI : https://doi.org/10.1038/s41598-024-74522-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Mendelian randomization
  • Atopic dermatitis
  • Metabolites
  • Inflammatory proteins

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

data analysis sample research paper

Comprehensive durability assessment of in-service concrete bridges based on combination weighting and extenics cloud model

  • Open access
  • Published: 09 October 2024
  • Volume 6 , article number  540 , ( 2024 )

Cite this article

You have full access to this open access article

data analysis sample research paper

  • Jie Cai 1 ,
  • Xiaoxiao Liu 1 &
  • Zhipeng Wang 1  

In recent years, with the continuous development of urban transportation, the durability of concrete bridge structures has become increasingly prominent. Many in-service concrete bridges exhibit issues such as cracks, excessive deflection, and a significant reduction in load-bearing capacity during their service life. To meet the requirements of sustainable development, it is urgent to assess the durability of bridges accurately. Therefore, this paper establishes a scientific and reliable evaluation index system for the durability of in-service concrete bridges. Based on a review of the literature and relevant standards and specifications, this paper establishes an evaluation index system for the durability of in-service concrete bridges and calculates the weights using a combination weighting method. The durability of the bridges is assessed using the extension cloud model theory, and MATLAB is utilized for efficient and accurate computation and analysis, ultimately determining the durability grades of the bridges. The method was applied to the durability assessment of five in-service concrete bridges, and the results were in high agreement with the measured data, proving its effectiveness and accuracy. The research results provide a scientific basis for the management and maintenance of concrete bridges, which can help extend their service life. Future studies will expand the sample to cover more bridge types to comprehensively assess and enhance the durability of bridges.

Article Highlights

A scientific and reliable durability evaluation index system for in-service concrete Bridges has been established.

Weight optimization: Based on grey correlation weight and entropy weight to find the combined weight.

The perfect integration of extension theory and cloud model.

Avoid common mistakes on your manuscript.

1 Introduction

The construction of bridges represents a pivotal aspect of the advancement of national infrastructure and a fundamental element of transportation infrastructure. In recent years, numerous countries have been confronted with the issue of bridge ageing, with the number of old bridges increasing at an alarming rate [ 1 ]. It thus follows that an analysis and study of the durability of in-service bridges has become a particularly necessary undertaking. The durability of a bridge can be defined as the ability of its structural components and the natural environment to withstand the effects of gradual deterioration, traffic volume, and other combined factors. This is achieved through the design of the bridge, taking into account the normal use and maintenance conditions, in order to ensure the safety, functionality, and technical performance of the bridge is maintained [ 2 , 3 , 4 , 5 ]. If these defects can be detected in a timely manner and appropriate repairs and reinforcements can be carried out, the maintenance costs of the bridges can be greatly saved, and at the same time, the huge losses due to the damage of the buildings as well as the bad social impacts can be prevented [ 6 , 7 ]. A series of domestic and foreign bridge collapse accidents have occurred in rapid succession, including the Liaoning Tianzhuangtai Bridge and the Jilin Fusong Jinshan Bridge. The overall collapse of these structures has been attributed to a number of factors, with the primary cause being the durability of the problem [ 8 ].

Mark Alexander [ 9 ] focuses on the durability of reinforced concrete structures and reinforcement corrosion. He states that corrosion of reinforcement is the greatest threat to the durability of reinforced concrete. The service life of reinforced concrete structures depends on their ability to resist the main deterioration mechanisms over an acceptable and ‘predictable’ time period. Yu-Chen Ou et al. [ 10 ] proposed a seismic assessment method for corroded reinforced concrete bridges based on nonlinear static pushover analysis by seismically evaluating the effects of chloride erosion on 11 reinforced concrete bridges of different structural types in Taiwan Province from both the material level and the structural level, and the results of the study showed that transverse reinforcing bars have a thicker cover layer and that their corrosion begins 2–32 years after the corrosion, which usually occurs before longitudinal reinforcing bars begin to corrode. Martina Šomodíková et al. [ 11 ] point out that combining advanced analytical methods with the nonlinear finite element method is an effective tool for assessing existing bridges. The article describes a probabilistic-based method to assess the load-carrying capacity of bridges degraded over time and applies it to a 60-year-old reinforced concrete bridge, demonstrating the importance of studying the degradation patterns of concrete bridges. Alessandro Nettis et al. [ 12 ] investigated the vulnerability of simply supported prestressed concrete bridges constructed in Italy and Europe since the 1960s, focusing on the effects of corrosion of reinforcement and traffic loading on the degradation of bridge capacity. The study provides an in-depth discussion of how these factors affect the vulnerability of bridges and provides a valuable reference for transport authorities in assessing the safety of existing bridges. Laura Anania et al. [ 13 ] investigated the collapse of post-tensioned prestressed concrete bridges and found that the collapse was caused by the fracture of the steel reinforcement, which was completely corroded by the steel wires inside the reinforcement. Numerical analyses further confirmed the effect of corrosion of reinforcement on concrete bridges. This suggests that future research on concrete bridge durability should focus on bridge degradation induced by reinforcement corrosion. J. R Mackechnie et al. [ 14 ] suggested that improving durability can extend the service life of concrete bridges and that the durability performance of bridges can be virtually discerned by understanding the microstructure of concrete and the underlying deterioration mechanisms. According to the above scholars' research, the durability problem of concrete bridges is particularly urgent. Therefore, this paper will provide a comprehensive assessment of the durability of in-service concrete bridges.

Currently, many scholars both domestically and internationally have conducted extensive research on the durability issues of in-service concrete bridges, but these research methods are relatively simple and the research efficiency is low. Huang Ya'e [ 15 ] used the extension analytic hierarchy process to calculate the weights of evaluation indices and designed durability assessment software for sea-crossing tied-arch bridges based on extension set theory using Python programming tools. This provides a reference for the durability assessment of in-service concrete bridges. However, its assessment results are greatly influenced by subjective weights. Liu Junli [ 16 ] proposed a comprehensive assessment method for concrete bridge durability that combines the analytic hierarchy process with evidence theory. A system for assessing the durability of concrete bridges was established and its feasibility was verified, laying the foundation for subsequent research on concrete bridge durability. Pan Tao [ 17 ] based on research into the durability of prestressed concrete bridges, proposed a bridge structure durability assessment method based on a fuzzy neural network. The study analyzed the calculation results of a self-organizing neural network and a BP neural network model written in Matlab, and conducted experimental verification. The results indicated that the fuzzy neural network is well-suited for assessing the durability of bridge structures. Liu Hanbing [ 18 ] and colleagues proposed a new method for assessing the condition of the superstructure of reinforced concrete bridges based on particle swarm optimization of fuzzy c-means clustering, the disadvantage of their method is that the study needs to be based on a large number of training samples of existing field test data of old bridges to conduct the study, the advantage is that it combines the advantages of particle algorithms with accelerated convergence, which greatly improves the effectiveness of the clustering. Li Gong [ 19 ] and colleagues addressed the prominent issue of concrete durability in cold and arid regions by establishing a concrete durability assessment model based on grey relational theory. They verified that the model's calculation results were consistent with experimental results, effectively solving the problem of insufficient durability of concrete specimens throughout the entire experimental period. Chen et al. [ 20 ] adopted the analytic hierarchy process and, based on the principle of fuzzy comprehensive evaluation, used a neural network adaptive fuzzy inference system as the evaluation engine to establish a durability and safety assessment system. Ma Jibing [ 21 ] and colleagues established a comprehensive durability assessment model and index system for mid- and low-rise concrete arch bridges based on the analytic hierarchy process. Additionally, they explored the fuzzy closeness evaluation method of this model using fuzzy comprehensive evaluation theory. He Weinan [ 22 ] and colleagues emphasized that the durability of bridge suspenders is directly related to the safety of the entire bridge. Based on an investigation into the durability damage of bridge suspender structures, they established a comprehensive evaluation system for the durability of bridge suspenders and defined the evaluation standards. They proposed a durability assessment method for suspenders based on set pair analysis. By calculating the weights of the durability assessment indices using the analytic hierarchy process, they accurately quantified the durability level of bridge suspender structures through set pair analysis operations. Numa J. Bertola et al. [ 23 ] proposed a risk-based approach to assessing the condition of bridges using visual inspection data. The methodology combines the state of deterioration of the bridge components with the impact of the damage on the overall structural safety in order to assess the condition of the bridge. However, this approach usually results in an overly conservative assessment of structural damage, which may lead to unnecessary rehabilitation interventions. Liu et al. [ 7 ] integrated the topological method, derived from material element theory and correlation functions, into the durability assessment of reinforced concrete structures. They developed a material element model specifically designed to evaluate the durability of these structures. Li Qingfu [ 24 ] and colleagues conducted a comprehensive and accurate durability assessment of in-service cable-stayed bridge structures. They established an evaluation index system for the durability of stay cables, using a combined method of IAHP (Improved Analytic Hierarchy Process) and CRITIC (Criteria Importance Through Intercriteria Correlation) to allocate weights to the evaluation indices. Using the confidence criterion, they performed a durability assessment of the stay cables on the Jiahui Bridge as an example. The comparison analysis using SPA (Set Pair Analysis) and MEE (Maximum Entropy Estimation) methods demonstrated that this approach yields more accurate durability assessment results for bridge stay cables. Xu Baosheng [ 25 ] and colleagues addressed the issue that most existing evaluation methods rely on single, often biased, weights. They employed a combination of subjective and objective weighting to avoid the limitations of single-method weighting, thus reflecting both the decision-makers subjective intentions and the objective attributes of the data.

Currently, although research on the durability damage mechanism of in-service concrete bridges is relatively mature and progress has been made in detecting and evaluating the durability damage of individual structural components, there are still fewer studies on the durability assessment of the entire concrete bridge. In addition, most of the existing studies suffer from ambiguity and randomness. In this context, it is particularly important to analyze different methods for determining the weights of indicators. The subjective weighting method comprehensively considers various factors, but it relies heavily on the subjective judgment of decision-makers. Consequently, the weight allocation can become unstable or biased due to the decision-maker's personal preferences. Neural networks, while capable of providing good assessments with the help of big data, depend on the learning samples for accuracy. If the sample size is too large or too small, corresponding issues may arise. The fuzzy comprehensive evaluation method can reflect the fuzziness between various evaluation indices but inherently contains a degree of subjectivity. Grey theory is an effective method but has certain limitations. To improve the accuracy of weight quantification, this paper introduces a combination weighting method based on grey relational analysis and the entropy weight method. This approach addresses the mutual influence between multiple indices and the fuzziness and objectivity in determining weights. Given that the bridge durability assessment system is a complex system, this paper employs the extension cloud model to assess the durability of in-service concrete bridges under the condition of combined weighting. This method not only qualitatively and quantitatively analyzes and objectively describes the characteristics of bridge durability but also effectively combines the objective description of extension theory with the dual uncertainty reasoning characteristics of the cloud model, making the durability assessment of in-service bridges more reasonable and reliable. The flow chart of the evaluation method is shown in Fig.  1 .

figure 1

Durability assessment flowchart for in-service concrete bridges

2 Establishment of the durability evaluation index system for in-service concrete bridges

2.1 principles for selecting evaluation indicators.

The durability of concrete bridges has always been a significant issue of concern in both the engineering and academic communities. The durability of these structures is the result of the combined effects of multiple influencing factors, and there are often inconsistencies and contradictions among the indicators of these factors. To ensure that the durability of concrete bridges can be accurately assessed, it is necessary to establish a reasonable and comprehensive evaluation index system. By extensively reviewing the literature [ 26 , 27 , 28 , 29 ] and based on the analysis of the diseases of in-service concrete bridges, as well as relevant documents and standards such as the 'Design Specifications for Durability of Highway Concrete Structures' [ 30 ], the 'Technical Condition Evaluation Standards for Highway Bridges' [ 31 ], and the 'Highway Bridge and Culvert Maintenance Specifications' [ 32 ], the evaluation index system for the durability assessment of in-service concrete bridges has been established.

2.2 Determination of the evaluation index system

Based on the principles for selecting evaluation indicators, as well as the main influencing factors of concrete bridge durability and the analysis of bridge diseases, the durability of in-service concrete bridges is set as the target level. The primary indicators include bridge appearance, reinforcement condition, concrete condition, other influencing factors, and environmental factors. Furthermore, there are 13 secondary indicators, including bridge deck pavement, expansion joint condition, and protective layer thickness, among others. The evaluation index system for the durability of in-service concrete bridges is established, as shown in Fig.  2 .

figure 2

Durability assessment index system for in-service concrete bridges

3 Research method

3.1 grey relational analysis method.

In real-life and practical engineering applications, systems and problems with incomplete data and information are common. To address these issues, Professor Deng Julong first proposed the Grey System Theory in 1982, introducing the Grey Relational Analysis within it. The core idea of the Grey System Theory is to infer the behavior and structure of a system through partially known information. Grey systems are between white systems (completely determined) and black systems (completely undetermined), dealing with systems where information is incomplete or partially known. Grey Relational Analysis is a method to measure the strength of the relationships among various factors in a system. Calculating the relational degrees between different factors, can determine the relationships and influence levels among variables. This method can not only analyze situations with incomplete data but also provide intuitive and reliable results, thereby offering scientific support for decision-making in real-life and engineering applications [ 33 , 34 ].

3.1.1 Calculation of grey relational coefficients

The basic idea of grey correlation analysis is to judge the degree of correlation between different sequences by evaluating the geometric shape similarity of the sequence curves. After dimensionless processing of variables, we use grey correlation analysis to determine the relationship between reference sequence (parent sequence) and comparison sequence (subsequence) [ 35 ]. Calculate the grey relational coefficient of subsequence relative to the reference sequence on the -th indicator using Matlab programming software.

In the formula, \(\mathop {\min }\limits_{{k}} \mathop {\min }\limits_{{i}} |X_{{ko}} - X_{{ki}} | + \rho \mathop {\max }\limits_{k} \mathop {\max }\limits_{i} |X_{{ko}} - X_{{ki}} |\) are the minimum and maximum differences between the two levels, respectively; \(\rho\) is the distinguishing coefficient; the larger the distinguishing coefficient, the greater the resolution, and the smaller the distinguishing coefficient, the lower the resolution. Generally, \(\rho\) is taken as 0.5.

3.1.2 Calculation of grey relational degree

By averaging the correlation coefficients of each data point, the relational degree of each index can be obtained, that is

where \(n\) is the number of data points.

3.1.3 Calculation of weights for grey relational evaluation indicators

By normalizing the relational degrees of each indicator, the weights of the indicators can be obtained. The calculation formula is:

where \(m\) is the total number of indicators, and \(W_{1k}\) is the weight of the \(k\) -th indicator.

3.2 Entropy weight method

3.2.1 calculation of evaluation indicator proportions.

Due to the presence of logarithms in the calculation formula of the entropy weight method, this paper adopts a non-negative translation method to handle values less than or equal to zero, referencing SPSSAU (Statistical Product and Service Software Automatically) and literature. Based on the standardized data processing results, the proportion of the \(i\) -th sample value under the \(k\) -th evaluation indicator is calculated [ 36 ], that is

where \(x_{i}^{*} (k)\) represents the \(i\) -th sample value under the \(k\) -th evaluation indicator after standardization.

3.2.2 Calculation of information entropy for each evaluation indicator

3.2.3 calculation of entropy weights for each evaluation indicator, 3.3 combination weighting based on grey relational analysis and entropy weight method.

Using the Lagrange multiplier method [ 36 ] for combined weighting can integrate both the objective information of various indicators and subjective judgments, thereby obtaining a more scientific and reasonable weight distribution. This approach also enhances the model's adaptability and robustness to different types of data and complex decision-making environments.

The specific steps to determine the weights of the durability evaluation indicators for in-service concrete bridges based on the Grey Relational-Entropy Weight Method are as follows:

The comprehensive weighting using the Lagrange multiplier method is performed as follows:

Based on the weights obtained from the Grey Relational Analysis and the Entropy Weight Method, the comprehensive weights of the indicators are calculated using formulas ( 7 ) and ( 8 ).

3.4 Extensible cloud model

Extension theory is an emerging discipline founded by scholars such as Cai Wen from Guangdong University of Technology. This theory is primarily based on formal models, exploring new methods for extending things, and is characterized by high logicality and mathematization. Extension theory aims to solve incompatibilities and contradictions in things, providing an innovative approach and regularity for dealing with such issues. It mainly includes three parts: matter-element theory, extension set theory, and extension logic, each contributing to the systematic analysis and resolution of complex problems [ 37 , 38 , 39 ]. According to Extension Theory, based on matter-element theory, the matter-element to be evaluated can be represented by an ordered triple \(R = (N,C,V)\) , where \(N\) is the thing, \(C\) is the characteristic, and \(V\) is the value of characteristic \(C\) for \(N\) [ 40 ]. In traditional matter-element extension, \(V\) is a determined value. However, the evaluation of the durability of in-service concrete bridges involves randomness and fuzziness, and its grade boundaries also exhibit randomness and fuzziness. Therefore, the digital characteristics of the cloud model \((Ex,En,He)\) are used to replace the original object characteristic \(V\) [ 41 ]. In this context, \(Ex\) is the expected value, the most representative point within the domain space, essentially the best sample. \(En\) is the entropy, which measures the degree of uncertainty of the qualitative concept. It is determined by the fuzziness and randomness of the qualitative concept, reflecting both the dispersion of cloud droplets and the value range recognized by the qualitative concept. \(He\) is the hyper-entropy, which measures the uncertainty of entropy, reflecting the thickness of the cloud droplets. Its value can be adjusted based on the fuzzy threshold of the indices.

Extension cloud theory refers to the coupling of the matter-element model and the cloud model, utilizing matter-element theory to analyze the cloud model. It can be represented using a matrix as follows:

In the equation, \(O\) is the assessment grade set matrix; \(C_i(i = 1,2,...,n)\) is the assessment index; \(Ex_jEn_jHe_j\) are the cloud parameters for each grade of the assessment index \(C_i\) .

3.4.1 Determining the grade interval cloud model

All in-service bridge durability assessment indices are divided into 5 grades. Let the grade interval values for each index be \(\left[ {D_{\min} ,D_{\max} } \right]\) , where the expected value \(Ex\) is taken as the middle value. Due to the fuzziness and uncertainty of the association degree, \(En\) and \(He\) are obtained through the relational expression, thus determining the grade interval cloud model.

In the equation, \(D_{\max}\) and \(D_{\min}\) are the upper and lower limits of each grade interval, respectively; \(p\) is the fuzzy constant, which can be adjusted according to the fuzziness, discreteness, randomness and actual situation of the index. In this paper, 0.01 is adopted.

According to the extension cloud model generation algorithm, the normal cloud model of the indices is simulated using the digital parameters of the cloud model for concrete bridge durability assessment indices.

3.4.2 Determining the correlation degree of the extension cloud model

Each assessment index value \(x\) is considered a cloud droplet. A normally distributed random number \(En^{\prime}\) with an expected value \(En\) and a standard deviation \(He\) is generated. The cloud correlation degree between each index value \(x\) and the normal cloud model is then calculated.

In the equation, \(k_i\) is the correlation degree between the index value \(D\) and the extension cloud model; \(Ex\) is the variance; \(En^{\prime}\) is a random number following a normal distribution.

The judgment matrix \(G_{mn}\) formed based on the cloud correlation degree \(k_i\) is:

In the equation, \(G_{mn}\) is the correlation degree between the evaluated index and the grade interval extension cloud model; \(m\) is the number of evaluation indices; \(n\) is the evaluation grade.

3.4.3 Determining the comprehensive evaluation grade

Combining the comprehensive weight values of the concrete bridge durability assessment indices, the comprehensive evaluation vector \(B\) is calculated as follows:

In the equation, \(V_i\) is the set of comprehensive weight vectors for each index.

The fuzzy grade characteristic value of the evaluation \(r\) is calculated as follows:

In the equation, \(b_i\) is the maximum value of vector \(B\) ; \(f_i\) is the grade corresponding to the maximum component.

3.4.4 Calculating credibility

Due to the fuzziness and randomness of the correlation degree \(k\) , it is necessary to increase the number of iterations to reduce the impact of random factors. The grade characteristic expectation \(Ex_r\) and the grade characteristic entropy \(En_r\) are calculated as follows:

In the equation, \(q\) is the number of iterations, set to 1000; \(r_i(x)\) is the characteristic value of index \(x\) in the \(i\) -th iteration.

Considering the credibility \(\theta\) , the credibility \(\theta\) is defined as follows:

The value of \(\theta\) reflects the degree of deviation of the evaluation results, and its magnitude is inversely proportional to the credibility. The smaller the value of \(\theta\) , the higher the credibility of the evaluation results.

4 Basis for durability assessment of in-service concrete bridges

A reasonable classification of the evaluation grades for the durability of in-service concrete bridges facilitates the assessment of bridge durability. By referring to relevant standards and literature, and considering the degree of damage and defects in the bridges, the durability of the bridges is classified into five grades, as shown in Table  1 . The grading criteria for the evaluation indicators of the durability of in-service concrete bridges are determined based on these classifications, as shown in Table  2 .

In Table  2 , a positive indicator, i.e., the larger the value of the indicator, the better, and a negative indicator, i.e., the smaller the value of the indicator, the better.

5 Case study

In this paper, we refer to the literature [ 42 ] and analyze the concrete bridge inspection reports, and according to the determination of the evaluation system, we collect the inspection data related to a certain four concrete highway bridges in Wuhan, China. The bridges are summarised in Table  3 .

In Table  3 , the design loads, Highway Class I represents the high bearing capacity standard; Vehicle Load—Super Class 20 class specifically refers to allowing more than 100 tons of gross vehicle weight; Vehicle Load Class 18 that the maximum weight of the car can be carried for 18 tons; bridge 2 and bridge 4 are mainly for its main bridge for the durability analysis, the length of the main bridge is 2458 m and 1876.1 m respectively.

A field analysis was conducted on Changfeng Bridge in Wuhan, Hubei Province, to obtain its relevant detection data. This bridge, a large-scale bridge crossing the Han River, is located on the western section of the Third Ring Road in Wuhan and was completed and opened to traffic in 2001. The bridge starts from the Cihui Interchange on the Third Ring Road in the north, crosses the Han River to the south, and lands on the north side of the original bridge toll station. The bridge spans from the Hankou shore abutment No. 0 to the Hanyang shore abutment No. 35, with a total length of 1130 m. The main bridge is a concrete-filled steel tube tied-arch bridge with a total length of 372 m. The overall diagram of the bridge is shown in Fig.  3 .

figure 3

Overall picture of Changfeng Bridge

The measured data for the durability assessment indices of the five in-service concrete bridges are shown in Table  4 .

In the measured values of the durability evaluation indicators in Table  4 , the indicators \(C_1\) , \(C_2\) , \(C_8\) , \(C_{10}\) , \(C_{11}\) and \(C_{12}\) involve a quantification process that may be influenced by the subjective judgment of the inspectors. In order to reduce the influence of subjective factors on the results, the final quantified values of these indicators adopt the quantified average values of multiple inspectors.

5.1 Determining the weights of assessment indices

Using MATLAB programming software and based on Eqs. ( 1 ) to ( 3 ), the grey relational weights of the indices are calculated, as shown in Table  5 .

Using the SPSSAU platform and based on Eqs. ( 4 ) to ( 6 ), the entropy weights of the durability assessment indices for in-service concrete bridges are calculated, as shown in Table  5 .

The comprehensive weights of grey relational and entropy weights are calculated using the Lagrange multiplier method. Based on Eqs. ( 7 ) to ( 8 ), the comprehensive weights of each index are obtained. The results are shown in Table  5 .

According to Table  5 , the \(C_7\) concrete carbonation indicator has the largest proportion of the combined weight, which has a particularly significant impact on the durability of in-service reinforced concrete bridges. This is closely followed by the \(C_5\) reinforcement corrosion and \(C_{13}\) chloride content indicators, which also have relatively large combined weight values. The results of this weight allocation are in line with practical theoretical knowledge, as in-service concrete bridges are usually exposed to atmospheric environments and are susceptible to carbonation due to direct contact with carbon dioxide in the air. In addition, bridges are also susceptible to chloride ions in such environments, leading to corrosion of the reinforcement and thus affecting the durability of the bridge. Therefore, the results of the calculation of the combined weights can be considered scientific and reliable.

Based on Table  5 , draw the index weight analysis chart, as shown in Fig.  4 .

figure 4

Weight distribution of indicators

As shown in Fig.  4 , there are significant differences between the gray relational weights and the entropy weights for some indices, and certain indices exhibit inconsistent trends. The gray relational analysis involves a degree of subjectivity in its considerations, whereas the entropy weight method determines weights based on the degree of data dispersion, reflecting objectivity. Combining both methods balances subjective and objective factors, resulting in more reasonable final weights. Overall, the comprehensive weights can more accurately reflect the true importance of each evaluation index, avoiding potential biases inherent in a single weighting method and enhancing the rationality and accuracy of the evaluation results.

5.2 In-service concrete extensible cloud model

Establish the durability evaluation model of in-service concrete bridge, and grade it according to the durability evaluation index of in-service concrete bridge. According to Eq. ( 10 ), calculate the normal cloud model eigenvalues (expectation, entropy, super entropy) of each evaluation index, see Table  6 . According to the cloud digital eigenvalues can get the cloud drop diagram, this paper takes the index 1 bridge deck pavement as an example, as shown in Fig.  5 .

figure 5

Cloud Droplet Diagram for Index \(C_1\)

Table 6 lists the eigenvalues of the normal cloud model for each index of durability evaluation of in-service concrete bridges. Meanwhile, Fig.  5 visualizes the cloud drop diagram of \(C_1\) indicator as an example. By analyzing this figure, the grade distribution of \(C_1\) indicators and the related correlation characteristics can be clearly understood.

5.3 Calculation of index correlation degree

Based on the characteristic values (expected value, entropy, hyper-entropy) obtained from the cloud model, MATLAB is used to perform the calculations according to Eqs. ( 11 ) and ( 12 ). The cloud correlation degrees for the durability assessment grades of the five in-service concrete bridges are obtained. Taking the correlation degree data of the example bridge as an example, the results are shown in Table  7 .

From Table  7 , the cloud correlation of the evaluation ratings of each evaluation index of the durability of in-service concrete bridges can be seen, and based on its results, the comprehensive correlation can be further obtained, as shown in Table  8 .

5.4 Determining the evaluation grade

Based on Eqs. ( 13 ) to ( 16 ), the comprehensive correlation degree and credibility are obtained, and the evaluation grade is determined. The results are shown in Table  8 .

As can be seen from Table  8 , the evaluation class of Bridge 1 is Class I, the evaluation class of Bridge 3 is Class II, and the evaluation classes of Bridge 2 and Bridge 4 as well as Changfeng Bridge are all Class III, and the credibility of the calculations is less than 0.01, which indicates that the results are highly credible. The results indicate that relying on the Matlab platform to establish a model to evaluate the durability of each in-service concrete bridge, the evaluation results coincide with the measured data, which verifies the accuracy of the model. According to the classification of durability assessment class of in-service concrete bridges in Table  1 , we can determine that the assessment class of Bridge 1 is Class I, which indicates that it is basically free of damage and can be used normally, and it only needs to maintain routine maintenance. The assessment class of Bridge 2, Bridge 4, and Changfeng Bridge is Class III, indicating that there is a moderate degree of damage, and although they can still maintain the normal use of the function, they need to carry out major repairs to the damaged parts. Bridge 3 was assessed as Category II, indicating minor damage, which has no significant impact on the normal use of the bridge and requires only minor localized repairs.

6 Discussion

With the increasing phenomenon of bridge deterioration, the number of bridges in urgent need of rehabilitation and strengthening has been increasing, which indicates that it has become imperative to carry out durability studies on in-service concrete bridges. Against this background, this study proposes an efficient method for the comprehensive assessment of bridge durability, which has significant theoretical and practical value.

In this study, a combined assignment method based on grey correlation analysis and entropy weight method is proposed on the basis of the construction of an evaluation index system. The results in Table  6 and Fig.  5 clearly show that this method not only overcomes the limitations of the single assignment method but also provides a more scientific and reasonable weight determination method for the field of bridge durability assessment. The introduction of this method not only enriches and improves the theoretical system of bridge durability assessment, but also provides an important reference for future related research and promotes the further improvement of the field of bridge durability assessment.

By carrying out the modeling evaluation on the MATLAB platform, this study significantly improves the evaluation efficiency. The combination of topological theory and cloud modeling is used to construct the model, which not only considers the interrelationships among the evaluation indicators of the durability of in-service concrete bridges but also pays full attention to the relative independence of each indicator. Compared with the traditional evaluation methods, this innovative combined model performs well in dealing with the complex relationships among evaluation indicators and coping with ambiguity and uncertainty, making the evaluation results more scientific and accurate.

As can be seen from the analysis results in Table  8 , the durability assessment results of in-service concrete bridges in this study are highly consistent with the measured results, which verifies the effectiveness and practicality of the proposed method. The method can be effectively applied in the durability assessment of actual bridges, providing a scientific decision-making basis for the bridge management department. By accurately evaluating the durability status of bridges, the relevant authorities can rationally arrange maintenance and reinforcement programs, which can effectively extend the service life of bridges and reduce the maintenance costs. This not only optimizes the bridge management strategy but also provides a strong guarantee for the sustainability of the infrastructure.

7 Conclusions

The durability assessment of in-service concrete bridges is an important component of bridge construction. With the increasing number of deficient bridges each year, the maintenance and renovation of bridges have become urgent. Therefore, proposing a scientific and effective method for evaluating the durability of in-service concrete bridges is of great significance. Based on extensive literature review, this paper adopts the extension model to assess the durability of in-service concrete bridges, with the main conclusions as follows:

Based on the analysis of durability defects in in-service concrete bridges, a two-tier evaluation index system for the durability of in-service concrete bridges was established. Thirteen indicators were selected to assess the durability of concrete bridges: deck pavement, expansion joint conditions, protective layer thickness, rebar distribution, rebar corrosion, concrete strength, concrete carbonation, bearing conditions, traffic volume, foundation settlement, guardrail and railing inspection, drainage system conditions, and chloride ion content.

In the extension cloud evaluation system, the weight of each index directly affects the effectiveness of bridge durability assessment. Using a single weight to evaluate bridge durability reduces the referential value of the assessment results. Therefore, this study employs the Lagrange multiplier method to combine the weights of grey theory and entropy weight, enhancing the scientific nature of index weighting. This approach also improves the stability and robustness of the extension model, making the evaluation results more reliable and referential.

This study employs an extension cloud model with combined weighting to assess the durability of in-service concrete bridges. The evaluation results are compared with actual bridge data, showing high consistency, which ensures the accuracy of the assessment results. The extension cloud model effectively integrates extension theory and cloud model theory, combining the advantages of both. It excels in handling fuzziness and uncertainty and adequately reflects the complexity and dynamism of the evaluated objects.

In this paper, the durability assessment system and assessment methods of in-service concrete bridges have been studied to a certain extent, but there are still some issues that need to be further improved due to the limitations of data collection and the authors' knowledge level.

The durability assessment system established in this paper mainly relies on literature collection and standardised theoretical analysis. Future research can combine methods such as finite element simulation to determine the influencing factors of bridge durability in more depth, so as to define and select the assessment indexes more precisely.

In the assessment of bridge durability, there is a certain degree of subjectivity in the quantification process of the indicator values, which may affect the accuracy of the assessment results. Subsequent studies should further refine the quantification process of these indicators and explore more objective and standardised quantification methods to improve the accuracy and reliability of the assessment results.

Different types of bridges (e.g., steel, prestressed concrete, and reinforced concrete bridges) have significant differences in durability assessment. Due to the limitations of the sample, this study only analysed the durability of concrete bridges in the same region. Future research should establish a more comprehensive assessment system for different types of bridges in order to improve the applicability and effectiveness of the model and to provide a broader reference basis for the durability assessment of various bridge structures.

Data availability

Data is provided within the manuscript or supplementary information files.

Kawamura K, Miyamoto A, Frangopol DM, Kimura R. Performance evaluation of concrete slabs of existing bridges using neural networks. Eng Struct. 2003;25(11):1455–77. https://doi.org/10.1016/S0141-0296(03)00112-3 .

Article   Google Scholar  

Chen YJ. Research on durability evaluation method of reinforced concrete bridges based on fuzzy C-means clustering (in Chinese). Jilin University (2015). Retrieved from https://kns.cnki.net/kcms2/article/abstract?v=MuRVhOLgpmsgc8kUcNudRXwDUlhLurvta5E-yR6gHPTT075H0yj6cnFL3CZW2TUcuv1dcrw1sM-rTFkssjE9oM-cj45vh2fcJHhI3MDARQtapeGdHXEFk0qURSEi6FlJak0mMSMyXJ8=&uniplatform=NZKPT&language=CHS

Smith G, Kogler R, McNeill D. Upgrading bridge durability. Mater Perform. 2011;50(1):50–3. https://doi.org/10.1002/suco.201100022 .

Alexander MG, Ballim Y, Stanish K. A framework for use of durability indexes in performance-based design and specifications for reinforced concrete structures. Mater Struct. 2008;41(6):921–36. https://doi.org/10.1617/s11527-007-9295-0 .

Li Q, Zhou J, Feng J. Safety risk assessment of highway bridge construction based on cloud entropy power method. Appl Sci. 2022;12(17):8692. https://doi.org/10.3390/app12178692 .

Melchers RE, Chaves IA. Durability of reinforced concrete bridges in marine environments. Struct Infrastruct Eng. 2020;16(1):169–80. https://doi.org/10.1080/15732479.2019.1604769 .

Liu JZ, Xu JY, Bai EL, Gao ZG. Durability evaluation analysis of reinforced concrete structures based on extension method. Adv Mater Res. 2011;163:3354–8. https://doi.org/10.4028/www.scientific.net/AMR.163-167.3354 .

Naijie C, Xueying B. Durability evaluation of highway bridges based on improved similarity weight and unascertained measure theory. J Railway Sci Eng. 2018;15(10):2541–8. https://doi.org/10.19713/j.cnki.43-1423/u.2018.10.013 .

Alexander M, Beushausen H. Durability, service life prediction, and modelling for reinforced concrete structures—review and critique. Cem Concr Res. 2019;122:17–29. https://doi.org/10.1016/j.cemconres.2019.04.018 .

Ou YC, Fan HD, Nguyen ND. Long-term seismic performance of reinforced concrete bridges under steel reinforcement corrosion due to chloride attack. Earthq Eng Struct Dyn. 2013;42(14):2113–27. https://doi.org/10.1002/eqe.2316 .

Šomodíková M, Lehký D, Doležel J, Novák D. Modeling of degradation processes in concrete: probabilistic lifetime and load-bearing capacity assessment of existing reinforced concrete bridges. Eng Struct. 2016;119:49–60. https://doi.org/10.1016/j.engstruct.2016.03.065 .

Nettis A, Ruggieri S, Uva G. Corrosion-induced fragility of existing prestressed concrete girder bridges under traffic loads. Eng Struct. 2024;314: 118302. https://doi.org/10.1016/j.engstruct.2024.118302 .

Anania L, Badalà A, D’Agata G. Damage and collapse mode of existing post-tensioned precast concrete bridge: the case of Petrulla viaduct. Eng Struct. 2018;162:226–44. https://doi.org/10.1016/j.engstruct.2018.02.039 .

Mackechnie JR, Alexander MG. Using durability to enhance concrete sustainability. J Green Build. 2009;4(3):52–60. https://doi.org/10.3992/jgb.4.3.52 .

Ya'e H. Durability evaluation and structural safety warning study of sea-crossing tied arch bridges. Master's thesis, Zhejiang Ocean University, Zhoushan, China (2023). https://doi.org/10.27747/d.cnki.gzjhy.2022.000232

Junli L. Durability evaluation and prediction of concrete bridges (Master's thesis, Hunan University, Changsha, China) (2014). Retrieved from https://kns.cnki.net/kcms2/article/abstract?v=JgtjNxUAsgeFkRpjfKdTujdoaTHHT6sH0fN1vtgKrFcZMbici9DeULZlOEn7rY_cSraD8J8pUBWZpP6GRtJyxsDdlB5ECkN6ScfQ6bj_AOIiGDorL7d2Ba3HOx0fWM8C86kz1Fsq__Tb3pTqFDxum27Dn2MNvhvAe2anmOkmbTQrnCjpvkOslSzd1gN52g4c3tx6J2XsIyM=&uniplatform=NZKPT&language=CHS

Tao P. Durability analysis of prestressed bridges based on Matlab fuzzy neural network. (Master's thesis, Southeast University, Nanjing, China) (2022). https://doi.org/10.27014/d.cnki.gdnau.2020.004355

Liu H, Wang X, Jiao Y, He X, Wang B. Condition evaluation for existing reinforced concrete bridge superstructure using fuzzy clustering improved by particle swarm optimisation. Struct Infrastruct Eng. 2017;13(7):955–65. https://doi.org/10.1080/15732479.2016.1227854 .

Gong L, Liang Y, Zhang B, Gong X, Yang Y. Durability evaluation of concrete in cold and arid regions based on grey relational theory. Adv Civ Eng. 2022;2022:6287810. https://doi.org/10.1155/2022/6287810 .

Chen W, Dan DH, Sun LM. Safety and durability assessment of cable-supported bridge by using ANFIS. In: Proceedings of the international conference on health monitoring of structure, vol. 1, pp. 725–731, Nanjing, China (2007).

Ma J, Pu Q, Song D, Yang G. Fuzzy neamess comprehensive assessment of durability of existing half-through and through concrete arch bridges. J Highway Transp Res Dev. 2008;2:143.

Google Scholar  

He W, Sun X, Li C. Research on durability evaluation method of bridge slings based on set pair analysis. In: IOP conference series: earth and environmental science, vol. 474, p. 072081. IOP Publishing (2020). https://doi.org/10.1088/1755-1315/474/7/072081

Bertola NJ, Brühwiler E. Risk-based methodology to assess bridge condition based on visual inspection. Struct Infrastruct Eng. 2023;19(4):575–88. https://doi.org/10.1080/15732479.2021.1959621 .

Li Q, Zhang T, Yu Y. Evaluation of the durability of bridge tension cables based on combination weighting method-unascertained measure theory. Sustainability. 2022;14(14):7147. https://doi.org/10.3390/su14127147 .

Xu B, Qi N, Zhou J, et al. Reliability assessment of highway bridges based on combined empowerment–TOPSIS method. Sustainability. 2022;14(14):7793. https://doi.org/10.3390/su14137793 .

Li Q, Yu Y. Durability evaluation of concrete bridges based on the theory of matter element extension—entropy weight method—unascertained measure. Math Probl Eng. 2021;2021:2646723. https://doi.org/10.1155/2021/2646723 .

Danjie S, Xingwang L, Hongming H. Application of improved TOPSIS method based on grey relational analysis in bridge evaluation. J Hebei Agric Univ. 2018;41(2):116–21. https://doi.org/10.13320/j.cnki.jauh.2018.0021 .

Jing C, Xueying B, Yanlong Z. Durability evaluation of in-service concrete bridges based on fuzzy extension analytic hierarchy process. J Saf Environ. 2015;15(4):16–20. https://doi.org/10.13637/j.issn.1009-6094.2015.04.003 .

Li Z, Dang Y, Tang Z, et al. Optimal overlays for preservation of concrete in cold climate: decision-making by the method of fuzzy comprehensive evaluation combined with AHP. J Infrastruct Preserv Resil. 2021;2:1–16. https://doi.org/10.1186/s43065-021-00046-x .

Ministry of Transport of the People's Republic of China. (2019). Code for durability design of highway concrete structures (JTG/T 3310-2019).

Ministry of Transport of the People's Republic of China. (2018). Technical specifications for highway condition assessment (JTG 5210-2018).

Ministry of Transport of the People's Republic of China. (2021). Specifications for highway bridge and culvert maintenance (JTG 5120-2021).

Kuo Y, Yang T, Huang GW. The use of grey relational analysis in solving multiple attribute decision-making problems. Comput Ind Eng. 2008;55(1):80–93. https://doi.org/10.1016/j.cie.2007.12.002 .

Liu S. Grey system theory and its application. 3rd ed. Beijing: Science Press; 2014.

Jin Z, Peipei N, Yuxin M, et al. Performance evaluation research based on binary semantic grey relational extension. J Jiangsu Univ Sci Technol Nat Sci Ed. 2023;37(1):92–7. https://doi.org/10.20061/j.issn.1673-4807.2023.04.015 .

Ji M, Xizhe L. Evaluation of state characteristics of low voltage distribution network areas based on G2-entropy weight method. Electr Power Autom Equip. 2017;37(1):41–6. https://doi.org/10.16081/j.issn.1006-6047.2017.01.007 .

Xingwang P, Huimin L, Wenlong L, Hai M, Xuan L. Risk assessment of bridge detection operation based on entropy weight and matter-element extension theory. China Saf Sci J. 2019;29(8):42. https://doi.org/10.16265/j.cnki.issn1003-3033.2019.08.007 .

Li Q, Zhou HD, Zhang H. Durability evaluation of highway tunnel lining structure based on matter element extension-simple correlation function method-cloud model: a case study. Math Biosci Eng. 2021;18(4):4027–54. https://doi.org/10.3934/mbe.2021202 .

Cai W. Topology and its applications. Chinese Science Bulletin, 673–682 (1998). Retrieved from https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=CJFD&dbname=CJFD9899&filename=KXTB199907000&uniplatform=NZKPT&v=q5wWatQV0E66C_n5vjpboWCYMTm8elp7X_y1UyFoKGtPz5-g3UMJljkjR2BgNuFj (accessed on 16 May 2022).

Yan J, Feng C, Li L. Sustainability assessment of machining process based on extension theory and entropy weight approach. Int J Adv Manuf Technol. 2014;71:1419–31. https://doi.org/10.1007/s00170-013-5532-6 .

Wang X, Yu H, Lv P, Wang C, Zhang J, Yu J. Seepage safety assessment of concrete gravity dam based on matter-element extension model and FDA. Energies. 2019;12:502. https://doi.org/10.3390/en12030502 .

Yang H. Research on durability evaluation method of in-service reinforced concrete bridges based on fuzzy neural network (Master's thesis, Jilin University, Changchun, China). (In Chinese) (2013). Retrieved from https://kns.cnki.net/kcms2/article/abstract?v=JgtjNxUAsgezbk9b4PN3dGWOQ-E4dztiiuBVwpnnUK6cNB6L4xNGTYX0d60cfni2imvaevuJ3FvKGCNLTfwWsjUFh2WdYPByWcHozgqajQlxadNNAvAFP4Ua40QKxk9xh_DW84QlZOJ_KXbIyUV7bAf81mq_CxK8FPaCHlhSxtD5Y3awuw-L7_QN0mTlaeV9lUyUVIsn1uE=&uniplatform=NZKPT&language=CHS

Download references

Acknowledgements

The authors gratefully acknowledge many important contributions from the researchers of all reports cited in our paper.

No external funding was used.

Author information

Authors and affiliations.

School of Civil Engineering, Architecture and Environment, Hubei University of Technology, Wuhan, 430068, China

Jie Cai, Xiaoxiao Liu & Zhipeng Wang

You can also search for this author in PubMed   Google Scholar

Contributions

JC and XL, who designed the study and analyzed the entire evaluation process, are the lead contributing authors of the study. ZW guided the use of MATLAB software, edited the code and analyzed it. XL wrote the manuscript of the paper, which has been examined by all the authors.

Corresponding author

Correspondence to Xiaoxiao Liu .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

All authors agreed to publish this manuscript.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cai, J., Liu, X. & Wang, Z. Comprehensive durability assessment of in-service concrete bridges based on combination weighting and extenics cloud model. Discov Appl Sci 6 , 540 (2024). https://doi.org/10.1007/s42452-024-06250-0

Download citation

Received : 17 July 2024

Accepted : 03 October 2024

Published : 09 October 2024

DOI : https://doi.org/10.1007/s42452-024-06250-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • In-service concrete bridges
  • Durability assessment
  • Grey theory
  • Entropy weight method
  • Matter-element extension theory

Advertisement

  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. FREE 10+ Qualitative Data Analysis Samples in PDF

    data analysis sample research paper

  2. Data Analysis Report

    data analysis sample research paper

  3. FREE 13+ Research Analysis Samples in Word, PDF, Google Docs, Apple Pages

    data analysis sample research paper

  4. FREE 10+ Sample Data Analysis Templates in PDF

    data analysis sample research paper

  5. (PDF) Data analysis in qualitative research

    data analysis sample research paper

  6. Analysis In A Research Paper

    data analysis sample research paper

VIDEO

  1. Data Analysis Sample Using Jamovi 1

  2. Demographic Analysis in SPSS

  3. Ethiopia Demographic and Health Survey data analysis in Stata #EDHS #DHS

  4. Data Collection & Analysis Chapter 5 Business Research

  5. DATA ANALYSIS SAMPLE TASKS ACCOMPLISHED

  6. HOW TO WRITE A RESEARCH PAPER

COMMENTS

  1. (PDF) An Overview of Statistical Data Analysis

    1 Introduction. Statistics is a set of methods used to analyze data. The statistic is present in all areas of science involving the. collection, handling and sorting of data, given the insight of ...

  2. Chapter 3

    10 Examples of Effective Experiment Design and Data Analysis in Transportation Research About this Chapter This chapter provides a wide variety of examples of research questions. The examples demon- strate varying levels of detail with regard to experiment designs and the statistical analyses required.

  3. Data Analysis Essay Examples

    In such cases, a sample data analysis essay may be a handy guide in the world of data and figures. Our database contains many samples you can choose from, advancing your writing skills and finishing the tasks on time. Perfect Essay Examples. You can use any research paper data analysis example from this database to guide your writing process.

  4. Census Data for Good: Analysis to Action

    By Lara Cleveland. IPUMS International regularly asks representatives of National Statistical Offices (NSOs) around the world to share their data with the research community. While IPUMS offers a license payment to countries for the right to redistribute microdata, NSO representatives are most interested in how sharing data with IPUMS will benefit the people of their countries.

  5. How to Write a Results Section

    The most logical way to structure quantitative results is to frame them around your research questions or hypotheses. For each question or hypothesis, share: A reminder of the type of analysis you used (e.g., a two-sample t test or simple linear regression). A more detailed description of your analysis should go in your methodology section.

  6. Reporting Research Results in APA Style

    Reporting Research Results in APA Style | Tips & Examples. Published on December 21, 2020 by Pritha Bhandari.Revised on January 17, 2024. The results section of a quantitative research paper is where you summarize your data and report the findings of any relevant statistical analyses.. The APA manual provides rigorous guidelines for what to report in quantitative research papers in the fields ...

  7. Methods Section of Research Paper: How-to Guide

    Mention any statistical tests or software used. If a specific method was used for data analysis, describe it concisely. Ethical Considerations If applicable, include a short statement on how ethical standards were met, such as obtaining informed consent or ensuring confidentiality. The methods section of research paper is about precision and ...

  8. Data Analysis Research Proposals Samples For Students

    The purpose of this paper is to present a proposal for a study on the topic of marital satisfaction and work-family balance. The paper presents the proposal under the subsections objective, subjects for the study, ethical considerations, measurement, data collection methods, and data analysis. 1. Objective.

  9. The levels of amino acid metabolites in serum induce the ...

    Two-sample MR and bidirectional MR analyses of metabolites and AD risk. To conduct a more comprehensive analysis of GWAS summary data for AD, we integrated GWAS data on AD from both the FinnGen ...

  10. Comprehensive durability assessment of in-service concrete bridges

    The durability assessment system established in this paper mainly relies on literature collection and standardised theoretical analysis. Future research can combine methods such as finite element simulation to determine the influencing factors of bridge durability in more depth, so as to define and select the assessment indexes more precisely. 2.