Chapter 13: Data Processing, Data Analysis, and Statistical Testing
- What is the difference between measurement validity and interview validation?
Validity is a general term which is strongly associated with research. It is actually an indication of how sound a research study is and it specifically applies to the design and the methods of research. The measurement validity of an assessment is the degree to which it measures what it is supposed to measure (McDaniel & Gates, 2013). However, the validation process of interview is the process to ascertain that interviews actually were conducted as specified or not.
- Assume that Sally Smith, an interviewer, completed 50 questionnaires. Ten of the questionnaires were validated by calling the respondents and asking them one opinion question and two demographic questions over again. One respondent claimed that his age category was 30–40, when the age category marked on the questionnaire was 20–30. On another questionnaire, in response to the question “What is the most important problem facing our city government?” the interviewer had written, “The city council is too eager to raise taxes.” When the interview was validated, the respondent said, “The city tax rate is too high.” As a validator, would you assume that these were honest mistakes and accept the entire lot of 50 inter-views as valid? If not, what would you do?
Considering the above mentioned case and being a validator, I would consider that these are honest mistakes and would accept the entire lot of 50 interviews as valid. However, I would contact some more participants and would look for a little more validity. These can be said as honest and valid mistakes which are quite possible while taking the data.
- What is meant by the editing process? Should editors be allowed to fill in what they think a respondent meant in response to open-ended questions if the information seems incomplete? Why or why not?
The editing process is the process of ascertaining that questionnaires which were filled out by participants, were filled out properly and completely. Even if the information seems incomplete, the editor should not be allowed to fill in what they think a respondent mean to open ended questions. The reason behind it is that it can completely change the opinion of participant and can manipulated the actual results (Ferguson, 2008).
- Give an example of a skip pattern on a questionnaire. Why is it important to always follow the skip patterns correctly?
A skip patterns on a questionnaire is a sequence in which later questions are asked based on a respondent’s answer to an earlier question. The example of the skip pattern on a questionnaire can be as follows:
- Do you feel tensed or depressed? (Yes, No). Here are two options to this question. Here can be two patterns of questions which will be; one for response in yes and other for response in no. The respondent can be asked to skip the questions associated with other response and to shift to pattern which is based on their response.
It is always important to follow the patterns correctly because if a participant will skip to wrong pattern, it would result in collecting wrong and invalid data.
- It has been said that, to some degree, coding of open-ended questions is an art. Would you agree or disagree? Why? Suppose that, after coding a large number of questionnaires, the researcher notices that many responses have ended up in the “Other” category. What might this imply? What could be done to correct this problem?
It can be said as a valid statement that coding of open ended question is an art. In the above mentioned, case, it is clearly evident that researcher didn’t pay good attention to develop the open ended questions in a nice manner and this resulted in the way that responses have ended up in the “other” category (Pink, 2010) (Stinson & Fisher, 2006). For this reason, a researcher needs to pay full attention to the basic and important features of open ended questions and he should take care of the particular coding requirements.
- What is the purpose of logical cleaning data? Give some examples of how data can be logically cleaned. Do you think that logical cleaning is an expensive and unnecessary step in the data tabulation process? Why or why not?
Logical cleaning of the data is carried out to finalize the computerized error checking of data. Before the tabulation and statistical analysis of survey results, the final error checking is done which is also termed as the logical clearing data. There are different software packages available which are used by different colleges such as SAS and SPSS. It is an important step in the tabulation process because it results in checking the data and then minimizing the later issues in findings and results of study.
- It has been said that a cross tabulation of two variables offers the researcher more insightful information than does a one-way frequency table. Why might this be true? Give an example.
A cross tabulation of two variables offers the researcher more insightful information than does a one-way frequency table. It is because that cross tabulation represents a simple to understand but powerful analytical tool. This is carried out in marketing research where one question is seen in responses to one or more other questions. For example, in terms of a study where the relationship between cities consumers in association with their willingness to consider for hospitalization and their age is cross tabulated, this will be example of cross tabulation of different variables.
- Illustrate the various alternatives for using percentages in one-way frequency tables. Explain the logic of choosing one alternative method over another.
The cross tabulation can be used by researchers as an alternative for using percentage in one-way frequency tables. As mentioned earlier, the cross tabulation helps to list the data in comparison with the responses to other questions and hence, the data can be validated and tabulated in a more appropriate manner.
- Explain the differences among the mean, median, and mode. Give an example in which the researcher might be interested in each of these measures of central tendency.
Mean is the sum of the values for all observations of a variable divided by the number of observations. The median is the value below which 50% of the observations fall while the mode is the value that occurs most frequently. The example can be as “A total of 10 beer drinkers (drink one or more cans, bottles, or glasses of beer per day on the average) were interviewed in a mall-intercept study. They were asked how many cans, bottles, or glasses of beer they drink in an average day.” Here a researcher will be interested in all of these measures of central tendency because the measures of central tendency will be as:
- Mode: 2 cans/bottles/glasses
- Median: 2 cans/bottles/glasses
- Mean: 3 cans/bottles/glasses
- Explain the notions of mathematical differences, managerially important differences, and statistical significance. Can results be statistically significant and yet lack managerial importance? Explain your answer.
The mathematical differences refer to the concept if numbers are not exactly same, they are different but it does not mean that the difference is important or statistically significant. The statistical significance refers to a particular difference which is larger enough to be unlikely to have occurred because of chance or sampling error and it will be statistically significant. Lastly, the managerially important differences are when results are sufficiently different form a managerial perspective. Also results can be statistically significant and yet lack managerial importance. For example, the difference in consumer responses about two different packages in a test market can lack managerial importance but it might be statistically significant (Nedarc, 2011).
- Describe the steps in the procedure for testing hypotheses. Discuss the difference between a null hypothesis and an alternative hypothesis.
The steps for testing hypothesis include the stating the hypotheses, choosing the appropriate test statistics, developing a decision rule, calculating the value of the test statistic and stating the conclusion. A null hypothesis is that which states that there is no significant difference between specified populations, any observed difference would be due to sampling or experimental error. While an alternative hypothesis is used in hypothesis testing that is contrary to a null hypothesis.
- Distinguish between a type I error and a type II error. What is the relationship between the two?
Type I error is also named as alpha error and it is the rejection of the null hypothesis when, in fact, it is true. Type II error is the beta error which is failure to reject the null hypothesis when, in fact, it is false. Relation between both of these is that they rely primarily on the null hypothesis.
Bibliography
Ferguson, D. P. (2008). AN INTRODUCTION TO THE DATA EDITING PROCESS. National Agricultural Statistics Service.
McDaniel, C., & Gates, R. (2013). Data processing, data analysis and statistical testing. In Marketing Research Essentials (8th Edition ed., pp. 325-363). Hoboken, New Jersey: John Wiley and Sons, Inc.
Nedarc. (2011). Hypothesis Testing . Retrieved from Nedarc: http://www.nedarc.org/statisticalhelp/advancedstatisticaltopics/hypothesisTesting.html
Pink, R. (2010). 7 STEPS TO PREPARE DATA FOR ANALYSIS. Inquisium Blog.
Stinson, L. L., & Fisher, S. K. (2006). OVERVIEW OF DATA EDITING PROCEDURES IN SURVEYS ADMINISTERED BY THE BUREAU OF LABOR STATISTICS: PROCEDURES AND IMPLICATIONS. first International Computer-Assisted System Information Computing Conference.