Create effective surveys - Bauer Research
Wynne Chin, Professor, Decision & Information Sciences on Partial Least Squares
Creating effective survey questions is more science and less art due to improved algorithmic methodologies developed by Wynne Chin, Professor of Decision & Information Sciences at the University of Houston Bauer College of Business. His graphical software program reduces huge numbers of questions into manageable concepts and allows a line to be drawn between a concept and question to produce a score on the effectiveness of the question.
The program relies on a refined version of Partial Least Squares (PLS) technology and addresses two issues: extracting meaningful data from small samples and reducing huge numbers of questions into manageable concepts. Chin’s been sharing it with fellow academics for the past thirteen years in what he calls the “longest beta testing in history.”
Because the software is graphically driven – the user just links objects together in a drawing - all that’s needed are questions in a regular text file with the names of variables and names coded for each response. Chin says, “You’re trying to see how good the questions are and can determine, for example, the main reasons why customers decide to buy a new product.” In his example, there are four concepts: Advantages, Enjoyment, Easy to Use and Decision to Use, and 28 questions.
On the computer screen the concepts appear as circles and the questions as squares. By drawing a line between any square (a question) and a circle (a concept such as Decision to Use) a score is produced. The closer the score is to 1, the better the question. A bad question will get a zero or minus score.
The software allows the user simultaneously to determine how good the concepts are and how good the questions are for forming that concept and does so with a non-subjective score. It pinpoints whether the issues in a survey are useful at predicting a buy response. Models can be as complex as 400 concepts, each having more than 20 questions.
Chin’s 2003 article “A Partial Least Squares (PLS) Latent Variable Modeling Approach for Measuring Interaction Effects” appeared in Information Systems Research. It details his research for the academic audience.
PLS was created in the 1960s by Herman Wold, a Swedish economist. By 1980, rudimentary PLS software ran on main frame computers. Chin became aware of the technique in 1982 and outlined his software in the mid 80s. Chin says, “I used it primarily for my own research during my dissertation. For the conditions I was facing – at that point, I had a small sample size and so many cases of data points – I still wanted to make sense of the research. That’s one of the major benefits of this methodology – you can get meaningful results with limited data.” In 1992, he started releasing it without charge to researchers around the world.
“I didn’t want to send it out until it was robust enough that I wouldn’t have to keep answering emails from people trying to make sense of their data analysis or understand how to run the software,” says Chin.
There are two groups using PLS: those using a shortened version of the underlying algorithm that can handle only two concepts and those using Professor Chin’s software that can handle multiple concepts, create scores on effectiveness and provide meaningful results from small samples.
Sample size is a constant issue for researchers. Large corporations have the financial resources to gather questionnaires from large numbers of people. Academics often are limited to surveying only a small number of people because not many people are willing to take the time to complete a 100-question survey without compensation.
Since his initial sharing within his own MIS field, use has expanded to approximately a thousand researchers around the world in disparate fields, such as marketing, political science, medicine, and ecology. Some researchers are devoting book chapters to the methodology using examples from his software. A number of doctoral students at Bauer College use the technique because they have a lot of questions, a small sample size and a very complex model. Chin’s software gives them the opportunity to see how good their measures are and tie the links.
“My policy has been very simple,” says Chin. “At universities where there are only three or four users, I give it to them for free when they sign an agreement. If they wish the entire college – all their students – to use it, we ask for a simple set-up fee. I’m not in any rush to sell the software until I’m sure that it has all the right functions.”
This technique falls under second generation multi-variate technique. Second generation allows you the same analysis as the first generation techniques but it gives you the flexibility to create different statistical models that are closer to the level of complexity of the data you’re trying to model.
Often the first generation techniques make certain assumptions that are constraining. One of the most constraining assumptions is that the measures being used are error free. When someone tells their age or socio-economic income, it’s assumed to be accurate. The greater degree of error in any of the questions, the less accurate the results. Something may appear more important than something else and it might just be due to how messy the measures were.
The key second generation aspect of PLS allows blending two different fields - psychometrics with econometrics. Psychometrics takes a muddied area like satisfaction or attitude and factors out the errors in the metrics. Econometrics uses regression for prediction with the assumption of error-free measurement.
More specifically, the second generation combines the measurement aspects of psycho metrics and the prediction aspects of econometrics. Chin’s PLS graph is a different counterpart to a more wide spread use of co-variance based. PLS graph technique’s distinction is that it is components based – it makes actual index scores. If there are a lot of measures of satisfaction, they can be combined to create a single satisfaction score – a component – an actual score, which can be very useful.
A few marketing consulting firms have their own proprietary versions of PLS but many more don’t know the technology. Chin receives questions from businesses as they try to read the scholarly literature in this area. “I’ve set them up with a short-term version and they give me feedback. Some are academics who went into the corporate world. I’ve trained a number of firms including some Fortune 500 companies,” he notes.
The tide’s shifting from co-variance software to the PLS-based software simply because of the reality of the data we have. It’s not just about the sample size. It’s about the different degrees of messiness – such as whether the data looks like a bell curve or not. PLS deals with non-normal data. It can handle this better.
Chin says, “In the last three to five years I’ve seen an exponential demand for the software. As soon as one person shows another, they want it. Requests are reaching that level I dread. I guess it’s the academic equivalent of an actor’s fame in Hollywood. I’m glad more people are beginning to use it, but it means I spend two hours a night typing away at over one-hundred emails.”
The software can be very helpful if you:
- work with theoretical models that involve latent constructs
- have problems with variables that tap into the same issues
- want to account for measurement error
- have non-normal data
- have a small sample set
- wish to determine whether the measures you developed are valid and reliable within the context of the theory you are working in
- have formative as well as reflective measures
The PLS-Graph program can handle up to 400 indicators. Models with 50 to 100 are estimated in a matter of seconds. Other PLS versions exist but they run on the old DOS format mainframe. Chin’s software is pure graphical – just link it and draw.
If you’d like more information about Professor Wynne Chin’s graphical software program, contact him at wchin@uh.edu.
_______________________________________________
A Sample:
The screen print shows four concepts (circles) and three questions (squares) that relate to one concept – Decision to Use. The other concepts are Advantages, Easy to Use and Enjoy. Drawing a line from all the circles to Decision to Use, creates a pop-up score. The closer the score is to 1, the better. In this case, scores close to 1 indicate the main reasons for deciding to use the product.
You can see that Advantages has a score five times higher than Enjoyment of Software. The users may enjoy the product but benefits are much more influential in making them use it. This helps the researcher know what the customer has to be sold on and where to improve the product.
Additionally, each square has a number score that tells how good the question is. If a question has a low score then it might get thrown out. The software identifies whether the questions are effective at predicting a buy response.
One of the best examples is a set of questions about Voluntariness of using a certain voice mail system. This survey wants to determine if employees use voice mail voluntarily or because superiors require it. Without drawing a line yet from question to concept (Voluntariness) a score appears. When a line is drawn between each question and the concept the score changes based on the effectiveness of the question.
A conjunction in the question muddies the response as shown by the zero score. The question asked “Does your boss expect or require you to use the voice mail?” We respond differently to “expect” and “require.” It may take two sets of questions – one about “expect” and one about “require.” This is how the software simultaneously determines how good the concepts are and how good the questions are for forming that concept.
Wynne Chin joined the faculty at the University of Houston in the fall of 1997. He received his doctorate from the University of Michigan in Computers and Information Systems, M.S. in chemical engineering (biomedical option) from Northwestern University, MBA from the University of Michigan, and a bachelors in biophysics from U.C. Berkeley. He has published in journals such as MIS Quarterly, Information Systems Research, and Decision Sciences. Dr. Chin's substantive interests include modeling the individual IT adoption process, end-user satisfaction, and developing group process measures such as cohesion, satisfaction, and consensus to understand the impact of electronic meeting systems. More recently, he has begun work on cross-cultural analysis. His research is largely empirical and quantitative relying on lab and Monte Carlo experiments as well as surveys. Methodologically Dr. Chin focuses on construct development through the use of structural equation modeling (both covariance-based and partial least squares) as well as developing new causal modeling techniques for topics such as assessing interaction effects.




