Category Archives: Big Data

Mixed Methods

Keep-calmThe Journal of Mixed Methods Research defines mixed methods as “research in which the investigator collects and analyses data, integrates the findings, and draws inferences using both qualitative and quantitative approaches or methods in a single study or program of inquiry”.

In many texts on mixed methods, this type of research is presented as a way to make peace between two “adversaries”: the supporters of quantitative vs. the supporters of qualitative research. The argument is that during the last century these “adversaries” have engaged in a so-called “paradigm war”. On one side are the quantitative purists who articulate assumptions about research that are in line with what we often label positivist philosophy: social observations should be treated as entities in much the same way that physical scientists treat physical phenomena and the observer is separate from the entities that are subject to observation (Johnson and Onwuegbuzie 2004)  . Here, any scientific inquiry should be objective, with the aim at making time- and context-free generalizations, where real causes of scientific outcomes can be deemed reliable and valid (Gulbrandsen, 2012, p. 48) . On the other side we have the qualitative purists who reject positivism and argue for a rage of alternatives, such as constructivism, idealism, relativism, humanism, hermeneutics, or postmodernism. Though the anti-positivists differ among themselves in many aspects, they all argue for the existence of multiple and constructed realities, as opposed to the singular reality of positivism. And as such, they all argue that the observer and the observed cannot be separated because the (subjective) observer is the only source of the ‘reality’ that is to be observed (Guba, 1990). Beyond this, they also share the stance that time- and context-free generalizations are neither desirable nor possible, that research is value-bound, hence making it impossible to differentiate causes and effects (Johnson and Onwuegbuzie 2004).

During the 1990’s a growing number of scholars started pointing out the inadequacy of the strict quantitative-qualitative division, arguing that the so-called “incompatibility thesis” (that qualitative and quantitative research paradigms cannot and should not be mixed) (Howe, 1988), is faulty. Instead, these scholars argue, there should be a third way, and they started promoting mixed method research as a new research paradigm that could point in this third direction. In particular, they argue that although the two paradigms often portray themselves as opposites, they actually share basic agreements on several points (Phillips and Burbules, 2000); they both use empirical data to address research questions, they both aim to minimize confirmation bias and invalidity, and they both attempt to provide justifiable claims about human activities and the environments in which they unfold. The middle road, then, according to Johnson and Onwuegbuzie (2004), is to acknowledge that what appears objective can vary across individuals because what we observe is affected by our background knowledge, theories and experiences. Observation is, in other words, not a direct window into “reality”, and will thus not provide final proof. BUT this does not mean that all is relative; rather, what we obtain is probabilistic evidence.

So, why use mixed methods? Well, in short, because it allows you to overcome shortcomings of the individual methods (qualitative and quantitative) and to break down the confines of traditional perspectives (Gulbrandsen, 2012, p. 48). First, and foremost, by mixing methods you will be more likely to avoid the limitations of purely quantitative or qualitative studies. Quantitative studies are often criticized for not including context and for not providing the participants with a voice, and qualitative studies are often discounted for potential researcher biases, smaller sample sizes, and lack of generalizability (Miller et al., 2011). Mixed methods can include context and participants’ voices and still be neutral and generalizable. Secondly, mixed methods research makes triangulation possible (i.e. seeking convergence and confirmation of results from different methods studying the same phenomenon), hence also allowing the investigation to be informed by the findings from one method when utilizing the other.


How to use mixed methods? Well, there are two basic approaches: concurrent or sequential. The first implies that you conduct both the qualitative and the quantitative research simultaneously. The second implies that you first conduct one (e.g. quantitative) and then, based on the findings from the first, conduct the second (e.g. qualitative).

In a review of the field of mixed methods, Tashakkori and Creswell (2007, p. 208),  found that there are three dominant ways of doing mixed method research.

  1. Here researchers create separate quantitative and qualitative questions, followed by an explicit mixed methods question. For example, if a study involves concurrent quantitative and qualitative data collection, this type of mixed question could ask, ‘‘Do the quantitative results and the qualitative findings converge?’’. If a study is more sequential, the question might be ‘‘How do the follow-up qualitative findings help explain the initial quantitative results?’’ or ‘‘How do qualitative results explain (expand on) the experimental outcomes?’’
  2. Here researchers create an overarching mixed research question, which is then later broken down into separate quantitative and qualitative subquestions to answer in each strand or phase of the study. This is more frequent in concurrent studies than in sequential ones. Although this overarching question might be implicitly present, sometimes it is not explicitly stated. An example is Parmelee, Perkins, and Sayre’s (2007) study exploring ‘‘how and why the political ads of the 2004 presidential candidates failed to engage young adults’’. The authors followed this implicitly stated question with three specific subquestions: ‘‘How does the interaction between audience-level and media-based framing contribute to college students’ interpretations of the messages found in political advertising?’’, ‘‘To what extent do those interpretations match the framing found in the ads from the 2004 U.S. presidential election?’’ and ‘‘How can political ads be framed to better engage college students?’’. As another example, in a concurrent design, a mixed methods question might be ‘‘What are the effects of Treatment X on the behaviors and perceptions of Groups A and B?’’ Consequently, the component questions that are drawn from the overarching mixed question might be ‘‘Are Groups A and B different on Variables Y and Z?’’ (the quantitative strand) and ‘‘What are the perceptions and constructions of participants in groups A and B regarding treatment X?’’ (the qualitative strand).
  3. Here researchers create research questions for each phase of a study as the study evolves. If the first phase is a quantitative phase, the question would be framed as a quantitative question or hypothesis. If the second phase is qualitative, the question for that phase would be framed as a qualitative research question. This is found in sequential studies more than in concurrent studies.


Encyclopædia Britannica defines data mining as “knowledge discovery [through] the process of discovering interesting and useful patterns and relationships in large volumes of data. The field combines tools from statistics and artificial intelligence with database management to analyse large digital collections, known as data sets.” Or put slightly differently, the term describes the process of extracting insights and knowledge from large data sets, that is the processing of information, e.g. available and extracted (mined) from social media platforms.

We will soon provide you with more on this topic, but until then, check out this introductory article by Chen et al. (1996). Also, check out the SIGKDD, the Association for Computing Machinery’s Special Interest Group on Knowledge Discovery and Data Mining,  and its publications on data mining.