You are on page 1of 10

Aggregated Approaches to Identifying Community and its Constituent

Elements in Formal Blended Learning Environments

Richard A. Schwier
Ben K. Daniel
University of Saskatchewan


This paper describes a variety of methods, techniques and procedures combined to determine
whether an online learning community exists, to isolate its constituent characteristics and understand
interactions among them, and build a dynamic model of an online learning community. It presents analysis
of data from a three-year research program on virtual learning communities, including user perceptions of
community (sense of community indices); interaction analysis (density, intensity, reciprocity) content
analysis (transcripts, interviews, and focus groups), paired-comparison analysis (Thurstone scaling) and
community modeling techniques (Bayesian Belief Network analysis).


This paper describes a set of approaches used to measure and understand the characteristics of community
discussed in greater detail in Schwier and Daniel (2007). The categories of analysis included identifying a sense of
community, isolating characteristics of community, comparing characteristics of community, and building a
dynamic Bayesian Belief Network Model of a community. The procedures of the analysis involved eliciting
summative judgments by participants, identifying variables constituting a virtual learning community and making
paired comparison to determine the relative importance of the various characteristics. Finally, we used the data to
construct a computational model from the data—one that not only represents the interrelationships among variables,
but that can also be used to project the effect on the community when its constituent elements' values are changed as
a results of new evidence. This paper presents in the general the key results of the studies undertaken. For detailed
description of the studies and methods employed and the findings see Schwier and Daniel (2007).

Research Context

Our analyses draw on data generated over three years of online communication among groups of graduate
students in Educational Communications and Technology as they participated in seminars on the foundations of
educational technology and instructional design. Each offering of the classes spanned an entire semester or academic
year. The classes were small graduate seminars with enrolments from six to thirteen students, and each class met
primarily online, but with monthly group meetings. While most students were able to attend the group meetings
regularly, every class cohort had members who participated exclusively or mostly from a distance. Given the
blended nature of all of the classes, we confine our conclusions to similar environments, and emphasize that these
results cannot be generalized to environments that are entirely online, or entirely face-to-face. However, the methods
employed in the studies can be extended to study similar learning environments. Measures employed to determine
whether a community existed among the students include; content analysis, focus groups, and interviews to ascertain
sense of a community index and intensity of interaction; paired comparison using Thurstone scaling to determine
relative importance of the identified characteristics of virtual learning communities; and Bayesian Belief Network
for building a dynamic computational model and evidence-based scenarios to query and update the model.

Sense of Community Indices

In gross sense, to determine whether a community existed, we employed the "Sense of Community Index
(SCI)" (Chavis, undated), a classic instrument employed broadly in the field of community psychology (Chavis &
Wandersman 1990; Chipuer & Pretty, 1999; Obst & White, 2004). The Sense of Community Index (SCI) measures

an individual's psychological sense of community. We administered the SCI at the beginning and end of a year-long
course, and ran a simple T-test on the data to see if there was any change in measures of the group's sense of
community by the end of the course. The T-test results suggested a significant positive growth in the sense of
community scores from the beginning to the end of the course (p< .01). Though, the reliability of SCI is often
questionable, despite its long use, we anticipated using the Classroom Community Scale (CCS) proposed by Rovai
and Jordan (2004). The classroom community scale is similar in format and intent to the SCI, but it boasts a higher
reliability estimate for the full scale (Chronbach's alpha = .93) and the subscales (connectedness = .92; learning =
.87), partially attributable to the higher number of items on the scale.

Density and Intensity of Interaction

Fahy, Crawford and Ally (2001) proposed several useful measures of describing interaction collectively called
Transcript Analysis Tool (TAT). The TAT includes methods of measuring density, intensity and persistence of
interactions in transcripts of online discussions. We drew on their recommendations and extended some of them to
analyze interactions in our data, particularly transcripts of asynchronous discussions. We explored how individuals
in the class were connected to each other using density as measurement proxy. Fahy, Crawford and Ally's (2001)
defined density as "the ratio of the actual number of connections observed, to the total potential number of possible
connections." Given by the formula: Density = 2a/N (N-1), where "a" is the number of observed interactions
between participants, and "N" is the total number of participants. Density is a measure of how connected
individuals are to others in a group, and the idea is that a higher degree of connection is a positive indicator of
For our own calculations, we included only peripheral (voluntary or additional) communications between
people by eliminating all instances of required postings and responses. We felt that peripheral interaction would
provide a stronger measure of community, given that required communications among students might inflate the
actual density value. In the case of one of our groups, we discovered a density ratio of .78, suggesting that 78% of
the possible connections were made. Density = 2(122)/13(12) = .782. Although there are no baseline data to make
judgments about the existence of community, this level of density did seem to suggest a strong level of connection
among participants.
Fahy, Crawford and Ally also recommend considering measures of intensity to determine whether participants
are authentically engaged with each other, not merely carrying out their responsibilities in a course. They argue that
it is a useful measure of involvement because it involves measures of persistence and dedication to being connected
to others in the group. One measure of intensity is "levels of participation," or the degree to which the number of
postings observed in a group exceed the number of required postings. In this case, students were required to make
490 postings as part of the course requirements, and they actually made 858 postings, yielding a level of
participation ratio of 1.75. Another important measure for the purpose of understanding community was Fahy,
Crawford and Ally's "S-R ratio", a formula to measure the parity of communication among participants. We referred
to this as a measure of "reciprocity", and we felt that truly engaged groups who form communities will exhibit high
degrees of reciprocity. Details of our analysis are reported in Schwier and Daniel (2007).

Characteristics of a Model of a Virtual Learning Community

Building on previous model of virtual learning community (see Schwier, 2001), we had identified fourteen
characteristics of community that grew out of the theoretical model, from the analysis of interactions among
participants, from a content analysis of transcripts of communication among community participants, and from
interviews and focus groups. While the process to this point was disciplined at each step, the intention was to draw
out characteristics that might be important in formal virtual learning communities; the purpose was not to validate or
compare the relative significance of any of the characteristics. The next step in the process was to try to determine
the relative importance of the characteristics that were drawn from these various sources. We had a good sense of
what were many of the characteristics that comprised the communities we observed, but we did not have any reliable
information about which characteristics were important, which were trivial, and which might be more important than
others. Figure 1 presents the characteristics observed for the definitions of the variables (see Schwier & Daniel,

Figure 1. Characteristics of a virtual learning community (Schwier, in press)

To address this question, we developed a paired-comparison treatment that asked participants to compare each
characteristic of a VLC to every other characteristic and choose the characteristic they believed was more important
to the community. Twenty-three students who had completed their coursework volunteered to participate in the
study. The fourteen characteristics were compared against each other, resulting in 91 paired-comparisons in the
treatment. Authorware Professional™ was used to develop the treatment, and the treatment was administered on
Windows-based PC workstations. In the design of the treatment, care was taken to avoid response bias and
contamination from fatigue by presenting each pair in random order and by alternating the upper-lower orientation
of each characteristic in relation to the characteristic against which it was being compared. After completing the
comparisons, participants were asked to describe how they made their decisions generally, and if there were factors
that influenced their decisions.

Thurstone Analysis

In order to discriminate among the variables, Schwier & Daniel (2007) developed a paired-comparison
treatment that required participants to compare each characteristic of a VLC to every other characteristic and choose
the characteristic they believed was more important to the community. This was based on Thurstone's method of
paired comparisons, a method of analysis that generates a scale ranking and scale points among variables that can be
used to plot a visual representation of distances between and among variables.
Thurstone (1927) postulated that for each of the items being compared and among all subjects, a preference
will exist, and that for each item the preference will be distributed normally around that item's most frequent or
modal response. A person's preference for each item versus every other item is obtained, and the more people that
select one item of a pair over the other item, the greater the preference for, or perceived importance of, that item, and
thus the greater its scale weight. Thurstone's Law of Comparative Judgment circumvents potential ceiling effect
problems by forcing individuals to rank items two at a time rather than all at once (Manitoba Centre for Health
Policy, 2005). Given the results of all possible paired comparisons of the variables under study, scale values can be
plotted on a line to provide a graphic illustration of the relative value of each variable, represented by its relative
distance from the other variables (the greater the distance between any two variables on the scale, the greater the
differences between those two variables).
The scale is descriptive, and there are no post-hoc tests available to identify significant differences among
variables. But the scale values provide a convenient metric for assigning initial weights to variables in modeling
exercises. In the study of the fundamental variables of virtual learning communities, Schwier and Daniel (2007),
compared each VLC characteristic with the others, following procedures outlined by Misanchuk (1988). The data
were then converted into a line drawing that depicted differences between elements along a line. Greater differences

were shown spatially as larger distances between points on the line. The outcome of the comparison and the ranking
of the variables are shown in Table 1.

Table 1. Thurstone Scale Rankings and Scale Points for Each of the Fourteen VLC Characteristics.

Characteristic Thurstone Scale Ranking Thurstone Scale Point

Trust 1 0.7341
Learning 2 0.5806
Participation 3 0.3182
Mutuality 4 0.2671
Intensity 5 0.2425
Social Protocols 6 0.1852
Reflection 7 0.1523
Autonomy 8 0.0155
Awareness 9 -0.0785
Identity 10 -0.1939
Future 11 -0.2474
Technology 12 -0.5033
Historicity 13 -0.7309
Plurality 14 -0.7701

As a result of this analysis, we were able to obtain measures that could be used to understand the association
and interplay of community characteristics in a VLC, and we could also use the Thurstone Scale points to assign
weights to these characteristics when we attempted to construct a dynamic model of virtual learning communities.
Reviewing the results, it is apparent that there are at least three clusters of characteristics. Trust and learning were
considered by the participants to be the most important characteristics of a VLC. A large cluster of characteristics
gathered around the mean scale point, and while they differed from each other, we treated them as a group because
of their central position relative to the other points. Technology, historicity and plurality were ascribed much lower
status than the other characteristics, and one might argue as a result that they should be eliminated from the model
entirely. After reviewing comments, it was apparent that even those characteristics that were positioned at the low
end of the Thurstone scale still had a role to play in the construction of community, however marginal that influence
might be.
We were also reluctant to eliminate characteristics at this point in the research because we are still gathering
primary data from new groups. Our confidence in the relative positions of these characteristics, and ultimately our
judgments about their inclusion in a model of VLC, will grow as our analysis continues. At what point will we be
satisfied that we've identified the important characteristics and measured their relative importance? Probably never,
given that VLCs are dynamic environments that are also situated in particular learning contexts. But we will
continue to gather data to develop and refine models, and our tools and the sophistication of our observations will
mature over time too.

Building Models of Online Learning Communities

Bayesian networks, Bayesian models or Bayesian belief networks (BBNs) can be classified as part of the
probabilistic graphical model family. Graphical models provide an elegant and mathematically sound approach to
represent uncertainty. BBN approach combines advances in graph theory and probability. BBNs are graphs
composed of nodes and directional arrows (Pearl 1988). Nodes in BBNs represent variables and directed edges
(arrows) between pairs of nodes indicate relationships between variables. The nodes in a BBN are usually drawn as
circles or ovals. Further, BBNs offer a mathematically rigorous way to model a complex environment that is
flexible, able to mature as knowledge about the system grows, and computationally efficient (Druzdzel & Gaag,
2000; Rusell & Norvig, 1995).
Bayesian networks can be used to model probabilistic relationships among variables. In some cases, their
graphical structure can be loosely interpreted as the result of direct causal dependencies between variables. In
domains with many causal relations, such as in medical diagnosis (symptoms cause diseases), human experts are
usually able to express their domain knowledge in the graphical structure of the network. For example, in a model

for medical diagnosis, the parameters of the network are the conditional probabilities of effects given the state of
their direct causes.
Bayesian Belief Network (BBN) techniques are increasingly being used for understanding and simulating
computational models of complex social systems. BBN models enable reasoning when there is uncertainty (Pearl,
1998). They combine the advantages of an intuitive visual representation with a sound mathematical basis in
Bayesian probability. The motivation to build the Bayesian model of a virtual learning community is to be able to
perform a number of simulations and observe the influence of variables in the model with the goal of determining
and understanding the relationships among these variables that are critical to learning in communities.
The first step in building a BBN is to identify key variables that represent a domain (Druzdzel & Gaag, 2000;
Pearl, 1988; Rusell & Norvig, 1995). The variables identified in our model are drawn from an analysis of online
transcripts, interviews and email traffic that were subjected to grounded theory analysis, and the identified variables
were then subjected to a Thurstone analysis to identify their relative weights. The motivation to build the Bayesian
model of a virtual learning community is to be able to perform a number of simulations and observe the influence of
variables in the network with the goal of determining and understanding those variables that are critical to virtual
learning communities as well as their interactions in the processes of learning. In building the model, once variables
were identified, the second step involved mapping the variables into a graph (see Figure 2) based upon coherent
qualitative reasoning.
Druzdzel and Henrion (1993) proposed a transformation of a causal Bayesian network into a qualitative
probabilistic network (QPN), in which the relation between two adjacent nodes is denoted as positive (+), negative (-),
null (0) or unknown (?); there are also relations that involve more than two nodes, such as positive or negative synergies.
The main advantage of QPN's is that they simplify the construction of models, because they do not require the elicitation
of numerical parameters; as a consequence, their main disadvantage is the lack of precision in the results, especially
because very often the combination of "positive" and "negative" influences leads to "unknown" relations. The motivation
for this approach is based on the fact that people usually reason in qualitative terms.
In our case, we used the Thurstone analysis as a starting point to identify relative positions of variables of
virtual communities, and we then used qualitative reasoning to subjectively identify those variables that are of
interest and influence in the model and isolate those that are less likely to have an impact on the overall performance
of the model. We caution the reader that our reasoning is based on our teaching and research experience into virtual
learning environments, and it may contain epistemological, contextual and personal bias. However, the initial
precision of the relationships among variables is less important to developing a model than is the identification of
key variables that was accomplished by using the grounded theory approach mentioned earlier. Precision is built by
tuning the model and observing how variables interact over time and across contexts in the BBN. In other words, a
BBN is built iteratively, and as the number of iterations increase, the model is tuned to render an increasingly
accurate network of relationships among key variables
In this study, we used qualitative reasoning to infer causal relationships among the variables identified in the
study, resulting in relationships among variables that could be charted. For instance one can qualitatively and
inductively reason that in virtual learning communities, participation and learning are essentially variables whose
interactions are mediated by another technology as another variable, (i.e., it is hard to imagine learning online
without any participation and equally participation is often mediated by technology), and therefore, technology is
assigned to be a parent of participation. Similarly, participation can influence awareness in various ways, which in
turn can lead to the development of trusting relationships. Since awareness can contribute to trust and distrust, trust
is set to be a child of awareness.
Furthermore, one can reason that technology influences awareness in different ways. For example, imagine a
learning environment in which each individual has a profile (electronic portfolio) and the information is made
available to others in the community; this can create sense of awareness about who is who, or who knows what, in
that community. Similarly, technology may influence intensity in a weak positive manner. For example, poor
technology might have negative outcomes on engagement. In other words, people might not be willing to use
technology that does not work well for them, or they find awkward to use.
Extending this type of qualitative reasoning resulted in the BBN shown in Figure 2. In the model, those nodes
that contribute to causality align themselves in “parent” to “child” relationships, where parent nodes are causes and
child nodes are effects. For example, trust is the child of mutuality; awareness and intensity, which are in turn
children of participation and technology (see Figure 2). The criterion for determining causality among the variables
is a reflection of our qualitative reasoning process (soft data), which can be validated using empirical evidence (hard

Figure 2. BBN representation of relationships among virtual learning community variables.

The third step in building the model involved assigning initial probabilities to the network. In general, BBN
initial probabilities can be obtained from domain experts, secondary statistics or they can be taken from observations
and subjective intuition. It is also possible that initial probabilities can be learned from raw data. In addition to
learning prior probabilities, it is sometimes necessary to examine the structure of the network. In our case, the initial
probabilities were obtained using approached discussed in Daniel, Zapata-Revera and McCalla (2003) and the
structure and the degree and strength of influence among the variables was determined by examining the distances
between the variables of virtual learning communities along the Thurstone Scale. This approach enabled us to
cluster those variables that were closely aligned on the Thurstone scale and use weighted threshold values (Daniel,
McCalla, & Schwier, 2005) to generate the conditional probability table. The relationships and the degree of
influence among the variables were further described qualitatively. In the results of Thurstone scaling, those
variables that cluster around the mean scale point was observed were given high degree of influence.

Generating the Conditional Probability Values

The initial conditional probabilities were generated by examining qualitative descriptions of the influence
between two or more variables and the strength of their relationships in the model (Daniel, Zapata-Rivera, McCalla,
2003; Daniel, McCalla, & Schwier, 2005). Each probability describes the strength of relationship. For instance,
various degrees of influence among variables are represented in the model by the letters S (strong), M (medium),
and W (weak). The signs + and - represent positive and negative relationships. The elicitation of the initial
probability approach for the variables was based on the approach discussed in Daniel, Zapata-Revera and McCalla
(2003), but the strengths of the relationships and the influence of each variable was based on the results of the
Thurstone scaling and the relative positioning of each variables along the scale. For instance, technology was ranked
to be last and so it carries a threshold probability value of 0.6 and the symbol weak (W+) was assigned to it. The
sign means that there is some kind of influence, but because of its low ranking along the scale, the influence is a
weak one.
The probability values were obtained by adding weights to the values of the variables depending on the number
of parents and the strength of the relationship between particular parents and children. For example, if there are
positive relationships between two variables, the weights associated with each degree of influence are determined by
establishing a threshold value associated with each degree of influence. The threshold values correspond to the
highest probability value that a child could reach under a certain degree of influence from its parents, i.e. assuming
that Participation and Technology have positive and strong relationships with Awareness, evidence of good
technology and high participation will result into a conditional probability value of 0.98 (i.e., Awareness=Exist).
This value is obtained by subtracting a base value (1 / number of parents--0.5 in this case with two parents) from the
threshold value associated to the degree of influence (i.e., threshold value for strong = 0.98) and dividing the result
by the number of parents (i.e., (0.98 - 0.5) / 2 = 0.24). Table 3 lists threshold values and weights used in this

example. The value α = 0.02 leaves some room for uncertainty when considering evidence coming from positive
and strong relationships.

Table 2. Threshold values and weights with two parents

Degree of influence Thresholds Weights

Strong 1-α = 1 - 0.02 = 0.98 (0.98-0.5) / 2 = 0.48 / 2 = 0.24
Medium 0.8 (0.8-0.5) / 2 =0.3 / 2 = 0.15
Weak 0.6 (0.6-0.5) / 2 =0.1 / 2 = 0.05

Daniel, Zapata-Revera and McCalla (2003)

This assumes that participation and technology have positive strong relationships with awareness and there is
evidence of positive participation and technology in a particular community. Given these assumptions, weights will
be added to the conditional probability table of awareness every time participation = high or technology = good. For
example, the conditional probability value associated with awareness given that there is evidence of participation =
high and technology = good is 0.98. This value is obtained by adding to the base value the weights associated with
participation and technology (0.24 each). Table 4 shows a complete conditional probability table for this example.

Table 3. An example of conditional probability table for two parents with strong, positive relationships

Participation High Low

Technology Good Bad Good Bad
Awareness Exists 0.98 0.74 0.74 0.5
Awareness Does Not Exist 0.02 0.26 0.26 0.5

The calculation of the various states of the relationships among the three variables (awareness, participation and
technology), and their corresponding values used in Table 3. Given below:
P (Awareness= Exist | Participation= high & Technology= Good) = 0.5 + 0.24 + 0.24 = 0.98
P (Awareness= DoesNotExist| Participation= high & Technology= Good) = 1 - 0.98 = 0.02
P (Awareness= DoesNotExist | Participation=High & Technology= Bad) = 1 - 0.74 = 0.26
P (Awareness= Exist| Participation= Low & Technology= Good) = 0.5 + 0.24 = 0.74
P (Awareness= DoesNotExist | Participation= Low & Technology= Good) = 1 - 0.74 =0.26

Querying the Network

Querying a BBN refers to the process of updating the conditional probability table and making inferences based
on new evidence. One way of updating a BBN is to develop a detailed number of scenarios that can be used to query
the model. A scenario refers to a written synopsis of inferences drawn from observed phenomenon or empirical data.
Druzdzel and Henrion (1993) described a scenario as an assignment of values to those variables in Bayesian network
which are relevant for a certain conclusion, ordered in such a way that they form a coherent story—a causal story
which is compatible with the evidence of the story. The use of scenarios in Bayesian network is drawn from
psychological research (Pennington & Hastie, 1988). This research shows that humans tend to interpret and explain
any social situation by weighing up the most credible stories that include hypotheses to test and understand social
phenomena. In Bayesian terms a hypothesis is the assignment of a value to a discrete variable or group of variables.
Although a scenario can describe all of the nodes a model, it is more reasonable to include only the nodes relevant
for a certain situation. If there is a certain focal hypothesis, say for instance H, selected by the user, the relevant
nodes are those that affect the posterior probability of H given the observed evidence e. Otherwise, the relevant
nodes are all those whose probabilities depend on e. The explanation of the model therefore, consists of showing the
evidence (i.e., the scenarios that are most compatible with the hypothesis and those that are incompatible with the

Furthermore, updating a BBN using scenarios is an attempt to understand the significance of various
relationships among variables in a network. Based on the results of Thurstone scaling we have observed a large
cluster of variables around the mean scale point. These were then chosen to construct the Bayesian Belief network.
Although, the variables obtained in the earlier analysis can be treated as a group because of their central position
relative to the other points, it is difficult to measure their individual importance relative to others in the same cluster
or in other clusters in the VLC model. We build simple scenarios to further infer their relative influence and
significance to learning within the network. The approach described in table 3 uses both qualitative and quantitative
data in building the Bayesian Network to model imprecise and nebulous domains (Daniel, Zapata-Revera &
McCalla, 2003). In addition, the probability distribution enables us to query the model and observe changes as they
propagate to generate new posterior probability values (P), which we can then use to make logical inferences about
the state of the model from changes in its variables. For example, imagine a community where there is reasonably
high level of participation among individuals (e.g., p=0.98), and a high presence of mutuality, implying learners are
constantly engaged in reciprocal relationships through exchanging messages, sharing experiences, stories,
information and knowledge. Querying the model (presented in figure 2) with this scenario reveals increased learning
with a posterior probability value of P (l=0.763).
Another scenario we employed to tune the model involved a formal virtual learning community in which an
effective level of participation guided by explicit social protocols was observed. In addition, individuals were
constantly engaged in open discourse (mutuality), and the issues were addressed in both depth and breadth
(intensity). Further, assuming that there is a high intensity in discourse encouraged individuals to reflect deeply on
the issues being discussed. Results of querying the model using this scenario revealed a higher probability of
learning p (l=0.779) with a significant difference of 0.016 compared to the probability of learning in the presence of
effective participation and mutuality alone. This result is intuitively appealing, given interview data that suggested
that a combination of these factors encouraged depth in the discussion and in learning (Schwier & Daniel, 2006).
In practice virtual learning communities should encourage freedom of expression, mutual respect and they
should value diversity. Building on the notion of individual freedom in a virtual learning community, we were
interested in observing the impact of autonomy on trust and learning, given effective participation and good
technology. Autonomy seems to be very influential; the network revealed higher probability of trust P (t=0.924) and
correspondingly high probability of learning P (l=0.794) when autonomy was elevated.
Given the central importance of trust as a prerequisite condition of learning, we were interested in
understanding the impact of all the variables on trust and learning. In this scenario all the variables in the first layer
(technology and participation) and second layer (mutuality, intensity, social protocols, reflection, autonomy and
awareness) in the model were set to their highest probability values. This scenario increased the values of posterior
probabilities of trust (P: (t=0.944) and learning (P: l=0.810). This result suggests that the variables in the network
can collectively have considerable and yet varying effects on trust and learning, depending on differing scenarios.
Although the results of the Thurstone analysis ranks trust to be the most important variable in a virtual learning
community, our analysis suggests that when trust is associated directly with learning, but without the positive
influence of its parent variables (mutuality, intensity, social protocols, reflection, autonomy and awareness), the
probability of learning remains low P (l=0.629). This result holds even in the presence of good technology and
effective participation. Previous research has emphasized the value of trust in enhancing the sense of a community.
Prusak and Cohen (2001) suggested that trust enables people to work together, collaborate, and smoothly exchange
information and share knowledge without time wasted on negotiation and conflict. In virtual learning communities
however, we argue that without mutuality, intensity, social protocols, reflection, and awareness; the impact of trust
on learning may be minimal.
Based on different experiences, experts’ knowledge, intuition and hunches, a large number of scenarios can be
developed to query this model. Querying the model using logical scenarios, whether based on empirical data or
experts’ experiences, offers a disciplined method of examining the cumulative effect of making changes anywhere
in the network and also for speculating about how any particular change can alter the values of related variables. The
BBN is still, at its core, a tool for speculation, but over time and as data are added to inform the variables and their
interrelationships, the network can be "tuned" to provide robust and precise ways to make decisions about how to
support learning in virtual learning communities


The most important point in this paper is not the specific methods we chose to perform the analyses. A host of
other tools are available to researchers, and they can be used to generate related but different interpretive angles on
data (e.g., social network analysis tools). Of greater significance is that the methods flow from definition to analysis
to prediction, so they have some intuitive and practical appeal. We recognize that we are at the beginning of
learning about how to understand online learning communities, and so we make no claims that these methods
represent a definitive set of tools for that job, but we think that considering the full cycle from definition to modeling
is important. Much of the research to date that examines online learning communities looks closely at only a few
variables, and much of the literature is highly speculative. We think that there is a need to systematically isolate
features of communities, try to determine their relative importance, and then build models that can be used to test
inferences in new environments and inform the science of design in online learning.


Chavis, D. M. & Wandersman A. (1990) Sense of community in the urban environment, A catalyst for participation
and community development. American Journal of Community Psychology, 18, 55-81.
Chavis, D.M. (undated). The sense of community index. Retreived August 31, 2005 from
Chipuer, H.M., & Pretty, G.M.H. (1999). A review of the sense of community index. Journal of Community
Psychology, 27, 246-658.
Daniel, B.K., McCalla, G.I., & Schwier, R.A. (2005). Data mining and modeling social capital in virtual learning
communities. Proceedings of the 12th International Conference on Artificial Intelligence in Education,
Amsterdam, 18-22 July, 2000-208
Daniel, B.K., Zapata-Rivera, D. J., & McCalla, G. I. (2003). A Bayesian computational model of social capital in
virtual communities. In M. Huysman, E. Wenger, & Wulf, V. (eds), Communities and technologies
(pp.287-305). London: Kluwer.
Druzdzel, M.J., & Gaag, L. C (2000). Building probabilistic networks: "Where do the numbers come from?" Guest
editor's introduction. Data Engineering, 12(4), 481-486
Druzdzel, M.J., & Henrion, M. (1993). Efficient reasoning in qualitative probabilistic networks. Proceedings of the
11th Annual Conference on Artificial Intelligence (AAAI-93), Washington, D.C., pp. 548-553.
Fahy, P.J., Crawford, G., Ally, M., Cookson, P., Keller, V. and Prosser, F. (2000). The development of testing of a
tool for computer-mediated conferencing transcripts. Alberta Journal of Educational Research, 46 (1), 85-
Manitoba Centre for Health Policy (2005). The Thurstone manual. Retrieved August 31, 2005 from
Misanchuk, E.R. (1988). Step-by-step procedure for constructing a Thurstone Scale. Annual Management Program
of the Canadian Association for University Continuing Education, Halifax, Nova Scotia.
Obst, P.L., & White, K.M. (2004). Revisiting the sense of community index: A confirmatory factor analysis. Journal
of Community Psychology, 32(6), 691-705.
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, CA:
Morgan Kaufmann.
Pennington, N., & Hastie, R. (1988). Explanation-based decision making: effects of memory structure on judgment.
Journal of Experimental Psychology. Learning, Memory, and Cognition, 14(3), 521-533.
Prusak, L., & Cohen, D. (2001). In good company: How social capital makes organizations work. Boston, MA:
Harvard Business School Press.
Rovai, A.P., & Jordan H.M. (2004). Blended learning and sense of community: A comparative analysis with
traditional and fully online graduate courses. International Review of Research in Open and Distance
Learning, August, 2004. Retrieved April 26, 2005 from
Russell, S. & Norvig, P. (1995). Solution manual for "artificial intelligence: a modern approach." Englewood
Cliffs, NJ: Prentice Hall.
Schwier, R.A. (2001). Catalysts, emphases and elements of virtual learning communities: Implications for research
and practice. The Quarterly Review of Distance Education, 2(1), 5-18.

Schwier, R.A. (in press). A typology of catalysts, emphases and elements of virtual learning communities. In R.
Luppicini (Ed.), Trends in distance education: A focus on communities of learning. Greenwich, CT:
Information Age Publishing.
Schwier, R.A., & Daniel, B.K. (2007). Did we become a community? Multiple methods for identifying community
and its constituent elements in formal online learning environments. In N. Lambropoulos, & P. Zaphiris
(Eds.), User- evaluation and online communities (pp. 29-53). Hershey, PA: Idea Group Publishing.
Thurstone, L.L. (1927). A law of comparative judgement. Psychological Review, 34, 273-286.