ARTÍCULOS ORIGINALES
ISSN 2389-8186
E-ISSN 2389-8194
Vol.7, No. 2-1
Julio-diciembre de 2020
doi: https://doi.org/10.16967/23898186.666
pp. 19-30
rpe.ceipa.edu.co
* The authors are very grateful to Tecnológico Nacional de México for supporting this work. Also, this research paper was sponsored by
the CONACYT.
** Master in Administrative Engineering. Tecnológico Nacional de México, Veracruz, México. E-mail: iscjpo@gmail.com.
ORCID: 0000-0002-2694-2827. Google Scholar: https://scholar.google.com/citations?user=z2ZZQOoAAAAJ&hl=es
*** PhD in Computer Science. Tecnológico Nacional de México, Veracruz, México. E-mail: lrodriguezm@ito-depi.edu.mx.
ORCID: 0000-0002-9861-3993. Google Scholar: https://scholar.google.es/citations?user=2hZw4HAAAAAJ&hl=es.
**** PhD in Computer Science. Centro de Investigación en Matemáticas CIMAT, Zacatecas, México. E-mail: jmejia@cimat.mx.
ORCID: 0000-0003-0292-9318. Google Scholar: https://scholar.google.com/citations?user=gdlrfgEAAAAJ&hl=es
***** PhD in Engineering Sciences. Universidad del Papaloapan, Oaxaca, México. E-mail: imachorro@unpa.edu.mx.
ORCID: 0000-0002-3822-4478. Google Scholar: https://scholar.google.es/citations?view_op=list_works&hl=es&user=fE8k294AAAAJ.
****** PhD in Electrical Engineering. Tecnológico Nacional de México, Veracruz, México. E-mail: galorh@orizaba.tecnm.mx.
ORCID: 0000-0003-3296-0981. Google Scholar: https://scholar.google.es/citations?hl=es&user=8Zgf4KwAAAAJ.
******* PhD in Science in Electrical Engineering. Tecnológico Nacional de México, Veracruz, México. E-mail: ujuarezm@orizaba.tecnm.mx.
ORCID: 0000-0002-5911-3136. Google Scholar: https://scholar.google.com/citations?hl=en&user=6Ko_q38AAAAJ
Towards Association Rule-based Item
Selection Strategy in Computerized
Adaptive Testing*
JOSUÉ PACHECO-ORTIZ**
LISBETH RODRÍGUEZ-MAZAHUA***
JEZREEL MEJÍA-MIRANDA****
ISAAC MACHORRO-CANO*****
GINER ALOR-HERNÁNDEZ******
ULISES JUÁREZ-MARTÍNEZ*******
ISSN 2389-8186
E-ISSN 2389-8194
Vol.7, No. 2-1
Julio-diciembre de 2020
doi: https://doi.org/10.16967/23898186.666
COMO CITAR ESTE ARTÍCULO
How to cite this article:
Pacheco-Ortiz, J. et al. (2020).
Towards Association Rule-
based Item Selection Strategy in
Computerized Adaptive Testing.
Revista Perspectiva Empresarial,
7(2-1), 19-30.
Recibido: 20 de agosto de 2020
Aceptado: 07 de diciembre de 2020
ABSTRACT
One of the most important stages of Computerized Adaptive Testing is the
selection of items, in which various methods are used, which have certain weaknesses
at the time of implementation. Therefore, in this paper, it is proposed the integration of
Association Rule Mining as an item selection criterion in a CAT system. We present the
analysis of association rule mining algorithms such as Apriori, FP-Growth, PredictiveApriori
and Tertius into two data set with the purpose of knowing the advantages and disadvantages
of each algorithm and choose the most suitable. We compare the algorithms considering
number of rules discovered, average support and condence, and velocity. According to the
experiments, Apriori found rules with greater condence, support, in less time.
KEY WORDS
Computerized adaptive testing, association rules, e-learning, intelligent
systems.
Hacia una estrategia de selección de ítems basada en reglas
de asociación en pruebas adaptativas computarizadas
RESUMEN
Una de las etapas más importantes de las pruebas adaptativas informatizadas
es la selección de ítems, en la cual se utilizan diversos métodos que presentan ciertas
debilidades al momento de su aplicación. Así, en este trabajo, se propone la integración de
la minería de reglas de asociación como criterio de selección de ítems en un sistema CAT.
Se presenta el análisis de algoritmos de minería de reglas de asociación como Apriori, FP-
Growth, PredictiveApriori y Tertius en dos conjuntos de datos con el n de conocer las ventajas
y desventajas de cada algoritmo y elegir el más adecuado. Se compararon los algoritmos
teniendo en cuenta el número de reglas descubiertas, el soporte y conanza promedios y la
velocidad. Según los experimentos, Apriori encontreglas con mayor conanza y soporte
en un menor tiempo.
PALABRAS CLAVE
pruebas adaptativas informatizadas, reglas de asociación, e-learning,
sistemas inteligentes.
21
ARTÍCULOS
JOSUÉ PACHECO-ORTIZ, LISBETH RODRÍGUEZ-MAZAHUA, JEZREEL MEJÍA-MIRANDA, ISAAC MACHORRO-CANO,
GINER ALOR-HERNÁNDEZ, ULISES JUÁREZ-MARTÍNEZ
Revista Perspectiva Empresarial, Vol. 7, No. 2-1, julio-diciembre de 2020, 19-30
ISSN 2389-8186, E-ISSN 2389-8194
A uma estratégia de seleção de itens baseada em regras
de associação em provas adaptativas informatizadas
RESUMO
Uma das etapas mais importantes das provas adaptativas informatizadas
é a seleção de itens, na qual se utilizam diversos métodos que apresentam certas
debilidades no momento da sua aplicação. Assim, neste trabalho, se propõe a
integração da mineração de regras de associação como critério de seleção de itens
num sistema CAT. Se apresenta a análise de algoritmos de mineração de regras de
associação como Apriori, FP-Growth, PredictiveApriori e Tertius em dois conjuntos de
dados com o m de conhecer as vantagens e desvantagens de cada algoritmo e eleger
o mais adequado. Se compararam os algoritmos tendo em conta o número de regras
descobertas, o suporte e conança em média e a velocidade. Segundo os experimentos,
Apriori encontrou regras com maior conança e suporte num menor tempo.
PALAVRAS CHAVE
provas adaptativas informatizadas, regras de associação,
e-learning, sistemas inteligentes.
22
ARTÍCULOS ORIGINALES
JOSUÉ PACHECO-ORTIZ, LISBETH RODRÍGUEZ-MAZAHUA, JEZREEL MEJÍA-MIRANDA, ISAAC MACHORRO-CANO,
GINER ALOR-HERNÁNDEZ, ULISES JUÁREZ-MARTÍNEZ
Revista Perspectiva Empresarial, Vol. 7, No. 2-1, julio-diciembre de 2020, 19-30
ISSN 2389-8186, E-ISSN 2389-8194
Introduction
Computer Adaptive Testing —CAT— (Chen,
Chao and Chen, 2019) has revolutionized the
traditional way of evaluating, since it dynamically
selects and manages the most appropriate questions
depending on the previous answers given by the
examinees. One of the central components of a CAT
is the item selection criterion (Miyazahua and Ueno,
2019), although the most widely used criterion
is Fisher’s Maximum Information (Albano et al.,
2019), it presents several weaknesses that generate
a certain degree of mistrust, for example, bias in
the item selection, estimation errors at the start
of the exam, or the same question being displayed
repeatedly to the tested one (Sheng, Bingwei and
Jiecheng, 2018; Du, Li and Chang, 2018; Lin and
Chang, 2019; Yigit, Sorrel and de la Torre, 2019;
Ye and Sun, 2018). Therefore, in this paper the
development of a CAT system that uses association
rules for the selection of items is proposed, focusing
on using the potential advantages of association

answered correctly or incorrectly and the questions
answered correctly, and thus present the most
appropriate questions (most likely to answer
correctly) in the tests, according to the responses
of the evaluated, considering the best rules (stored
in the database of students who submitted the
same test previously) with greater support and

Several research projects have used association
rule mining —ARM— with different algorithms in
their development, for example, in Rubio Delgado
et al. (2018), authors applied Apriori, FP-Growth,
PredictiveApriori and Tertius, grouping them

to compare them, so Apriori and FP-Growth were

values, whereas for PredictiveApriori and Tertius

time, the number of rules generated, the support

cases. In contrast, Wang et al. (2018) worked with
the Apriori algorithm, occupying for the comparison
  
whose set of generated rules were debugged based
on the minimum Lift, Chi-squared test and minimum
improvement. While in Prajapati, Garg and Chauhan


execution time and conviction were used in the
process of comparing the Distributed Frequent
Pattern Mining —DFMP—, Count Distribution
Algorithm —CDA— and Fast Distributed Mining
—FDM— algorithms.
The objective of this paper is to present a
comparative analysis of various ARM algorithms
that allows to select the most suitable for the
implementation in the CAT system that is

the introduction; (ii) background and some of the
works related to this research; (iii) the integration of
ARM in the CAT process and the comparison method
that was followed; (iv) the results and the analysis

Background and related works
Over the years, in different projects, various
tools have been applied in the development of the

parameter logistic model for item calibration (Lee
et al., 2018); maximum likelihood estimation for the
evaluator’s skill estimation (Albano et al., 2019);
and root mean square differences as an evaluation
criterion (Stafford et al., 2019), among others.
      
has been done to solve the problems presented
by Fisher’s Maximum Information, using other
selection strategies, for example, Bayesian networks
(Tokusada and Hirose, 2016), Greedy algorithm
(Bengs, Brefeld and Krohne, 2018), Kullback-Leibler
Information (Chen et al., 2017), Minimum Expected
Subsequent Variance (Rodríguez-Cuadrado et al.,
2020), to mention a few which, while they have
achieved favorable results, most have only been in
studies of simulation and not in real application.
We propose using ARM as an item selection

associations or correlations between the elements
or objects (in this case, test answers given by other
students in the past) of a database, it has many

can occur between correct/incorrect answers and
correct ones; (ii) they will determine the suitable
23
ARTÍCULOS
JOSUÉ PACHECO-ORTIZ, LISBETH RODRÍGUEZ-MAZAHUA, JEZREEL MEJÍA-MIRANDA, ISAAC MACHORRO-CANO,
GINER ALOR-HERNÁNDEZ, ULISES JUÁREZ-MARTÍNEZ
Revista Perspectiva Empresarial, Vol. 7, No. 2-1, julio-diciembre de 2020, 19-30
ISSN 2389-8186, E-ISSN 2389-8194
item according to the answer of the evaluated; and
(iii) the items presented to the examinees will be
selected, considering interesting metrics widely
used in related works. ARM has been used in various

(Dahdouh et al., 2019) and online learning (Gu,
Zhou and Yan, 2018), offering positive results in
each case; however, to the best of our knowledge, its
use as a selection strategy for CAT is not currently
reported, therefore, this project contemplates the
integration of ARM in the stage of selecting items
in the CAT. The expected outcome at the end of

both CAT and association rules in the educational

a system that is not only adaptive, but also learns
and evolves according to the experiences that it
accumulates over time.
Methodology
The following subsections specify the
integration of ARM in the CAT process and the
method performed for comparing ARM algorithms.
      
process. The second subsection contemplates the
data bank used for comparison. The third subsection

CAT process with ARM as item selection
criterion
The process followed by the proposed CAT
is shown in Figure 1, which begins with an initial
estimate of student knowledge to select and present

a new knowledge estimate is made. While the stop
criterion is not met, and if the answer to the previous
question was correct, then a question with a higher
level of complexity is chosen using association rules,
else one with less complexity is selected according
to association rules, then the item is presented to the
student to recalculate his/her level of knowledge
estimate. This cycle is repeated until the stop
criterion is met, when this happens, all information

grade is displayed to the student, which logs out
and new association rules are obtained and saved
automatically that will serve the next time a student
submits the exam.
Figure 1. Integration of association rule mining in the item selection phase of the CAT process. Source: author own elaboration.
24
ARTÍCULOS ORIGINALES
JOSUÉ PACHECO-ORTIZ, LISBETH RODRÍGUEZ-MAZAHUA, JEZREEL MEJÍA-MIRANDA, ISAAC MACHORRO-CANO,
GINER ALOR-HERNÁNDEZ, ULISES JUÁREZ-MARTÍNEZ
Revista Perspectiva Empresarial, Vol. 7, No. 2-1, julio-diciembre de 2020, 19-30
ISSN 2389-8186, E-ISSN 2389-8194
Collection and Preparation of Data
Information of tests on pencil and paper
corresponding to three units of the Computer
Systems Master’s Database course were used for the
creation of a database in MySQL. From the database
records, two binary-matrix were created to serve
as the basis for the application of ARM algorithms.
In the binary-matrix, questions are represented by
the columns and examinees by the rows, where, 1
corresponds to a correct answer and 0 corresponds

called Exa1 corresponds to the answers of the

  
Exa2 corresponds to the answers of the second

students. According to the Waikato Environment for

the two data sets were analyzed based on their
characteristics and it was observed that they did
not need any other processing, so they were ready
for the next step of the process.
Evaluation of Algorithms
There are several metrics to evaluate
association rule mining algorithms, among which

Laplace measure, certainty factor, odds ratio and
cosine similarity (Yan, Zhang and Zhang, 2009).

bi-improve, bi-support (Ju et al., 2015), Items-
based Distance —ID—, and Data Rowsbased
Distance (Djenouri et al., 2014). However, the most

used in this project, adding also the time factor and
number of rules.
For the comparative analysis of the association
rule algorithms, the following four criteria were


of the detected association.

transactions from the database that the given rule


construction of a model.

rules obtained.
The purpose of this comparison process is to
identify the algorithm that provides those rules

with one antecedent and one consequent, and (ii)
rules with a value of consequent equal to 1 (correct

Item5=1 ==> Item6=1 or Item3=0 ==>
Item4=1
Where, value 1 means the question had a correct
answer and 0 means it had the wrong answer. All

support, found in the shortest possible time.
Four association rule algorithms were applied

(i) Apriori (Agrawal, Imielinski and Swam,

mining. It generates rules through an incremental
process that searches for frequent relationships
between attributes bounded by a minimum

run under certain criteria, such as upper and lower
coverage limits, and to accept sets of items that

and order criteria to display the rules, as well as

we want to show.



items and their support, value that allows us to
organize the sets in a descending way. The method
proposes good selectivity and substantially
reduces the cost of the search, given that it starts
by looking for the shortest frequent patterns and
then concatenating them with the less frequent

frequent patterns.

algorithm achieves a favorable computational
performance due to its dynamic pruning technique
that uses the upper bound of all rules of the
supersets of a given set of elements. In addition,
through a backward bias of the rules, it manages
25
ARTÍCULOS
JOSUÉ PACHECO-ORTIZ, LISBETH RODRÍGUEZ-MAZAHUA, JEZREEL MEJÍA-MIRANDA, ISAAC MACHORRO-CANO,
GINER ALOR-HERNÁNDEZ, ULISES JUÁREZ-MARTÍNEZ
Revista Perspectiva Empresarial, Vol. 7, No. 2-1, julio-diciembre de 2020, 19-30
ISSN 2389-8186, E-ISSN 2389-8194
to eliminate redundant ones that are derived from
the more general ones. For this algorithm, it is
necessary to specify the number of rules that are
required.




      
parameters that allow its application to multiple
domains.
For a better understanding in the comparison
process, the algorithms were grouped based on their
characteristics, so a comparison was carried out
between Apriori and FP-Growth, since both allow
min_conf)
and support (min_sup) to obtain 4 different groups
of rules (15, 20, 25, and 50) with one antecedent
and one consequent, where the value of the latter
is equal to 1. Getting in response for each case, the
time in milliseconds consumed in the execution

While for the comparison of Predictive Apriori and
Tertius, it was also necessary to specify the number
of rules required, obtaining as a response for each
case, the time in milliseconds used, as well as the

the four algorithms, the number of rules generated,
time spent and support are taken into account. Each
evaluation was executed 100 times to estimate the
average time for the construction of the models.

were considered.
Results and Discussion
Tables 1 and 2 show the comparison between
Apriori and FP-Growth for the Exa1 and Exa2 data
sets.
Table 1. Test results for Apriori and FP-Growth for the Exa1 data set
15 Rules 20 Rules 25 Rules 50 Rules
Algorithms
Min_conf/
Min_sup
Conf.
Sup.
Time
Conf.
Sup.
Time
Conf.
Sup.
Time
Conf.
Sup.
Time
Apriori
0.7/0.5
1 0.96 3 1 0.94 5 0.99 0.93 4 0.99 0.87 9
FP-Growth 1 0.96 5 0.98 0.95 5 0.99 0.93 5 - - -
Apriori
0.7/0.6
1 0.96 6 1 0.94 6 0.99 0.93 8 0.99 0.87 17
FP-Growth 1 0.96 7 0.98 0.95 4 0.99 0.93 6 - - -
Apriori
0.8/0.3
1 0.96 5 1 0.94 5 0.99 0.93 5 0.99 0.87 14
FP-Growth 1 0.96 6 0.98 0.95 6 0.99 0.93 6 - - -
Apriori
0.8/0.6
1 0.96 5 1 0.94 6 0.99 0.93 6 0.99 0.87 10
FP-Growth 1 0.96 3 0.98 0.95 4 0.99 0.93 6 - - -
Apriori
0.9/0.5
0.9/0.9
1 0.96 6 1 0.94 5 0.99 0.93 4 0.99 0.87 24
FP-Growth 1 0.96 5 0.98 0.95 4 0.99 0.93 6 - - -
Apriori 1 0.96 7 0.98 0.95 4 - - - - - -
FP-Growth - - - - - - - - - - - -
Source: author own elaboration.
26
ARTÍCULOS ORIGINALES
JOSUÉ PACHECO-ORTIZ, LISBETH RODRÍGUEZ-MAZAHUA, JEZREEL MEJÍA-MIRANDA, ISAAC MACHORRO-CANO,
GINER ALOR-HERNÁNDEZ, ULISES JUÁREZ-MARTÍNEZ
Revista Perspectiva Empresarial, Vol. 7, No. 2-1, julio-diciembre de 2020, 19-30
ISSN 2389-8186, E-ISSN 2389-8194
Table 2. Test results for Apriori and FP-Growth for the Exa2 data set
15 Rules 20 Rules 25 Rules 50 Rules
Algorithms
Min_conf/
Min_sup
Conf.
Sup.
Time
Conf.
Sup.
Time
Conf.
Sup.
Time
Conf.
Sup.
Time
Apriori
0.7/0.5
0.99 0.90 3 0.97 0.90 3 0.98 0.88 6 0.97 0.85 6
FP-Growth 0.99 0.90 4 0.97 0.90 5 0.99 0.86 7 - - -
Apriori
0.7/0.6
0.99 0.90 4 0.97 0.90 4 0.98 0.88 5 0.97 0.85 8
FP-Growth 0.99 0.90 5 0.97 0.90 5 0.99 0.86 7 - - -
Apriori
0.8/0.3
0.99 0.90 5 0.97 0.90 3 0.98 0.88 7 0.97 0.85 7
FP-Growth 0.99 0.90 6 0.97 0.90 5 0.99 0.86 6 - - -
Apriori
0.8/0.6
0.99 0.90 7 0.97 0.90 4 0.98 0.88 5 0.97 0.85 7
FP-Growth 0.99 0.90 6 0.97 0.90 4 0.99 0.86 6 - - -
Apriori
0.9/0.5
0.99 0.90 5 0.97 0.90 4 0.98 0.88 4 0.97 0.85 6
FP-Growth 0.99 0.90 4 0.97 0.90 7 0.99 0.86 7 - - -
Source: author own elaboration.
As it is observed in Table 1, Apriori obtained
15 and 25 rules faster than the latter in more cases.

   
Moreover, Apriori was the only algorithm that
obtained 15 and 20 rules, considering a value of 0.9
for min_conf and min_sup, respectively, and 50 rules

and Rules criteria, Apriori is better than FP-Growth
for the Exa1 data set.
Likewise, Table 2 shows that Apriori is faster
than FP-Growth for 15, 20 and 25 rules. For the group
of 25 rules, although FP-Growth has a higher level

of support in them. In addition, for the group of 50
rules, Apriori was the only algorithm that obtained

Support and Rules criteria, Apriori is also better
than FP-Growth for the Exa2 data set.
The comparisons between PredictiveApriori
and Tertius for the Exa1 and Exa2 data sets
are shown in Table 3 and 4, respectively; as it is
observed, although PredictiveApriori’s support and


the system should occupy as little time as possible
in generating rules that are the basis for selecting
the next item.
Figures 2 to 5 show the comparison between
the four algorithms with regard to support and time,
respectively. The results indicate that the algorithm
that generates rules with better support within the
Exa1 and Exa2 data sets and in less time is Apriori.
Therefore, it is the best algorithm for the data sets.
The product of the two analyses carried out in
this section allows to determine that the Apriori
algorithm is the one that presents the best results in
each of the data sets. For example, in the Exa1 data


Item13=1 ==> Item7=1
27
ARTÍCULOS
JOSUÉ PACHECO-ORTIZ, LISBETH RODRÍGUEZ-MAZAHUA, JEZREEL MEJÍA-MIRANDA, ISAAC MACHORRO-CANO,
GINER ALOR-HERNÁNDEZ, ULISES JUÁREZ-MARTÍNEZ
Revista Perspectiva Empresarial, Vol. 7, No. 2-1, julio-diciembre de 2020, 19-30
ISSN 2389-8186, E-ISSN 2389-8194
Table 3. Test results for PredictiveApriori and Tertius for the Exa1 data set
Algorithms Rules Condence Support Time
PredictiveApriori 15 1 0.90 16959
Tertius 15 0.81 0.41 22
PredictiveApriori 20 1 0.87 48558
Tertius 20 0.79 0.40 29
PredictiveApriori 25 1 0.84 22674
Tertius 25 0.79 0.41 25
PredictiveApriori 50 1 0.73 58364
Tertius 50 0.79 0.43 47
Source: author own elaboration.
Table 4. Test results for PredictiveApriori and Tertius for the Exa2 data set
Algorithms Rules Condence Support Time
PredictiveApriori 15 1 0.79 18223
Tertius 15 0.82 0.44 33
PredictiveApriori 20 1 0.73 8539
Tertius 20 0.82 0.45 34
PredictiveApriori 25 1 0.68 8528
Tertius 25 0.79 0.44 26
PredictiveApriori 50 0.99 0.69 13706
Tertius 50 0.79 0.41 52
Source: author own elaboration.
Figure 2. Comparison of Apriori, FP-Growth, PredictiveApriori and Tertius algorithms in terms of support for Exa1 data. Source:
author own elaboration.