You are on page 1of 4

Integrated Intelligent Research(IIR) International Journal of Business Intelligent

Volume: 01 Issue: 01 June 2012,Pages No.11-14


ISSN: 2278-2400

Knowledge Identification using Rough Set Theory in


Software Development Processes
R. Rameshkumar, C.Jothi Venkateswaran,
Research Scholar, Bharath university, Salaiyur, Chennai
Head, PG and Research Department of Computer Science,Presidency College (Autonomous), Chennai
ramesh116@hotmail.com

Abstract- The knowledge processing system leads the of the proposed project or product. To obtain the
power of the organization in the world business race. All fulfillment of the particular product, there are several
the industries are adopting knowledge management system models used in the field of software engineering. Different
for their human capital .The level of interaction occurs phases are involved in the development process, which is
among the employees in the industry increase the common for the engineering module adopted for the
knowledge creation, identification, representation and analysis. As a model, Waterfall model as simplest process
utilization. The knowledge discovery data process by finding the dependency between activities and
complexity various depend on the domain, nature of the constructing a network for those dependency values with
applications, organizational system and many more the use of association matrices and map .This will throw a
organizational policies. The process time and volume of light on the many relations and implications of the
data is to be reduced for the decision supporting and waterfall model for a particular selected project. This paper
Knowledge data discovery process using rough set theory describes the dependency level analysis in reduction of
equivalence association in the software development variables using data processing techniques such as rough
process and Information Technology Organization. set theory.
Determination of the target factor variables that influence
the processing knowledge in the organization .The II. BACKGROUND OF THE RESEARCH
variables are identified based equivalence association of all
combinational factors of the variables. The researcher There are various approaches and practices are adopted in
paper observed software development project, which data mining process such as Classification, Regression,
produced un-deterministic result of the project Clustering, Rule generation, Discovering association rules,
development. This paper aimed to find the relations of Summarization, Dependency modeling and Sequence
variable, which could contribute more knowledge for the analysis. There are many methods such as fuzzy logic,
successful completion and delivery of the project that neural network, genetic algorithms, genetic programming
increase the software process development delivery. and rough sets. Each of them can analyze a problem in its
However, the activity variables leads to determine the set domain, those methodologies can be used together to solve
of activities carried out the professional group and complex problems, and more and more researches combine
encourage them to provide more attention on the selective those methods to find new critical features. Using the
activities. above techniques and approaches in data mining it is easy
to find out a huge number of patterns in a database[1].
Keyword: KDD, software development , variable Rough set theory suits to analysis of different types of
reduction uncertain data and rough set can deal with large data to
reduce superfluous information and find extracting
I. INTRODUCTION knowledge form the rules.Rough sets theory is developed
and applied in data mining and knowledge discovery
Knowledge discovery in databases (KDD) is the process of process [2,3,4,5,6]. It has been applied to the analysis of
identifying needy information as a result of data many issues, including medical diagnosis, engineering
processing. The KDD approach is presented with the high reliability, expert systems, empirical study of material data
level conceptual manner [1] where it has been [7], machine diagnosis [8], travel demand analysis , data
decomposed in a few iterative steps. This approach is mining [9].The research addressed the effect of
attempted to impellent in the software development attributes/features on the combination values of decisions
process. Software development quality process is achieved that insurance companies make customers’ needs satisfied
by using different methods to complete the entire activities [10]. Rough set theory can unify with fuzzy theory and is
1
Integrated Intelligent Research(IIR) International Journal of Business Intelligent
Volume: 01 Issue: 01 June 2012,Pages No.11-14
ISSN: 2278-2400
transformed from the crisp one to a fuzzy one, called
Alpha Rough Set Theory. The rough sets theory is useful
method to analyze data and reduct information in a simple
way. This shows that rough set theory used for pre
processing of the data mining process, which leads the
time consuming and cost effective approach for the
business solutions. This approach is attempted to find the
effective attributes for the determination software
developers activity evaluation which set of attributes to be
consider for the quality software production and provide
Cyclic Avoidance: : The activities relationships are
skill set to the employees. The activities, which are carried
determined only for the phase x with the next phase x+1.
out for the different projects and its relational impact, are
There is no determination of activities within the phase
converted using associative mapping process, which
itself. So there is no cyclic path for the relationship
represented below
determination.
Multi level Relationship: : If there is a relation between
III. CONSTRUCTION OF ACTIVITY
the phase x with phase x+2 through the phase x+1, then a
ASSOCIATION MATRIX (AAM) :
multi level relationships will occur.
Activity Relation: Each activity in phase x is taken and
Feed Forwarded approach: : The activities
the impact of that particular activity in the next phase
relationships can be determined only for the phases x,
(x+1)is considered. If the activity a1 creates an impact in
x+1,x+2 etc. as a forward approach. Here there is no
the next phase then that impact can be called as Activity
determination of relationships for the x+1 phase with its
relation. These types of activities are also called as
previous phase x i.e. no backward relationships.
dependent activities. If that particular activity a1 did not
create any impact in the next phase then it is called as
IV. ASSOCIATION MAP CONSTRUCTION
independent activity.
Direct relationship function : Association map is
Independent activity association matrix
constructed for the phases x and x +1 if there is a direct
The Feasibility analysis set (P1a) is represented as column
relationship between them. If the phase x is having relation
and Requirement analysis set (P2a) is represented as row
with x+1 then there is a path existing between the x and
and then a two dimensional association matrix is framed
x+1 phase in the association map
with the following conditions.
Routed relationship function :
Condition 1:
Single hidden phase : This type of routed relationship is
If the activities of Feasibility analysis set (P1a)
constructed if there is a path between the phases X and the
create an impact with the activities of Requirement
phase x+2 through the phase x+1 which is in between
analysis set (P2a) then the value is set as ‘1’. These
those two phases
activities are considered as related activities or dependant
Multilevel hidden phase : This type of relationship occurs
activities.
only when there is a relationship between x and X+3 or
Condition 2:
x+4 phase which is having the path existence through x+1
If the activities of Feasibility analysis set (P1a) does not
and x+2 etc.
create an impact on the activities of Requirement analysis
Network construction
set (P2a) then the value is set as ‘0’. These activities are
Now the network is constructed from the association map
considered as isolated activities or independent activities
by the following steps :
Condition 3:
(a)Phase as a Layer : In this network construction we
If the activities of Feasibility analysis set (P1a), partially
are considering the phases as layers. Since the waterfall
creates some impact on Requirement analysis set (P2a)
model has 6 phases and so those phases are considered as 6
then those activities to be called as partial dependant
layers of the network construction.
activities. If its impact is dominant on the process then the
(b) activities as a node
value to be considered appropriately to the dependant
Each activity of the phases is considered as the nodes of
activity else it is treated as an isolated activities.
the network construction. In the phase1 of waterfall model
In certain cases these partial activities are treated
there are three activities and those three are going to be
as X (don’t care condition).the matix is given below:
considered as nodes of the first layer of the network.
Likewise the remaining phases and their activities are
treated as the respective layers and nodes .

2
Integrated Intelligent Research(IIR) International Journal of Business Intelligent
Volume: 01 Issue: 01 June 2012,Pages No.11-14
ISSN: 2278-2400
As per the constructed network model the associative
relationship matrix are constructed. While observing the
activities of the employee and their activities the
contribution of employee and their each activity in the
software development phases presented as a unit matrix.
This matrix representation has seven phases and each
phase four activities are consider for the evaluation. The
employee activities on these phases along with the
performance observed and presented.

V. RESULT INTERPRETATION
The development environment is observed and the sample set of data is partially presented according to the observed
performance. The numbers of activities, which are involved as per the involvement the developer performance is, vary
one with another. The collect data sample presented below

Team
Regularit Task Involvemen
sno Emp.id y completion accuracy t Reporting Performance
1 1000 100 82 98 84 90 90.8
2 1004 80 93 82 93 86 86.8
3 1007 92 91 90 81 90 88.8
4 1011 100 93 95 92 80 92
5 1013 86 95 100 92 93 93.2
6 1015 87 80 99 80 97 88.6
7 1019 99 81 89 95 95 91.8
8 1024 91 80 96 94 90 90.2
9 1028 82 99 87 89 88 89
10 1030 93 87 81 99 98 91.6
11 1122 89 90 90 80 83 86.4
12 1128 82 97 90 84 81 86.8
13 1133 100 100 99 95 82 95.2
14 1140 88 82 98 80 85 86.6
15 1148 94 87 98 89 91.8
16 1152 98 87 83 96 90 90.8
17 1161 83 98 83 87 83 86.8
18 1166 88 100 98 97 82 93
19 1175 82 97 92 80 91 88.4
20 1182 80 84 99 88 95 89.2

If the activities and the skills set is identical. According I. CONCLUSION


to the relational activities and the skill employee These papers address the developers skill set and the
performance are differ one with another. performance according to the contribution of
development phases. These developmental phase
activity performance are differing one with another
based on the domain and the skill set. In the internal
phase activities are consider high determining factor for
the developmental activity .While evaluating the
performance the performance factors are differ as per the
number of phases and the skill set of the employee. This
concludes that the business environment evaluation and
industrial service could be dome through the analysis of

3
Integrated Intelligent Research(IIR) International Journal of Business Intelligent
Volume: 01 Issue: 01 June 2012,Pages No.11-14
ISSN: 2278-2400
employee involvement in the software development. The simplification of product quality evaluation,” Computers & Industrial
Engineering, Vol. 43, No. 4. Pp 661-676.
roughest theory implementation carried out as proposed [9] Li, R., and Wang, Z.O. 2004, “Employees’ behaviors,”European
model of determination of variable set. Journal of Operational Research, Vol. 157,No. 2. Pp439-448.
[10] Grzymala, J. and Siddhave, S. Rough set Approach to Rule Induction
from Incomplete Data. Proceeding of the IPMU’2004, the10th
As per the number of activities in the different phases International Conference on information Processing and Management
and their performance the chart is presented below. of Uncertainty in Knowledge-Based System. 2004.
[11] Wang, X., Yang J. Teng X., Xiang, W., Jensen, R., Feature selection
based on Rough Sets and particle swarm optimization. Pattern
recognition Letters 2007 , pp. 459-471

REFERENCES
[1] Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and
Ramasamy Uthurusamy, editors. Advances in knowledge discovery &
data mining, chapter 1, pages 1–36. MIT Press, Cambridge, MA, 1996.
[2] Pawlak, Z. “Rough classification,” International Journal of Man–
Machine Studies, Vol. 20, No. 5 Pp469–483. 1984.
[3] Pawlak Z. Rough Sets, Kluwer Academic Publishers. 1991
[4] Pawlak, Z., “Rough set and data analysis,” Proceedings of the
Asian11-14 Dec.. Pp1 – 6. 1996
[5] Pawlak, Z.., “Rough classification,” Int. J. Human-Computer Studies,
Vol. 51, No. 15. Pp369-383. 1999
[6] Pawlak, Z., 2005, “Rough sets and flow graphs,” Rough Sets, Fuzzy
Sets, Data Mining and Granular Computing, LNAI Vol. 3641. Pp1-11.
[7] Jackson, A.G., Leclair, S.R., Ohmer, M.C., Ziarko, W. and Al-kamhwi,
H. 1996, “Rough sets applied to materials data,” ACTAMater, Vol. 44,
No. 11. Pp4475-4484.
[8] Zhai, L.Y., Khoo, L.P., and Fok, S.C. 2002, “Feature extraction using
rough set theory and generic algorithms an application for the

You might also like