Early subsystems ND The number of modified

Early
Studies on bug prediction focused on software complexity measured in lines of
code We aim to predict whether a particular file associated with a change is
buggy or not. Traditionally All techniques follow the following steps:

 

1)    Training Data Extraction : For each change, label
it as buggy or clean by mining a project’s revision history and issue tracking
system. Buggy change means the change contains bugs (one or more), while clean
change means the change has no bug.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

2)    Feature Extraction : Extract the values of various
features from each change. Many different features have been used in past
change classification studies.

3)    Model Learning : Build a model by using a
classification algorithm based on the labeled changes and their corresponding
features.

4)    Model Application : For a new change, extract the
values of various features. Input these values to the learned model to predict
whether the change is buggy or clean.

 

 

14
basic change measures

 

Name

Description

NS

The
number of modified subsystems

ND

The
number of modified directories

NF

The
number of modified files

Entropy

Distribution
of modified code across each file

LA

Lines
of code added

LD

Lines
of code deleted

LT

Lines
of code in a file before the change

FIX

Whether
or not the change is a defect fix

NDEV

The
number of developers that changed the modified files

AGE

The
average time interval between the last and the current change

NUC

The
number of unique changes to the modified files

EXP

Developer
experience

REXP

Recent
developer experience

SEXP

Developer
experience on a subsystem

 

 

 

 

The
framework mainly contains two phases: a model building phase and a prediction
phase. In model building we aim to build a classifier using deep learning
techniques from historical changes with known label. In prediction phase we
utilize the framework to predict whether a incoming change is buggy or not.

 

Our
Framework extracts 14 basic features from a training set of changes that could
determine whether a code is buggy or not. We preform data preprocessing to
normalize and resample the data, as class of clean changes are more than buggy
samples we perform random undersampling on the data. Then a deep learning
network aka Deep Belief network(DBN) is used to generate and integrate more
advanced features, advanced features are linear combination of initial
features. Based on the advanced feautres we construct a logistic regression
model to predict whether change is buggy or not.

 

Data
preprocessing:

 

As the magnitudes of 14 features are not of
same order, we perform data normalization on these features. In this approach, the data is
scaled to a fixed range – usually 0 to 1.

The cost of having this bounded range – in contrast to standardization – is
that we will end up with smaller standard deviations, which can suppress the
effect of outliers.

A Min-Max scaling
is typically done via the following equation:

X_{norm}
= frac{X – X_{min}}{X_{max}-X_{min}}

 

Random
undersampling:

 

Random undersampling method randomly chooses
observations from majority class which are eliminated until the data
set gets balanced. This step is essential and important for defect prediction
because it helps the learned classifier not to be biased to the majority class
and thus it can improve the performance of the classifier.

 

RBN:

 

Boltzmann Machine is a stochastic recurrent neural network
with stochastic binary units and undirected edges between units. Unfortunately,
learning for Boltzmann machines is impractical and has a scalability issue. As
a result, Restricted Boltzmann Machine (RBM) has been introduced 10,which has
one layer of hidden units and restricts connections between hidden units. This
allows for more efficient learning algorithm 6.The structure of RBM is
depicted in the following figure:

 

 

 

 

 

 

 

Figure 3: The structure of RBM

 

 

Given these configurations, probability distributions over
hidden and/or visible units are defined in terms of the energy function:

 

                                         

                                      

Then,
the maximum likelihood learning algorithm can train the network by simply
alternating between updating all the hidden units in parallel and all the
visible units in parallel. To fasten the learning for a RBM, contrastive
divergence algorithm is used and the general idea is to update all the hidden
units in parallel starting with visible units, reconstruct visible units from
the hidden units, and finally update the hidden units again.

 

Deep
Belief Networks:

 

As
Deep Belief Networks (DBN) name indicates, it is multi-layer belief networks. Each
layer is Restricted Boltzmann Machine and they are stacked each other to
construct DBN. The first step of training DBN is to 8 learn a layer of features
from the visible units, using Contrastive Divergence (CD) algorithm. Then, the
next step is to treat the activations of previously trained features as visible
unites and learn features of features in a second hidden layer. Finally, the
whole DBN is trained when the learning for the final hidden layer is achieved.

Figure 5: Greedy learning for DBN This simple greedy learning algorithm works
for training DBN. This is because that training RBM using CD algorithm for each
layer looks for the local optimum and the next stacked RBM layer takes those
optimally trained values and again look for the local optimum. At the end of
this procedure, it is likely to get the global optimum as each layer
consistently trained to get the optimum value.