To install StudyMoose App tap and then “Add to Home Screen”
Save to my list
Remove from my list
Online Discussion Forums (ODF) has become more popular among the virtual community. Scheming the semantic significance between questions and answers is essential for the detection of precise answers in ODF. Traditional methods of modeling semantic relevancy lead to the sparsity of the word features due to the limitation of sentence length in short texts. This research proposes a deep belief network model for modeling the semantic significance between non-factoid question-answer pairs in the form of short texts. The model is capable of improving its performance on detecting the textual similarity between question and answer pairs without any hand annotating work.
The experimental results show that proposed deep belief network model outperforms the traditional approaches used in the forum corpora.
Question answering (QA) problem has attracted much consideration in natural language processing (NLP) field.QA itself can be divided into factoid QA and non-factoid QA. (Nam and Claudia, 2018) Many of the QA researches mostly focus on detecting the exact answer to a given factoid question.
Hence there is a lack of automatic QAS which perform precisely when answering to a non-factoid question. For resolving non-factoid questions, the user generated question and answer pairs having a great significance. (Nam and Claudia, 2018) These natural QA pairs are normally created during the communication via social media, online forums, community-based question answering (cQA) web sites etc.
ODF is an e-learning platform that provides students with privilege to post questions and answers to the discussion threads. (Alabo and Emmah, 2014) According to Baoxun et al., 2010, the data in the ODF have two noticeable features.
Semantic similarity is a metric which used to measure the extent of similarity of meaning between two concepts.
The approaches being used to calculate semantic similarity can be classified as corpus-based models and knowledge-based models.
Traditional methods of modeling semantic relevancy lead to the sparsity of the word features due to the limitation of sentence length in short texts in ODF. Textual features and word co-occurrence features that commonly used in factoid answer quality prediction process are irrelevant to discussion forums. (Ze et al., 2015) Hence there is a necessity of proposing an operative methodology to model semantic relevancy in non-factoid question answer pairs while overcoming above problems.
Since the user formed non-factoid questions and their answers in the ODF are always having short texts, this research is considering about a method of modeling the semantic similarity between two short texts. The objective is to presents a deep belief network (DBN) to model the semantic relevance between question-answer pairs.
The significance of the naturally generated question and answer pairs has not been familiar with many of the researchers until recent years. Most of the researches carried on QA domain focused on below main steps. First, a methodology to rank the answer 3095625187642500quality. Second, a methodology to find available similar questions when a new question given (Jeon et al. 2006).Third, a methodology to find the best answer providers in a given community network (Liu et al., 2005).Jeon et al., 2006 presented a framework for calculating the quality of answers in ODF.A translation based approach to rank candidate answers have been developed by Bernhard and Gurevych (2009).The DBN model based approach presented in this paper is partly similar to the methodology proposed by Surdeanu et al. (2008). Both Ding et al. (2008) and Cong et al. (2008) proposing a methodology for detection of question and answers contexts using ranking algorithm and conditional random fields. Many of the researches done in recent years are using the social features to improve the performance in semantic modeling. (Jeon et al.2006 and Hong et al.2009) Lexical gap between question and answers is one of the central problems. Statistical machine translation (SMT) approach has become a common solution for filling this lexical gap.
Deep Belief Network (DBN) is a generative graphical model that composed from multiple layers of hidden variables along with connections between the layers. By reconstructing a great number of questions and candidate answers, DBN is modeling the semantic relationship between those questions and their answers while learning the high-level hidden semantic representation from the answers.
Restricted Boltzmann machine (RBM), the building blocks of DBN is a neural network which consists of two layers. The two layers in RBM can be used to model group of binary vectors. Dimension reducing approach based on RBM shows worthy performance. Further DBN model is able to gain the hidden semantic information inside the word count vectors.
Figure 1 shows the simple structure of RBM. The bottom layer represents visible or input vector v, while the top layer represents the hidden feature h which are observed as feature detectors. The matrix of weights W contains the connection weights between the visible units and the hidden units. When input the vector v, the trained RBM model provides a hidden feature h, this can be used to restructure the vector v with a minimum error.
The proposing DBN trained by reconstructing the question using its answers since the answer has strong semantic link with the question and comprises more information than the question. The main objective of training the model is to minimize the error of reconstruction.
The DBN with RBM for QA Pairs
Figure 2 shows the design of the DBN model. The model consists of three layers and each layer represents a RBM. First the binary feature vectors are creating based on the word occurrence measurements in the visible answer vector. Then they are used in the bottom layer to calculate the hidden features in the hidden units of the RBM. Next restructure the question vector using the above set of hidden features. This processes can be modeled as follows:
P (h j = 1|a) = ? (b j + i wi jai) (1)
P (qi = 1|h) = ? (bi + j wi jh j) (2)
Where a is the visible feature vector of the answer and h is the hidden feature vector for reconstructing the questions. hj is the jth element of the hidden feature vector and qi is the ith element of the question vector. wi j is a the symmetric interaction between word i and hidden feature j. bi stands for the bias of the model for word i, and b j denotes the bias of hidden feature j.
The bottom layer generates the hidden features using equation 1 for the input answer vectors in training dataset. Along with the hidden features, the second equation is used to recreate the Bernoulli rates for each word in the question vectors. The equation 1 is once again executing to active hidden features. The training steps of middle and top layers are similar with bottom layer. During the pre-training stage, the lowermost layer is greedily pre-trained for 200 passes through the entire training set, and each of the rest two layers is greedily pre-trained for 50 passes. It is required to fine tune the weights of the DBN network by applying cross entropy error function and gradient descent optimization algorithm for optimal renewal. After input the vectors of the question and its candidate answers, this model performs a level-by-level calculation to gain the equivalent feature vectors. Next compute the distance between the mapped question vector and each candidate answer vector. The candidate answer with the smallest distance is the best answer for the given question.
Due to the feature sparsity it is essential to minimize the dimension of the feature vectors. As a solution, we implement two types of binary word features.
Word frequency: The words in the corpora chosen based on a professional dictionary to confirm that they are meaningful in the computer knowledge field. Then rank all the selected words according to their frequencies.
Occurrence of several function words: For the non- factoid questions, the function words are relatively meaningful for determining whether a short text is the answer or not.
Question: The screen of my computer turns black after booting. I can't enter bios. What should I do now?
Answer: You may firstly unplug the hard disc. Then try to boot. If it works, then you should set your CD-ROM as the boot driver in bios and reinstall the system
Question: The screen of my computer turns black after booting. I can't enter bios. What should I do now?
Answer: You may firstly unplug the hard disc. Then try to boot. If it works, then you should set your CD-ROM as the boot driver in bios and reinstall the system
An example for function word selection
The dataset is a forum corpus provided by Camegie Mellon University, Pittsburgh which contains manually generated non-factoid questions and answers. We extracted 5000 human generated QA pairs as the training set without any manual work to label the best answers since the best answer is nominated by the owner of the question. For the testing dataset, we randomly selected 500 threads from the dataset and manually label the best answers from forum threads. Three main semantic relevance computing methods that are more frequently used by many of the researchers to rank the candidate answers are considered as reference lines in this research.
Precision (P) and Mean Reciprocal Rank (MRR) relevancy measuring methods are used for the performance evaluation of the DBN model over the above discussed reference line methods.
Relevancy measures on reference line methods and DBN
Table 1 lists the results gained on the forum dataset using the above reference lines methods and DBN model. The results showing a significant difference when applying fine-tuning. The reason for the progresses in the DBN model over the reference line methods is fine tuning and training the model to learn the semantic relationship between the words in Question and answer pairs from the training set. The reference line results are showing that the traditional semantic relevance methods can barely achieve good results.
The experimental results indicate that DBN model perform well over the traditional approaches. Most of the times, the content words in the questions are generally morphologically dissimilar with the words with the identical meaning in the answers. This will cause for the less performance in both cosine similarity method and KL- divergence. Compared to the cosine similarity method, KL- divergence achieved more perfection in precision and MRR. HowNet is unable to calculate the semantic similarity between QA pairs with high precision. Future works will be carried out to improve the performance of proposed DBN model by adopting the both textual and non-textual features for modeling semantic relevance in non-factoid QA pairs.
Online Discussion Forums. (2019, Dec 11). Retrieved from https://studymoose.com/online-discussion-forums-essay
👋 Hi! I’m your smart assistant Amy!
Don’t know where to start? Type your requirements and I’ll connect you to an academic expert within 3 minutes.
get help with your assignment