Hire Writer

Online Discussion Forums

Categories: Artificial Intelligence Internet Network

Download

Essay, Pages 8 (1802 words)

Views

245

Abstract

Online Discussion Forums (ODF) has become more popular among the virtual community. Scheming the semantic significance between questions and answers is essential for the detection of precise answers in ODF. Traditional methods of modeling semantic relevancy lead to the sparsity of the word features due to the limitation of sentence length in short texts. This research proposes a deep belief network model for modeling the semantic significance between non-factoid question-answer pairs in the form of short texts. The model is capable of improving its performance on detecting the textual similarity between question and answer pairs without any hand annotating work.

Don't use plagiarized sources. Get your custom essay on

“ Online Discussion Forums ”

Get custom paper

NEW! smart matching with writer

The experimental results show that proposed deep belief network model outperforms the traditional approaches used in the forum corpora.

Introduction

Question answering (QA) problem has attracted much consideration in natural language processing (NLP) field.QA itself can be divided into factoid QA and non-factoid QA. (Nam and Claudia, 2018) Many of the QA researches mostly focus on detecting the exact answer to a given factoid question.

Prof. Finch

Verified writer

Proficient in: Artificial Intelligence

4.7 (346)

“ This writer never make an mistake for me always deliver long before due date. Am telling you man this writer is absolutely the best. ”

+84 relevant experts are online

Hire writer

Hence there is a lack of automatic QAS which perform precisely when answering to a non-factoid question. For resolving non-factoid questions, the user generated question and answer pairs having a great significance. (Nam and Claudia, 2018) These natural QA pairs are normally created during the communication via social media, online forums, community-based question answering (cQA) web sites etc.

ODF is an e-learning platform that provides students with privilege to post questions and answers to the discussion threads. (Alabo and Emmah, 2014) According to Baoxun et al., 2010, the data in the ODF have two noticeable features.

A question or answer usually includes a very short content and an informal tone tends to be used by users.
Most of the posts are useless, which makes the forums become a noisy environment for question and answer detection.

Semantic similarity is a metric which used to measure the extent of similarity of meaning between two concepts.

Get to Know The Price Estimate For Your Paper

Topic

Deadline: 10 days left

Number of pages

Email Invalid email

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email

"You must agree to out terms of services and privacy policy"

Write my paper

You won’t be charged yet!

The approaches being used to calculate semantic similarity can be classified as corpus-based models and knowledge-based models.

Research Problem

Traditional methods of modeling semantic relevancy lead to the sparsity of the word features due to the limitation of sentence length in short texts in ODF. Textual features and word co-occurrence features that commonly used in factoid answer quality prediction process are irrelevant to discussion forums. (Ze et al., 2015) Hence there is a necessity of proposing an operative methodology to model semantic relevancy in non-factoid question answer pairs while overcoming above problems.

Objective

Since the user formed non-factoid questions and their answers in the ODF are always having short texts, this research is considering about a method of modeling the semantic similarity between two short texts. The objective is to presents a deep belief network (DBN) to model the semantic relevance between question-answer pairs.

Related Works

The significance of the naturally generated question and answer pairs has not been familiar with many of the researchers until recent years. Most of the researches carried on QA domain focused on below main steps. First, a methodology to rank the answer 3095625187642500quality. Second, a methodology to find available similar questions when a new question given (Jeon et al. 2006).Third, a methodology to find the best answer providers in a given community network (Liu et al., 2005).Jeon et al., 2006 presented a framework for calculating the quality of answers in ODF.A translation based approach to rank candidate answers have been developed by Bernhard and Gurevych (2009).The DBN model based approach presented in this paper is partly similar to the methodology proposed by Surdeanu et al. (2008). Both Ding et al. (2008) and Cong et al. (2008) proposing a methodology for detection of question and answers contexts using ranking algorithm and conditional random fields. Many of the researches done in recent years are using the social features to improve the performance in semantic modeling. (Jeon et al.2006 and Hong et al.2009) Lexical gap between question and answers is one of the central problems. Statistical machine translation (SMT) approach has become a common solution for filling this lexical gap.

Deep Belief Networks for QA pairs

Deep Belief Network (DBN) is a generative graphical model that composed from multiple layers of hidden variables along with connections between the layers. By reconstructing a great number of questions and candidate answers, DBN is modeling the semantic relationship between those questions and their answers while learning the high-level hidden semantic representation from the answers.

Restricted Boltzmann Machine

Restricted Boltzmann machine (RBM), the building blocks of DBN is a neural network which consists of two layers. The two layers in RBM can be used to model group of binary vectors. Dimension reducing approach based on RBM shows worthy performance. Further DBN model is able to gain the hidden semantic information inside the word count vectors.

Restricted Boltzmann Machine

Figure 1 shows the simple structure of RBM. The bottom layer represents visible or input vector v, while the top layer represents the hidden feature h which are observed as feature detectors. The matrix of weights W contains the connection weights between the visible units and the hidden units. When input the vector v, the trained RBM model provides a hidden feature h, this can be used to restructure the vector v with a minimum error.

DBN model pre-training process

The proposing DBN trained by reconstructing the question using its answers since the answer has strong semantic link with the question and comprises more information than the question. The main objective of training the model is to minimize the error of reconstruction.

The DBN with RBM for QA Pairs

Figure 2 shows the design of the DBN model. The model consists of three layers and each layer represents a RBM. First the binary feature vectors are creating based on the word occurrence measurements in the visible answer vector. Then they are used in the bottom layer to calculate the hidden features in the hidden units of the RBM. Next restructure the question vector using the above set of hidden features. This processes can be modeled as follows:

P (h j = 1|a) = ? (b j + i wi jai) (1)

P (qi = 1|h) = ? (bi + j wi jh j) (2)

Where a is the visible feature vector of the answer and h is the hidden feature vector for reconstructing the questions. hj is the jth element of the hidden feature vector and qi is the ith element of the question vector. wi j is a the symmetric interaction between word i and hidden feature j. bi stands for the bias of the model for word i, and b j denotes the bias of hidden feature j.

The bottom layer generates the hidden features using equation 1 for the input answer vectors in training dataset. Along with the hidden features, the second equation is used to recreate the Bernoulli rates for each word in the question vectors. The equation 1 is once again executing to active hidden features. The training steps of middle and top layers are similar with bottom layer. During the pre-training stage, the lowermost layer is greedily pre-trained for 200 passes through the entire training set, and each of the rest two layers is greedily pre-trained for 50 passes. It is required to fine tune the weights of the DBN network by applying cross entropy error function and gradient descent optimization algorithm for optimal renewal. After input the vectors of the question and its candidate answers, this model performs a level-by-level calculation to gain the equivalent feature vectors. Next compute the distance between the mapped question vector and each candidate answer vector. The candidate answer with the smallest distance is the best answer for the given question.

Feature Selection

Due to the feature sparsity it is essential to minimize the dimension of the feature vectors. As a solution, we implement two types of binary word features.

Word frequency: The words in the corpora chosen based on a professional dictionary to confirm that they are meaningful in the computer knowledge field. Then rank all the selected words according to their frequencies.

Occurrence of several function words: For the non- factoid questions, the function words are relatively meaningful for determining whether a short text is the answer or not.

Question: The screen of my computer turns black after booting. I can't enter bios. What should I do now?

Answer: You may firstly unplug the hard disc. Then try to boot. If it works, then you should set your CD-ROM as the boot driver in bios and reinstall the system

Question: The screen of my computer turns black after booting. I can't enter bios. What should I do now?

Answer: You may firstly unplug the hard disc. Then try to boot. If it works, then you should set your CD-ROM as the boot driver in bios and reinstall the system

An example for function word selection

Experimental Structure

The dataset is a forum corpus provided by Camegie Mellon University, Pittsburgh which contains manually generated non-factoid questions and answers. We extracted 5000 human generated QA pairs as the training set without any manual work to label the best answers since the best answer is nominated by the owner of the question. For the testing dataset, we randomly selected 500 threads from the dataset and manually label the best answers from forum threads. Three main semantic relevance computing methods that are more frequently used by many of the researchers to rank the candidate answers are considered as reference lines in this research.

Results and Analysis

Precision (P) and Mean Reciprocal Rank (MRR) relevancy measuring methods are used for the performance evaluation of the DBN model over the above discussed reference line methods.

Method P (%) MRR (%)
HowNet based similarity 22.95 25.87
Cosine Similarity 39.39 44.12
KL divergence 43.26 49.65
DBN (without Fine-Tuning) 48.24 58.09
DBN (with Fine-Tuning) 53.90 61.49

Relevancy measures on reference line methods and DBN

Table 1 lists the results gained on the forum dataset using the above reference lines methods and DBN model. The results showing a significant difference when applying fine-tuning. The reason for the progresses in the DBN model over the reference line methods is fine tuning and training the model to learn the semantic relationship between the words in Question and answer pairs from the training set. The reference line results are showing that the traditional semantic relevance methods can barely achieve good results.

Conclusions

The experimental results indicate that DBN model perform well over the traditional approaches. Most of the times, the content words in the questions are generally morphologically dissimilar with the words with the identical meaning in the answers. This will cause for the less performance in both cosine similarity method and KL- divergence. Compared to the cosine similarity method, KL- divergence achieved more perfection in precision and MRR. HowNet is unable to calculate the semantic similarity between QA pairs with high precision. Future works will be carried out to improve the performance of proposed DBN model by adopting the both textual and non-textual features for modeling semantic relevance in non-factoid QA pairs.