THE SEMANTIC ENGINE: A BLACK BOX

5 September 2019by Ellysse

How are semantic engines structured? What NLU tools are they composed of? Is there an infallible semantic engine? Starting from these questions, let’s try to explore the anatomy of semantic engines and understand how the latter conditions the configuration of virtual assistants.
With the resources we have available it is possible to observe the behavior of NLP engines in relation to the domains we supply. The performance evaluation phase can in fact be used to understand how a virtual agent is trained and able to classify intents of a given domain, but it can also be a useful tool to compare how different semantic engines work on identical domains and test sets. Or again, how certain semantic engines behave in relation to specific tasks with ad hoc built test sets.
Based on a comparative test we conducted, regarding the measurement of Google Dialogflow performance, BUP, IBM Watson, Microsoft LUIS, on 3 different domains (FAQs concerning car insurance, use of the registry electronic and the management of bulky waste collection) it was possible to measure the accuracy of semantic engines on the different domains, setting a zero classification threshold.
Setting up a medium-sized training set and test set for each domain, we tried to formulate sentences using also synonyms of words not present in the training set; Include sentences with typos; Consciously insert sentences that could easily create ambiguity between multiple intents.
Without any surprise, the domain that returns the best performance is the one with the largest training set. On average, the semantic engines that obtain the highest accuracy percentages on the three domains are BUP and Watson IBM, although Dialogflow stands out in particular as far as the third domain is concerned. LUIS Microsoft instead seems to recognize typing errors with more difficulty.
This is an example conducted on not very large test sets but shows how performance measurement can be an important factor not only in the evaluation phase of the chatbot itself, but also to try to put a foot in the heart of our virtual assistant. Choosing the semantic engine that best meets the needs of our chatbot is one of the most delicate steps in the configuration path, hence the importance of comparing it or evaluating its effectiveness in relation to more specific tasks.

    I have read the Privacy Notice and consent to have my personal data processed according to article no. 13, EU Regulation no. 679/2016 (GDPR).

    Copyright © 2018 Ellysse srl – All Rights Reserved | N° RI e P.IVA: 01981770355 | Iscrizione REA: RE 240501 | Privacy Policy | Cookie Policy | Sitemap

    Inspired by Iconic Srl