Question-Answer based Enterprise Data Retrieval
Introduction
Modern enterprises have a substantial amount of data accumulated over time. This data may be structured, residing in databases and data warehouses, and also unstructured such as emails, office messages, memos and other documents. It is highly desirable to be able to query this data arbitrarily, and get the required information as quickly as possible in day to day enterprise operations. However, Data retrieval from different enterprise databases and text blobs is a highly complex task. For databases, it usually requires implementing queries that fetch data from database tables, join it together and display on a Web or Mobile user interface. Each such query requires implementation of end-to-end flow from a front-end Web-Page where the user selects columns, conditions and filters via drop-down menus etc. Furthermore, each specific query has to be implemented in the back-end as an API that retrieves data. For unstructured data e.g. text, it is an even more complex process since data has to be first converted to a form suitable for storage in a database before it can be queried.
This project aims to provide a simple question and answer based data retrieval interface that can be invoked via both textual, and speech based interfaces. The user may ask questions in natural language (english) by either typing or speaking, and get the relevant information either displayed textually in a chat-like interface, or played out as speech after converting the text to speech. It uses Speech-to-Text, Natural Language Processing (NLP), Knowledge Graphs (KG), and Text-to-Speech technologies. It supports several types of queries including simple database item retrieval, multiple filters and conditional queries, listings and multiple types of commonly used aggregations. As an additional feature, It also extracts questions from incoming emails and extracts their answers from past replies to automatically generate email responses, and then presents those responses to a user to approve them before sending to the sender.
Customer Story
Our customer is a supplier of grocery items to several grocery chains throughout the United States. It books orders from grocery stores, contacts suppliers and vendors for specific items, orders items in required quantities, and then ensures smooth order delivery while taking a markup as an agent. Their business is thus a middle-man that sits between large grocery store chains and vendors. They approached us via one of our Google Cloud consultant partners and asked for a solution that could facilitate their sales representatives to get information about orders, suppliers, inventory items, their current prices and quantities ordered etc. as quickly as possible with some filtering conditions such as items with price greater than a specific value etc. They wanted the interface to be preferably voice for convenience, but also text based as a fall back.
They also had more than three years of email data with specific questions and answers given in responses. They wanted to know if the answers to specific questions asked frequently in past emails could be retrieved automatically to generate answers and, presented to an email handling user such as an office secretary for approval, and then sent back to their customer. This would save a significant amount of time since, according to the customer, a secretary spent several hours sifting through about 100 or so emails daily to answer them based on information that has already been used in previous answers several times before.
Our Solution
We designed, developed and deployed an end-to-end solution to this problem on Google Cloud Platform (GCP). The customer’s Web application with its back-end database is already running on GCP. We added our Question-and-Answer application as an addon feature implemented as a Microservice exposing its API to the main application.
We first analyzed the relevant data tables from the customer’s database. Based on that analysis, we built a Knowledge Graph incorporating the relationships between different entities such as orders, items, vendors, dates, quantities, prices etc.
The user asked questions via speech input on a Wed-Page or by typing text in a text-box. In case of speech input, the captured speech is first converted to text using AI techniques. The resulting text is passed to an NLP Intent Classification Model that classifies the question into one of several predefined categories. We built query structures for each category or Intent to traverse and retrieve information from KG. Cypher Query Language was used as theQuery Language with Neo4j as the Graph Database. We also created placeholders in these query structures for the named entities coming from the question.
Following this classification model, we also trained a Named Entity Recognition (NER) model by using the data available in the customer’s database as the dataset and labeling the data as per the columns available in the database for different entities such as customer, vendor, item, sales representative etc.
After training these two models, it was easy to construct the relevant query from the incoming question. We passed the question’s text first through Intent Classifier and then through the NER model to extract entities and insert them in the query structure for that specific intent.
The answers are returned back to the client service that converts the textual answer to speech using Text-to-Speech AI algorithm, or just displays as text if the original question was given a text input.
For email Question Answering, we fine tuned a BERT based Transformer model on two years worth of email data to generate answers from email body text and attachments.
Key Technologies Used
-
- Knowledge Graphs and Ontologies
- Speech-to-Text
- CypherQL (Query Language for graphs)
- Internet Classification model to identify the category of the question
- NER model to extract entities from the question
Key Frameworks
-
- Neo4j (Graph Database)
- Whisper (Speech-to-Text)
- Huggingface Transformers (Intent Classification, NER and email question/answering models)
- Nemo (Nvidia Text-to-Speech)
Customer Benefits
This project saves a lot of development time for implementation of ad hoc queries when they are required by customer representatives involved in the field as sales agents. The representatives easily get valuable information by typing simple text or speak into a voice input interface, and get information required for their operations and decision making. Over fifty types of questions are supported with responses ranging from simple entity names and particulars to list of item records with relevant fields. If these queries are implemented through normal user interfaces, it would take a significant amount of time and continuously ongoing development by the Web application engineering team.