Elevating Enterprise Search: Leveraging Azure AI Search and Open AI for Next-Gen Information Retrieval

Published

Imagine this: You are in a management meeting, and the CEO is asking for an update on recent app usage statistics across cohorts compared to the timing of recent app feature releases. Instead of digging through reports to find an answer (which you might have to do by taking the question “offline”), you enter a question into an enterprise chatbot. Within seconds, you receive a detailed answer, pulling in data from multiple sources, allowing you to further the discussion.

Or imagine you are a doctor working for a large hospital and treating a patient with some abnormal symptoms. Instead of taking the time to search through medical files looking for similar cases, or journals discussing these symptoms, you “consult” with an intelligent medical bot, where you can describe the medical issue in detail, and it comes back with a series of diagnostic tests to aid you in creating a differential diagnosis, some prior patient cases with similarities, and the contact information of a physician who has worked on similar cases to confer with.

This is the promise of a next-gen search experience with Azure AI Search and OpenAI.

The Promise of AI Search

By now, you have tested ChatGPT and other large language models and have been impressed with their capabilities. You are also looking into ways that generative AI can be used to address complex business problems and user queries.

Generative AI is impressive in the way that it allows for natural language processing of questions and requests and provides responses and answers back in natural language. It’s leaps and bounds beyond the typical search experience most users are used to today which is keyword based.

According to recent studies, the average employee wastes 2 hours per day or 20% of their workweek just searching for information. And that’s because traditional keyword searches need to match exactly what the user is searching for. If you search for files related to trade show handouts, the phrases you use in your search matter. For example, if you search for the term “handouts”, but the file you are looking for is called “brochure” or “hand out”, your search query fails. You might try other search techniques to find the file, or you might give up and just create a new one from scratch.

Generative search is different because it uses a form of search called vector searching. The core concept is that words and data are turned into “number vectors”, with similar concepts and data having values that are close together, and dissimilar concepts having values that are further apart.

Without getting too complex, imagine going to a library and looking for a book on medieval warfare. You might “chat” with a librarian who uses the Dewey decimal system to provide a listing of books that are closely related to the topic you are researching and avoiding topics that are “further away.” Now imagine that this library has every book ever written and a vector-based organization system that can find related concepts across every book, phrase, and word and you can see the type of scale of knowledge at play.

LLMs Alone Are Not the Answer

On its own large language models like OpenAI are not the full answer to building powerful enterprise search experiences. The way most models work is that they are trained on billions of data points, which takes time. OpenAI has a training cutoff of October 2023 in most cases. So, it is good for asking questions and getting responses based on generally and publicly available knowledge. But it is no good when you want to provide organization specific, relevant, up to date information in your responses to users or employees.

Retrieving Specialized Knowledge with Retrieval Augmented Generation

Retrieval Augmented Generation (RAG) is a technique that enhances AI-generated responses by combining a language model’s capabilities with real-time information retrieved from additional sources.

A generative AI query is like going to a hotel concierge and asking them for their opinion about the best restaurants in town. They will use their past knowledge, dining experience, and feedback from other customers to give their recommendations. A RAG query is similar, but the concierge also takes the time to look up more recent restaurant reviews, checks the city food inspector database for any violations, and checks the restaurants online reservation system to see if they have space available before making her final recommendations.

Combining Generative AI and Large Language Models with a Retrieval Augmented Generation system allows for incredibly powerful search and informational retrieval experiences that can support use cases like ultra intelligent customer support bots, legal research, medical diagnosis support, e-commerce recommendation systems, business intelligence, financial advisory, and more.

Azure AI Search and OpenAI Integration Make These Scenarios Possible

Azure AI Search is a powerful enterprise intelligence platform that allows organizations to pull together various information sources and expose it to employees and users in a secure manner for search functionality. But by integrating OpenAI with it, enterprises can build natural language experiences like chatbots that understand their users' requests and then use their internal knowledge to provide more up-to-date, relevant, and accurate information. There are many advantages to using this combination to provide this functionality.

Enterprise Adoption and Trust of Azure: Microsoft Azure is already used by many organizations as the backbone of their data infrastructure, providing a ready-to-integrate platform to work from.
Market Acceptance of OpenAI: OpenAI is at this point the largest and most trusted generative AI model and its close relationship with Microsoft provides more resources to ensure interoperability.
Safety and Compliance: OpenAI provides APIs that prevent sensitive data from being “ingested” into its training models, and Azure provides a strong security structure to safeguard data from being exposed to users who do not have access rights to it.

Overall Benefits of Integrating Azure AI Search and Open AI:

Enhanced Accuracy and Up-to-Date Information: Users can get access to information that is current, bypassing the limitations of knowledge cutoff dates. It also reduces AI “hallucinations” - the process is less likely to make up information when it cannot find answers in its training data.
Improved Contextual Relevance: The information users find is going to be more relevant to their specific questions, and with datasets that are integrated that are closer to their role or need, the search query will be able to provide the answers they need.
Enhanced User Experience: Users will be able to change their search behaviour as they become more familiar with using a chat interface to find information. Instead of searching for information in specialized data sources by specific keywords, and “hunting” for information, they will ask straightforward questions and get direct answers in one single interface.
Personalization: With proper data classification, set up, and training, data can be personalized to users in a way that keeps their queries private and secure. For example, an employee could ask the chat interface questions about how their health situation is impacted by recent employer healthcare plan changes, without the chatbot bringing in other employee’s confidential data into the query.

AI Search Readiness Engagement: A Step-by-Step Plan

Our approach to AI Search Readiness, is like our guidance for CoPilot Readiness, with the following stages recommended:

Consolidate: Ensure all the data you want to expose to search is available to Azure AI Search or available to be accessed through integrations.
Clean: Audit, organize, and remove outdated content. This is a good time to review unstructured files and data and remove outdated material. (i.e. get rid of handouts from over 10 years ago)
Secure: Ensure that only those team members who need to search for data are granted access, and that their access is limited to the areas required by their role.
Label: Implement data classification labels so that specific data types are blocked from specific users or eliminated from search indexing altogether.
Train: Train users on how to get the best results from your new search approaches. They are used to keyword based search, but they can be much more effective with natural language-based queries and questions.
Governance: Devise policies and procedures to ensure ongoing compliance with security policies.

Conclusion

Integrating Azure AI Search with OpenAI's generative AI capabilities marks a transformative step in enterprise information retrieval. This powerful combination addresses the limitations of traditional keyword-based searches and standalone large language models by providing up-to-date, contextually relevant information through natural language interactions. Retrieval Augmented Generation (RAG) enhances the AI's ability to deliver precise answers by accessing real-time data, reducing inefficiencies like the average two hours per day employees spend searching for information.

If you would like to learn more about Azure AI Search and Open AI’s potential for your organization, watch our on-demand webinar "Supercharge your Search". We discuss these concepts and a real-world example with the City of Windsor’s website.