Microsoft Unveils Vector Search in Preview and Voice Cloning in Full Release
Microsoft's Inspire conference brought AI-centric announcements to the fore. The Vector Search, now in preview in Azure Cognitive Search, and the Voice Cloning feature stood out. These tools promise to refine data search and provide personalized natural language responses.

During the annual Inspire conference, tech colossus Microsoft unveiled a set of novel AI-powered features expected to enhance the capabilities of its Azure platform. The spotlight was on the Vector Search tool, now available in a preview format via Azure Cognitive Search. With machine learning at its core, Vector Search brings the promise of a quicker search experience, as it leverages its capabilities to apprehend the essence and relatedness of unstructured data, such as images and text.
The technique employed by Vector Search - vectorization - is gaining momentum within the field of search. It involves the transformation of words or images into a set of numbers known as vectors, which are representative of their meaning. This numeric representation permits mathematical processing and allows machines to make sense of and organize data. Consequently, machines can recognize analogous words within the 'vector space', like 'king' and 'queen', and promptly locate them within databases, comprising millions of words. This approach to vector searching has been adopted by many firms including Qdrant, SeMI Technologies, and other tech giants like Amazon and Google.
In differentiating itself from its competitors, Microsoft's vector search approach includes pure vector search, hybrid recovery, and advanced reranking. The company posits that its vector search tool can be employed in apps and services to deliver personalized responses in natural language, offer product suggestions, and help identify patterns in the data. Moreover, benefits of this system include building search-integrated, chat-based apps, converting images into vector representations with Azure AI Vision, and retrieving relevant information from large data-sets to aid process and workflow automation. Integration of Vector Search further stretches to other Azure Cognitive Search capabilities, among them faceted navigation and filters.
Further illuminating the AI landscape, Microsoft is rolling out the Document Generative AI solution. This feature amalgamates Microsoft's existing AI-powered document processing services - including Azure Form Recognizer - with the Azure OpenAI Service. This service is a facet of Microsoft's fully managed, enterprise-focused offering intended to provide businesses with AI technology from OpenAI. Microsoft's ongoing commercial partnership with OpenAI has been instrumental in supplementing controls and governance features to the tech.
Acting on OpenAI’s latest AI language models, the Document Generative AI solution processes files for tasks such as summarizing reports, extracting values, mining knowledge, and generating novel types of documents. It also acts as a foundation for responses, similar to OpenAI's ChatGPT. For instance, the Document Generative AI solution permits customers to upload invoices, contracts, bills, and enable employees to query about service guarantees and specific line items. The solution also provides answers in text format, images, or tables, while furnishing citations with a link to the original content.
Microsoft added that the Document Generative AI solution's capabilities can be leveraged for natural language interactions with documents and content generation activities. These include newsletters, blog posts, summaries, captions, etc. Microsoft states that the solution supports functions such as intelligent document chat capabilities, writing assistance, comprehensive search functionality, query support, document translation, and more. All these complex and diverse document tasks are handled by the models from OpenAI.
In a linked disclosure, Microsoft announced that OpenAI’s Whisper model, an automatic speech recognition model, will soon be integrated into Microsoft’s family of AI speech services and the Azure OpenAI Service. Enterprise customers are set to gain the capability to transcribe and translate audio content, as well as generate batch transcriptions on a larger scale.
Among other major announcements at Inspire, Microsoft declared the offer of public preview for Real-time Diarization, an AI-driven speech service capable of identifying who among several people is speaking in real time. Furthermore, Microsoft broadened the accessibility of Custom Neural Voice, an AI tool that can closely mimic an actor’s voice or create original synthetic voices. Previously, access to this feature was restricted. However, Microsoft now requires customers to apply and earn approval to use the feature. Additionally, customers must gain the voice talent's consent and agree to a code of conduct to use Custom Neural Voice.
Microsoft also provides watermarking and detection tools designed to facilitate the identification of audio clips created using Custom Neural Voice. However, these tools alone cannot conclusively solve the licensing and consent issues associated with voice cloning tech. Nevertheless, Microsoft has decided that it will not be part of the fight associated with this matter.
While tools like Vector Search and Custom Neural Voice are transforming the tech world, platforms such as AppMaster, recognised as a High Performer in No-code Development Platforms by G2, are appealing to users looking to create backend, web and mobile applications with minimum coding. In the rapidly changing tech landscape, it will be fascinating to see how AI functionalities continue to evolve and shape our future.


