Azure Cosmos DB joins the AI toolchain

Microsoft CEO Satya Nadella described the arrival of large language AI models like GPT-4 as “a Mosaic moment,” comparable to the arrival of the first graphical web browser. Unlike that original Mosaic moment, when Microsoft was late to the browser wars and was forced to buy its first web development tooling, the company has taken a pole position in AI, rapidly rolling out AI technologies across enterprise and consumer products.

One key to understanding Microsoft is its view of itself as a platform company. That internal culture drives it to deliver tools and technologies for developers and foundations on which developers can build. For AI, that starts with the Azure OpenAI APIs and extends to tools like Prompt Engine and Semantic Kernel, which simplify the development of custom experiences on top of OpenAI’s Transformer-based neural networks.

As a result, much of this year’s Microsoft Build developer event is focused on how you can use this tooling to build your own AI-powered applications, taking cues from the “copilot” model of assistive AI tooling that Microsoft has been rolling out across its Edge browser and Bing search engine, in GitHub and its developer tooling, and for enterprises in Microsoft 365 and the Power Platform. We’re also learning where Microsoft plans to fill in gaps in its platform and make its tooling a one-stop shop for AI development.

LLMs are vector processing tools

At the heart of a large language model like OpenAI’s GPT-4 is a massive neural network that works with a vector representation of language, looking for similar vectors to those that describe its prompts and creating and refining the optimal path through a multidimensional semantic space that results in a comprehensible output. It’s similar to the approach used by search engines, but where search is about finding similar vectors to those that answer your queries, LLMs extend the initial set of semantic tokens that make up your initial prompt (and the prompt used to set the context of the LLM in use). That’s one reason why Microsoft’s first LLM products, GitHub Copilot and Bing Copilot, build on search-based services, as they already use vector databases and indexes, providing context that keeps LLM responses on track.

Unfortunately for the rest of us, vector databases are relatively rare, and they are built on very different principles from familiar SQL and NoSQL databases. They’re perhaps best thought of as multi-dimensional extensions of graph databases, with data transformed and embedded as vectors with direction and size. Vectors make finding similar data fast and accurate, but they require a very different way of working than other forms of data.

If we’re to build our own enterprise copilots we need to have our own vector databases, as they allow us to extend and refine LLMs with our domain-specific data. Maybe that data is a library of common contracts or decades worth of product documentation, or even all your customer support queries and answers. If we could store that data in just the right way, it could be used to build AI-powered interfaces to your business.

Copyright © 2023 IDG Communications, Inc.

Source :

Leave a Comment

SMM Panel PDF Kitap indir