Providence Health: Scaling ML/AI Projects with Databricks Mosaic AI

Providence Health’s extensive network spans 50+ hospitals and numerous other facilities across multiple states, presenting many challenges in predicting patient volume and daily census within specific departments. This information is critical to making informed decisions about short-term and long-term staffing needs, transfer of patients, and general operational awareness.  In the early stages of Databricks adoption, […]

Continue Reading

Enhancing, improving and productionisation of LLM powered data governance

Introduction In the initial article, LLM Powered Data Classification, we addressed how we integrated Large Language Models (LLM) to automate governance-related metadata generation. The LLM integration enabled us to resolve challenges in Gemini, such as restrictions on the customisation of machine learning classifiers and limitations of resources to train a customised model. Gemini is a […]

Continue Reading

Season’s Speedings: Databricks SQL Delivers 4x Performance Boost Over Two Years

As the season of giving approaches, we at Databricks have been making our list and checking it twice–but instead of toys and treats, we’ve been wrapping up powerful performance improvements for our users. Through analyzing billions of production queries and listening closely to our community’s wishes, we’re excited to deliver a package of enhancements that […]

Continue Reading

Announcing the General Availability of Materialized Views and Streaming Tables for Databricks SQL

We’re excited to announce that materialized views (MVs) and streaming tables (STs) are now Generally Available in Databricks SQL on AWS and Azure. Streaming tables offer simple, incremental ingestion from sources like cloud storage and message buses with just a few lines of SQL. Materialized views precompute and incrementally update the results of queries so […]

Continue Reading

Aimpoint Digital: Leveraging Delta Sharing for Secure and Efficient Multi-Region Model Serving in Databricks

When serving machine learning models, the latency between requesting a prediction and receiving a response is one of the most critical metrics for the end user. Latency includes the time a request takes to reach the endpoint, be processed by the model, and then return to the user. Serving models to users that are based […]

Continue Reading

How we reduced peak memory and CPU usage of the product configuration management SDK

Introduction GrabX is Grab’s central platform for product configuration management. It has the capacity to control any component within Grab’s backend systems through configurations that are hosted directly on GrabX. GrabX clients read these configurations through an SDK, which reads the configurations in a way that’s asynchronous and eventually consistent. As a result, it takes […]

Continue Reading

Announcing General Availability: Publish to Microsoft Power BI Service from Unity Catalog

We’re excited to announce the General Availability of Publish to Microsoft Power BI Service from Unity Catalog, an integration that makes it easy to create Power BI web reports from your Unity Catalog data in just a few clicks. This feature enables seamless catalog integration and data model sync, allowing you to publish datasets directly […]

Continue Reading

Implementing Star Schema in Databricks

We are updating this blog to show developers how to leverage the latest features of Databricks and the advancements in Spark. Most data warehouse developers are very familiar with the ever-present star schema. Introduced by Ralph Kimball in the 1990s, a star schema is used to denormalize business data into dimensions (like time and product) […]

Continue Reading

Turbocharging GPU Inference at Logically AI

Founded in 2017, Logically is a leader in using AI to augment clients’ intelligence capability. By processing and analyzing vast amounts of data from websites, social platforms, and other digital sources, Logically identifies potential risks, emerging threats, and critical narratives, organizing them into actionable insights that cybersecurity teams, product managers, and engagement leaders can act […]

Continue Reading

LLM-assisted vector similarity search

Introduction As the complexity of data retrieval requirements continue to grow, traditional search methods often struggle to provide relevant and accurate results, especially for nuanced or conceptual queries. Vector similarity search has emerged as a powerful technique for finding semantically similar information. It refers to finding vectors in a large dataset that are most similar […]

Continue Reading