Ensuring data reliability and observability in risk systems

Grab has an in-house Risk Management platform called GrabDefence which relies on ingesting large amounts of data gathered from upstream services to power our heuristic risk rules and data science models in real time. Fig 1. GrabDefence aggregates data from different upstream services As Grab’s business grows, so does the amount of data. It becomes imperative […]

Continue Reading

Building Enterprise GenAI Apps with Meta Llama 3 on Databricks

We are excited to partner with Meta to release the latest state-of-the-art large language model, Meta Llama 3, on Databricks. With Llama 3 on Databricks, enterprises of all sizes can deploy this new model via a fully managed API. Meta Llama 3 sets a new standard for open language models, providing both the community and […]

Continue Reading

Announcing General Availability of Next-Generation Lakeview Dashboards

The next generation of Databricks SQL (DBSQL) dashboards, also known as Lakeview Dashboards, is now generally available on AWS and Azure. This new dashboarding experience is optimized for ease of use, scalable and secure distribution, governance, and performance. “Lakeview dashboards have been critical to the latest product suite our team brought to market. We used […]

Continue Reading

Use Ray on Databricks for new scalable AI applications

We released Ray support public preview last year and since then, hundreds of Databricks customers have been using it for variety of use cases such as multi-model hierarchical forecasting, LLM finetuning, and Reinforcement learning. Today, we are excited to announce the general availability of Ray support on Databricks. Ray is now included as part of […]

Continue Reading

Accelerated DBRX Inference on Mosaic AI Model Serving

Introduction In this blog post we dive into inference with DBRX, the open state-of-the-art large language model (LLM) created by Databricks (see Introducing DBRX). We discuss how DBRX was designed from the ground up for both efficient inference and advanced model quality, we summarize how we achieved cutting-edge performance on our platform, and end with […]

Continue Reading

Grab Experiment Decision Engine – a Unified Toolkit for Experimentation

Introduction This article introduces the GrabX Decision Engine, an internal open-source package that offers a comprehensive framework for designing and analysing experiments conducted on online experiment platforms. The package encompasses a wide range of functionalities, including a pre-experiment advisor, a post-experiment analysis toolbox, and other advanced tools. In this article, we explore the motivation behind […]

Continue Reading

Turning observations into actionable insights for enhanced decision making

Introduction Iris (/ˈaɪrɪs/), a name inspired by the Olympian mythological figure who personified the rainbow and served as the messenger of the gods, is a comprehensive observability platform for Extract, Transform, Load (ETL) jobs. Just as the mythological Iris connected the gods to humanity, our Iris platform bridges the gap between raw data and meaningful insights, […]

Continue Reading

How to protect Data Exfiltration with Azure Databricks to help ensure Cloud Security

In the previous blog, we discussed how to securely access Azure Data Services from Azure Databricks using Virtual Network Service Endpoints or Private Link. Given a baseline of those best practices , in this article we walkthrough detailed steps on how to harden your Azure Databricks deployment from a network security perspective in order to […]

Continue Reading

Pyspark 2023: New Features and Performance Improvement

With the releases of Apache Spark 3.4 and 3.5 in 2023, we focused heavily on improving PySpark performance, flexibility, and ease of use. This blog post walks you through the key improvements. Here’s a rundown of some of the most important features added in Apache Spark 3.4 and 3.5 in 2023: Spark Connect introduces a […]

Continue Reading