Grab Experiment Decision Engine – a Unified Toolkit for Experimentation

Introduction This article introduces the GrabX Decision Engine, an internal open-source package that offers a comprehensive framework for designing and analysing experiments conducted on online experiment platforms. The package encompasses a wide range of functionalities, including a pre-experiment advisor, a post-experiment analysis toolbox, and other advanced tools. In this article, we explore the motivation behind […]

Continue Reading

Turning observations into actionable insights for enhanced decision making

Introduction Iris (/ˈaɪrɪs/), a name inspired by the Olympian mythological figure who personified the rainbow and served as the messenger of the gods, is a comprehensive observability platform for Extract, Transform, Load (ETL) jobs. Just as the mythological Iris connected the gods to humanity, our Iris platform bridges the gap between raw data and meaningful insights, […]

Continue Reading

How to protect Data Exfiltration with Azure Databricks to help ensure Cloud Security

In the previous blog, we discussed how to securely access Azure Data Services from Azure Databricks using Virtual Network Service Endpoints or Private Link. Given a baseline of those best practices , in this article we walkthrough detailed steps on how to harden your Azure Databricks deployment from a network security perspective in order to […]

Continue Reading

Pyspark 2023: New Features and Performance Improvement

With the releases of Apache Spark 3.4 and 3.5 in 2023, we focused heavily on improving PySpark performance, flexibility, and ease of use. This blog post walks you through the key improvements. Here’s a rundown of some of the most important features added in Apache Spark 3.4 and 3.5 in 2023: Spark Connect introduces a […]

Continue Reading

State Reader API for Spark Structured Streaming on Databricks

Databricks Runtime 14.3 includes a new capability that allows users to access and analyze Structured Streaming‘s internal state data: the State Reader API. The State Reader API sets itself apart from well-known Spark data formats such as JSON, CSV, Avro, and Protobuf. Its primary purpose is facilitating the development, debugging, and troubleshooting of stateful Structured […]

Continue Reading

Implementing LLM Guardrails for Safe and Responsible Generative AI Deployment on Databricks

Introduction Let’s explore a common scenario – your team is eager to leverage open source LLMs to build chatbots for customer support interactions. As the model handles customer inquiries in production, it might go unnoticed that some inputs or outputs are potentially inappropriate or unsafe. And only in the midst of an internal audit—if you […]

Continue Reading

Announcing the General Availability of Databricks Feature Serving

Today, we are excited to announce the general availability of Feature Serving. Features play a pivotal role in AI Applications, typically requiring considerable effort to be computed accurately and made accessible with low latency. This complexity makes it harder to introduce new features to improve the quality of applications in production. With Feature Serving, you […]

Continue Reading

eBay’s Responsible AI Principles

eBay is committed to the responsible use of AI. We see unique opportunities to develop AI-powered customer tools and services, which must be implemented safely while meeting our community’s needs. We have adopted the following key principles: 1. Inclusivity, Equity, and Fairness: eBay strives to enable equitable and fair AI experiences  Building with an inclusivity […]

Continue Reading