Enabling near real-time data analytics on the data lake

[ad_1] Introduction In the domain of data processing, data analysts run their ad hoc queries on the data lake. The lake serves as an interface between our analytics and production environment, preventing downstream queries from impacting upstream data ingestion pipelines. To ensure efficient data processing in the data lake, choosing appropriate storage formats is crucial. […]

Continue Reading

The journey of building a comprehensive attribution platform

[ad_1] The Grab superapp offers a comprehensive array of services from ride-hailing and food delivery to financial services. This creates multifaceted user journeys, traversing homepages, product pages, checkouts, and interactions with diverse content, including advertisements and promo codes. Background: Why ads and attribution matter in our superapp Ads are crucial for Grab in driving user […]

Continue Reading

Meet the Winners of the 5th eBay University Machine Learning Challenge

[ad_1] Five university students are headed to eBay for summer internships this year after claiming top prize at the 2023 eBay University Machine Learning Challenge. The annual competition asks students to dream up innovative solutions to real-world ecommerce problems, rewarding top teams with valuable work experience while recruiting promising new talent to the company. This […]

Continue Reading

Databricks adds new migration Brickbuilder Solutions to help customers succeed with AI

[ad_1] For the past two years, Databricks has collaborated with leading consulting partners to build innovative solutions for industry, migration, and data and AI use cases. Based on a foundation of proven customer deployments, Databricks Brickbuilder Solutions and Accelerators package together the experience and knowledge of our partners to help businesses unlock the full potential […]

Continue Reading

Lauren Wilcox Named 2023 ACM Distinguished Member

[ad_1] Today, we’re pleased to share that Lauren Wilcox, Sr. Director of Responsible AI at eBay, was named a 2023 ACM Distinguished Member by the Association for Computing Machinery (ACM). This prestigious recognition is awarded to those who have made significant contributions to the field of computing. Wilcox was nominated based on her research contributions […]

Continue Reading

Grab’s approach to content moderation

[ad_1] In the fast-paced world of on-demand delivery, maintaining safe marketplaces is a complex undertaking. Grab, a leading superapp in Southeast Asia, operates GrabFood and GrabMart, two popular marketplaces that connect consumers with a wide range of food and daily necessities. With more than 100k listings for different items updated daily by our merchants across eight different […]

Continue Reading

Rethinking Stream Processing: Data Exploration

[ad_1] Introduction In this digital age, companies collect multitudes of data that enable the tracking of business metrics and performance. Over the years, data analytics tools for data storage and processing have evolved from the days of Excel sheets and macros to more advanced Map Reduce model tools like Spark, Hadoop, and Hive. This evolution […]

Continue Reading

Announcing Ray Autoscaling support on Databricks and Apache Spark™

[ad_1] Ray is an open-source unified compute framework that simplifies scaling AI and Python workloads in a distributed environment. Since we introduced support for running Ray on Databricks, we’ve witnessed numerous customers successfully deploying their machine learning use cases, which range from forecasting and deep reinforcement learning to fine-tuning LLMs. With the release of Ray […]

Continue Reading

LLM Training and Inference with Intel Gaudi 2 AI Accelerators

[ad_1] At Databricks, we want to help our customers build and deploy generative AI applications on their own data without sacrificing data privacy or control. For customers who want to train a custom AI model, we help them do so easily, efficiently, and at a low cost. One lever we have to address this challenge […]

Continue Reading

Parameterized queries with PySpark | Databricks Blog

[ad_1] PySpark has always provided wonderful SQL and Python APIs for querying data. As of Databricks Runtime 12.1 and Apache Spark 3.4, parameterized queries support safe and expressive ways to query data with SQL using Pythonic programming paradigms. This post explains how to make parameterized queries with PySpark and when this is a good design […]

Continue Reading