How We Export Billion-Scale Graphs on Transactional Graph Databases

eBay’s GraphDatabase, NuGraph, benefits many eBay’s internal teams for real-time business decisions through relationship analysis. But as the graph dataset increases, it becomes more and more challenging to validate the graph data quality, check the relationship topology and understand the insight of the graph. For example, eBay’s internal biggest graph has more than 15 billion […]

Continue Reading

Python Dependency Management in Spark Connect

Managing the environment of an application in a distributed computing environment can be challenging. Ensuring that all nodes have the necessary environment to execute code and determining the actual location of the user’s code are complex tasks. Apache Spark™ offers various methods such as Conda, venv, and PEX; see also How to Manage Python Dependencies […]

Continue Reading

Graph modelling guidelines

Introduction Graph modelling is a highly effective technique for representing and analysing complex and interconnected data across various domains. By deciphering relationships between entities, graph modelling can reveal insights that might be otherwise difficult to identify using traditional data modelling approaches. In this article, we will explore what graph modelling is and guide you through […]

Continue Reading

Introducing Python User-Defined Table Functions (UDTFs)

Apache Spark™ 3.5 and Databricks Runtime 14.0 have brought an exciting feature to the table: Python user-defined table functions (UDTFs). In this blog post, we’ll dive into what UDTFs are, why they are powerful, and how you can use them. What are Python user-defined table functions (UDTFs) A Python user-defined table function (UDTF) is a […]

Continue Reading

Arrow-optimized Python UDFs in Apache Spark™ 3.5

In Apache Spark™, Python User-Defined Functions (UDFs) are among the most popular features. They empower users to craft custom code tailored to their unique data processing needs. However, the current Python UDFs, which rely on cloudpickle for serialization and deserialization, encounter performance bottlenecks, particularly when dealing with large data inputs and outputs. In Apache Spark […]

Continue Reading

eBay’s first Chief AI Officer Nitzan Mekel-Bobrov Recognized in Insider’s AI 100 List

Insider recently compiled its first AI 100 list, a compilation of some of the most important, innovative and influential leaders in the world of artificial intelligence. The list includes representatives from many top-tier technology companies as well as startups, research organizations and labs.  eBay’s Chief AI Officer, Nitzan Mekel-Bobrov, was included on the list of […]

Continue Reading

eBay Exec on How Artificial Intelligence Will Bring a ‘Paradigm Shift’ to Ecommerce

Insider recently published a story analyzing AI’s role in the evolution of ecommerce, sharing insight from our own Chief AI Officer Nitzan Mekel-Bobrov. Nitzan says that a larger paradigm shift is on its way, and that our platform’s massive data scale is helping eBay take the lead in generative AI for ecommerce. Nitzan discussed the […]

Continue Reading

Announcing MLflow 2.8 LLM-as-a-judge metrics and Best Practices for LLM Evaluation of RAG Applications, Part 2

Today we’re excited to announce MLflow 2.8 supports our LLM-as-a-judge metrics which can help save time and costs while providing an approximation of human-judged metrics. In our previous report, we discussed how the LLM-as-a-judge technique helped us boost efficiency, cut costs, and maintain over 80% consistency with human scores in the Databricks Documentation AI Assistant, […]

Continue Reading

[Big Book of MLOps Updated for Generative AI]

Last year, we published the Big Book of MLOps, outlining guiding principles, design considerations, and reference architectures for Machine Learning Operations (MLOps). Since then, Databricks has added key features simplifying MLOps, and Generative AI has brought new requirements to MLOps platforms and processes. We are excited to announce a new version of the Big Book […]

Continue Reading