An Upgraded Machine Learning Monitoring System

Background In 2019, eBay started an initiative to upgrade the monitoring platform to handle increased monitoring signals. We decided to make these upgrades in order to cope with the vast number of queries our system encounters, which in turn revealed several engineering challenges to be overcome. The new platform, Sherlock.io, aligns with the Prometheus tech […]

Continue Reading

Multi-Objective Ranking for Promoted Auction Items

Background Last year, eBay Ads launched a new campaign type, Promoted Listings Express (PLX), which lets eBay sellers boost visibility for their auction-style listings with just a few clicks and a single, flat, upfront fee. Over the past year, our research team worked to optimize how we merchandise these promoted auction items. The way in […]

Continue Reading

New Buyer Groups Tool Brings Personalized Marketing to eBay Sellers

eBay attracts buyers for its wide selection and value, and for the global connection provided by the comprehensive marketplace. That connection between the buyer and seller can be hugely beneficial for engagement, but buyers would prefer to not be bombarded with unorganized, unwanted communication. Sellers want an easy way to engage with their buyers, which […]

Continue Reading

Increase A/B Testing Power by Combining Experiments

Say you’ve had an experiment that produced some surprising results, so you replicated it with a new experiment. Or say you’ve got a number of separate experiments for multiple channels, yielding different reports from the same hypothesis. In the past, this would have potentially provided under-powered experiment reports without sound evidence. But there’s a more […]

Continue Reading

Why and How eBay Pivoted to OpenTelemetry

Introduction Observability provides the eyes and ears to any organization. A major benefit to observability is in preventing the loss of revenue by efficiently surfacing ongoing issues in critical workflows that could potentially impact customer experience. The Observability landscape is an ever-changing one and recent developments in the OpenTelemetry world forced us to rethink our […]

Continue Reading

Building Patient Cohorts with NLP and Knowledge Graphs

Check out the solution accelerator to download the notebooks referred throughout this blog.  Cohort building is an essential part of patient analytics. Defining which patients belong to a cohort, testing the sensitivity of various inclusion and exclusion criteria on sample size, building a control cohort with propensity score matching techniques: These are just some of the processes […]

Continue Reading

Databricks State Rebalancing Structured Streaming Enhancement Preview

In light of the accelerated growth and adoption of Apache Spark Structured Streaming, Databricks announced Project Lightspeed at Data + AI Summit 2022. Among the items outlined in the announcement was a goal of improving latency in Structured Streaming workloads. In this post we are excited to go deeper into just one of the ways […]

Continue Reading

How to Profile PySpark – The Databricks Blog

In Apache Spark™, declarative Python APIs are supported for big data workloads. They are powerful enough to handle most common use cases. Furthermore, PySpark UDFs offer more flexibility since they enable users to run arbitrary Python code on top of the Apache Spark™ engine. Users only have to state “what to do”; PySpark, as a […]

Continue Reading

Admin Isolation on Shared Clusters

This blog was co-authored by David Meyer, SVP Product Management at Databricks and Joosua Santasalo, a security researcher with Secureworks.   At Databricks, we know the security of the data processed in our platform is essential to our customers. Our Security & Trust Center chronicles investments in internal policies and processes (like vulnerability management and […]

Continue Reading

Improved Performance and Value With Databricks Photon and Azure Lasv3 Instances Using AMD 3rd Gen EPYC™ 7763v Processors

Databricks has partnered with AMD to support a new chip that lets you run your queries faster, saving you time and money. Combining the latest technologies from Azure Databricks and AMD, users can now take advantage of the new Lasv3-series VMs with the Databricks Runtimes to reduce the total cost of ownership (TCO) and achieve […]

Continue Reading