How eBay’s New Search Feature Was Inspired By Window Shopping

We live in a world of discovery where visual appetite reigns supreme. Window shopping, infinite scroll lists, and micro engagements using simple visual cues are the norm. Search engines traditionally interpret a textual query input and match items and/or documents ranked by their relevance to the input query. The relevance of the retrieved results is […]

Continue Reading

Evolution of quality at Grab

To achieve our vision of becoming the leading superapp in Southeast Asia, we constantly need to balance development velocity with maintaining the high quality of the Grab app. Like most tech companies, we started out with the traditional software development lifecycle (SDLC) but as our app evolved, we soon noticed several challenges like high feature bugs and […]

Continue Reading

Determine the best technology stack for your web-based projects

In the current technology landscape, startups are developing rapidly. This usually leads to an increase in the number of engineers in teams, with the goal of increasing the speed of product development and delivery frequency. However, this growth often leads to a diverse selection of technology stacks being used by different teams within the same […]

Continue Reading

Fine-Tuning Large Language Models with Hugging Face and DeepSpeed

Large language models (LLMs) are currently in the spotlight following the sensational release of ChatGPT. Many are wondering how to take advantage of models like this in their own applications. However, this is merely one of several advances in transformer-based models, many others of which are open and readily available for tasks like translation, classification, […]

Continue Reading

Building the Lakehouse for Healthcare and Life Sciences – Processing DICOM images at scale with ease

One of the biggest challenges in understanding patient health status and disease progression is unlocking insights from the vast amounts of semi-structured and unstructured data types in healthcare. DICOM, which stands for Digital Imaging and Communications in Medicine, is the standard for the communication and management of medical imaging information. Medical images, encompassing modalities like […]

Continue Reading

How eBay Made Its New Accessibility Tool — And Made It Available to All

There is sometimes a fundamental gap between the engineering and design teams when creating a new product. Designers want their work to be accessible, but many of the available tools are cumbersome, confusing, and come with processes that aren’t well-defined. This can lead to designers delivering their work to engineers without fully baked accessibility, which […]

Continue Reading

Unsupervised Outlier Detection on Databricks

Kakapo (KAH-kə-poh) implements a standard set of APIs for outlier detection at scale on Databricks. It provides an integration of the vast PyOD library of outlier detection algorithms with MLFlow for tracking and packaging of models and hyperopt for exploring vast, complex and heterogeneous search spaces.   The views expressed in this article are privately […]

Continue Reading

Migrating from Role to Attribute-based Access Control

Grab has always regarded security as one of our top priorities; this is especially important for data platform teams. We need to control access to data and resources in order to protect our consumers and ensure compliance with various, continuously evolving security standards. Additionally, we want to keep the process convenient, simple, and easily scalable […]

Continue Reading

Scalable Spark Structured Streaming for REST API Destinations

Spark Structured Streaming is the widely-used open source engine at the foundation of data streaming on the Databricks Lakehouse Platform. It can elegantly handle diverse logical processing at volumes ranging from small-scale ETL to the largest Internet services. This power has led to adoption in many use cases across industries. Another strength of Structured Streaming […]

Continue Reading

Securing GitOps pipelines

Introduction Grab’s real-time data platform team, Coban, has been managing infrastructure resources via Infrastructure-as-code (IaC). Through the IaC approach, Terraform is used to maintain infrastructure consistency, automation, and ease of deployment of our streaming infrastructure, notably: With Grab’s exponential growth, there needs to be a better way to scale infrastructure automatically. Moving towards GitOps processes […]

Continue Reading