Engineering Archives - Page 21 of 30

Unifying Your Data Ecosystem with Delta Lake Integration

May 9, 2023May 9, 2023Posted by adminLeave a Comment

As organizations are maturing their data infrastructure and accumulating more data than ever before in their data lakes, Open and Reliable table formats such as Delta Lake become a critical necessity. Thousands of companies are already using Delta Lake in production, and open-sourcing all of Delta Lake (as announced in June 2022) has further increased […]

Securing Databricks cluster init scripts

May 2, 2023May 2, 2023Posted by adminLeave a Comment

This blog was co-authored by Elia Florio, Sr. Director of Detection & Response at Databricks and Florian Roth and Marius Bartholdy, security researchers with SEC-Consult. Protecting the Databricks platform and continuously raising the bar with security improvements is the mission of our Security team and the main reason why we invest in our bug […]

Safer deployment of streaming applications

May 2, 2023May 2, 2023Posted by adminLeave a Comment

The Flink framework has gained popularity as a real-time stateful stream processing solution for distributed stream and batch data processing. Flink also provides data distribution, communication, and fault tolerance for distributed computations over data streams. To fully leverage Flink’s features, Coban, Grab’s real-time data platform team, has adopted Flink as part of our service offerings. In […]

eBay’s Blazingly Fast Billion-Scale Vector Similarity Engine

May 1, 2023May 1, 2023Posted by adminLeave a Comment

Introduction Often, ecommerce marketplaces provide buyers with listings similar to those previously visited by the buyer, as well as a personalized shopping experience based on profiles, past shopping histories and behavior signals such as clicks, views and additions to cart. These are vital to the shopping experience, and so it’s equally vital that we continuously […]

Databricks ❤️ Hugging Face – The Databricks Blog

April 26, 2023April 26, 2023Posted by adminLeave a Comment

Generative AI has been taking the world by storm. As the data and AI company, we have been on this journey with the release of the open source large language model Dolly, as well as the internally crowdsourced dataset licensed for research and commercial use that we used to fine-tune it, the databricks-dolly-15k. Both the […]

Processing data simultaneously from multiple streaming platforms using Delta Live Tables

April 25, 2023April 25, 2023Posted by adminLeave a Comment

One of the major imperatives of organizations today is to enable decision making at the speed of business. Business teams and autonomous decisioning systems often require all the information they need to make decisions and respond quickly as soon as their source events happen – in real time or near real time. Such information, known […]

PyTorch on Databricks – Introducing the Spark PyTorch Distributor

April 20, 2023April 20, 2023Posted by adminLeave a Comment

Background and Motives Deep Learning algorithms are complex and time consuming to train, but are quickly moving from the lab to production because of the value these algorithms help realize. Whether using pre-trained models with fine tuning, building a network from scratch or anything in between, the memory and computational load of training can quickly […]

Spark Connect Available in Apache Spark 3.4

April 18, 2023April 18, 2023Posted by adminLeave a Comment

Last year Spark Connect was introduced at the Data and AI Summit. As part of the recently released Apache SparkTM 3.4, Spark Connect is now generally available. We have also recently re-architected Databricks Connect to be based on Spark Connect. This blog post walks through what Spark Connect is, how it works, and how to […]

Message Center – Redesigning the messaging experience on the Grab superapp

April 17, 2023April 17, 2023Posted by adminLeave a Comment

Since 2016, Grab has been using GrabChat, a built-in messaging feature to connect our users with delivery-partners or driver-partners. However, as the Grab superapp grew to include more features, the limitations of the old system became apparent. GrabChat could only handle two-party chats because that’s what it was designed to do. To make our messaging […]

Introducing Apache Spark™ 3.4 for Databricks Runtime 13.0

April 14, 2023April 14, 2023Posted by adminLeave a Comment

Today, we are happy to announce the availability of Apache Spark™ 3.4 on Databricks as part of Databricks Runtime 13.0. We extend our sincere appreciation to the Apache Spark community for their invaluable contributions to the Spark 3.4 release. To further unify Spark, bring Spark to applications anywhere, increase productivity, simplify usage, and add new […]

Category: Engineering

Unifying Your Data Ecosystem with Delta Lake Integration

Securing Databricks cluster init scripts

Safer deployment of streaming applications

eBay’s Blazingly Fast Billion-Scale Vector Similarity Engine

Databricks ❤️ Hugging Face – The Databricks Blog

Processing data simultaneously from multiple streaming platforms using Delta Live Tables

PyTorch on Databricks – Introducing the Spark PyTorch Distributor

Spark Connect Available in Apache Spark 3.4

Message Center – Redesigning the messaging experience on the Grab superapp

Introducing Apache Spark™ 3.4 for Databricks Runtime 13.0

Categories

Latest News

Local and landscape scale factors influence pollinators at solar parks – The Applied Ecologist

📰 Newsmax to Pay $67 Million in Defamation Settlement Over 2020 Election Misinformation

📞 Ireland National Leaving Certificate Helpline Opens August 22 to Support Students and Families

Secretary of State Marco Rubio with Jesse Watters of Jesse Watters Primetime on Fox News

South Sudanese ‘are counting on us’, top UN official tells Security Council

A humanoid robot is now on sale for under US$6,000 – what can you do with it?

Pages

Enjoy this blog? Please spread the word :)