Augmenting Presto & Trino with Autonomous Indexing
Presto and Trino have quickly become the tool of choice for data driven companies looking for the agility and flexibility of the data lake architecture.
Originated in Facebook labs, both open-source projects have been proven at scale in a variety of use cases at Airbnb, Comcast, Facebook, Netflix, Twitter, and Uber, and have vibrant communities of contributors addressing problems and improving the product offering.
Eliminating the challenges of brute force
Both Presto and Trino heavily rely on full scans to process queries. In fact, 80% of compute resources are “wasted” on ScanFilter, which means partitioning is not useful enough in reducing data reads and optimizing query performance for multiple workloads.
This is a huge challenge if performance requirements are strict / interactive, which means supporting it will require a large cluster (compute resources) which quickly spirals out of control in both ops and budget. There’s also a challenge where several workloads coexist and the partitioning strategy needs to enable fast analytics for all use-cases.
In this case, indexing is much more effective. In this webinar we’ll introduce Varada’s autonomous indexing solution that can reduce cluster size by an order of magnitude or deliver 10x-100x faster queries on the existing footprint. You can also expect 40%-60% TCO reduction based on the highly effective resource utilization when using Varada’s indexing technology.
Varada can be deployed as a Presto / Trino connector to existing clusters, or as a standalone cluster.