• Integration - Data Integration - Data Virtualization

    Trino Series: Caching Strategies and Query Performance Tuning

    Introduction: Enhancing Trino Performance In our journey with Trino, we’ve explored its setup, integrated it with multiple data sources, added real-time data, and expanded to cloud storage. To wrap up, we’ll focus on strategies to improve query performance. Specifically, we’ll implement caching techniques and apply performance tuning to optimize queries for frequent data access. This final post aims to equip you with tools for building a highly responsive and efficient Trino-powered analytics environment. Goals for This Post Implement Caching for Frequent Queries: Set up a local cache for repeated queries to reduce data retrieval times and resource consumption. Tune Query…

  • Data Virtualization - Data Integration

    Trino Series: Advanced Integrations with Cloud Storage

    Introduction: Scaling Data with Cloud Storage In the previous blogs, we explored building a sample project locally, optimizing queries, and adding real-time data streaming. Now, let’s take our Trino project a step further by connecting it to cloud storage, specifically Amazon S3. This integration will showcase how Trino can handle large datasets beyond local storage, making it suitable for scalable, cloud-based data warehousing. By connecting Trino to S3, we can expand our data analytics project to manage vast datasets with flexibility and efficiency. Project Enhancement Overview Goals for This Blog Post Integrate Amazon S3 with Trino: Configure Trino to access…

  • Data Virtualization - Data Integration - Integration

    Trino Series: Optimizing and Expanding the Sample Project

    Introduction: Building on the Basics In our last blog, we set up a local Trino project for a sample use case—Unified Sales Analytics—allowing us to query across PostgreSQL and MySQL databases. Now, we’ll build on this project by introducing optimizations for query performance, configuring advanced settings, and adding a new data source to broaden the project’s capabilities. These enhancements will simulate a real-world scenario where data is frequently queried, requiring efficient processing and additional flexibility. Project Enhancement Overview Goals for This Blog Post Optimize Existing Queries: Improve query performance by using Trino’s advanced optimization features. Add a New Data Source:…

  • Data Virtualization - Data Integration - Integration

    Trino Series: Building a Sample Project on Local Installation

    Why a Trino Series Instead of Presto? If you followed the initial post in this series, you may recall we discussed the history of Presto and its recent transformation into what is now known as Trino. Originally developed as Presto at Facebook, this powerful SQL query engine has seen an incredible journey. The transition to Trino represents the evolution of PrestoSQL into a more robust, community-driven platform focused on advanced distributed SQL features. The rebranding to Trino wasn’t merely a name change—it reflects a shift toward greater community collaboration, improved flexibility, and extended support for analytics across a wide variety…

  • Data Integration - Data Virtualization - Integration - Big Data - Enterprise Application Integration

    PRESTO / Trino Basics

    Introduction: My Journey into Presto My interest in Presto was sparked in early 2021 after an enriching conversation with Brian Luisi, PreSales Manager at Starburst. His insights into distributed SQL query engines opened my eyes to the unique capabilities and performance advantages of Presto. Eager to dive deeper, I joined the Presto community on Slack to keep up with developments and collaborate with like-minded professionals. This blog series is an extension of that journey, aiming to demystify Presto and share my learnings with others curious about distributed analytics solutions. What is PRESTO Presto is a high performance, distributed SQL query…