-
Notifications
You must be signed in to change notification settings - Fork 1.5k
feat(database): Add support for Trino as a new database driver #1313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This commit introduces a new file `build_trino.go` that sets up the build constraints for the Trino database. It imports the necessary Trino database driver from the `github.com/golang-migrate/migrate/v4` package, enabling migration support for Trino within the CLI.
…nagement, and add unit tests for configuration and driver registration
…nt formatting in migration files
…nd use CREATE OR REPLACE for user view
…e and update queries for version retrieval and insertion
…ble and enhance connection handling with scheme conversion in trino.go; add tests for scheme conversion in trino_test.go
…ery parameters in README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds support for Trino as a new database driver for the migrate tool, enabling schema migration management for data sources connected through the Trino distributed SQL query engine. The implementation allows users to manage database schemas for modern data platforms like data lakes built on Iceberg, Parquet, and S3.
Key changes include:
- Implementation of a complete Trino database driver with full migration lifecycle support
- Connection string handling with automatic scheme conversion from
trino://
to HTTP/HTTPS based on SSL configuration - Comprehensive test suite with Docker-based integration tests and unit tests
Reviewed Changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
database/trino/trino.go |
Core driver implementation with migration table management, version tracking, and advisory locking |
database/trino/trino_test.go |
Integration tests using Docker container for live Trino testing |
database/trino/unit_test.go |
Unit tests for driver registration, configuration, and interface compliance |
go.mod |
Dependency updates including Trino Go client and related packages |
internal/cli/build_trino.go |
Build tag file for conditional Trino driver inclusion |
database/trino/examples/migrations/ |
Example migration files demonstrating Trino-specific SQL patterns |
database/trino/README.md |
Documentation for connection strings, parameters, and usage guidelines |
README.md |
Updated main README to list Trino as supported database |
Makefile |
Added Trino to DATABASE variable for build system |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
SELECT table_name | ||
FROM information_schema.tables | ||
WHERE table_catalog = '%s' | ||
AND table_schema = '%s' | ||
AND table_type = 'BASE TABLE'`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability: the catalog and schema names are directly interpolated into the SQL string without proper escaping or parameterization. Use parameterized queries or properly escape the values.
Copilot uses AI. Check for mistakes.
query = fmt.Sprintf("SELECT version, dirty FROM %s.%s.%s ORDER BY sequence DESC LIMIT 1", | ||
t.config.MigrationsCatalog, t.config.MigrationsSchema, t.config.MigrationsTable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability: catalog, schema, and table names are directly interpolated into the SQL string. These values should be properly escaped or validated to prevent SQL injection attacks.
Copilot uses AI. Check for mistakes.
migrationsTable := fmt.Sprintf("%s.%s.%s", | ||
t.config.MigrationsCatalog, t.config.MigrationsSchema, t.config.MigrationsTable) | ||
|
||
insertQuery := fmt.Sprintf("INSERT INTO %s (version, dirty, sequence) VALUES (?, ?, ?)", migrationsTable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability: the table name is constructed using string concatenation without proper escaping. While the values themselves use parameterized queries, the table name should also be properly escaped.
Copilot uses AI. Check for mistakes.
migrationsTable := fmt.Sprintf("%s.%s.%s", | ||
t.config.MigrationsCatalog, t.config.MigrationsSchema, t.config.MigrationsTable) | ||
|
||
// Use CREATE TABLE IF NOT EXISTS for safe concurrent table creation | ||
createQuery := fmt.Sprintf(` | ||
CREATE TABLE IF NOT EXISTS %s ( | ||
version BIGINT NOT NULL, | ||
dirty BOOLEAN NOT NULL, | ||
sequence BIGINT NOT NULL | ||
)`, migrationsTable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability: the table name is directly interpolated into the CREATE TABLE statement without proper escaping. This could allow SQL injection if the catalog, schema, or table names contain malicious content.
Copilot uses AI. Check for mistakes.
dropQuery := fmt.Sprintf("DROP TABLE IF EXISTS %s.%s.%s", | ||
t.config.MigrationsCatalog, t.config.MigrationsSchema, tableName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability: catalog, schema, and table names are directly interpolated into the DROP TABLE statement without proper escaping. This is particularly dangerous as it's used in a loop to drop multiple tables.
Copilot uses AI. Check for mistakes.
This pull request introduces a new database driver for Trino, enabling migrate to manage schema migrations on data sources connected through the Trino query engine.
Purpose
Trino is a distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources. By adding Trino support to
migrate
, we empower users to manage database schemas for a wide array of modern data platforms, such as data lakes built on Iceberg, Parquet, and S3, directly through a familiar migration workflow.This driver is particularly valuable for data-intensive applications where schema evolution needs to be version-controlled and synchronized across different underlying systems.
Implementation Details
New Trino Driver (
database/trino/trino.go
)database.Driver
interface.trinodb/trino-go-client
) to interact with the Trino REST API.schema_migrations
table to track migration versions and state.Connection Handling
trino://
connection string, which is automatically converted tohttp://
orhttps://
based on thessl
query parameter.https://
for all connections unless explicitly disabled via?ssl=false
.Core Functionality
Open()
: Establishes a connection to the Trino coordinator.Run()
: Executes migration scripts (SQL statements).SetVersion()
/Version()
: Manages the migration version and dirty status in theschema_migrations
table.Drop()
: Cleans up the schema by dropping all tables within it.Lock()
/Unlock()
: Provides an in-memory advisory lock to prevent concurrent migration runs.Configuration
catalog
,schema
,x-migrations-table
, and other settings.Integration Tests (
database/trino/trino_test.go
)trinodb/trino
) to run tests against a live Trino instance.Documentation
README.md
has been added to thedatabase/trino
directory, providing clear instructions on how to use the driver, format the connection string, and configure the available parameters.This new driver significantly expands the capabilities of
migrate
, bringing modern, large-scale data platforms into its ecosystem.