feat: Bigquery as OLAP engine by k-anshul · Pull Request #9161 · rilldata/rill

k-anshul · 2026-04-01T12:11:40Z

closes https://linear.app/rilldata/issue/PLAT-450/metrics-views-on-bigquery

Added

TODOs to be done with follow ups:

Exports are broken
remove conversion of civil.Date to time.Time in the rill driver and handle it wherever required

Checklist:

Covered by tests
Ran it and it works as intended
Reviewed the diff before requesting a review
Checked for unhandled edge cases
Linked the issues it closes
Checked if the docs need to be updated. If so, create a separate Linear DOCS issue
Intend to cherry-pick into the release branch
I'm proud of this work!

k-anshul · 2026-04-02T11:54:59Z

runtime/metricsview/executor/executor_timestamps.go

+	}
+
+	rangeSQL := fmt.Sprintf(
+		"SELECT min(%[1]s) as `min`, max(%[1]s) as `max`, %[2]s as `watermark` FROM %[3]s %[4]s",


This is not an efficient query even when running on partition column

An optimization can be done where we check if this is the partition column in the table and directly check on min/max partition metadata.
Given this is an often executed query I think it can done in a follow-up. @begelundmuller thoughts ?

If the optimization can be done in a fast/cheap/safe way, then yeah it sounds good to me

begelundmuller · 2026-04-03T15:33:44Z

runtime/drivers/bigquery/bigquery.go

+	// MaxBytesBilled is the maximum number of bytes billed for a query. This is a safety mechanism to prevent accidentally running large queries.
+	// Set this to -1 for project defaults.
+	// Only applies to dashboard queries and does not apply when ingesting data from BigQuery into Rill.
+	MaxBytesBilled  int64 `mapstructure:"max_bytes_billed"`


Is it normal for BI tools to offer this for BigQuery? If we do, I think it should be much higher than 10 GB by default (which at the high usage-based list price is only $0.0625).

Another danger of setting a default limit is that it may break existing projects that use a BigQuery query for partition discovery.

begelundmuller · 2026-04-03T15:36:15Z

runtime/drivers/bigquery/olap.go

+	ctx, cancel := context.WithTimeout(ctx, drivers.DefaultQuerySchemaTimeout)
+	defer cancel()
+
+	res, err := c.Query(ctx, &drivers.Statement{


Did you check if BigQuery's dry run feature can be used for this? IIRC they have some of the best dry run support of any database.

begelundmuller · 2026-04-03T15:40:58Z

runtime/drivers/olap.go

+	if d == DialectBigQuery && isFullJoin {
+		// BigQuery requires plain equality for FULL joins
+		// TODO: find a better way to handle this
+		return fmt.Sprintf("coalesce(CAST(%s AS STRING), '__rill_sentinel__') = coalesce(CAST(%s AS STRING), '__rill_sentinel__')", lhs, rhs)


Does this TODO need to be addressed?

(Since optimizing small things like this for BigQuery perhaps doesn't really make sense, consider removing the isFullJoin param and just using this approach for the BigQuery dialect for any join type)

begelundmuller · 2026-04-03T15:42:59Z

runtime/drivers/olap.go

+	case DialectBigQuery:
+		// BigQuery uses UNION ALL for generating time series


Did you check if it can use BigQuery's RANGE function? https://docs.cloud.google.com/bigquery/docs/reference/standard-sql/range-functions#range

begelundmuller · 2026-04-03T15:44:04Z

runtime/drivers/olap.go

+			// BigQuery converts time.Time type to TIMESTAMP which is not compatible with DATE type dimensions
+			// so we need to convert it back to civil.Date if the dimension type is DATE
+			// TODO: remove conversion of civil.Date in the rill driver and handle it wherever required and remove this conversion here
+			if result.Schema.Fields[0].Type.Code == runtimev1.Type_CODE_DATE {
+				if t, ok := dimVal.(time.Time); ok {
+					dimVal = civil.DateOf(t)
+				}
+			}


Does this TODO need to be addressed?

Did you consider moving this conversion into the BigQuery driver (i.e. iterate over the args passed to Query and Exec?)

begelundmuller · 2026-04-03T15:51:17Z

runtime/metricsview/executor/executor_validate.go

+	// Store the time dimension's data type so it's available for downstream queries (e.g. Schema validation).
+	e.metricsView.Dimensions = append(e.metricsView.Dimensions, &runtimev1.MetricsViewSpec_Dimension{
+		Name:     e.metricsView.TimeDimension,


My previous comment applies here as well. I thought we already had this

begelundmuller · 2026-04-03T15:53:13Z

runtime/drivers/olap.go

+	if d == DialectBigQuery && typeCode == runtimev1.Type_CODE_DATE {
+		return "DATE(?)"
+	}


Is this also needed if you cast to civil in Query? Or is there any other way we could push this into the driver and avoid knowing the time type in advance?

begelundmuller · 2026-04-03T15:58:06Z

runtime/testruntime/testruntime.go

+	ctx := t.Context()
+
+	_, currentFile, _, _ := goruntime.Caller(0)
+	projectPath := filepath.Join(currentFile, "..", "testdata", "ad_bids_bigquery")


The testruntime/testdata directory is only used for legacy tests. Is it really necessary here? Or can the same be accomplished with a normal NewInstanceWithOptions call like we use for new tests that defines the file(s) needed for the specific test inline?

begelundmuller · 2026-04-03T15:59:31Z

runtime/testruntime/testruntime.go

+
+	inst := &drivers.Instance{
+		Environment:      "test",
+		OLAPConnector:    "bigquery",


The hard-coded OLAP connector option predates the time when we supported changing the OLAP connector in rill.yaml. It shouldn't be necessary in new tests – can you just put olap_connector: bigquery in the test's rill.yaml?

begelundmuller · 2026-04-03T16:02:03Z

runtime/queries/table_head.go

@@ -180,33 +181,157 @@ func (q *TableHead) generalExport(ctx context.Context, rt *runtime.Runtime, inst
 }

 func (q *TableHead) buildTableHeadSQL(ctx context.Context, olap drivers.OLAPStore) (string, error) {


It seems like there's a huge complexity increase in this function. Two questions:

We don't run TableHead very often, so is it necessary to optimize it so hard? In general, I would assume people who connect a BI tool to a data warehouse are fine with a SELECT * FROM tbl LIMIT 100 query being run.

If it really is necessary, is it possible to combine it into one nested query and push it into the dialect somehow?

k-anshul added 4 commits April 1, 2026 14:00

feat: bigquery as olap engine

e900703

revert table partition column as timeseries

d40b83a

better error msg for max bytes billed

6752fc2

self review

942cebe

k-anshul self-assigned this Apr 1, 2026

k-anshul added 3 commits April 2, 2026 11:31

unit tests fix

814148e

more fixes

d3cd995

full join fix

e07d8f0

k-anshul commented Apr 2, 2026

View reviewed changes

also add other partition

1ad44e3

k-anshul requested a review from begelundmuller April 2, 2026 13:08

k-anshul added 3 commits April 3, 2026 13:00

timezone related changes

a20da27

add unit tests

1e3d684

small query change

b37d12a

begelundmuller requested changes Apr 3, 2026

View reviewed changes

		case DialectBigQuery:
		// BigQuery uses UNION ALL for generating time series

		@@ -180,33 +181,157 @@ func (q TableHead) generalExport(ctx context.Context, rt runtime.Runtime, inst
		}

		func (q *TableHead) buildTableHeadSQL(ctx context.Context, olap drivers.OLAPStore) (string, error) {

Conversation

k-anshul commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

k-anshul commented Apr 1, 2026 •

edited

Loading