SQL¶
marimo lets you can mix and match Python and SQL: Use SQL to query Python dataframes (or databases like SQLite and Postgres), and get the query result back as a Python dataframe.
To create a SQL cell, you first need to install additional dependencies, including duckdb:
pip install marimo[sql]
Examples
For example notebooks, check out
examples/sql/
on GitHub.
Example¶
In this example notebook, we have a Pandas dataframe and a SQL cell that queries it. Notice that the query result is returned as a Python dataframe and usable in subsequent cells.
Creating SQL cells¶
You can create SQL cells in one of three ways:
Right-click an “add cell” button (“+” icon) next to a cell and choose “SQL cell”
Convert a empty cell to SQL via the cell context menu
Click the SQL button that appears at the bottom of the notebook
This creates a “SQL” cell for you, which is syntactic sugar for Python code. The underlying code looks like:
output_df = mo.sql(f"SELECT * FROM my_table LIMIT {max_rows.value}")
Notice that we have an output_df
variable in the cell. This contains
the query result, and is a Polars DataFrame (if you have polars
installed) or
a Pandas DataFrame (if you don’t). One of them must be installed in order to
interact with the query result.
The SQL statement itself is an f-string, letting you
interpolate Python values into the query with {}
. In particular, this means
your SQL queries can depend on the values of UI elements or other Python values,
and they are fit into marimo’s reactive dataflow graph.
Reference a local dataframe¶
You can reference a local dataframe in your SQL cell by using the name of the Python variable that holds the dataframe. If you have a database connection with a table of the same name, the database table will be used instead.
Since the output dataframe variable (_df
) has an underscore, making it private, it is not referenceable from other cells.
Reference the output of a SQL cell¶
Defining a non-private (non-underscored) output variable in the SQL cell allows you to reference the resulting dataframe in other Python and SQL cells.
Querying files, databases, and APIs¶
In the above example, you may have noticed we queried an HTTP endpoint instead of a local dataframe. We are not only limited to querying local dataframes; we can also query files, databases such as Postgres and SQLite, and APIs:
-- or
SELECT * FROM 's3://my-bucket/file.parquet';
-- or
SELECT * FROM read_csv('path/to/example.csv');
-- or
SELECT * FROM read_parquet('path/to/example.parquet');
For a full list you can check out the duckdb extensions. You can also check out our examples on GitHub.
Interactive tutorial¶
For an interactive tutorial, run
marimo tutorial sql
at your command-line.
Examples¶
Check out our examples on GitHub.