I just watched the author of this feature and blog post give a talk at the DataCouncil conference in Oakland, and it is obvious what a huge amount of craft, ingenuity, and care went into building it. Congratulations to Hamilton and the MotherDuck team for an awesome launch!
I really like duckdb's notebooks for exploration and this feature makes them even more awesome, but the fact that I can't share, export or commit them into a git repo feels extremely limiting. It's neat-ish that it dodfoods and store them in a duckdb database. It even seems to stores historical versions, but I can't really do anything with it..
That's actually not a bad idea, to have LLM autocomplete when you write queries, especially if you first add a comment at the top saying what you want to achieve:
// Select all orders for users registered in last year, and compute average earnings per user
Me too (author of the post here). In fact, I was watching a seasoned data engineer at MotherDuck show me how they would attempt to debug a regex in a CTE. As a longtime SQL user, I felt the pain immediately; haven't we all been there before? Instant SQL followed from that.
You can use any variation of DuckDB valid syntax that you want! I prefer to put from first just because I think it's better, but Instant SQL works with traditional select __ from __ queries.
Yes it comes from a desire to impose intuition from other contexts onto something instead of building intuition with that thing.
SQL is a declarative language. The ordering of the statements was carefully thought through.
I will say it's harmless though, the clauses don't have any dependency in terms of meaning so it's fine to just allow them to be reordered in terms of the meaning of the query, but that's true of lots and lots of things in programming and just having a convention is usually better than allowing anything.
For example, you could totally allow this to be legal:
def
for x in whatever:
print(x)
print_whatever(whatever):
There's nothing ambiguous about it, but why? Like if you are used to seeing it one way it just makes it more confusing to read, and if you aren't used to seeing it the normal way you should at least somewhat master something before you try to improve it through cosmetic tweaks.
I think you see this all the time, people try to impose their own comfort onto things for no actual improvement.
Indeed, we are! We worked with DuckDB Labs to add the query_location information, which we're also enriching with the tokenizer to draw a path through the AST to the cursor location. I've been wanting to do this since forever, and now that we have it, there's actually a long tail of inspection / debugging / enrichment features we can add to our SQL editor.
This is a very cool feature. I don't know how useful it is or how I'd use it right now but I think I am going to get into some benchmarking and performance tweaking soon and this could be handy.
Correct. We only enable fast previews for SELECT statements, which is the actual hard problem. This said, at some point we're likely to also add support for previewing a CTAS before you actually run it.
Please finally add q language with proper integration to your tables so that our precious q-SQL is available there. Stop reinventing the wheel, let's at least catch up to the previous generation (in terms of convenience). Make the final step.
I just watched the author of this feature and blog post give a talk at the DataCouncil conference in Oakland, and it is obvious what a huge amount of craft, ingenuity, and care went into building it. Congratulations to Hamilton and the MotherDuck team for an awesome launch!
wohoo! glad you noticed that. Hamilton is amazing.
I really like duckdb's notebooks for exploration and this feature makes them even more awesome, but the fact that I can't share, export or commit them into a git repo feels extremely limiting. It's neat-ish that it dodfoods and store them in a duckdb database. It even seems to stores historical versions, but I can't really do anything with it..
In DuckDB UI and MotherDuck.
Awesome video of feature: https://youtu.be/aFDUlyeMBc8
Disclaimer: I’m a co-founder at MotherDuck.
I hope this doesn't work with DELETE queries.
Maybe in the next version they could also implement support for DROP, with autocorrect for the nearest (not yet dropped) table name.
LLM powered queries that run in Agent mode so it can answer questions of your data before you know what to ask.
That's actually not a bad idea, to have LLM autocomplete when you write queries, especially if you first add a comment at the top saying what you want to achieve:
// Select all orders for users registered in last year, and compute average earnings per user
SELECT ...
DELETED 0 rows. Did you mean `where 1=1`? (click accept to re-run with new where clause)
for clarity: Instant SQL won't automatically run queries that write or delete data or metadata. It only runs queries that read data.
Can't it just run inside a transaction that isn't committed?
Young bobby tables at it again
ROFL
ROFL FROM jokes WHERE thats_a_new_one;
CTE inspection is amazing. I spend too much time doing that manually.
Me too (author of the post here). In fact, I was watching a seasoned data engineer at MotherDuck show me how they would attempt to debug a regex in a CTE. As a longtime SQL user, I felt the pain immediately; haven't we all been there before? Instant SQL followed from that.
First time seeing the from at the top of the query and I am not sure how I feel about it. It seems useful but I am so used to select...from.
I'm assuming it's more of a user preference like commas in front of the field instead of after field?
You can use any variation of DuckDB valid syntax that you want! I prefer to put from first just because I think it's better, but Instant SQL works with traditional select __ from __ queries.
Yes it comes from a desire to impose intuition from other contexts onto something instead of building intuition with that thing.
SQL is a declarative language. The ordering of the statements was carefully thought through.
I will say it's harmless though, the clauses don't have any dependency in terms of meaning so it's fine to just allow them to be reordered in terms of the meaning of the query, but that's true of lots and lots of things in programming and just having a convention is usually better than allowing anything.
For example, you could totally allow this to be legal:
There's nothing ambiguous about it, but why? Like if you are used to seeing it one way it just makes it more confusing to read, and if you aren't used to seeing it the normal way you should at least somewhat master something before you try to improve it through cosmetic tweaks.I think you see this all the time, people try to impose their own comfort onto things for no actual improvement.
a fun function in duckdb (which I think they're using here) is `json_serialize_sql`. It returns a JSON AST of the SQL
Indeed, we are! We worked with DuckDB Labs to add the query_location information, which we're also enriching with the tokenizer to draw a path through the AST to the cursor location. I've been wanting to do this since forever, and now that we have it, there's actually a long tail of inspection / debugging / enrichment features we can add to our SQL editor.
This is a very cool feature. I don't know how useful it is or how I'd use it right now but I think I am going to get into some benchmarking and performance tweaking soon and this could be handy.
It's neat but the CTE selection bit errors out more often than not & erroneously selects more than the current CTE
Can you say more? Where does it error out? Sounds like a bug; if you could post an example query, I bet we can fix that.
Will this be available in duckdb -ui ?
Is mother duck editor features available on-prem? My understanding is that mother duck is a data warehouse sass.
It is already available in the local DuckDB UI! Let us know what you think!
-Customer software engineer at MotherDuck
Does local DuckDB UI work without an internet connection?
Delete From dbo.users w...
(129304 rows affected)
The blog specifically says that they're getting the SQL AST so presumably they would not execute something like a DELETE.
Correct. We only enable fast previews for SELECT statements, which is the actual hard problem. This said, at some point we're likely to also add support for previewing a CTAS before you actually run it.
Cool. Now, there's this thing called a joke...
This is such a bizarre feature.
On first glance possibly, on second glance not at all.
First, repeat data analyst queries are a usage driver in SQL DBs. Think iterating the code and executing again.
Another huge factor in the same vein is running dev pipelines with limited data to validate a change works when modelling complex data.
This is currently a FE feature, but underneath lies effective caching.
The underlying tech is driving down usage cost which is a big thing for data practitioners.
What about it is bizarre?
Please finally add q language with proper integration to your tables so that our precious q-SQL is available there. Stop reinventing the wheel, let's at least catch up to the previous generation (in terms of convenience). Make the final step.
What is q-SQL?