Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How to generate SQL statements with ChatGPT (forta.com)
67 points by rmason on July 12, 2023 | hide | past | favorite | 45 comments


I've spent a lot of time on this problem space for our business. My current conclusions are as follows:

Writing the actual SQL query is not the hard part for us. The hardest part is exhaustively expressing our intention in any language at all, followed closely by its apprentice - the incomplete requirements monster.

I've found that if I can coach a team member into giving me a proper natural language statement that we would use as an ideal training example for a SQL bot, then they are probably able to just go ahead and write the final query by hand. Overall, it would likely take less time & cognitive overhead.

Don't get me wrong - I see the automation potential too. I've used it for lots of really crappy bulk refactors and it's great. But, it absolutely cannot think for you and magic-in the requirements that your customer ultimately left on the floor.

Generative SQL output is the opposite of what you want when you are at the frontier of understanding and incomplete requirements. For us, this is kind of where the biggest use case would have come from. We don't need much help with the boring stuff. We need a really smart asshole to call us out when there appears to be a logical gap somewhere. ChatGPT almost seemed to be this @ launch but I think its been RLHF'd into a potato by now. I stopped using it for anything serious around late May.


Subqueries and windowing and partitioning are annoying tedious areas if you only use that sort of SQL occasionally (<1/month) and the GPT4 API seems to massively help for turning plain text into the appropriate stuff. I know enough to be able to be confident in it after running some tests of the query, but not enough to knock it out cold without a few minutes research.


>>*...Anything serious...*

My problem is that I simply cannot trust any data coming from GPT - it doesnt have real-time info, it constantly tells me what it CANT DO and its never able to tell me what it CAN DO.

The platform is extremely powerful, and what we are seeing is that the cadre of AI LLMs/apps that are coming about has all the devs/companies frantically running in a panic on how to both weaponize and monetize.


>"My problem is that I simply cannot trust any data coming from GPT"

I just used it to write shell function which extracts value from .ini file using section, name as a parameters. Works perfectly. Otherwise I would have to manually dick with awk input string wrecking my brain in the process


function validation is not same as information validation and links to information from [world]\\

I would EXPECT/DEMAND your results - other results are not functions.


That issue seems entirely mitigated by thorough testing.


If you know SQL why is this better than just writing SQL? If you don’t know SQL how do you know it’s correct?

I had to forbid my coworkers from using chatgpt to generate SQL because it kept doing things slightly wrong - like missing parens when dealing with AND and OR in a where statement.


Most people can read more complicated text than they can write.

Most people who need to generate a query can run it and look at the results.

I use chatgpt to help me, particularly with languages I don’t write in everyday. And I often find myself arguing with chatgpt that what it wrote is wrong, because I can read the query, but don’t know how to write the query I need.


I agree with your sentiment but it actually occurred to me that code might actually be the reverse - where you can write more complicated code than you can read.


A nice corollary to Kernighan's law on debugging.

I guess it can go either way. Sometimes ChatGPT4 is great and saves me time, but the cycle of "explain what I want, get the output test/compile it, post the errors, try again" doesn't always converge quickly enough to not be more irritating than actually learning the issues in the code I'm trying to interface with. So I'll try it a bit, but if it goes back and forth more than 3 or 4 times, I'll give up and dig in to study the thing.


> If you know SQL why is this better than just writing SQL? If you don’t know SQL how do you know it’s correct?

You test it.

And what do you mean by “know SQL?” No one knows everything about SQL. Virtually everyone writing SQL is in between the two extreme cases you provide.


If you understand SQL enough to know if the query ChatGPT produced is valid - you probably could have written it yourself. I don’t see how prompting chatgpt and testing its results is faster or better than just writing SQL.


I can recognize much more faces than I can describe/draw. It is easier to read foreign language than to write it (active vs passive vocabulary).

It is the same with googling, looking for a solution on StackOverflow: understanding/judging whether the solution is good may be faster than constructing it yourself.


Imo it depends on what you're prompting. If you yourself do not fully understand what you want and/or which information you have to provide for the model to 'understand' enough context, prompts will not give good results.

I usually find ChatGPT very good for converting conceptual ideas to a concrete piece of 'boilerplate' to work off of. It can help you jump gaps in your proces, but in the end you still have to make the jumps yourself.


> If you understand SQL enough to know if the query ChatGPT produced is valid

Like I said, you need to test it. Not read it. Test it.


Writing tweaks to complex statements is tedious and I’m lazy? I often just feed in existing queries and scripts and say “use the below as a template, but do X,Y, and Z instead.” GPT-4 does this correctly the majority of the time and for all but a handful of circumstances, can typically be coaxed into the correct thing with a couple of back-and-forths.


> If you know SQL why is this better than just writing SQL? If you don’t know SQL how do you know it’s correct?

This line of reasoning applies for all areas of knowledge, and I wish more people would realize this about LLMs.


I know (some) French, still use ChatGPT or Google Translate a lot to get me started if I want to send a note to someone in French. I still do some hand-waving afterwards to ensure it's correct but that's about it.

I'd argue that this process applies to any GPT-type generator, it serves a starting point so you're not staring at a blank piece of paper, but it won't give you the final result.


The benefit of using GPT to produce something for another human is the other human can understand you even if the statement is slightly weird or incorrectly worded.

SQL doesn’t have the same tolerance.


You send the produced query through a parser and a validator. The parser ensures it is syntactically correct, the validator ensures it is only accessing allowed data.


The SQL it produced for my coworkers ran without error - it just produced undesired results.


It sounds like your concern is that an LLM is not capable of understanding an english explanation of the query and producing the correct query. I personally have not had many issues here, and the time saved by tweaking a generated query is larger than writing the query myself from scratch. And LLMs will only improve so that these tweaks require less time.


I feel the same way when someone gives me an MR without suitable unit tests. How do they know it is correct?


Yesterday I had the case where I needed to create analytics tables in Clickhouse based on typescript types for around 150 types.

Thought I'd give it a try and gave chatGPT 2 examples of existing tables and the corresponding TS types and asked it if it understood what it should do next. It explained the existing logic and also 2 edge cases hidden (e.g. if a type was boolean the column was UInt8 (to save 0/1))

Then I pasted all types in batches of 15 in and it generated with only 2-3 corrections needed, probably saving me around 1-2h of manual checking and/or creating. It understood that certain fields are non LowCardinality and therefore not used it.


If you can export your data to SQLite first (my https://datasette.io/tools/db-to-sqlite tool can help with that) you can upload the SQLite database directly to ChatGPT Code Interpreter and have it run the queries for you, since sqlite3 is in the Python standard library

If you do this, it will automatically retry with a new SQL query if it gets any error messages.


I listened to you on the latentspacepod the other day and tried this method myself.

  1. Used GPT-4 to build functions that pulled in sample datasets from my BigQuery.

  2. Used GPT-4 to insert them into a sqlite database.

  3. Upload the zip file into Code Interpreter.

  4. Now I'm in an environment to quickly self-iterate on GPT-4 generated SQL.
It felt incredibly powerful. Previously I was developing with the latency of 100Gb+ tables – now the sqlite db responds instantly, and I don't even need to write the code.


It's really good at generating sqlite db's[0] too.

0 - https://chat.openai.com/share/5534e0b2-4010-4da6-99b9-a3aa02...


This works well for simple use cases, but quickly breaks down when faced with real-life scenarios.

Database schemas might contain hundreds of tables, and they may not always have intuitive names. Also, the relationships between the tables aren't always clear, and there are always company-specific mystery clauses that you might need to apply (such as excluding certain users) when running a query.

Anyone building a chat interface for databases has probably realized this by now.


This is solvable by augmenting the database schema with comments.

When you integrate your database with an LLM, you'll notice the LLM will produce flawed queries based on wrinkles in your database schema. This is because the LLM relies on conventional understanding of how the schema is probably tied together. When you see the flawed queries, you augment the schema with a comment that explains why the schema has a wrinkle. The LLM takes that into consideration and the resulting queries are improved.

A concrete example[1]: I found that when querying the Sakila movie rental database, the generated query would frequently attempt to join the `rental` table to the `film` table through a nonexistent `film_id` column on the rental table. By adding the linked comment, the LLM stopped doing that.

1. https://github.com/amoffat/HeimdaLLM/blob/dev/notebooks/saki...


This is where the function feature comes in, you can define functions that need to be called with some data then you control the database searching.

For instance you can have a function with "Id" and "type" where type is an enum of ["post", "comment"] which would return the latest row and another that returns all rows.

Then asking "for userID 123 find me the latest post" will call the function and use the results to answer the question

Though this isn't always perfect as it can decide that it cannt find that information (ignored functions exist) or if it disagrees with the function response will just completely ignore it.

It also has issues with "find me the latest post and comment for ID 456" where it will call the function with "post" then state it doesn't have any comments data. It will find it after father prompting


Yes! You can also instruct the LLM to constrain the generated statement with specific clauses, so that the notion of "my data" in the english query comes with implicit constraints. This can be made safe using a validation scheme, which is what HeimdaLLM[1] does. It guarantees that generated queries are only joining to allowed tables, that they have required constraints, that they only select allowable columns, and that they only execute allowed functions (and more).

The end result is that you can produce safe SQL queries on behalf of your users, allowing your users to use natural language to query their data in your database.

For example, in this demo[2], I ask it:

   query("how much have i spent renting movies, broken down by month")
And it responds correctly, guaranteed to be constrained to my data.

1. https://github.com/amoffat/HeimdaLLM 2. https://github.com/amoffat/HeimdaLLM/blob/dev/notebooks/demo...


I really don't see how english is a better query language than SQL. Yes, SQL has a bunch of sharp corners you have to learn, but plain english can be very ambiguous. Creating an interface that that is simple at low query complexity is easy. A good interface is the one that still makes sense when queries get complicated.

And to be fair, even SQL is not that good at that.


> I really don't see how english is a better query language than SQL.

People have been trying to develop "natural" programming languages since the advent of programming, and I think we can say that it's pretty much never going to work. The things that make natural language desirable (such as semantic ambiguity, individualized expression, inherent adaptability, etc) are almost exactly the things that you absolutely don't want to get involved when interfacing with computers. Programming languages lack these traits, which makes them predictable and dependable in a way natural language simply isn't, which is why they're good for interfacing with computers.


wow, blast from the past. Ben Forta's books taught me ColdFusion WAY back in the late 90's. Very cool to see him still active in this space.


> "What happened to the company?"

> "It collapsed because nobody knew how to maintain the queries—they generated everything with ChatGPT."


Isn't it how it works right now with Excel? Pretty sure every company relies on a obscure spreadsheet that no one knows how to maintain


I wouldn't be surprised, but there's at least someone around who could unwind it if necessary.

There's a big difference between limited access to domain experts and no access (or so limited that the costs are exorbitant or the individual is overwhelmed with opportunities).

My time horizon for these sorts of failures is ~10-20 years out. By that point, the younger generations coming into the workforce will have been fully conditioned (or just discouraged) out of thinking and building skills, wholly reliant on AI. I look it as a "gradually, then all of a sudden" problem.


I could totally see these types of scenarios being played out irl.


This isn't limited to SQL either; I'm working on chaining together natural language queries to create Elasticsearch queries.

Fortunately, the queries themselves are small, so the rest of tokens are spent on context that include index names and schema.

This doesn't necessarily scale well in other contexts and can get pricy quick!


Anyone can do something like this now by searching Google or asking Clippy. The act of double-checking what the interface suggests is the same as doing the thinking and work manually. Perhaps more intelligent error messages would be helpful, or performance improvement suggestions.


the difficulty at my job is the many tables have no documentation, the transform code is inaccessible, and they were created by different teams by people who arent here any more.

by the time i figure out what to write in the prompt most of my work is done.


lol, a lot of time I just copy paste the CREATE syntax into chatgpt and ask for help with how things are connected, it understands it pretty damn well.

At some point some startup is going to come up with a backend-in-a-box where you just describe the tables, rules and it will spit out all Graphql mutations and queries, react components, an auth layer, etc...

ChatGPT out of the box can do all of that right now, but it needs a coordinating tool, perhaps to even make unit tests and alter statements (for changes) and mostly, so that we don't blow out the context. That's the key issue - context. If context was unlimited, then we can just emit everything in one go. So initial structure and some prompting should lead to everything else.


Yup! Pretty much magic!


> How to generate SQL statements with ChatGPT

Bobby Tables would like to have a word with you. /s


> select * from massive table where locking up production = 1

openai: sorry k thx bai




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: