More

ejones · 2025-02-26T18:39:56 1740595196

Amazing work. I'm interested in the choice of WASM - presumably any target that can run DOOM could've been used? Of which there are innumerable choices I assume. Was it for symbolic reasons or genuinely the most useful target?

dimitropoulos · 2025-02-26T20:36:35 1740602195

love this feedback - will definitely talk about it in the next videos.

you're gonna laugh.. but the answer is "ignorance". I had no idea what I was doing and had literally never touched WebAssembly before but thought it'd be a good place to start. Then it just stuck.

Hilariously, later a friend explained to me "Dimitri, this would have been a LOT easier if you had just targeted ASSEMBLY. IT WAS RIGHT THERE IN THE NAME". haha. oh well! ignorance is bliss

ejones · 2025-02-26T21:32:43 1740605563

Ah nice! Well, hats off this is really impressive. As other commenters mentioned the extent to which it's documented and the restricted scope probably helped.

dimitropoulos · 2025-02-26T21:58:58 1740607138

exactly! knowing what I know now, actually WebAssembly was probably just about the best thing I could have accidentally picked!

rvnx · 2025-02-26T18:46:29 1740595589

WASM is one of the easier platforms to port as the Virtual Machine is well documented and there are actual implementations in many languages that can be used for debugging and comparing the results.

even in pure JS: https://github.com/evanw/polywasm

wvenable · 2025-02-26T18:45:39 1740595539

WASM is the easiest target because you don't have to emulate an entire computer.

svieira · 2025-02-26T21:01:37 1740603697

But in this case he kind of did anyway (at least the video makes reference to "L1 Instructions Cache").

wvenable · 2025-02-26T22:13:57 1740608037

But that's all CPU -- he doesn't have to emulate the rest of the computer (video card, IO systems, etc). You provide WASM with your own interface to the outside world.

dimitropoulos · 2025-02-27T13:19:14 1740662354

(not arguing, really just want to hear your thoughts to this!) so re: video card - but I did write what I don't know what else to call other than "a graphics driver" (i.e. it takes Doom palette pixel values and converts them to something the user sees on their screen with ASCII art). what else would you call that? or are you saying video card would have to be at the level of VSCode or my operating system that actually lights up physical pixels on my screen.

wvenable · 2025-02-27T16:43:57 1740674637

If instead of WASM you decided to emulate a different DOOM target like a PC then you'd have to emulate the actual VGA graphics hardware and enough of the other PC hardware to run the game. That level of emulation is a difficult project on it's own.

dimitropoulos · 2025-02-27T17:36:30 1740677790

got it, ok sweet. thank you so much for explaining. I don't know a ton about these programming circles, so I don't want to say the wrong thing if I can avoid it. Sounds like you're saying that no WASM runtime is, in this sense, qualifying - which makes sense!

ejones · on Aug 6, 2024

FWIW, llama.cpp has always had a JSON schema -> GBNF converter, although it launched as a companion script. Now I think it's more integrated in the CLI and server.

But yeah I mean, GBNF or other structured output solutions would of course allow you to supply formats other than JSON schema. It sounds conceivable though that OpenAI could expose the grammars directly in the future, though.

behnamoh · on Aug 6, 2024

I think for certain tasks it's still easier to write the grammar directly. Does converting from JSON to a CFG limit the capabilities of the grammar? i.e., are there things JSON can't represent that a context free grammar can?

ejones · on Aug 6, 2024

You might be right that they're similarly powerful. In some cases, an arbitrary output format might in and of itself be desirable. Like it might result in token savings or be more natural for the LLM. For instance, generating code snippets to an API or plain text with constraints.

And this is more esoteric, but technically in the case of JSON I suppose you could embed a grammar inside a JSON string, which I'm not sure JSON schema can express.

ejones · on Aug 6, 2024

Similar approach to llama.cpp under the hood - they convert the schema to a grammar. Llama.cpp's implementation was specific to the ggml stack, but what they've built sounds similar to Outlines, which they acknowledged.

ejones · on April 1, 2022

Awesome to see work in the DB wire compatible space. On the MySQL side, there was MySQL Proxy (https://github.com/mysql/mysql-proxy), which was scriptable with Lua, with which you could create your own MySQL wire compatible connections. Unfortunately it appears to have been abandoned by Oracle and IIRC doesn't work with 5.7 and beyond. I used it in the past to hack together a MySQL wire adapter for Interana (https://scuba.io/).

I guess these days the best approach for connecting arbitrary data sources to existing drivers, at least for OLAP, is Apache Calcite (https://calcite.apache.org/). Unfortunately that feels a little more involved.

westurner · on April 1, 2022

NYTimes/DBSlayer (2007) wraps MySQL in JSON: https://open.nytimes.com/introducing-dbslayer-64d7168a143f

ODBC > Bridging configurations: https://en.wikipedia.org/wiki/Open_Database_Connectivity#Bri...

awesome-graphql > tools: security: https://github.com/chentsulin/awesome-graphql#tools---securi...

... W3C SOLID > Authorization and Access Control: https://github.com/solid/solid-spec#authorization-and-access...

"Hosting SQLite Databases on GitHub Pages" (2021) re: sql.js-httpvfs, DuckDB https://news.ycombinator.com/item?id=28021766

[edit] TIL the MS ODBC 4.0 spec is MIT Licensed, on GitHub , and supports ~challenge/response token auth: "3.2.2 Web-based Authentication Flow with SQLBrowseConnect" https://github.com/microsoft/ODBC-Specification/blob/master/...

> Applications that are unable to allow drivers to pop up dialogs can call SQLBrowseConnect to connect to the service.

> SQLBrowseConnect provides an iterative dialog between the driver and the application where the application passes in an initial input connection string. If the connection string contains sufficient information to connect, the driver responds with SQL_SUCCESS and an output connection string containing the complete set of connection attributes used.

> If the initial input connection string does not contain sufficient information to connect to the source, the driver responds with SQL_NEED_DATA and an output connection string specifying informational attributes for the application (such as the authorization url) as well as required attributes to be specified in a subsequent call to SQLBrowseConnect. Attribute names returned by SQLBrowseConnect may include a colon followed by a localized identifier, and the value of the requested attribute is either a single question mark or a comma-separated list of valid values (optionally including localized identifiers) enclosed in curly braces. Optional attributes are returned preceded with an asterix (*).

> In a Web-based authentication scenario, if SQLBrowseConnect is called with an input connection string containing an access token that has not expired, along with any other required properties, no additional information should be required. If the access token has expired and the connection string contains a refresh token, the driver attempts to refresh the connection using the supplied refresh token.

Also TIL ODBC 4.0 supports: https://github.com/microsoft/ODBC-Specification :

> Semi-structured data – Tables whose schema may not be defined or may change on a row-by-row basis

> Hierarchical Data – Data with nested structure (structured fields, lists)

> Web Authentication model

ejones · on March 25, 2022

In fact it looks like there are already browser extensions that do this for Markdown:

https://chrome.google.com/webstore/detail/markdown-viewer/ck... https://chrome.google.com/webstore/detail/markdown-preview-p...

ejones · on Sept 14, 2021

There's a proposal for this: https://github.com/tc39/proposal-partial-application

abecedarius · on Sept 14, 2021

That's more specialized: it can express

    foo(42, ?)

but not what you'd write within this pipe syntax as

    foo(42, ^) + 1

The tradeoff is that it doesn't need some extra delimiter since the function call is what delimits it. Perhaps that's a better tradeoff, I'm not sure; but for sure we shouldn't have both.

ejones · on March 31, 2021

It just navigated the page itself by assigning window.location.href, which isn't subject to the same restrictions as popups.

ejones · on July 17, 2020

Yeah, the inclusion criteria are pretty broad:

  The snapshot will include every repo with any commits
  between the announcement at GitHub Universe on November 13th
  and 02/02/2020, every repo with at least 1 star and any
  commits from the year before the snapshot (02/03/2019 -
  02/02/2020), and every repo with at least 250 stars.

from https://archiveprogram.github.com/#arctic-code-vault

sdesol · on July 17, 2020

This would make sense, as it looks like money wasn't a huge factor. If money wasn't an issue, time and effort would be, and you can easily reduce effort by storing pretty much everything.

ejones · on June 18, 2020

Apparently the Saguache Crescent in Colorado is (as of this video in 2016) still using the linotype, the last known newspaper in the US to do so: https://www.youtube.com/watch?v=DNa9XRoNRUM

ejones · on Aug 8, 2011

That, or use a split-frame browser extension (https://addons.mozilla.org/en-US/firefox/addon/fox-splitter-...) (http://www.chromeplugins.org/plugins/google-chrome-dual-view...)