I wrote a Rust / tree-sitter program to detect reify usage in Clojure code a while ago: https://github.com/borkdude/analyze-reify (partially for learning Rust but also performance is really nice). I wanted to extend this to more advanced patterns and do it in Clojure itself, because that's what I'm most productive with.
The reason I wanted to know more about this specific pattern is to answer the question if it's relevant enough for scripting. If so, I should probably support that in my Clojure interpreter (https://github.com/borkdude/sci) which is used in babashka (https://github.com/borkdude/babashka), a fast starting scripting environment for Clojure.
@Borkdude I love your project & the name! I work on a similar tool (https://github.com/returntocorp/semgrep), we've recently switched to using tree-sitter under the hood so that we can add new languages faster. Since Semgrep tries to support many languages, we can't use the native language parsers, so I was interested in https://github.com/sogaiu/tree-sitter-clojure. How mature vs buggy would you say the tree-sitter-clojure implementation is? We've found the quality of tree-sitter implementations can vary wildly across languages.
it's unclear to me what exactly they rely on in tree-sitter grammars so it's hard to say whether tree-sitter-clojure would be helpful for them.
i think tree-sitter-clojure does what it does pretty well based on testing across a large number of code files from clojars and recent generative testing runs.
however, it deliberately does not try to do certain things (e.g. identify definitions, special forms, etc.) for various reasons. the approach is very similar to that of rewrite-clj* and parcera.
This could be cool for doing similarity searches as well as generating Markov probability trees in how apis or language features are used.
I was thinking of doing something similar for a RISC-V cpu, where the CPU would generate compressed histograms of common multiple instruction sequences (across loop boundaries) so that one would know what super instructions to synthesize.
What is your motivation for making this? Refactoring? Code Indexing, if so for academic analysis, defect detection or ?
I could see this being a really nifty search interface for massive code libraries.
Are you discussing this library on another Clojure communication channel (Discord, Slack, etc) and if so, where?