For curious, several facts i noticed by running anaconda distributed shakti:
1) k7 implementation obviosly is completely different codebase from current kdb+. I ran the following query on kdb+ and k7
a1:1000000?1000000;b1:1000000?1000000;
\t select #a by b from (+`a`b!(a1;b1));
And difference seems to be in order of magnitude (k7 is 650 msec, kdb+ is 77 msec), also on k7, time increases for subsequent execution of same query ==> memory leak? Looks like it's very early stage.
2) null number (0N) and infinity are now represented as non-ascii symbols Ø and ∞, also parsed as such.
3) type operator (@) returns symbols (`i, `j) for ints and longs etc, was returning shorts before. Interesting how do we distinguish arrays/scalars now.
4) default numeric type (e.g. 12345) is now int, was long.
5) entering overflow numeric literal returns Ø, was throwing exception.
6) k in anaconda is not stripped 675K linux x64 executable, stripped is 220K, while kdb+ is 657K as sold.
> 3) type operator (@) returns symbols (`i, `j) for ints and longs etc, was returning shorts before. Interesting how do we distinguish arrays/scalars now.
this seems not to be in the spirit of language. Testing if type is array or not is so common, and adding this function (li, li2) to "standard library" would introduce "standard library". Maybe we overlooked some other obvious solution.
A concept in K that I wish more languages had is indexing with a list of indices. It always blows peoples' minds when I tell them how sorting works in K. You use grade (<) to return the indices that will sort the array if indexed in that order. Then you pass in that list as the indices to the list. They do it in the string examples in this tutorial.
Sorting a list by getting the sorted indices (permutation) and then applying them might seem a bit roundabout at first glance. Then you realize that you can apply the permutation to something else. For example, to sort a table by the values of a column, you get the column, "sort" the column, then apply the permutation to the original table (all columns of the table).
If you want to rank results differently, you'd apply a function to each element of the original vector that computes the rank / weight of that element. Then you sort by weights, and apply the result to the original vector. Something like:
This is also how filtering works, you map a function that returns 0 or 1 over the list. Then you can call where (&) and it will give you the indices where the function evaluates true. Index by that and you'll filter a list.
It's a very good tutorial (it's not easy to find information about k7), but it is a bit outdated. For example, % is inverse and not sqrt now, and it is not possible to evaluate parsed trees with !. I am sure the differences are small, but be prepared to be surprised with some descriptive error messages like:
'nyi
A beginner is going to have a bad time with any language. Sure, 'nyi is not very descriptive, but an error message is not the place to teach total beginners. Unless you put some effort into learning a language (or already know something similar), compiler errors will be difficult to figure out anyway.
The most blatant example is the infamous "discards qualifiers" error but, for example, in a programming language with nice and descriptive error messages like Go, you will have a hard time figuring out what a interface type error is about until you actually read the manual and learn what an interface is.
A good error message doesn't have to be descriptive, but if it isn't, it absolutely must be Googleable -- i.e. unique enough so that someone who doesn't understand what it means can paste it into a search engine and find out.
If I Google "'nyi", I get a bunch of results about the New York Islanders hockey team. Not helpful. If I try to be more specific and Google "'nyi k7", the closest thing to a helpful result I get is a link to this discussion on HN.
What's important is that APL language family isn't a domain-specific one, suitable for some particular tasks in a narrow niche. To the contrary, they are languages of computations, allowing rather short paths from thought to implementation of an algorithm, compared to other languages.
I'm sure K knowledge and experience ensure the practitioner powerful tools for wide variety of problems in many areas.
The mentioned gitlab repo doesn't seem to exist anymore. One would have to download kdb+ from kx, but I am not sure if this wouldn't be K4 only. Alternatively, one could play with oK (K6) which would also give you the nice graphical iKe.
Not really sure... people started talking about it on the mostly dead /r/apljk subreddit last week which caught my attention. It appears Arthur Whitney left Kx Systems and started his own company again. I'm immediately suspicious when I see "blockchain" in something, but Arthur is a bit of a legend. I don't fully understand why it is bundled as an Anaconda (scientific and data analysis oriented distribution of Python), but I assume a library is utilized there for machine learning or something like that (hopefully someone can comment and tell us what). You can type in "/" in the REPL to bring up the help menu showing the various symbols and what they mean monadically (when taking one argument) or dyadically (taking two arguments). As usual though, their website seems to have virtually zero documentation like they want to stay obscure. I'd love if someone in the industry that can afford to use these products can tell me what Shakti looks like it would be used for. Is it the same as kdb+ mostly with a few added features?
Haha. Same here for Red. I understand why they did it (it seems like they wanted blockchain and crypto currency to be their killer app), but I'm worried it was just a costly detour. That is their decision of course though and best of luck. The other thing they're doing (if I understand correctly) is creating a second code optimizer that is a paid product. So the default compiler created slower code than the commercial one I think.
I'm also still waiting on kOS, but I'll never be able to afford it, so not sure why I'm waiting on it.
Anaconda (scientific and data analysis oriented distribution of Python)
While that was the original and still by far most common use case of Anaconda, it has greatly expanded over the past couple of years into a more general purpose software environment packaging tool.
I think the reason it's in anaconda is they're trying to go after the python crowd? Regain some of the data analysis market share that stuff like numpy took from APL.
I see a lot of niche languages like Dyalog APL doing the same thing (building bridges to Python), but if I really have to use Python, I'll just use Python. Not a hybrid of the two. Maybe that's just me though.
Think about it as an extension to e.g. K/Q's strengths. What if you would like to visualise certain aspects of quote/trade data? It could take a very long time to process the data in Python and it is difficult to visualise data from within kdb+. Seems to be a good match.
I know what you're getting at, but how hard can it be for a software company that makes millions (my assumption) to add a charting library as a native option with a verb for linechart, scatterplot, barplot...etc? They could just bundle the C++ JUCE library and then their users could create all sorts of GUIs and charts. I've seen other commercial products do this and it seems to work well, but I can understand them not wanting to include 3rd party code and pay a licensing fee. Still...I'd rather have a native way to do charting within a language rather than spin off a Python interpreter with Matplotlib. It just doesn't sound like a rapid prototyping process if you have to do that. I think I read before that they used to have a simple chart library, but removed it to keep the product tiny. I assume they know best, but don't know their reasoning. Thanks for the discussion though.
This reminds me of using F# to call out to R. If I have to do all that work, I might as well just use R :). I realize there are some edge cases where that is nice though and that not everyone is as bothered by that as me. It seems like the context switching is inefficient.
They used to have it years ago in K2, a dynamic GUI that updated as the values updated. Arthur's brother wrote it.
It used an interesting idea that exapanded on the ideas of dependencies and triggers to also include GUI updated. There were some special attributes that described the layout of the variable, and you did `show$val to display it. As the value of val changed (and all other values it touched or included), the GUI would updated for you.
Well, I think you guessed the reasoning already correctly: The use cases of kdb+ all are so much concerned with performance, that it even matters if the whole binary fits in the CPU cache. That's certainly the reason why the GUI stuff was removed after K2. Similarly was the removal of almost all system calls in the pre-kOS prototypes. It must be very interesting to work on such a system, squeezing every possible bottleneck.
Didn't know that. Thanks! Does Shakti not have the same problem with Python? Or is Python only mostly for visualization once you get back your dataset?
K7 like it's ancestor APL is extremely powerful language, whit great signal to noise ratios. However it is kind of "write-only" language.. When trying to read a code not written by you (or written by you couple of monthes ago) you struggle to understand it...
Agree with this. The problem is that there's no widely adopted style guide with k or q. A lot of code you find on the web is written in a minified fashion. At least put a new line and indent whenever you can.
K is interesting in that unlike something like Java the hard part is not remembering the standard library. You start to learn idioms for how to do certain things as you use it more. Then you pattern match different parts of the code and it can be very readable.
The fact that every function/operator has multiple meanings, often very different, depending on the actual data passed to it doesn't help readability. It tries too hard to use only symbols, which are limited on an normal keyboard, and so it crams multiple functionalities into the same symbol.
The language looks generally good for its goal, except these two operators
|/y / maximum
&/y / minimum
Those two characters have no connection with the operation they stand for and seem to be randomly picked from the available symbols on the keyboard. Why not simply max y and min y, which anybody can read and understand?
At least for boolean operations, | (or) can be thought of as the maximum of a vector of 0s (false) and 1s (true). Likewise, & (and) is the minimum of a vector of 0s (false) and 1s (true).
But I don't know if that was the motivation / justification for naming these operators / functions.
if you squint a bit, `x or y` returns the one that's greater, i.e. `0 or 1 == 1`; and `x and y` returns the one that's smaller, i.e. `0 and 1 == 0`. so min/max kind of are a generalization of and/or
Those are “or”=max, “and”=min. They are use sometimes in Mathematics (I have used them) but with the “low angle” and “up angle”. Not too frequently though.
There are a few expected (or, better said, rumored) features that have been left out. Some of them will eventually come (views), some of them probably don't (like using the unicode symbol for sqrt). I was surprised when I saw in nsl a shift-reduce parser that took into account operator precedence, but it looks like it is not being used. I'd like to have lexical scoping too, but due to how function values work, I see it very unlikely.
It’s not a port, it’s a separate implementation. The difference in meaning between these two words is sufficiently important that it’s worth pointing out IMO.
Porting software means to adapt an existing piece of software to a different computing environment.
Ports are based on source code and/or assets from the original piece of software that is being ported.
As such a port would be derivative work, meaning that the authors would not be able to release their software without permission from the copyright holder(s) of the software that was being ported.
In the 1990s Morgan Stanley's full suite of fixed income portfolio management and trading applications were written in APlus which was another of Arthur Whitney's APL variants. When I joined, MS was moving away from it, but the developers who'd been using it were very much in the "you'll take APlus from my cold, dead fingers" camp.
Not K7, but KDB+ is the de-facto time series store for financial data. It's faster than anything open source by orders of magnitude and extremely expressive. However, it's extremely expensive. If KDB had a much more user-friendly licensing scheme they would have seen much more adoption.
J and K are different languages. They are similar in that both use standard ascii keys and are in the APL family of languages. The author of K "Arthur Whitney" wrote part of a prototype with Roger Hui for the J interpreter or so I think. The primary author of J is none other than Ken Iverson (the Turing award winner and designer of the original APL) who wanted a free version of APL for the masses where people wouldn't complain about the weird non ASCII APL symbols. It also primarily uses a tacit function train style of programming like data flow languages where you don't even necessarily need variables (pretty cool). This tacit coding style has since been added to Dyalog APL where you can see folks like Aaron Hsu build some pretty cool applications using it. I'd imagine K is more performant though than J and the source is famously small (just a few C files). J has a lot more stuff baked in like Qt support. They both have a database library Jd (for J) and Kdb+ (for K). Kdb+ is a fancy SSD based database which uses K and a SQL like DSL called Q. Its performance is pretty darn good and used in Finance. I don't know enough about Jd.
I get the same gut reaction as you, but I tried reformatting the code a little bit and it's not _that_ terrible.
From the very little APL I've learnt, I know that operators always have either one parameter (named ω) or two (named α and ω), and their inputs and outputs are always arrays.
If you need to write a lot of such functions, it makes sense to define some macros to help you:
#define V1(f) A f(w)A w; // create one-arg function
#define V2(f) A f(a,w)A a,w; // create two-arg function
#define DO(n,x) {I i=0,_n=(n);for(;i<_n;++i){x;}}
iota is the APL equivalent of Python's range function:
V1(iota) { // Define one-arg function iota.
I n = *w->p; // Get the value of the ω argument.
A z = ga(0, 1, &n); // Allocate output array.
DO(n, z->p[i] = i); // Assign increasing integers.
R z; // Return the result.
}
The plus function adds two arrays:
V2(plus) { // Define two-arg function plus.
I r = w->r, // Get the rank of ω.
*d = w->d, // Get a pointer to ω's data.
n = tr(r,d); // Get tne size of ω's data.
A z = ga(0, r, d); // Allocate output array.
DO(n, z->p[i] = a->p[i] + w->p[i]);
// Add corresponding values of α and ω
R z; // Return the result.
}
Personally I cannot tolerate the lack of whitespace, but the APL guys are known to like to see their entire programs in one screenful. I can understand that some people like to write code like this.
This comes up a lot in these discussions and I believe Arthur better understands 6 pages of code written like that than someone else can comprehend 30 pages of the equivalent idiomatic code. He writes APL/K for a living, so making the C fit closer makes more sense in my eyes as it matches his thought process better. I sympathize with hating scrolling in how it hurts your ability to see everything at once.
Yeah I could really use that. I'd like to take advantage of its enhanced BCLK capabilities but every time I mess with it it doesn't POST. https://www.assignmentland.co.uk/
Trying to learn how to use A+ from aplusdev.org to do some simple work for my personal use. A previous language designed by Arthur Whitney and GPL also. Included in Debian derived distros btw.
GNU APL is a full-fledged, modern implementation of the official ISO APL 2 standard, supporting things like Unicode and modern terminals, and is currently maintained by very friendly people. While A+ is a very old, partial implementation of APL that requires a dedicated font and 8-bit encoding to work, and has not been maintained for decades. (IIRC)
Documentation for APL 2 can be found online, but I would recommend getting a second hand copy of the classic Gilman Rose book. I bought mine online for a few dollars.
Sorry for the late reply. I seem to have the 1976 edition (I didn't think of checking the edition before buying mine! DOH!) I would say it was good enough for learning the language, although the references to mainframes and terminals are obsolete. (Picturesque, if you like that sort of thing, but obsolete.) If you can get the latest edition (1991) it's probably better.
I ended up buying that book because from all the online resources I was able to find at the time, I failed to understand the gist of the language, or how you are supposed to think in order to solve problems in APL. Which is profoundly different from all other languages, and which the book taught very well. I've also been using paper books to learn programming since forever, so it's a workflow I'm used to.
That being said, there are some free resources that you could try first:
The latter is about Dyalog APL, a dialect of APL, so not all of it will be applicable to GNU APL.
About the font, any Unicode font will do, but for graphical consistency I would recommend downloading this one. You can use it with any modern terminal application:
About the keyboard, if you are on Linux you can just add one of several APL variants to your existing national keyboard. You will have to designate an APL key (eg. the useless menu key) to use as a special alt-like key to input APL symbols. If you're not comfortable typing blindly, you can buy a keyboard with APL symbols on it, or alternatively buy or print stickers to apply to yours.
I spent an entire summer learning APL for fun (that was way before GNU APL existed, so I had to struggle with various proprietary interpreters, either freeware or demoware, on top of the weird language + font + keyboard!) I then went on to compete in Code Golf challenges and other random things using it and had a blast!
I still miss APL. IMHO none of the successors (J, K) come close to the beauty of its symbols. It's also a very different way of writing algorithms (sometimes called "multi-dimensional array programming") that is not found in modern programming languages.
Feel free to contact me (username + gmail) for any questions you may have.
1) k7 implementation obviosly is completely different codebase from current kdb+. I ran the following query on kdb+ and k7
And difference seems to be in order of magnitude (k7 is 650 msec, kdb+ is 77 msec), also on k7, time increases for subsequent execution of same query ==> memory leak? Looks like it's very early stage.2) null number (0N) and infinity are now represented as non-ascii symbols Ø and ∞, also parsed as such.
3) type operator (@) returns symbols (`i, `j) for ints and longs etc, was returning shorts before. Interesting how do we distinguish arrays/scalars now.
4) default numeric type (e.g. 12345) is now int, was long.
5) entering overflow numeric literal returns Ø, was throwing exception.
6) k in anaconda is not stripped 675K linux x64 executable, stripped is 220K, while kdb+ is 657K as sold.