K7 Tutorial

sannysanoff · on March 18, 2019

For curious, several facts i noticed by running anaconda distributed shakti:

1) k7 implementation obviosly is completely different codebase from current kdb+. I ran the following query on kdb+ and k7

  a1:1000000?1000000;b1:1000000?1000000;
  \t select #a by b from (+`a`b!(a1;b1));

And difference seems to be in order of magnitude (k7 is 650 msec, kdb+ is 77 msec), also on k7, time increases for subsequent execution of same query ==> memory leak? Looks like it's very early stage.

2) null number (0N) and infinity are now represented as non-ascii symbols Ø and ∞, also parsed as such.

3) type operator (@) returns symbols (`i, `j) for ints and longs etc, was returning shorts before. Interesting how do we distinguish arrays/scalars now.

4) default numeric type (e.g. 12345) is now int, was long.

5) entering overflow numeric literal returns Ø, was throwing exception.

6) k in anaconda is not stripped 675K linux x64 executable, stripped is 220K, while kdb+ is 657K as sold.

pvitz · on March 18, 2019

One might add that by typing "\", the K7 description gets printed (similar to the description on kparc.com).

SifJar · on March 19, 2019

> 3) type operator (@) returns symbols (`i, `j) for ints and longs etc, was returning shorts before. Interesting how do we distinguish arrays/scalars now.

Upper/lower case:

   @1
  `i
   @1 2
  `I
   @`a
  `n
   @`a`b
  `N

sannysanoff · on March 19, 2019

I saw that, but how do you make simple check if it is uppercase?

SifJar · on March 19, 2019

I guess you can check the ASCII value of the char:

   li:{(*$@x)within "AZ"}
   li 1
  0
   li 1 2 3
  1

Or can have a list of uppercase chars as names & check with 'in':

   li2:{(@:y)in x}[`$,:'`c$"A"+!26]
   li2 1 2 3
  1
   li2 1
  0

'li2' seems to be considerably faster:

   \t:10000 li 1 2 3
  13
   \t:10000 li2 1 2 3
  6

There may well be a better way though

sannysanoff · on March 20, 2019

this seems not to be in the spirit of language. Testing if type is array or not is so common, and adding this function (li, li2) to "standard library" would introduce "standard library". Maybe we overlooked some other obvious solution.

chrispsn · on March 18, 2019

Perhaps the performance difference is due to resource limitations in the Shakti trial?

toastking · on March 18, 2019

A concept in K that I wish more languages had is indexing with a list of indices. It always blows peoples' minds when I tell them how sorting works in K. You use grade (<) to return the indices that will sort the array if indexed in that order. Then you pass in that list as the indices to the list. They do it in the string examples in this tutorial.

i_don_t_know · on March 18, 2019

Sorting a list by getting the sorted indices (permutation) and then applying them might seem a bit roundabout at first glance. Then you realize that you can apply the permutation to something else. For example, to sort a table by the values of a column, you get the column, "sort" the column, then apply the permutation to the original table (all columns of the table).

geocar · on March 18, 2019

Another great concept is that function application and array indexing has the same syntax.

    a.map(f)

is basically† just:

    f[a]

and:

    a.map(x => m[x])

is still just:

    m[a]

†: Rank notwithstanding. If that bothers you that f might take an atom, pretend I said f@/:a and m@/:a instead.

icen · on March 18, 2019

A related operator is group (=), which creates a map between the unique elements of a list to their locations.

     x: 10 ? `a`b`c
     x
    `a`c`c`c`c`a`b`a`a`c
     =x
    a|0 5 7 8  
    b|,6       
    c|1 2 3 4 9

faitswulff · on March 18, 2019

How does grade work? Can you customize how it ranks results?

i_don_t_know · on March 18, 2019

If you want to rank results differently, you'd apply a function to each element of the original vector that computes the rank / weight of that element. Then you sort by weights, and apply the result to the original vector. Something like:

input[<{...compute weight of element x...}'input]

toastking · on March 18, 2019

This is also how filtering works, you map a function that returns 0 or 1 over the list. Then you can call where (&) and it will give you the indices where the function evaluates true. Index by that and you'll filter a list.

yiyus · on March 18, 2019

It's a very good tutorial (it's not easy to find information about k7), but it is a bit outdated. For example, % is inverse and not sqrt now, and it is not possible to evaluate parsed trees with !. I am sure the differences are small, but be prepared to be surprised with some descriptive error messages like: 'nyi

pvitz · on March 18, 2019

'nyi = not yet implemented

yjftsjthsd-h · on March 18, 2019

And that's very obvious once it's been pointed out, but how long would it take a beginner to figure that out?

pvitz · on March 18, 2019

I agree that the official K documentation was/is thin (after the K2 manual). But if you are interested in it, John Earnest wrote a nice documentation:

https://github.com/JohnEarnest/ok/tree/gh-pages/docs

yiyus · on March 18, 2019

A beginner is going to have a bad time with any language. Sure, 'nyi is not very descriptive, but an error message is not the place to teach total beginners. Unless you put some effort into learning a language (or already know something similar), compiler errors will be difficult to figure out anyway.

The most blatant example is the infamous "discards qualifiers" error but, for example, in a programming language with nice and descriptive error messages like Go, you will have a hard time figuring out what a interface type error is about until you actually read the manual and learn what an interface is.

smacktoward · on March 18, 2019

A good error message doesn't have to be descriptive, but if it isn't, it absolutely must be Googleable -- i.e. unique enough so that someone who doesn't understand what it means can paste it into a search engine and find out.

If I Google "'nyi", I get a bunch of results about the New York Islanders hockey team. Not helpful. If I try to be more specific and Google "'nyi k7", the closest thing to a helpful result I get is a link to this discussion on HN.

avmich · on March 18, 2019

What's important is that APL language family isn't a domain-specific one, suitable for some particular tasks in a narrow niche. To the contrary, they are languages of computations, allowing rather short paths from thought to implementation of an algorithm, compared to other languages.

I'm sure K knowledge and experience ensure the practitioner powerful tools for wide variety of problems in many areas.

a-saleh · on March 18, 2019

Still, for me it is quite hard to imagine what would I create with APL.

Usually, when trying a language, my instinct is to implement a web-service. Or a simple gui-app, with some library.

With APL, I am looking at it and wonder. I looks powerful. But what should I do with it? Make a compiler? A database? How does I/O even work there?

toastking · on March 18, 2019

In K, usually the I/O primitives you have are the "colon" operators, stuff like 4:.

pvitz · on March 18, 2019

The mentioned gitlab repo doesn't seem to exist anymore. One would have to download kdb+ from kx, but I am not sure if this wouldn't be K4 only. Alternatively, one could play with oK (K6) which would also give you the nice graphical iKe.

chrispsn · on March 18, 2019

K7 is now available as a trial from Shakti Software, Arthur Whitney’s new company. https://anaconda.org/shaktidb

There is a Google Group: https://groups.google.com/forum/#!forum/shaktidb

ah- · on March 18, 2019

I didn't really follow that development, what led to the creation of Shakti?

4thaccount · on March 18, 2019

Not really sure... people started talking about it on the mostly dead /r/apljk subreddit last week which caught my attention. It appears Arthur Whitney left Kx Systems and started his own company again. I'm immediately suspicious when I see "blockchain" in something, but Arthur is a bit of a legend. I don't fully understand why it is bundled as an Anaconda (scientific and data analysis oriented distribution of Python), but I assume a library is utilized there for machine learning or something like that (hopefully someone can comment and tell us what). You can type in "/" in the REPL to bring up the help menu showing the various symbols and what they mean monadically (when taking one argument) or dyadically (taking two arguments). As usual though, their website seems to have virtually zero documentation like they want to stay obscure. I'd love if someone in the industry that can afford to use these products can tell me what Shakti looks like it would be used for. Is it the same as kdb+ mostly with a few added features?

mhd · on March 18, 2019

Yeah, the mere mention of blockchain/crypto was also the reason why my interesting in Red waned suddenly...

Still waiting for kOS, too.

4thaccount · on March 18, 2019

Haha. Same here for Red. I understand why they did it (it seems like they wanted blockchain and crypto currency to be their killer app), but I'm worried it was just a costly detour. That is their decision of course though and best of luck. The other thing they're doing (if I understand correctly) is creating a second code optimizer that is a paid product. So the default compiler created slower code than the commercial one I think.

I'm also still waiting on kOS, but I'll never be able to afford it, so not sure why I'm waiting on it.

pvitz · on March 18, 2019

According to the Google Group, Alexander Belopolsky (involved in pyq) is a member of Shakti. That would explain the Python relation.

chrispsn · on March 18, 2019

I've played around with the Python integration - it's pretty smooth. See, eg, https://groups.google.com/d/msg/shaktidb/lQ3XSvFPDhw/DOlmmFg...

dagw · on March 18, 2019

Anaconda (scientific and data analysis oriented distribution of Python)

While that was the original and still by far most common use case of Anaconda, it has greatly expanded over the past couple of years into a more general purpose software environment packaging tool.

4thaccount · on March 18, 2019

Thanks for pointing that out!

toastking · on March 18, 2019

I think the reason it's in anaconda is they're trying to go after the python crowd? Regain some of the data analysis market share that stuff like numpy took from APL.

4thaccount · on March 18, 2019

I see a lot of niche languages like Dyalog APL doing the same thing (building bridges to Python), but if I really have to use Python, I'll just use Python. Not a hybrid of the two. Maybe that's just me though.

pvitz · on March 18, 2019

Think about it as an extension to e.g. K/Q's strengths. What if you would like to visualise certain aspects of quote/trade data? It could take a very long time to process the data in Python and it is difficult to visualise data from within kdb+. Seems to be a good match.

4thaccount · on March 18, 2019

I know what you're getting at, but how hard can it be for a software company that makes millions (my assumption) to add a charting library as a native option with a verb for linechart, scatterplot, barplot...etc? They could just bundle the C++ JUCE library and then their users could create all sorts of GUIs and charts. I've seen other commercial products do this and it seems to work well, but I can understand them not wanting to include 3rd party code and pay a licensing fee. Still...I'd rather have a native way to do charting within a language rather than spin off a Python interpreter with Matplotlib. It just doesn't sound like a rapid prototyping process if you have to do that. I think I read before that they used to have a simple chart library, but removed it to keep the product tiny. I assume they know best, but don't know their reasoning. Thanks for the discussion though.

This reminds me of using F# to call out to R. If I have to do all that work, I might as well just use R :). I realize there are some edge cases where that is nice though and that not everyone is as bothered by that as me. It seems like the context switching is inefficient.

jnordwick · on March 19, 2019

They used to have it years ago in K2, a dynamic GUI that updated as the values updated. Arthur's brother wrote it.

It used an interesting idea that exapanded on the ideas of dependencies and triggers to also include GUI updated. There were some special attributes that described the layout of the variable, and you did `show$val to display it. As the value of val changed (and all other values it touched or included), the GUI would updated for you.

Think Excel merged with K and you get the idea.

pvitz · on March 18, 2019

Well, I think you guessed the reasoning already correctly: The use cases of kdb+ all are so much concerned with performance, that it even matters if the whole binary fits in the CPU cache. That's certainly the reason why the GUI stuff was removed after K2. Similarly was the removal of almost all system calls in the pre-kOS prototypes. It must be very interesting to work on such a system, squeezing every possible bottleneck.

4thaccount · on March 18, 2019

Didn't know that. Thanks! Does Shakti not have the same problem with Python? Or is Python only mostly for visualization once you get back your dataset?

vadiml · on March 18, 2019

K7 like it's ancestor APL is extremely powerful language, whit great signal to noise ratios. However it is kind of "write-only" language.. When trying to read a code not written by you (or written by you couple of monthes ago) you struggle to understand it...

icen · on March 18, 2019

I think that this is a matter of practice. I write k/q almost daily, and don't struggle to read other people's code, or my own.

Most code isn't clever oneliners, and is quite readable.

anonu · on March 18, 2019

Agree with this. The problem is that there's no widely adopted style guide with k or q. A lot of code you find on the web is written in a minified fashion. At least put a new line and indent whenever you can.

toastking · on March 18, 2019

K is interesting in that unlike something like Java the hard part is not remembering the standard library. You start to learn idioms for how to do certain things as you use it more. Then you pattern match different parts of the code and it can be very readable.

stefano · on March 18, 2019

The fact that every function/operator has multiple meanings, often very different, depending on the actual data passed to it doesn't help readability. It tries too hard to use only symbols, which are limited on an normal keyboard, and so it crams multiple functionalities into the same symbol.

pmontra · on March 18, 2019

The language looks generally good for its goal, except these two operators

     |/y / maximum
 
     &/y / minimum

Those two characters have no connection with the operation they stand for and seem to be randomly picked from the available symbols on the keyboard. Why not simply max y and min y, which anybody can read and understand?

i_don_t_know · on March 18, 2019

At least for boolean operations, | (or) can be thought of as the maximum of a vector of 0s (false) and 1s (true). Likewise, & (and) is the minimum of a vector of 0s (false) and 1s (true).

But I don't know if that was the motivation / justification for naming these operators / functions.

uryga · on March 18, 2019

if you squint a bit, `x or y` returns the one that's greater, i.e. `0 or 1 == 1`; and `x and y` returns the one that's smaller, i.e. `0 and 1 == 0`. so min/max kind of are a generalization of and/or

lelf · on March 18, 2019

Paraphrasing

  fold[max] y  # maximum
  fold[min] y  # minimum

Those two characters have no connection with the operation they stand for

  | — logical-or,  min
  & — logical-and, max

    0|1
  1
    1|42
  42

de_Selby · on March 18, 2019

Sure they do, in most C-like languages || is OR and && is AND.

For OR, from

1 | 0 = 1

It's a small step to generalise to

N | 0 = N (~max)

and then

N | N-1 = N (max)

The same goes for &

pfortuny · on March 18, 2019

Those are “or”=max, “and”=min. They are use sometimes in Mathematics (I have used them) but with the “low angle” and “up angle”. Not too frequently though.

quadcore · on March 18, 2019

Note that |/y is | (or) over y. Hence, max.

SifJar · on March 18, 2019

max and min are also available as keywords in k7/shakti

beaumayns · on March 18, 2019

I was hoping we'd finally get proper lexical scope in this version of K. Alas, seems to not be the case.

yiyus · on March 18, 2019

There are a few expected (or, better said, rumored) features that have been left out. Some of them will eventually come (views), some of them probably don't (like using the unicode symbol for sqrt). I was surprised when I saw in nsl a shift-reduce parser that took into account operator precedence, but it looks like it is not being used. I'd like to have lexical scoping too, but due to how function values work, I see it very unlikely.

beaumayns · on March 18, 2019

Unlikely due to how they're implemented, or their semantics?

chrispsn · on March 18, 2019

According to this thread, it makes IPC harder: https://www.reddit.com/r/apljk/comments/82vcs4/comment/dyxb0...

mark_l_watson · on March 18, 2019

I liked the example of 1 line of code generation a table of random stock data.

But, this is not free or open source?

toastking · on March 18, 2019

There is an open source port of K called Kona: https://github.com/kevinlawler/kona

codetrotter · on March 18, 2019

It’s not a port, it’s a separate implementation. The difference in meaning between these two words is sufficiently important that it’s worth pointing out IMO.

Porting software means to adapt an existing piece of software to a different computing environment.

Ports are based on source code and/or assets from the original piece of software that is being ported.

As such a port would be derivative work, meaning that the authors would not be able to release their software without permission from the copyright holder(s) of the software that was being ported.

quickthrower2 · on March 18, 2019

Can any comment how this compares to R? Seems to have similar capabilities.

et2o · on March 18, 2019

The syntax seems vaguely similar, but I don't really see other commonalities. R's strength is it's incredible ecosystem.

asimjalis · on March 18, 2019

What are the financial applications that this is used for? Does anyone have specific examples?

bwanab · on March 18, 2019

In the 1990s Morgan Stanley's full suite of fixed income portfolio management and trading applications were written in APlus which was another of Arthur Whitney's APL variants. When I joined, MS was moving away from it, but the developers who'd been using it were very much in the "you'll take APlus from my cold, dead fingers" camp.

alfalfasprout · on March 18, 2019

Not K7, but KDB+ is the de-facto time series store for financial data. It's faster than anything open source by orders of magnitude and extremely expressive. However, it's extremely expensive. If KDB had a much more user-friendly licensing scheme they would have seen much more adoption.

didsomeonesay · on March 18, 2019

Interesting previous discussion of K programming languages (up to K5):

https://news.ycombinator.com/item?id=16500908 (2005)

surajs · on March 18, 2019

Gitlab link to download it is a 404!?

ngcc_hk · on March 18, 2019

Difference between J7 and K7?

4thaccount · on March 18, 2019

J and K are different languages. They are similar in that both use standard ascii keys and are in the APL family of languages. The author of K "Arthur Whitney" wrote part of a prototype with Roger Hui for the J interpreter or so I think. The primary author of J is none other than Ken Iverson (the Turing award winner and designer of the original APL) who wanted a free version of APL for the masses where people wouldn't complain about the weird non ASCII APL symbols. It also primarily uses a tacit function train style of programming like data flow languages where you don't even necessarily need variables (pretty cool). This tacit coding style has since been added to Dyalog APL where you can see folks like Aaron Hsu build some pretty cool applications using it. I'd imagine K is more performant though than J and the source is famously small (just a few C files). J has a lot more stuff baked in like Qt support. They both have a database library Jd (for J) and Kdb+ (for K). Kdb+ is a fancy SSD based database which uses K and a SQL like DSL called Q. Its performance is pretty darn good and used in Finance. I don't know enough about Jd.

etatoby · on March 18, 2019

> [K's] source is famously small (just a few C files)

Yes, but they look like this:

https://github.com/tangentstorm/j-incunabulum/blob/master/ji...

Call me crazy, but I'm wary of code written like this.

rbonvall · on March 18, 2019

I get the same gut reaction as you, but I tried reformatting the code a little bit and it's not _that_ terrible.

From the very little APL I've learnt, I know that operators always have either one parameter (named ω) or two (named α and ω), and their inputs and outputs are always arrays.

If you need to write a lot of such functions, it makes sense to define some macros to help you:

    #define V1(f) A f(w)A w;      // create one-arg function
    #define V2(f) A f(a,w)A a,w;  // create two-arg function
    #define DO(n,x) {I i=0,_n=(n);for(;i<_n;++i){x;}}

iota is the APL equivalent of Python's range function:

    V1(iota) {            // Define one-arg function iota.
        I n = *w->p;          // Get the value of the ω argument.
        A z = ga(0, 1, &n);   // Allocate output array.
        DO(n, z->p[i] = i);   // Assign increasing integers.
        R z;                  // Return the result.
    }

The plus function adds two arrays:

    V2(plus) {            // Define two-arg function plus.
        I  r = w->r,          // Get the rank of ω.
          *d = w->d,          // Get a pointer to ω's data.
           n = tr(r,d);       // Get tne size of ω's data.
        A z = ga(0, r, d);    // Allocate output array.
        DO(n, z->p[i] = a->p[i] + w->p[i]);
                              // Add corresponding values of α and ω
        R z;                  // Return the result.
    }

Personally I cannot tolerate the lack of whitespace, but the APL guys are known to like to see their entire programs in one screenful. I can understand that some people like to write code like this.

4thaccount · on March 18, 2019

This comes up a lot in these discussions and I believe Arthur better understands 6 pages of code written like that than someone else can comprehend 30 pages of the equivalent idiomatic code. He writes APL/K for a living, so making the C fit closer makes more sense in my eyes as it matches his thought process better. I sympathize with hating scrolling in how it hurts your ability to see everything at once.

maxpetis · on March 18, 2019

Yeah I could really use that. I'd like to take advantage of its enhanced BCLK capabilities but every time I mess with it it doesn't POST. https://www.assignmentland.co.uk/

svnpenn · on March 18, 2019

is it a private repo? this is asking me to login

https://gitlab.com/k7db/k

anonu · on March 18, 2019

I think so - I can't clone

patrickg_zill · on March 18, 2019

Trying to learn how to use A+ from aplusdev.org to do some simple work for my personal use. A previous language designed by Arthur Whitney and GPL also. Included in Debian derived distros btw.

etatoby · on March 18, 2019

I would highly recommend GNU APL over A+.

GNU APL is a full-fledged, modern implementation of the official ISO APL 2 standard, supporting things like Unicode and modern terminals, and is currently maintained by very friendly people. While A+ is a very old, partial implementation of APL that requires a dedicated font and 8-bit encoding to work, and has not been maintained for decades. (IIRC)

Documentation for APL 2 can be found online, but I would recommend getting a second hand copy of the classic Gilman Rose book. I bought mine online for a few dollars.

patrickg_zill · on March 18, 2019

Is there a particular edition of the Gilman Rose book you would recommend?

etatoby · on March 27, 2019

Sorry for the late reply. I seem to have the 1976 edition (I didn't think of checking the edition before buying mine! DOH!) I would say it was good enough for learning the language, although the references to mainframes and terminals are obsolete. (Picturesque, if you like that sort of thing, but obsolete.) If you can get the latest edition (1991) it's probably better.

I ended up buying that book because from all the online resources I was able to find at the time, I failed to understand the gist of the language, or how you are supposed to think in order to solve problems in APL. Which is profoundly different from all other languages, and which the book taught very well. I've also been using paper books to learn programming since forever, so it's a workflow I'm used to.

That being said, there are some free resources that you could try first:

http://misc.aplteam.com/robertson/APL1&2.pdf - http://misc.aplteam.com/robertson/APL3&4.pdf - https://www.dyalog.com/uploads/documents/MasteringDyalogAPL....

The latter is about Dyalog APL, a dialect of APL, so not all of it will be applicable to GNU APL.

About the font, any Unicode font will do, but for graphical consistency I would recommend downloading this one. You can use it with any modern terminal application:

https://www.dyalog.com/uploads/files/download.php?file=fonts...

About the keyboard, if you are on Linux you can just add one of several APL variants to your existing national keyboard. You will have to designate an APL key (eg. the useless menu key) to use as a special alt-like key to input APL symbols. If you're not comfortable typing blindly, you can buy a keyboard with APL symbols on it, or alternatively buy or print stickers to apply to yours.

I spent an entire summer learning APL for fun (that was way before GNU APL existed, so I had to struggle with various proprietary interpreters, either freeware or demoware, on top of the weird language + font + keyboard!) I then went on to compete in Code Golf challenges and other random things using it and had a blast!

I still miss APL. IMHO none of the successors (J, K) come close to the beauty of its symbols. It's also a very different way of writing algorithms (sometimes called "multi-dimensional array programming") that is not found in modern programming languages.

Feel free to contact me (username + gmail) for any questions you may have.

ne01 · on March 18, 2019

Off topic, but is it just me, or do you also dig the simple text format of the website? -- so easy to read!

jchw · on March 18, 2019

Wish it didn't have manual line breaks, though; it's kind of terrible on mobile.

Stratoscope · on March 18, 2019

  The first sentence
  reminds me a bit of the story of
  Mel.

http://catb.org/jargon/html/story-of-mel.html

twohoursprog · on March 18, 2019

Any good programmer should be able o learn to program k7 at the level of this tutorial in two hours.