Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The code (which is a few percent faster than quicksort) is on the last page:

  procedure QUICKSORTAGENT(input state)
  2: Let i = 1, j = 2, l = 3, h = 4
  3: if FunctionID = None then
  4: return vh ← Function1(vl ← vl, vh ← vh)
  5: else if FunctionID = 1 then . QuickSort
  6: if vl < vh then
  7: if prev = None then
  8: return vi ← Function2(vl ← vl, vh ← vh)
  9: else if prev = (vi ← Function2(vl ← vl, vh ← vh)) then
  10: return AssignVar(j, i)
  11: else if prev = AssignVar(j, i) then
  12: return MoveVar(i, -1)
  13: else if prev = MoveVar(i, -1) then
  14: if vi > vl then
  15: return vi ← Function1(vl ← vl, vh ← vi)
  16: else
  17: return MoveVar(j, +1)
  18: end if
  19: else if prev = (vi ← Function1(vl ← vl, vh ← vi)) t hen 
  20: return MoveVar(j, +1)
  21: else if prev = MoveVar(j, +1) and vj < vh then
  22: return vh ← Function1(vl ← vj , vh ← vh)
  23: else
  24: return Return(h)
  25: end if
  26: else
  27: return Return(h)
  28: end if
  29: else . Function ID = 2, Partition
  30: if prev = None then
  31: return AssignVar(i, l)
  32: else if prev = AssignVar(i, l) then
  33: return AssignVar(j, l)
  34: else if vj < vh then
  35: if prev = Swap(i, j) then
  36: return MoveVar(i, +1)
  37: else if (prev = AssignVar(j, l) or prev = MoveVar(j, 
  +1)) and A[vj ] < A[vh] then
  38: if vi 6= vj then
  39: return Swap(i, j)
  40: else
  41: return MoveVar(i, +1)
  42: end if
  43: else
  44: return MoveVar(j, +1)
  45: end if
  46: else if prev = MoveVar(j, +1) then
  47: return Swap(i, h)
  48: else
  49: return Return(i)
  50: end if
  51: end if
  52: end procedure


This reminds me of programming basic on an Acorn Electron. Line numbers and I don't remember indentation, though it was a few decades ago.


The Electron had a new enough version of BBC Basic that "proper" procedures and UDFs were supported (in a limited fashion, but well enough considering the constraints of the machine) so you could pretty much ignore line numbers, in fact by using a text editor and TYPEing the result into BASIC you could write code without them (the interpreter needed them, the TYPE trick made it put them in for itself).

But yes, this was uncommon so line numbers were a major thing.

And indentation was optional, you could add extra spaces to the start of lines and they would be kept and would have no effect on execution, but you would be wasting precious bytes of RAM and if using one of the heavier display modes (0, 1 or 2) you would only have 8.5Kbyte to play with for all the code and run-time storage (your variables, BASIC's call stack).


The source code in the PDF has indentation, it is a bit easier to read.


Ready for that COPY key!


Was this "discovered" by their system or was it hand-crafted by the authors from their custom instruction set elements ?


No, that is the training agent implemented as an instance of quicksort. I don't know why the OP reports that it's "a few percent faster than quicksort" because I can't find where that is claimed in the paper.

In fact, as far as I understand it the paper claims that the learned model (i.e. the student agent) has learned to reproduce the behaviour of this teacher agent after training for a smaller number of steps on average than what the teacher needs to sort a list. That is what the entire claim about superior efficiency rests on (there is no example of a learned model and no asymptotic analysis of the program such a model represents).


As I understand it they designed quicksort like that to be able to train with it. It is quite clear from the video where it is called "quick sort agent" compared to the model one and function1 and function2 is in the stack trace.

"We found adding the previously executed action to the input state st is sufficient to handle dis-ambiguation for this quick sort implementation. Alg. 8 shows the converted quick sort scripted agent"


Right! I was hoping to see the body of one of their system induced algorithms. They don't seem to have included any - or maybe I am mistaken and their system is opaque and does not allow a generated algo. to be read out...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: