The change of focus is perhaps hard to understand without more context. Basicall...

rgoddard · on Aug 17, 2015

I am currently working as an Actuarial Analyst, but I have also worked several years as a programmer.

As an analyst the tools I use are excel, access, SAS Enterprise guide and Oracle SQL developer. One of the big problems I face is that we have no good way to abstract away a process and really make it reusable.

My general work flow is using SAS to pull data from multiple sources, combine and run the data through some series of logic/calculations. Then take the resulting data, copy to excel for some additional analysis or report. This might be for a monthly/quarterly report or an analysis that needs to be update with the additional runout of data.

But these steps are all tightly coupled together. If I want to rerun the same logic on a different data set, or an updated data set I will copy and paste all of the files, update the queries. I have no way to bundle them together so that I can easily reuse with different data sources, or refreshed data.

Really want I want is someway to encapsulate different sets of data transformations/calculations into to functions to reuse them in different contexts and among different people.

dgudkov · on Aug 17, 2015

Look at my EasyMorph (http://easymorph.com). It's a visual replacement for scripted data transformations. People use it to replace SAS and Visual Basic scripting. It also allows creating reusable modules. Contact me at <hnusername>@easymorph.com if it looks interesting to you.

shaunxcode · on Aug 17, 2015

Hey, I clicked through, read the tutorial, got excited about your examples.. tried to download and found out it was windows only! I would have totally evaluated it further if there were an os x/linux option.

dgudkov · on Aug 17, 2015

Thanks for checking it out! As we're targeting Tableau users so eventually we will release an OS X version.

nycdatasci · on Aug 19, 2015

Speaking of Tableau (which was founded on the concept of VizQL), how is this different? Doesn't tableau basically enable knowledge workers to create data-centric web applications?

hbt · on Aug 17, 2015

use a virtual machine or install windows as dual boot.

If this software solves a problem you are really having, nothing would stop you.

CognitiveLens · on Aug 17, 2015

if solving one problem involves creating another, it's good to be cautious.

mistermann · on Aug 18, 2015

Does this have an API that can be called from .Net?

I'm really liking some of the things Microsoft is doing with Power Query, but I don't like how it is (afaik) only callable from Excel or PowerBI online. I'd like similar capability, but more open, and could be called via scripting, from SQLCLR, etc.

Another big hitch: Microsoft has not published proper API's for manipulating PowerPivot models in Excel, and I don't think they intend to - I've heard one 3rd party has reverse engineered the API's (you can decompile the .Net binaries, but I haven't had the time to look at it yet).

tycho01 · on Aug 18, 2015

Would you perhaps have any info/reference about where I could learn more about this reverse engineered PowerPivot API? This sounds pretty exciting. :D

To expand on my question, I'd heard (from Rob 'PowerPivotPro' Collie, former product manager on the project IIRC) that the core had been written in 'unmanaged code' (probably C++?), so I believe reverse engineering it would be a significantly larger effort than just opening some of its DLLs in say DotPeek, at least from as far as I've been able to tell.

mistermann · on Aug 19, 2015

The PowerPivot engine itself I imagine is in unmanaged code, but the code that just writes datasets and whatnot to the model is (from what I've heard)managed code, and indeed you can decmpile the libraries and see all sorts of things, I've only looked around for about 10 minutes or so. And I had just read on some obscure thread that someone had successfully found the undocumented API call to write to the model, which is what I'm wanting to do - but I don't even know what the product name is that supposedly does this, sorry.

tycho01 · on Aug 19, 2015

Right, makes sense, I'll try and check out what's available then. :)

By writing to the model, you mean programmatically adding new measures or the like?

My interest is in programmatically querying models using DAX, though to this end I'd also look to look in the direction of Microsoft's DirectQuery mode in SQL Server which supposedly did DAX-to-SQL conversion.

If one could use such a conversion plus MDX to start querying models on an Apache Spark cluster through pivot table/chart interfaces...

mistermann · on Aug 20, 2015

Not even measures, I'm just wanting to be able to create tables, define relations, etc, with the accompanying sql or m script. I'm hopeful they'll let us do that some day, but I still don't quite believe they've changed their stripes entirely.

dgudkov · on Aug 18, 2015

Currently EasyMorph supports integration through command line only. We do not plan having API for the desktop client, but we will definitely make EasyMorph Server API if we reach that point.

icebraining · on Aug 17, 2015

We use Pentaho Kettle for those kinds of transformations. It's FOSS, and connects to a whole bunch of programs and formats.

It's a graphical tool - you drag-n-drop modules, then configure and connect them, though it can also run scripts (it has JavaScript, Java, Bash and Ruby support, besides SQL, of course) - but after configuring the transformation/job, you can also run it on the terminal, which is useful for periodically re-running it.

http://community.pentaho.com/projects/data-integration/

mindcrime · on Aug 17, 2015

I've been doing a lot of work with Kettle as well, and it is a handy tool (albeit with a few warts).

What I think would be handy for use in an organizational setting, where "business users" might want to use some of the transforms, would be a way to publish transforms somewhere, making them discoverable and accessible to others. I don't want to make it sound like I'm talking about UDDI or anything (although, thinking about it, maybe you could use that), but just an easy way for a Joe Business User to get a list of available transforms, some explanation of what they do, what input they take, what they output, etc. And maybe a way to make changes to the "small stuff" (like the input and output path, for example) without having to load up Spoon and edit the ktr that way. Since transforms can be parameterized, that should be doable...

You could also picture combining this with something like a Yahoo Pipes like web interface, to let you define your own chains of transforms and operations as well. And hell, a web-based interface for editing ktr files would be a pretty interesting thing as well, if somebody would build it.

mhw · on Aug 18, 2015

Have a look at Alteryx (http://www.alteryx.com/) - it's pretty close to what you're describing, I think.

myoffe · on Aug 17, 2015

I haven't used it extensively, but SQL Server Integration Services (SSIS) looks like it does a lot of the things you're talking about.

wesd · on Aug 17, 2015

It does. There are other ETL tools as well.

https://en.wikipedia.org/wiki/Extract,_transform,_load#Tools

knn · on Aug 18, 2015

The databricks platform should solve exactly your problem - reusable data pipelining/transformation. I saw a demo of it last night and it was extremely slick. Their product is amazing, it makes data pipelining incredibly easy compared to setting up a hadoop cluster and running hive/etc. (I don't work for them - but if any databricks employee sees this, please hire me!) It runs on a spark cluster over AWS, which is much more modern and powerful than SAS/excel/sql. Since you know how to program already, it shouldn't be too hard to pick up spark (even has python bindings)

mcarroll_ · on Aug 18, 2015

@rgoddard - May be a bit overkill but check out Immuta (www.immuta.com). Its a data platform, built for data scientists, that enables you to query across many disparate sets of data using familiar patterns such as SQL, file system, etc. Our SQL interface allows you to hook to Excel, Tableau, Pentaho...so you could write your abstracted logic and connect to many data sources or mashed up analytic results. contact me at matt@immuta.com if you're interested after reading through the site.

jsandiego · on Aug 17, 2015

We are having a similar set of issues where I work (insurance industry). Always looking for folks to chat with/discuss similar issues

drdoom · on Aug 17, 2015

Is there a way to get in touch with you?

jsandiego · on Aug 19, 2015

Sure email is ios at arrowheadgrp.com

Fomite · on Aug 17, 2015

This is a really helpful perspective. For example, I've started referring to myself as a "user" of Python rather than a programmer.

agumonkey · on Aug 18, 2015

Care to explain a bit your choice of word ? Funny how I often heard about the user / programmer dichotomy. Always just felt like a user in the know.

Fomite · on Aug 20, 2015

I'd divide it as essentially "User of Packages" vs. "Writer of Packages", and of course, the dichotomy is not actually a clear one. But the choice was to suggest that there are some people for whom the programming language and its libraries are not an end of itself, and "Just crack it open and write your own thing for X..." is essentially a non-starter.

I think it's useful to distinguish between the two groups, because not only do they have different skill sets, but they have different motivations. For example I will never directly be evaluated on the performance or style of my code in the way a programmer might be - only the paper that code helped me write.

wingerlang · on Aug 18, 2015

Seems like it has to do with the end result. If your goal is to make an app, then you have to program that app. So you are a programmer.

If you want something quick, maybe throwaway-able, you pick something to use for that, and you are a user.

You could build a program with that throwaway script without being a programmer though. Just like I can build a house without being a carpenter.

That's my interpretation though.

biot · on Aug 17, 2015

Was there a problem inherent to building large applications that you found intractible, or is the shift solely due to focusing on entry-level accessibility?

jamii · on Aug 17, 2015

We built a Foursquare clone recently and the BOOM guys built an extended version of HDFS and Hadoop (http://db.cs.berkeley.edu/papers/eurosys10-boom.pdf). It works out pretty well. The shift is not really about accessibility either - I've watched people do some pretty advanced data work in fields like physics and biology.

It's more about making computers into personal tools. If you look at the tools the average person uses - email, excel, google etc - they all work really well individually but they are really hard to extend or compose. Each application is a world unto itself and doesn't play with the outside world. What would really help people work is not the ability to build their own applications but the ability to move data around and glue tools together. It's kind of like applying the unix philosophy to office suites.

Chiusano has a pretty interesting take on this - http://pchiusano.github.io/2013-05-22/future-of-software.htm...

ibdknox · on Aug 17, 2015

The shift is primarily due to the fact that relatively few people seemed to want to build large applications, including ourselves.

There are definitely some differences between building large apps and these more communication/analysis tasks, but we think the foundation itself applies to both. The language is an adaptation of the Dedalus[1] semantics, which the BOOM lab did some amazing things in distributed systems with [2]. If it can build a clone of hadoop, chances are it can build most things. We've built a number of our compilers in Eve, several of our editors were bootstrapped, we've built numerous examples, most recently a complete clone of Foursquare. Before we'd want others to try and do that though, we need our tooling to get a bit better. We expect that the Eve editor will get there eventually kind of out of necessity - we're going to bootstrap a lot of it this time too, starting with the compiler.

[1]: http://db.cs.berkeley.edu/papers/datalog2011-dedalus.pdf

[2]: http://db.cs.berkeley.edu/papers/eurosys10-boom.pdf

igravious · on Aug 17, 2015

My take from the abstract of Dedalus is "and adds an explicit notion of logical time to the language"

If you're interested in building a tool that relies on distributed communication and data flow it makes sense to bake a notion of logical time into the the system. Boolean logic has no notion of time. If you propose x != y say, you could be saying that throughout the lifetime of the system x is never equal to y or you could be comparing x to y at this instant in time. It depends of course if these are constants and/or variables.

Type theory shows that different logics map to different type systems so what may be holding programming back is that the logic of a system is not _dynamically_ selectable as the system evolves. Most (all?) programming languages have a simple boolean logic, mutable state, and just tons and tons of syntactic sugar on top of that. Obviously languages like Haskell and Clojure are more advanced (algebraic data types in the former and immutable data structures in the latter) but they still have a fixed/static way of being in the world if you know what I mean.

Natural language shows us that humans use many different types of logic contextually. Logic is not monolithic, maybe Eve is an admission of this?

Sorry if this makes no sense, it's just a hunch that's been percolating for a while.

jamii · on Aug 17, 2015

It makes sense, I think :)

Hickey's ideas about time and state are also pretty relevant - http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hic...

One of the things we are still working on is expressing non-monotonic logic nicely (things like "birds can fly, but penguins can't, but Harry the Rocket Penguin can"). It's unpleasant in standard datalog but I think we can provide a nicer interface.

igravious · on Aug 17, 2015

10 minutes into the Rich Hickey video and I can see why you responded to me with a link to it. This is indeed what I'm getting at; so many languages use fundamentally the same underlying logical and state model. Rich mentions single-dispatch, stateful OO. To that I would add boolean logic. Our systems or so riven by it we don't even see it. And I reckon it doesn't have to be that way! I can totally see why Eve is written in Rust from the 10 minutes of this talk that I've seen, and I can see that it is the incidental complexity of managing the lifetime of objects in your head in C++ that have forced this shift. Or, as per Hickey, Clojure-wards.

Still though, both Rust and Clojure, both presume an omnipresent bivalent atemporal logical discourse. If you're working on a different logic (or sets of logic?) in Eve then why not make them dynamically user-selectable at run-time in an intuitive manner :) Granted, I have _no earthly idea_ in practise what this means but when you reflect on how humans manipulate concepts internally you see that we have the machinery for this built into us -- or learnt somehow at a very early age. Tapping into this fluid logical apparatus would be ever so neat.

hendzen · on Aug 17, 2015

Is the endgame here making eve applications automatically distributed or parallelized?

I ask because the monotonic logic that Daedalus excels at expressing is quite limiting. Unless you are in an execution environment where operation ordering/synchronization is expensive (i.e. among a set of distributed processes) - the nice order-independent properties that CALM analysis gives you don't really buy you much.

jamii · on Aug 17, 2015

I can't see us using CALM for anything in the near future. The focus right now is just on making the basic programming experience smooth. We chose Dedalus because the discrete, synchronous model of time makes it easy to separate things which are truly stateful from things which are not and to handle both in a live, interactive environment. CALM is just a bonus.

skybrian · on Aug 17, 2015

It seems like this might be a good fit with what Sandstorm is doing (making personal servers easy to use). Even if I'm writing a program for myself, I still want to access it from multiple computers and share the results.

pekk · on Aug 17, 2015

Granted, shell is useful for concise interactive one-liners. Granted, Java is a terrible replacement for shell. But there's no kind of programming where shell is better than Perl, Python or Ruby - "big languages" which, unlike Java, were designed to be useful for scripting by people who aren't full-time programmers. Because they are "big languages" they also have the property that you can build bigger things on them, starting from your messy prototypes. It's better not to even try that with shell or Excel.

KarlosMueller · on Aug 18, 2015

That is a much better and saner explanation than these "revolutionizing programming"-grandiosity talks.

Eve will always tend to be missunderstood at Hackernews or Reddit. It simply is not aimed at professional developers and intended to build production systems.

_csoo · on Aug 17, 2015

Yes! This is a great niche for them to tackle with better tooling.

ilaksh · on Aug 18, 2015

My opinion is that programming is literally defined by editing text files with difficult to comprehend source code. Anything that is easier or deviates from that is by definition not programming and therefore something users do.

The rationalization is that anyone using an easier tool to program computers without as much code must not be able to use code. Therefore us programmers are better than them.

Its very similar to the earlier era of punchcard programmers scoffing at assembly language programmers. Or hand-tool craftsmen sneering at mass-produced component-based manufacturing.

That infantile belief system will persist until super-intelligent AIs revise it or (maybe) the next generation wises up.