It's fun to wake up to find a project that I started on the top of HN! These days, I'm no longer very involved with the day-to-day of the project.
Now that it's no longer a young project, here are some musings about Huginn and responses to people's comments in this thread, in no particular order.
I've found that Huginn excels as a scheduled web scraper with lightweight filtering. That's what I use it for. On the other hand, while you can write custom code in it, Huginn is pretty poor at implementing any sort of complex logic, and is even worse at bidirectional syncing between systems, which is something people often want it to do, but for which it wasn't designed.
If IFTTT or Zapier meet your needs, awesome! No need to run and monitor your own service. I personally choose to run Huginn on my own hardware in large part so that I'm comfortable giving it website cookies and passwords.
Some examples of what I use Huginn for these days:
- Watching Twitter in realtime for high standard deviation spikes in certain keywords, such as "san francisco emergency" or "san francisco tsunami warning", which then sends me a push notification, or "huginn open source", that goes to a digest email (and I imagine will trigger because of this thread).
- Watching Twitter for rare terms and sending me a digest of all tweets that match them. Also sending me all tweets from a few Twitter users who post rarely, but that I don't want to miss.
- Scraping a number of rarely updated blogs that don't have email newsletters and emailing me when they change. Some use RSS, most are just simple HTML scraping.
- Pulling deals from the frontpage and forums of slickdeals and craigslist and filtering them for certain keywords.
- Sending an early morning email if it's going to rain today.
- Watching ebay for some rare items.
- Sending my wife and me an email on Saturday morning with local yardsales from craigslist.
- Watching the HN and producthunt front pages for certain keywords.
Basically, anytime I find myself checking a website more then a few times, I spend 20min making a Huginn Agent to do it for me.
I think one reason Huginn has worked well for me is that I don't try to make it do too much. I use it for scraping data and gentle filtering, and that's about it. It's been super helpful for alerting me to interesting content for The Orbital Index, my current project, a weekly space-industry newsletter. (Last issue: http://orbitalindex.com/archive/2019-12-10-Issue-42/)
I'm excited to check this out, but I wanted to congratulate you on a truly excellent project name. Having spent many, many hours in naming struggles, I truly appreciate the perfection.
And for those unfamiliar, the (also amazingly named) historian Snori Sturluson explains: "Two ravens sit on his (Odin’s) shoulders and whisper all the news which they see and hear into his ear; they are called Huginn and Muninn. He sends them out in the morning to fly around the whole world, and by breakfast they are back again. Thus, he finds out many new things and this is why he is called ‘raven-god’ (hrafnaguð)." [1]
There is a lovely two player board game called Odin's Ravens in which players acting as Muninn and Huginn race against each other by playing landscape cards. [1]
I wasn't aware of the full details of the lore so thank you for bringing that up!
I thought what a weird coincidence that they'd choose this very similar weird word ... but now that you point out it's the name of one of Odin's ravens, I guess it's not quite as uncommon of a word.
Little known fact: this project is used widely by journalists who can't code (at The New York Times, among others) to do a variety of tasks, like eg. monitoring web pages like Trump's policy position, scraping press releases or filtering out very specific news alerts.
Great to see you at the top of HN today tectonic! I'm at Pivotal in Sydney Australia now. Thanks again for the impromptu interview all of those years ago!
- Process data through shell scripts or JavaScript
- Better filtering
- Liquid templating
- Completely private
I save thousands of dollars by using Huginn.
It's incredibly powerful and quite frankly I don't trust Zapier with my data and visibility of what I'm doing because there are commercial implications.
Run it in the cloud on AWS or on an old box at home.
Not disputing any of the other alternatives, but you can create custom integrations on Zapier. It's mostly intended for publishing them publicly, but you don't have to.
Yes, and it's running thousands of requests and posts / minute on AWS. But you don't even need to put it in the cloud, it runs perfectly at home on an old box running Ubuntu.
I have one box acting as a router, NAS, DVB-T receiver, NAS and the other box as HTPC. Both are based on desktop motherboards and I also run Kubernetes (kubeadm) on them, with more CPU-intensive tasks assigned to non-HTPC so it doesn't interrupt movie playback. But yes, I think it is not common, it takes time to set up and maintain.
I have been running huginn on my home server for a while. I've mainly used it to filter RSS feeds and then generate new feeds with the filtered items. Another use-case for me is ingesting webcomic RSS feeds (or scrape a page) and post the comics to a private Telegram channel. Once I also had an agent scraping a page and notifying me if something changed (realestate listing).
I have tried a couple alternatives, e.g. node-red but none really worked the way I wanted them to for these cases. huginn is incredibly flexible and (at least for me) the mental model of it's workflow makes a lot of sense.
Sadly more and more pages want you to go through their app/site and make it a bit difficult to work with, e.g. getting content from an instagram account.
One thing I have not figured out about huginn and which all of these automation tools seem to lack are loops. E.g I have page an agent scrapes, from which I want to output the src of an image tag but I also want to check if a certain condition on the page matches (e.g. a "next page" button exists) and then firstly output the found src but then also re-invoke the agent with a new input element. So it would scrape the next page and so on until it does not find the button anymore.
Zapier has exponentially more integrations than this or anything else, but is surprisingly difficult to use, and more so since they updated their UI. Editing Zaps is pure torture because refresh is so difficult. Exception handling is pure pain. In most instances the breadth of API calls is so narrow, and so rarely updated by vendors that you end up switching to a custom integration. I've also noticed vendors rapidly expanding their native integrations, sidestepping the need for a request broker.
The "retail" integration space remains underserved and if one of the enterprise players decided to go down-market with a better UI and deeper integrations - they'd mop the floor clean in 18 months.
> The "retail" integration space remains underserved and if one of the enterprise players decided to go down-market with a better UI and deeper integrations - they'd mop the floor clean in 18 months.
I think it's a tough market to be "winner" of. Novices are going to want a stupid simple GUI ("wizard mode", as someone else in thread mentioned). Power users are going to want to be able to toss in some code at some point in a workflow to do some fancy ETL you don't support out of the box. When you hit a certain level of complexity, an edge case or integration an automation product doesn't support, or perhaps even an amount of spend that you start looking at annually as painful, it's likely you consider pulling all of your workflows out and have a software engineer build something bespoke for your business line.
Yeah, I wonder if this space is sort of like the to-do/task-management space. It's conceptually one thing, but in practice people have such different needs that no one product can be everything to everybody.
> The "retail" integration space remains underserved
This is accurate, there's a significant opportunity in this space but it won't stay that way for long.
There's a well-known entrepreneur I know with significant exits and capital entering this space that's going after Zapier's market and I'm certain he's not alone.
> There's a well-known entrepreneur I know with significant exits and capital entering this space that's going after Zapier's market and I'm certain he's not alone.
Having seen how the sausage is made and what it takes to build and support such a product, I will say that past success is no guarantee of success in this market. It is a grind, day and day out of dealing with queues, API integrations that fail or degrade either loudly or silently and you have no visibility into them, non-technical users who are are not happy when the magic isn't working, blobs of data that just shouldn't be, spending hours tracking down single edge cases or failures, and so on.
It's not "write some code, here's your $1B exit" (but you will be writing a lot of code!). It is constant tending of a complex piece of machinery, and failures are painful (outages somewhat, data loss exceptionally so).
Anyone attempting to take on this space is going to have their hands more than full and they're far more likely to fail than succeed.
Personally I wouldn't even try, mostly because the tools themselves can enable much easier more likely to succeed lower hanging fruit and that's how I use them.
If you want to make money "using" these tools then Huginn is your best option and it gives you a competitive advantage over those you're competing with that are using Zapier to do similar things.
He should make this effort a marketplace like zapier but let the finance side come from SAAS that let them offer the opportunity to share revenue through a resellers agreement. For customers in zapiers model this would mean zapier would be for free solong there is a resellers agreement in place. This would decouple the adaptation friction on using the service and for zapier this would bundle every SAAS contract on their infrastructure. Even Google offers resellers of SAAS GSuite a whopping 20% revenue share.
Zapier can sell equity for an insane valuation IMHO. Money isn't their problem I think. I think - may be wrong - the problem is management. My sense is they are hampered by prior success bias and lack of what I'll call "grey hair" experience that provides insight into the business case their "retail" customers face. Again might be wrong, but something suggests arrogance to me.
"Once a day, ask 5 people for a funny cat photo; send the results to 5 more people to be rated; send the top-rated photo to 5 people for a funny caption; send to 5 final people to rate for funniest caption; finally, post the best captioned photo on my blog."
I'm still laughing :) wth! (My fear is that this might actually be sustainable with ads.)
The Airflow landing page that you linked to lists many integrations but when you click on those only a small subset of them are listed in the integrations section of the docs that is linked to. I guess the docs are in need of some more work.
This, as well as related projects like n8n & node-red, is a very cool project. I always wonder what people use it for in real life though. It seems a lot of trouble (setting up, learning curve, maintaining) for an action that usually takes a couple of seconds, like checking the weather or opening twitter.
I actually use Huginn to check the weather, compare it with weather conditions for the previous two days and send the result to my Telegram as a simple text ("It should be a bit warmer than yesterday in the first half of the day").
I do this because I have a hard time understanding if wearing a coat would be enough for '-5 deg. Celsium, NW wind 30 kmph'. Comparison with previous days helps a lot since I already know that I was feeling ok in my coat yesterday.
Ever notice when someone criticizes a company on HN, and suddenly the CEO shows up and engages after 6 months of inactivity? Likely they are using a tool like this along with Google Alerts, etc, to monitor sites for mentions of the company and maybe flag negative sentiment. Also, same thing but for politics.
Yes, very interesting indeed! Either there is a big conspiracy to uncover or said CEO reads HN and if something is number 1 and is related to his product he pays close attention. We will maybe never find out the truth...
From your defensive tone it sounds like you think I was talking about you, but I really wasn't. It's a phenomenon I've observed here and on Reddit and elsewhere for many years. And it seems like a perfect fit for a tool such as yours, is it not?
Very sorry! As I was the only "CEO" in here which product got criticized I thought it must have been me ;-)
Yes would actually be perfect for it, but never had the time to set something like that up. Maybe someday in the future. For now, I only have a simple Google Alert set and regularly check HN and Google Analytics which is a good indicator when it got mentioned somewhere.
I have a few:
- watch GoFundMe campaigns and post funding updates on Telegram
- watch multiple tech blogs and post updates on Telegram
- watch multiple funny blogs (xkcd, monkeyuser, lightroast, daily wtf) and post updates on Telegram
- watch English idiom/word/phrase of the day feeds and post on Telegram
- watch Canada's Express Entry draws and post new draws on Telegram
- watch IELTS exam dates/locations/seat availability and post updates on Telegram
These have required some setup but little to no maintenance after.
For those using expensive/advanced connectors like Zapier, Tray.io, etc. I find that https://n8n.io serves as a far welcoming open-source alternative that is worth looking at.
Thanks a lot rvz for throwing it in the mix. I am the creator of n8n so just wanted to mention that it is not "OSI approved open-source" as the commons clause got attached. More information about that in the FAQ https://docs.n8n.io/#/faq?id=license
haha, no would have done that even without that. It was always important to me to make that clear even though it did obviously not work exactly like intended.
About the "robust" part. A startup I know had a lot of issues with Zapier and switched to n8n and are quite happy as they did not have any issues since. I am sure you are right and you did your research but would love to hear what exactly you are referring to as it will help me to improve n8n.
The same with the limitations. Anything specific?
Yes, I did but did not get to it yet. Also not sure if this integrations would then be the "best" to use. I think it would be great to have at least "something" if nothing exists yet but would not be as good and easy as regular integrations and so would not fit perfectly into n8n as I really want to have it as easy as possible for users.
Please elaborate. I'm not questioning what you say is true, but I'm interested in a tool like this one and would love to know more before choosing one or another.
Do not have that much experience with Huginn but you are probably right. As n8n is new and Huginn is already around for many years it would be surprising if they would be equal in every way.
Anyway, I am quite sure that most that can be done with Huginn can with a limited amount of work also be done in n8n as creating new nodes is very simple.
(I am the creator of n8n)
One use case for Zapier (from a developer/company standpoint) is to allow customers to connect their existing services to actions inside your own app. For instance, if a customer updates a CRM record, you can have a custom zap update a record in your own SaaS platform.
To pull that off with huginn, is that as simple as connecting this up to Singer.io? Or would that require a big marketplace of huginn agents for popular integrations?
Also checkout Node Red [0], it's fairly popular in the automation space. It's rather sparse by default, but after adding in some community nodes (or making some yourself) it's pretty useful.
I imagine things have changed, but I tried node-red maybe 3 years ago, and found it really difficult to setup and get working.
Once I finally had it working, I found it really difficult to debug "tasks" (I forget the correct, node-red verbiage), for example to find out why an HTTP call was failing.
I also found the available connectors to be lacking, and a lot I tried were quite basic.
I gave this a go today and managed to install huginn on my synology nas by simply searching for the docker container. I then setup 3 agents to scrape a Shopify webstore jason endpoint that I’m always checking for inventory, have huginn parse the json and send me an sms via twilio if inventory changes. Took about 2 hours, wasn’t too bad. Huginn twilio docs seemed dated.
Used python simplehttp server and ngrok to replicate a json url and play with the triggers to test it all before pointing it at a real website.
I've seen the word "rake", so I guess... is this written in Ruby? If so, how's the performance?
My home server is pretty minimal and lightweight: a Raspberry Pi. Do you think it will run it fine? (I'm gonna want to try this anyway, didn't know about it until now and it looks amazing!)
Yeah, there's only a x86 Docker image on Docker Hub for Huginn. You could probably just build an ARM image from whatever Dockerfile the x86 version was built from.
I’m glad Huginn is on the front page. It’s an awesome project being used for a long time now. I was testing Huginn to see how skimpy I can be and still run Huginn on a free tier. Was able to run it on open shift free tier a while ago when their allowance was generous. But, looks like it’s hard now. Will try running on a Gcp instance and see if it works.
Is there any way to implement agents in python rather than JavaScript/ruby? Looks interesting but I don’t want to invest energy in building fluency in these other scripting languages.
I run Huginn in a Kubernetes cluster with most of its heavy lifting done by HTTP calls to OpenFaaS running in the same cluster. This let me write complex logic in Python but have Huginn handle the execution, triggers, filtering, and delivery of the data. Among other things I use it to scrape Twitter for interesting Kubernetes news and follow the retweets back to people who link to their own websites so that our editorial department can vet them for invitation into our writing program. Any particular author/account only appears once. It generates around 40 new candidates each day. I also use it to scrape the Chilean public alert website (onemi.cl) for alerts in my region ( forest fire, major weather), run those through the translate API and then send the English translations to my and my wife’s phones via Pushover.
These are things that I want to control and customize, and I vastly prefer running my own systems over trusting any third party.
You could write python externally (cloud functions/aws lambda) and then use huginn to call that over HTTP. This doesn't require any coding. Huginn has existing agents that you can easily configure to make HTTP requests (PostAgent and WebsiteAgent).
And, depending what you are planning to do with huginn, simply sending data from your python script to huginn using the WebhookAgent may be sufficient.
> Create Amazon Mechanical Turk workflows as the inputs, or outputs, of agents (the Amazon Turk Agent is called the "HumanTaskAgent"). For example: "Once a day, ask 5 people for a funny cat photo; send the results to 5 more people to be rated; send the top-rated photo to 5 people for a funny caption; send to 5 final people to rate for funniest caption; finally, post the best captioned photo on my blog."
Interesting, I should switch to that for some location-based action with my phone instead of IFTTT and see if I can make it work with Hauk or something similar.
Seems like all of these self-hosted automation apps struggle to incorporate Facebook adapters. Does anyone know of one that would allow me to read Tweets and post ones containing some particular text to a Facebook page (not my personal timeline/page but one I created for my project)?
I like to think of Zapier as "wizard mode" - if you're doing something within their pattern, planned workflow, capabilities, etc it is a great option.
If you need to do something more advanced or outside that narrow scope, you need to go to the APIs (or Huginn) itself.
The important questions are: How many people need to get into advanced mode? How big of an effort/skill/understanding jump is it to go there? (aka will it hurt too much to switch?)
I've come across Huginn many times in the past and every time, I tell myself I'm going to give it a shot. I still haven't even seen it loaded in a web browser...
After fighting the Docker images for half an hour, I usually realize that I could've just written a Python script to do whatever I was trying to do by then and proceed to do just that.
Sorry for being lazy! Maybe I'll get around to it next year...
What do you mean remote agents? Can it be accessed over HTTP? Then yes.
One of the great features of huginn is the diagram. That's where there are green and red indicators. Red indicates both errors and when a longer than expected period has passed. This makes quickly inspecting your workflows easy.
Sorry. I meant running an agent on a machine that is separate from the machine running huginn. The machine running the agent is not accessible publicly but has internet access (doesn't accept connections but can initiate them). huginn would be accessible publicly.
I'm probably missing something here, but what is the value of this when you could write a script that would do just about the same? (Serious question, I want to know the answer, not rhetorical)
Huginn is more of a framework for writing and running these scripts. It has some built in modules to make the process easier. IE: The webhook agent is an easy want of getting POST data into a script without bootstrapping a webserver yourself, you can then just raise an event and have another script handle it. No need to write and manage everything individually, you can run everything in one place.
I've deployed Huginn a few times but never gotten off the ground with it. I found agents a bit too cumbersome to configure, but maybe I just need to give it another shot.
I would not market this as an alternative to Zapier since it goes after a different market. The type of people who use Zapier will see a link to github.com and not even consider it.
I think this should be positioned more like "A high powered Zapier for technical users" or something similar.
> How does configuration works? Is it just configurable through the webinterface or can I update/set workflows with some external script?
Primarily Web interface. Though, the CommanderAgent can configure and trigger other agents. You can use that in combination with your external script for configuration.
> Is configuration saved in textfiles which I can put under versioncontrol and backup, or is it some unaccesable binay blob?
Agent configurations are stored in the database and can be exported as text files for your backups.
> How much does it break with each update? And how often does it receive bigger and smaller updates?
Making sure updates don't break existing installations is one of the top priorities of maintainers. Most of the updates are small/incremental. And I haven't had any problems with the updates breaking things.
> How does this compare to something like home assistant or node-red, if you know this software.
I looks at node-red years ago and determined it wasn't as mature of a project. Things could've changed since then. And I haven't heard of home assistant.
> Anything you consider missing or lacking on some area?
The UI is a bit basic. Though, I think this can also be looked at as a feature instead of a flaw.
Mostly assisting in curating news about niche technologies (angular, react).
What's a "major issue"?
At this point, I feel that I'm well versed in when to use huginn. That helps me avoid most issues.
I use Zapier all the time for the things it does better/easier. That said, often, one end of the "zap" is integrated with a workflow in huginn. For instance, I commonly use Zapier when I need data gathered by huginn to be continuously logged into a spreadsheet.
Additionally, puppeteer(node) is another fantastic automation tool, which I use to do things that you probably shouldn't try to do with huginn. But, I connect my puppeteer scripts to larger workflows using huginn.
Thanks to the maintainers, I think that huginn does what it was designed to do very well.
@dsander I immediately recognized your username from GitHub Thanks for all the work you've put in!
I run Huginn on Heroku. I haven't had any significant issues since it's been running. It's been so long since I've set it up that I would need to review my notes to tell you any more about how it's deployed.
Sometimes a smoothie around 3pm that contains plant proteins, collagen protein, amazing grass, 1/4 cup of blueberries, ground chia seeds, and maca powder.
I've been intermittent fasting + keto for the majority of the past five years. The minority is when we cycle off keto for a week here and there by doing intermittent fasting + vegan.
Now that it's no longer a young project, here are some musings about Huginn and responses to people's comments in this thread, in no particular order.
I've found that Huginn excels as a scheduled web scraper with lightweight filtering. That's what I use it for. On the other hand, while you can write custom code in it, Huginn is pretty poor at implementing any sort of complex logic, and is even worse at bidirectional syncing between systems, which is something people often want it to do, but for which it wasn't designed.
If IFTTT or Zapier meet your needs, awesome! No need to run and monitor your own service. I personally choose to run Huginn on my own hardware in large part so that I'm comfortable giving it website cookies and passwords.
Some examples of what I use Huginn for these days:
- Watching Twitter in realtime for high standard deviation spikes in certain keywords, such as "san francisco emergency" or "san francisco tsunami warning", which then sends me a push notification, or "huginn open source", that goes to a digest email (and I imagine will trigger because of this thread).
- Watching Twitter for rare terms and sending me a digest of all tweets that match them. Also sending me all tweets from a few Twitter users who post rarely, but that I don't want to miss.
- Scraping a number of rarely updated blogs that don't have email newsletters and emailing me when they change. Some use RSS, most are just simple HTML scraping.
- Pulling deals from the frontpage and forums of slickdeals and craigslist and filtering them for certain keywords.
- Sending an early morning email if it's going to rain today.
- Watching ebay for some rare items.
- Sending my wife and me an email on Saturday morning with local yardsales from craigslist.
- Watching the HN and producthunt front pages for certain keywords.
Basically, anytime I find myself checking a website more then a few times, I spend 20min making a Huginn Agent to do it for me.
I think one reason Huginn has worked well for me is that I don't try to make it do too much. I use it for scraping data and gentle filtering, and that's about it. It's been super helpful for alerting me to interesting content for The Orbital Index, my current project, a weekly space-industry newsletter. (Last issue: http://orbitalindex.com/archive/2019-12-10-Issue-42/)