There's no install instructions in the .zip file, but there is a github wiki. But that has obsolete instructions. But there's an open issue, and the comments for that issue include fixes to that.
I stopped messing with it once I saw how they wanted me to install something into root's crontab.
My hypothesis: postgresql has none of these jazzy monitoring tools because it never crashes.
Sure, as traffic grows, slow queries need to be optimized, crappy N+1 ORMs need to be replaced, and boxes need more RAM and faster disks, but when was the last time you heard of anyone complain about postgresql or blame it for any downtime?
/snark
I've certainly had PostgreSQL crash and all it took was someone in a Verizon datacenter accidentally yanking out the cables connecting the server to its RAID arrays. It even crashed a second time when someone "tripped over a cable" and took out the arrays again and a third time when the redundant power supplies turned out not to be...
OTOH complete loss of storage resulted in sub-second data loss with autorecovery the first two times using a five year old Pg installation and minimal issues the third time. The master server had a three year uptime on the main server process when it went down. Overall I'm pretty comfortable with that track record.
I want to be able to scale out by adding more machines. I want to be able to failover automatically to another data center when the first one goes down. I have yet to see a straight forward way to accomplish this with PG. Their wiki lists a bunch of tools related to this but they are either abandoned or does not cover this as far as I can tell. I can't understand all the positive I read about PG..
The promotion is done by rebinding the URL of the database and restarting the app. This shares mechanism neatly with changing of passwords, which is one reason we decided it was worth throwing out network-transparent orthodoxy when it came to HA: the clients must be controlled anyway to deal with security considerations.
One approach I used when we migrated data centres to avoid having to manage timing etc. of IP address changes in our apps, was to use haproxy.
We configured slaves in the new data centre, set up a haproxy in the new data centre that used the old data centre databases as the main backend, and the new data centre databases as the backup, changed the apps to point to haproxy, shut down the masters and let haproxy shift all the traffic, promoted the slaves once we were certain they'd caught up. We had a period of seconds where the sites were effectively read-only, but that was it.
We're planning rolling out haproxy with keepalived or ucarp to mediate between a lot more of our backend services
> I want to be able to failover automatically to another data center when the first one goes down. I have yet to see a straight forward way to accomplish this with PG
Of all the issues that could cause a machine to decide that a failover is needed, most of the root causes make a failover actually non desirable (a hardware failure for example (failover is good) is way less likely than non-reachability due to load (failover is disastrous) and unless you are very careful, an automated solution will act the same way in both cases.
Add to that the huge cost of failing back during which time there's no more slave to fail over to: Until 9.4 ist released, failing back requires you to file-system-level copy all the data back to the failed master to bring it back up as a slave.
After 9.4, re-synchronizing an old failed master to the new master will actually be possible in most cases (a mistaken failover is usually covered by these).
In case of an emergency, first make sure that a failover would actually help (if you're down because of high load and a misconfiguration of your system, failing over won't help, but will only make things worse), then fail over manually.
As I said, there are way fewer possible emergencies where failing over would help compared to many, many more where failing over would actually cause more damage.
This is valid for all non master-master database configurations I've had to deal with so far, but, again, it's even more pronounced with postgres because of the very time (and bandwith - which could mean "costly" when you cross the public internet) consuming failback (during which you have nowhere to fail over to again).
If you really, really want to do it, have a look at pgpool (http://www.pgpool.net/) which can automatically fail over to a slave and which also is able to read-load-balance between one or multiple slaves. It's quite the out-of-the-box solution.
As for scaling horizontally at the db level that can be achieved with foreign data wrappers calling out, but isn't built in.
While Postgres doesn't rival oracle and ms SQL server in feature checkboxes it's a very solid DBMS with many advanced SQL features, and it's free and open source. You can do a lot without hitting the scaling problems.
Well, in the world of free software you have a choice - an easily-scaled database system that offers you few guarantees and is very hard to code safely against, or one that is harder to scale but easier to code safely against. That doesn't make PG bad - it's just a choice you make. If you want to pay, you can get a little bit closer to having both.
Of course you hear lines like you quoted all the time - that's because it's true. The overwhelming majority of applications don't have needs beyond a master-slave pair of DB systems - and trends are moving in favour of that direction every day as RAM gets cheaper. Stack Overflow runs on a single not-that-beefy master-slave DB system.
Typical web applications are insanely amenable to read caches, and that's where you do most of your scaling. If you're at the point where your scaling needs truly exceed a large single DB system, I'd hope you have some money to throw at the problem.
Imagine my users table has a unique constraint on email address. Or my hotel_room_reservations table has to guarantee I never store overlapping reservations for the same room. In either case, on insert and on update, I need to check the new or updated row vs a bunch of other rows.
Those scenarios make horizontal scaling not so simple. Sure, get rid of those constraints, and you can insert as fast as your network can ship the bytes. But that's not a database anymore. That's a glorified flat file.
I agree that the documentation still need a lot of work.
However, it doesn't say nor suggest to setup any crontab as root. As the only requirement is to have a stored procedure called regularly, you can use local postgres crontab for example, or any other place else. I will clarify this point on documentation as soon as possible.
I'm involved in both projects. Here' s some insight :
OPM and POWA are not related. They were born at different times and developed by two different R&D teams at DALIBO, a French PostgreSQL company. The fact that they've been released publicly almost at the same time is merely a coincidence
OPM is designed to help a DBA to manage a very large number of servers. It provides answers to the questions : "Is there a problem somewhere on my PostgreSQL servers ? Which instances need to improvements ?"
POWA is more focused showing the "real-time" activity of one specific instance. It provides answers to questions like : "Which queries are slowing down my server right now ?"
If you're familiar with Oracle products, the difference between OPM and POWA is more-or-less the difference between Oracle AWR and Oracle Grid Control.
In a nutshell, if you only have one or two PostgreSQL servers in production, then POWA is easier to install and probably enough for you. If you want more stats and an overview of your servers, then OPM is a better choice.
Both projects are autonomous and moving fast. We're just pushing code on github without direct economic goals or marketing plans :-) In the long run we'll see which one finds its user base and how they evolve.
That being said, we also have other guys working on other similar PostgreSQL tools, in particular :
pgBadger ( http://dalibo.github.io/pgbadger/ ) is a log analyzer which provides rich reports of the database activity. Basically it answers to the question "Which queries should be optimized ?"
pgCluu ( http://pgcluu.darold.net/ ) is a audit/reporting tools, more focused on system and resource utilization. It answers to questions such as "What's the load on my CPUs ? Is my memory swapping ?"
So basically regarding PostgreSQL monitoring, we're just moving in every possible directions at the same time... Some say it's a "waste of time", we prefer the term "Darwinian Approach" :P
Minor grammar nitpick: "Open PostgreSQL Monitoring is a free software" is incorrect English, as software is an uncountable: we don't say "a software", but a program, a system, or something like that. You could also say "is free software".
I am unsure what some of the graphs measure and labels mean when viewing the demo. Tooltips, descriptions, etc would be helpful in my opinion. Overall it seems like a great start though.
For all PGSQL graphs, the signification of each graph and label is (only :/) documented in the check_pgactivity "--help" option, as it's the direct source of these graphs. Sadly, we didn't have time yet to write a proper documentation on the wiki page, or on the UI. We'll add a better documentation as soon as possible.
For the SYSTM graphs, they correspond to standard nagios system checks.
I stopped messing with it once I saw how they wanted me to install something into root's crontab.
My hypothesis: postgresql has none of these jazzy monitoring tools because it never crashes.
Sure, as traffic grows, slow queries need to be optimized, crappy N+1 ORMs need to be replaced, and boxes need more RAM and faster disks, but when was the last time you heard of anyone complain about postgresql or blame it for any downtime?