Posts Tagged Kettle

The problem with Ketlle/PDI

It’s no secret that I’m a fan of Kettle (Pentaho Data Integration) but there is one big problem – It’s missing more than a bit of functionality – it is the show stopper for widespread adoption by many customers.

For most needs, Kettle is faster better and allows faster and more agile development than Informatica. Spend a week with it and you’ll see. (not dismissing the other commercial players, but, realistically, Infa rules the space where Kettle wants and needs to be)

A feature of Kettle in some tactical situations is the fact that Kettle neither needs or wants a monolithic centralized ETL server – if you’ve got a little RAM to dedicate to a JVM, it is off to the races. The other day, I was able to enjoy the benefits of a tools over had coded scripts when pulling data from an ERP, from a MES and from some standalone sensors on robotic welders. There is visibility into the transformation/integration logic and much of the work will be reused in the future. Cool.  A great use of Kettle as it is today

Where Informatica excels and Pentaho doesn’t show for the party (or apparently even know there is a party) is in the monitoring of independent and related processes on a constant basis – The Pentaho crowd will say “oh that’s easy, Kettle captures all of the operation meta data to do it” (it does), but they miss the point!  As simple as it may seem to the Jave devs recording it, using the repository data is daunting to many.  Few of their customers will start from scratch to build the reports and analytic tools to monitor dependencies, jobs and transformations – instead they go to Informatica.

A call to Matt Casters or some clever opensource developer (related or not to the Pentaho crowd) – Build a dashboard and reports (in Pentaho’s framework) that does half of what WorkFlow Manager does and you will soar!!!

Comments (2)

Experts???

I was just looking over reviews of Integration tools by Gartner ans Forrester … pretty scary.

Two faults with the Gartner report:

The exclusion of open source contenders Pentaho and Jasper – Gartner does qualify that they only analyze tools from companies with minimum revenue or minimum number of production customers – While Pentaho may not yet have 300 subscribers to the commercial version (I have worked for two companies that do or .67% of the threshold), I’d bet just about anything that there were way over 300 customers supported by the community in 2007 when the Gartner report came out.

The other flaw with the Pentaho report is in their naming of the report – they review monolithic ENTERPRISE data integration tools with ENTERPRISE criteria but title their report Magic Quandrant for Data Integration Tools, 2007.

The Forrester Wave™ Evaluation of the Enterprise ETL Market is more appropriately named but kinda scary … I don’t know much about Forrester, but recognizing OWB as an option is a joke and Pervasive is given some credit – while pretty powerful, Pervasive is just a mash of some small scale tools – last time I used it, it requires multiple different scripting languages depending on the part of the suite you were in!

Comments (1)