Archive for category Usability

Enterprise Data Integration: The State of the Art

Recently I had to get up to speed on Cast Iron data integration solutions, now part of IBM. I have to admit, I went into it a bit pessimistic about the very notion of “point-and-click programming.” The basic value proposition of Cast Iron is to make it easy to move enterprise data from one place to another, for example, synchronize an Oracle database of accounts and customers with your database.


You do this by creating Projects with Orchestrations which are made up of Activities that talk to and listen to Endpoints like HTTP, FTP, SMTP, Database. Most Activities can map and transform data before passing control to the next Activity. There are also control flow Activities like If/Then, Do/While and Try/Catch. This kind of visual activity building isn’t unique to Cast Iron of course, and it makes pretty pictures. The diagram component used in Cast Iron Studio, a Java desktop app, is quite nice to work with.

Within Activity blocks you can map data by dragging and dropping:


I learned that the mapping interface is essentially an XSLT generator. When you peek a bit under the covers of Cast Iron, you’ll discover that it’s all about converting data to XML representations, building XSLTs and converting XML back to database calls or strings or whatever your end point needs. But Cast Iron hides all that from you, mostly. Once you’ve built your Orchestration and debugged it to your satisfaction, there’s a server (“appliance”) that you can push the Project to, and it’ll host the whole thing. Or you can have Cast Iron host it for you in “the cloud.”

It’s all actually very neat once it comes together and works. And it’s true, for basic data integrations no programming is required, and you can just point-and-click your way to data integration nirvana. Sounds easy right?

Reflecting on this technology, I’m struck by one thing: data integration is still hard.

We had a small but diverse group in the training session. Most participants introduced themselves as developers or systems analysts. But even with a lot of hand holding and a very knowledgeable and effective instructor, these fairly basic exercises often proved challenging to people. At various points, each person got stuck and needed help to get unstuck. Plus these exercises seemed nowhere near the level of complexity real world data integrations would have.

Making changes required lots of hunting, clicking, dragging, typing little bits of text, waiting and repeating. Even putting aside numerous UI annoyances and glitches, the experience of building even a fairly basic Orchestration with a small handful of Activities can be pretty frustrating. Looking around the room, I felt that while people were pleased when they finally got a lab exercise working, they were concerned that they needed so much help from the instructor, and were often stymied trying to work things out on their own. Although my background made the exercises pretty straightforward for me, it was clear this stuff still isn’t quite within easy reach.

Makes me wonder: can data integration really be easy one day?

What if you could run software that works like this:

  • Asks you: Where’s your data?
  • Asks you: Where do you want to move your data?
  • Gathers and verifies all authentication information, then gets to work:
    • Performs inspection of data sources and targets, including random sampling,
    • Algorithmically generates ETL Orchestration, highlighting areas of greatest uncertainty,
    • Include logging, email/SMS alerts, error handling, apply intelligent upserts, etc. all heuristically and algorithmically determined,
    • Automatically creates “staging/sandbox” environments based on the data targets – then show you Previews of the data integrations without having to make your own staging environments
  • If needed, then deeper customization can be made by hand,
  • Once you’re satisfied, allow one click deployment and activation

Perhaps a very naive vision, but this is what Usable Data Integration would be to me. Although tools like CI are nice, and they pay a lot of lip service to simplicity, it’s still definitely not “simple” except in the simplest scenarios. Yes, the “secret sauce” would be that magical algorithm in the middle step.

Maybe integration has innate complexity and can never be made simple?

I’d like to think and dream that integration can be simple. It’s just a hard problem.


1 Comment

Human Tolerant Software

Suppose you have a date range picker in your user interface, like this:

Date range picker

This picker acts as a filter for selecting records of things that are themselves ranges of dates.  For example, project tasks that have start and end dates.

The usability question is, should the filtered records be ones that (A) fall inside the selected date range, or (B) intersect the selected date range?

Answer: B

One guiding instinct is to err on the side of showing more information when given two otherwise equal options.  However, I feel this “rule” is weak because at some extremes, you wouldn’t want to follow it.

My key rationale is that from a usability standpoint, software should be tolerant of human imprecision.

Win95 vs. WinXP Start Buttons See, for example the classic example of how Microsoft Windows designers snatched defeat from the jaws of victory in Windows 95 by not allowing you to click in the bottom-most left-most pixel to invoke the Start menu…you have to actually move the mouse a few pixels up from that, a source of great frustration particularly to novice, sticky mouse ball, twitchy, funky mousepad, visually impaired, screen misaligned, elderly and disabled computer users the world over.  See also Fitt’s Law.

In our example, our designer has two real choices:

(A) Expect the human to always know and properly pick the "bounding dates" and exclude anything that doesn’t lie inside the circumscribed dates.  The worst case here is that the human is wrong, and when that happens, records will be missing the she/he should be seeing.  And she/he may not even know that records are missing in some cases,


(B) Be more tolerant and expect that maybe the human accidentally picked 1/2 instead of 1/1, or maybe the human forgot that there’s a task that started on 12/31.  Show more.  Worst case: we show too much.  Then the user has to refine his/her search, or simply ignore the extra record(s).

Which software would you prefer to use?  The one that expects you to be a machine?  Or the one that expects you’re human?

No Comments