Monday 31 December 2012

ODI11g - Which Repository architecture to go for?

Independently of the licensing aspects, with my Sunopsis past following me, I often wonder about what would be the best architecture to advise to an ODI newcomer.
Maybe the question would appear quite simple for some people to answer and maybe I am paying to much attention to this point.
To put things in perspective, there is no better way than taking several cases (at least 2) which are completely different and far away from one another in such a way that one will then somehow set the limits of the exercise.
In the Sunopsis period, the sales speech, supported by a designer seat driven licensing model, was to position the product as the solution for multiple Data Integration flows to multiple targets. The ODI repository (the Work repo type development) could then be implemented as a single central metadata repository enabling central impact analysis and central metadata management. This was and still is supported by the fact that the Data Models (as ODI objects) can be shared across projects and are not bound to a Project (as ODI object) in a 1:1 but rather in a 1:M relation.
In the Oracle period started more than 5 years ago, the product has become part of the Fusion Middleware offering and is interacting with many other products that also expose similar functionalities. Oracle has indeed been assembling products in its Fusion Middleware offering that were before (for some of them) competitors with overlapping functionalities (ie: transformations in data integration).
Within the OFM approach, the differentiating (or core) functionalities of these products are kept to fill in the puzzle. ODI within the Fusion Middleware offering could ultimately be seen as the ELT-L mass transformation engine. In this context, it could be reduced to a technical enabler for this function. Within an OFM solution implementation, each data integration flow or process management requiring mass data manipulation could call on its own dedicated ODI implementation to serve its goals.
The 2 ODI architectures at the limits would then look at one end as a single centralized metadata repository managed and organized with a centralisation perspective and at the other end as a free, open, where each developer or group of developers would locally manage their own 'project related ODI non centrally managed implementation' (implicitly with own set of repositories or even a single repository when using the R.C.U-tility).
Experience has often demonstrated that the optimal choice for a given case is to be found between the two limit cases.
Further, one could find oneself initially in front a chaotic, un managed implementation wherein several projects could have been started as 'Proof of Concept' to see their lifecycle extends further the PoC period. A federating process could later take place to eventually have them all under control under a centrally managed approach.
In order to evaluate appropriate alternatives, one would have to set a list of decision drivers and evaluate each extreme possibility against these drivers.
For the sake of illustration, one could start with a couple of driving functions such as:
. the ability to evaluate the impact of change in a data model
    when a datastore (as ODI object) is reverse-engineered in several Work Repositories for several projects
    it is most difficult to trace the impact of a change to this object
. managing the security of the ODI objects (projects, models, ...)
    having one Work repository per development Team requires no fine-tuned security setup
. sharing the model metadata (technical and business related)
. sharing the in-house developed or modified Knowledge Modules
    when using a single Work repository in-house developed KM's are easily sharable when placed under
    the  [Others].[Knowledge Modules] in the Designer Navigator.
    But different projects might require different  KM's frames or patterns, reducing the necessity to share.
. managing upgrades and homogeneity of the ODI application itself.
    Here the lifecycle of the different projects the ODI components are incorporated may  be different.
. facilitating the metadata exchange with other metadata repositories (ie: OBIEE)
. the IT organisation by itself (centralized or de-centralized (geographically, Operation versus BI, ...))
. the internal policy regarding version control using ODI repository or an external tool
. the developers' quest for autonomy with regard to Data Models-Datastores specifications
    when a developer has to wait for someone else to setup some things in the Data Models before he/she  
    can work further on the project.

Other aspects like aligning the ODI repository architecture to an Dev/Acc/Prod organization will also have to be taken into account as well as sharing the metadata of each repository types (Master, Work) between these environments.

I personally believe that whatever the repository architecture is eventually retained and implemented, investing upfront time and effort in setting up naming conventions, migration or exchange principles for KM's, projects and more will pay back by itself in the long term.
Usually sooner than expected.
Re-organizing developments afterwards to align them to a 'new' architecture definition will require much time and attention.