Extracting IdM Configuration from a Running Instance
Posted by Mike Sat, 25 Apr 2009 02:16:00 GMT
An IdM customer recently found themselves in a pickle: they wanted to integrate additional resources into their IdM deployment, but did not have access to the original configuration files or CBE.
Here are the steps we took to extract configuration from their production repository:
- export configuration from running system via lh
- split export file, one file per top level element
- mangle XML objects (remove date based attributes, GUID etc)
- repeat above for reference, out-of-the-box, IdM deployment
- diff XML object sets (custom deployment vs. reference deployment)
And the details...
Export Configuration
Use the lh command to extract the running configuration:
mpierson:$ export WSHOME=/var/lib/tomcat/webapps/idm mpierson:$ $WSHOME/bin/lh console -c "export /tmp/export.xml"
Note: the export process may take some time, and consume significant cycles on the host and the repo; the resulting export.xml file will be of the order of 100Mb, depending on the number of user/resource accounts.
Split Export File
To facilitate comparison of the custom configs to the reference deployment, split the export file into many files, one per top level waveset element:
mpierson:$ xsltproc split-waveset.xslt /tmp/export.xml
We used a simple XSL transform that writes each child of the top level waveset element to a file, using file names based on name or id attributes, if available.
Note: the example XSL transform will write the files to a directory called 'split', and there will potentially be many files created.
Mangle XML Objects
Again, to facilitate comparison of the custom config objects to the reference deployment, we mangled the split XML files to remove date based attributes, owner/modifier attributes, and GUIDs:
mpierson:$ ./remove-transient-attrs.sh split-dir idm-key-prefix
... where split-dir is the name of the directory containing XML files to be cleansed, and idm-key-prefix is the first 5-6 characters of the instance ID prefix (i.e. '123456' in "#ID#123456789ABC...")
Note: here is the bash script that utilized find and perl to strip the relevant attributes.
Repeat for Reference Deployment
Repeat steps 1 thru 3 to produce an reference set of waveset objects. Ensure that you reference IdM deployment matches the version, including hotfixes, of the running instance being analyzed.
Catalog Differences in Custom Deployment
We applied diff recursively over the two sets of waveset objects:
mpierson:$ diff -N -r --brief -w reference-elements/ custom-elements/
The results of a diff will likely include a number of run-time objects, including User, Account, XmlData, and Syslog elements. Filter run-time objects from the diff for a better view of configuration:
mpierson:$ diff -N -r --brief -w reference-elements/ custom-elements/ |\
grep -v Account| grep -vi syslog |\
grep -v TaskInstance | grep -v "User-" |\
grep -v TaskResult | grep -v XmlData |\
grep -v WorkItem
We used a similar approach to produce a list of the XML objects that define the IdM configuration:
mpierson:$ diff -N -r --brief -w reference-elements/ custom-elements/ |\
grep -v Account| grep -vi syslog |\
grep -v TaskInstance | grep -v "User-" |\
grep -v TaskResult | grep -v XmlData |\
grep -v WorkItem |\
awk '{print $4;}' > files.txt
Notes: the results of this process provides a starting point for a rigorous reverse engineering endeavour. Manual inspection of the results, and extensive testing are recommended!

Mat has pointed out that the sample script I refer to in the Mangle XML Objects section renders TaskDefinition objects unusable. Specifically, the name attributes in Activity elements are removed, which destroys the link to Transition elements.
I've updated the remove-transient-attrs.sh script to leave the name attribute in place.