<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Tech-Blog: Category programming</title>
    <link>/articles/category/programming</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Mike Pierson's technology blog</description>
    <item>
      <title>Documenting Sun IDM XML Objects</title>
      <description>&lt;p&gt;&lt;i&gt;Note 1: this post details an approach for documenting the configuration objects of Sun's &lt;a href='http://www.sun.com/software/products/identity_mgr/index.xml'&gt;Identity Manager&lt;/a&gt; product.  It's propbably not interesting unless you're an IDM developer.&lt;/i&gt;&lt;/p&gt;

&lt;p&gt;&lt;i&gt;Note 2: the following article applies to IDM 7.1.  Not sure about earlier or later versions.&lt;/i&gt;&lt;/p&gt;

&lt;p&gt;IDM's XML configuration objects have limited support for in-situ programmer comments.  Most high-level elements allow &amp;lt;&lt;i&gt;Comments&lt;/i&gt;&amp;gt; elements, while others support a &lt;i&gt;description&lt;/i&gt; attribute.  Herewith an approach that leverages the existing documentation aspects of the Waveset.dtd, generic XML comment notation, and the waveset XML data iteself to generate IDM implementation documentation.&lt;/p&gt;

&lt;p&gt;A schematic that shows how the approach takes waveset XML objects and converts them to standard document formats:&lt;/p&gt;
&lt;p style='text-align: center; '&gt;&lt;img src='http://tech-blog.mpierson.net/files/Waveset2docbook.png'&gt;&lt;/p&gt;

&lt;p&gt;For the impatient, the XSL that transforms Waveset XML to Docbook XML is &lt;a href='http://tech-blog.mpierson.net/files/waveset2docbook.xsl'&gt;waveset2docbook.xsl&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Documenting a Single XML Object&lt;/h3&gt;
&lt;p&gt;A recipe for generating documentation from an IDM configuration XML file:&lt;/p&gt;

&lt;ol&gt;

 &lt;li&gt;&lt;p&gt;Add a &lt;i&gt;&amp;lt;Comments&amp;gt;&lt;/i&gt; element or @description XML comment to your waveset object.  See below for list of supported elements.&lt;/p&gt;&lt;/li&gt;

 &lt;li&gt;
  &lt;p&gt;Using your favourite XSL processor, transform the waveset XML object to &lt;a href='http://www.docbook.org/'&gt;Docbook&lt;/a&gt; format using &lt;a href='http://tech-blog.mpierson.net/files/waveset2docbook.xsl'&gt;waveset2docbook.xsl&lt;/a&gt;.&lt;br /&gt;
  I use &lt;a ref=''&gt;xsltproc&lt;/a&gt;, e.g.:&lt;/p&gt;
  &lt;blockquote&gt;
    &lt;pre&gt;mpierson:$ xsltproc --stringparam fileName "custom/WEB-INF/config/MyResource.xml" \
  waveset2docbook.xsl custom/WEB-INF/config/MyResource.xml &gt; docs/MyResource.xml&lt;/pre&gt;
  &lt;/blockquote&gt;
  &lt;p&gt;(The &lt;i&gt;fileName&lt;/i&gt; stringparam allows the file's path in the CBE to be included in the documentation.)&lt;/p&gt;
 &lt;/li&gt;

 &lt;li&gt;
  &lt;p&gt;Add an XML declaration to the generated Docbook file:&lt;/p&gt;
  &lt;pre&gt;
&amp;lt;?xml version="1.0"?&amp;gt;
&amp;lt;!DOCTYPE article
  PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" 
  "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"&amp;gt;
  &lt;/pre&gt;
 &lt;/li&gt;
 
 &lt;li&gt;
   &lt;p&gt;Open the generated Docbook XML in OpenOffice using file type &lt;i&gt;Docbook&lt;/i&gt;.  This should work in versions 2.x and 3.x of OpenOffice.&lt;/p&gt;
   &lt;p&gt;- or -&lt;/p&gt;
   &lt;p&gt;Use the &lt;a href='http://docbook.sourceforge.net/'&gt;Docbook XSL stylesheets&lt;/a&gt; to convert your generated Docbook file to HTML or PDF (or &lt;a href='http://wiki.docbook.org/topic/formats'&gt;many other formats&lt;/a&gt;).&lt;/p&gt;
 &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Using the above recipe, I've documented a sample LDAP resource adapter definition:&lt;/p&gt;
 &lt;blockquote&gt;&lt;a href='http://tech-blog.mpierson.net/files/MyResource.xml'&gt;MyResource.xml&lt;/a&gt; -&amp;gt; &lt;a href='http://tech-blog.mpierson.net/files/MyResource.dbk'&gt;MyResource.dbk&lt;/a&gt; -&amp;gt; &lt;a href='http://tech-blog.mpierson.net/files/MyResource.doc'&gt;MyResource.doc&lt;/a&gt;&lt;/blockquote&gt;

&lt;h3&gt;Documenting the Entire IDM Implementation&lt;/h3&gt;
&lt;p&gt;A more practical application of this approach to IDM docs, is to produce a single 'as-built' document for all configured XML objects.  This is best achived via a script, which repeats the procedure for a single XML waveset object, and appends each result to a single Docbook document.&lt;/p&gt;

&lt;p&gt;An example script that does just this (including the required Docbook XML declaration) is &lt;a href='http://tech-blog.mpierson.net/files/makeDocs.sh'&gt;here&lt;/a&gt;.  You'll see in the script that I like the generated docs to be included as an appendix in the project documentation.&lt;/p&gt;  

&lt;h3&gt;Supported Waveset Elements&lt;/h3&gt;

&lt;p&gt;The current version of waveset2docbook.xsl supports a subset of the waveset.dtd, but includes most of the high-level elements.&lt;/p&gt;
 &lt;table cellpadding='5' cellspacing='1' border='1'&gt;
  &lt;tr&gt;
    &lt;th&gt;Element&lt;/th&gt;
    &lt;th&gt;Description Element&lt;/th&gt;
    &lt;th&gt;Other Aspects Documented&lt;/th&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Configuration:User Extended Attributes&lt;/td&gt;
    &lt;td&gt;none&lt;/td&gt;
    &lt;td&gt;extended attributes are enumerated&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Configuration:UserUIConfig&lt;/td&gt;
    &lt;td&gt;none&lt;/td&gt;
    &lt;td&gt;SummaryAttrNames, QueryableAttrNames, FindSearchAttrs, RepoIndexAttrs&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Configuration:Reconciliation Policy&lt;/td&gt;
    &lt;td&gt;none&lt;/td&gt;
    &lt;td&gt;
     reconciliation policy attributes (fetch timeout etc.), plus per-resource type configuration (correlation rule, confirmation rule, proxy user, etc.)
    &lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;LoginApp&lt;/td&gt;
    &lt;td&gt;@description&lt;/td&gt;
    &lt;td&gt;LoginModGroups are enumerated&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;LoginModGroup&lt;/td&gt;
    &lt;td&gt;none&lt;/td&gt;
    &lt;td&gt;resource type, module type, control type, correlation rule, authentication parameters&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Organizations&lt;/td&gt;
    &lt;td&gt;@description&lt;/td&gt;
    &lt;td&gt;path from Top, policies&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Policy:Account Policy&lt;/td&gt;
    &lt;td&gt;&lt;i&gt;Description&lt;/i&gt; element&lt;/td&gt;
    &lt;td&gt;account ID policy, password policy&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Policy:String Quality Policy&lt;/td&gt;
    &lt;td&gt;&lt;i&gt;Description&lt;/i&gt; element&lt;/td&gt;
    &lt;td&gt;string quality policy attributes are enumerated&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Resource&lt;/td&gt;
    &lt;td&gt;@description&lt;/td&gt;
    &lt;td&gt;flat file format attributes, @prodRef for reference to IDM resource documentation, active sync attributes (proxy user, correlation rule, confirmation rule), &lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Rule&lt;/td&gt;
    &lt;td&gt;&lt;i&gt;Comments&lt;/i&gt; element&lt;/td&gt;
    &lt;td&gt;rule type (correlation, confirmation, other)&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Configuration:Rule Library&lt;/td&gt;
    &lt;td&gt;none&lt;/td&gt;
    &lt;td&gt;documented rules are enumerated, including contents of Comment element in each rule&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;EmailTemplate&lt;/td&gt;
    &lt;td&gt;&lt;i&gt;Comments&lt;/i&gt; element&lt;/td&gt;
    &lt;td&gt;status of 'html enabled' flag&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;TaskDefinition&lt;/td&gt;
    &lt;td&gt;&lt;i&gt;Comments&lt;/i&gt; element&lt;/td&gt;
    &lt;td&gt;referenced sub-tasks, referenced forms&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Configuration:WFProcess (sub-task)&lt;/td&gt;
    &lt;td&gt;&lt;i&gt;Comments&lt;/i&gt; element&lt;/td&gt;
    &lt;td&gt;referenced sub-tasks, referenced forms&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;Configuration:Custom Catalog&lt;/td&gt;
    &lt;td&gt;none&lt;/td&gt;
    &lt;td&gt;name and value of each message is listed&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;TaskSchedule&lt;/td&gt;
    &lt;td&gt;&lt;i&gt;Description&lt;/i&gt; element&lt;/td&gt;
    &lt;td&gt;task to run, repitition count and unit&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;User&lt;/td&gt;
    &lt;td&gt;@description&lt;/td&gt;
    &lt;td&gt;user form, admin groups, organizations&lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;User Form&lt;/td&gt;
    &lt;td&gt;&lt;i&gt;Comments&lt;/i&gt; element&lt;/td&gt;
    &lt;td&gt;referenced forms&lt;/td&gt;
  &lt;/tr&gt;

 &lt;/table&gt;
&lt;p style='margin-bottom: 2em'&gt;&lt;/p&gt;

&lt;h3&gt;OpenOffice And Docbook&lt;/h3&gt;

&lt;p&gt;It's worth noting that OpenOffice supports Docbook 'out-of-the-box', but &lt;a href='http://xml.openoffice.org/xmerge/docbook/supported_tag_table.html'&gt;not all elements are supported&lt;/a&gt;.  I've adapted the docbook XSL filter from OpenOffice 3.0 to do a better job of rendering the &lt;i&gt;&amp;lt;literallayout&amp;gt;&lt;/i&gt; elements generated by waveset2docbook.xsl. Download it &lt;a href='http://tech-blog.mpierson.net/files/docbooktosoffheadings.xsl'&gt;here&lt;/a&gt;, your results may vary.  The OpenOffice site has instructions for &lt;a href='http://xml.openoffice.org/xmerge/docbook/'&gt;customizing the XML filters&lt;/a&gt;, but I just used the Tools -&amp;gt; XML Filter Settings in OO 3.0.&lt;/p&gt;

</description>
      <pubDate>Fri, 23 Jan 2009 13:23:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:1bee94eb-7885-473a-8148-5d5ab1ad6ff3</guid>
      <author>Mike</author>
      <link>/articles/2009/01/23/documenting-sun-idm-xml-objects</link>
      <category>programming</category>
      <category>idm</category>
      <category>docbook</category>
      <category>xsl</category>
      <trackback:ping>/articles/trackback/443</trackback:ping>
    </item>
    <item>
      <title>History</title>
      <description>&lt;p&gt;Here I give in to a Silly Internet Meme (&lt;a href='http://www.tbray.org/ongoing/When/200x/2008/04/15/History-Meme'&gt;Tim&lt;/a&gt; made me do it).&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;
mpierson@macbook:~$  history|\
awk '{a[$2]++} END{for(i in a){printf "%5d\t%s \n",a[i],i}}'|\
sort -rn|head
  195	ls 
  121	g 
   83	svn 
   75	cd 
   71	sudo 
   50	less 
   32	grep 
   28	xbacklight 
   22	ssh 
   20	./update.sh 

&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&amp;#x201c;g&amp;#x201d; is an alias for &lt;code&gt;gvim&lt;/code&gt; and &amp;#x201c;update.sh&amp;#x201d; is an IDM build script.&lt;/p&gt;

&lt;p&gt;... and I've added Tim's &amp;#x201c;lh&amp;#x201d; to the macbook&lt;/p&gt;

</description>
      <pubDate>Wed, 16 Apr 2008 09:34:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:2092be74-1491-40b5-b1a8-4f249e545390</guid>
      <author>Mike</author>
      <link>/articles/2008/04/16/history</link>
      <category>programming</category>
      <category>bash</category>
      <category>meme</category>
      <trackback:ping>/articles/trackback/407</trackback:ping>
    </item>
    <item>
      <title>Web 2.0</title>
      <description>&lt;p&gt;As Tim O'Reilly and Tim Bray &lt;a href="http://radar.oreilly.com/archives/2005/08/not_20.html"&gt;say&lt;/a&gt;: 'there's still a huge amount of disagreement about just what Web 2.0 means'.  Herewith, my summary of O'Reilly's piece &lt;a href="http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html"&gt;What Is Web 2.0&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;O'Reilly describes priciples shared by successful 'Web 1.0' successes and interesting recent applications. See the &lt;a href="http://www.oreillynet.com/oreilly/tim/news/2005/09/30/graphics/figure1.jpg"&gt;meme map&lt;/a&gt; that came out of a brainstorming session of a FOO Camp conference.&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
  &lt;h3&gt;The Web As Platform&lt;/h3&gt;
  &lt;p&gt;Web as platform is an old idea but it's implementation has been refined.  See Netscape vs. Google, DoubleClick vs. Ad Sense, Akamai vs. BitTorrent.&lt;/p&gt;
 &lt;/li&gt;
 &lt;li&gt;
  &lt;h3&gt;Harnessing Collective Intelligence&lt;/h3&gt;
  &lt;p&gt;Open Source software, open content, collaborative categorization, viral marketing, all rely on a collective intelligence.  Site attributes such as extensive (permanent) hyperlinks, low barriers to participation, organized content and meta data facilitate or enhance the affect of collective intelligence. Blogs are a special case of collective intelligence (and RSS a special attribute) in that the collective intelligence only emerges from a critical mass of blogs/articles.&lt;/p&gt;
 &lt;/li&gt;
 &lt;li&gt;
  &lt;h3&gt;Data is the Next Intel Inside&lt;/h3&gt;
  &lt;p&gt;Based on the way they approached their databases, MapQuest is a Web 1.0 story and Amazon is a Web 2.0 story.  MapQuest licensed map data from Tele Atlas, but did not enhance (e.g. user annotations) or control the data.  Amazon licensed ISBN data from R.R. Bowker and enhanced the data with data from publishers and customers.  MapQuest was soon joined in the marketplace by competing services (Yahoo, Google, MSN) and Amazon is the standard source for bibliographic data.&lt;/p&gt;
 &lt;/li&gt;
 &lt;li&gt;
  &lt;h3&gt;End of the Software Release Cycle&lt;/h3&gt;
  &lt;p&gt;In Web 2.0 software is delivered as a service not a product.&lt;/p&gt;
  &lt;p&gt; O'Reilly suggests a number of fundamental changes to the business model of software companies.&lt;/p&gt;
  &lt;ul&gt;
   &lt;li&gt;&lt;i&gt;Operations must become a core competency.&lt;/i&gt;  Google has become experts at managing the servers that deliver their web services.  And the expertise is closely guarded.&lt;/li&gt;
   &lt;li&gt;&lt;i&gt;Users must be treated as co-developers.&lt;/i&gt;  Release early and often (daily, hourly) and/or a perpetual beta.  Real time monitoring of user behaviour. &lt;/li&gt;
  &lt;/ul&gt;
 &lt;/li&gt;
 &lt;li&gt;
  &lt;h3&gt;Lightweight Programming Models&lt;/h3&gt;
  &lt;p&gt;Simple, lightweight service interfaces appear to be successful with the masses (i.e. the intelligent collective).  (One assumes that housingmaps.com enhances the value of Google maps?)&lt;/p&gt;
  &lt;p&gt;Three lessons identified:&lt;/p&gt;
  &lt;ul&gt;
   &lt;li&gt;Support lightweight programming models that allow for loosely coupled systems&lt;/li&gt;
   &lt;li&gt;Think syndication, not coordination
   &lt;li&gt;Design for 'hackability' and remixability
  &lt;/ul&gt;
 &lt;/li&gt;
 &lt;li&gt;
  &lt;h3&gt;Software Above the Level of a Single Device&lt;/h3&gt;
  &lt;p&gt;ITunes, Tivo, blackberry...&lt;/p&gt;
 &lt;/li&gt;
 &lt;li&gt;
  &lt;h3&gt;Rich User Experiences&lt;/h3&gt;
  &lt;p&gt;Google/Flickr/Basecamp are at the forefront, but Yahoo and others have made AJAX the basis for major product releases.&lt;/p&gt;
 &lt;/li&gt;
&lt;/ol&gt;

 &lt;p&gt;O'Reilly finishes with a summary of the core compentencies of a Web 2.0 company:&lt;/p&gt;
 &lt;ul&gt;
  &lt;li&gt;Services, not packaged software, with cost-effective scalability
  &lt;li&gt;Control over unique, hard-to-recreate data sources that get richer as more people use them
  &lt;li&gt;Trusting users as co-developers
  &lt;li&gt;Harnessing collective intelligence
  &lt;li&gt;Leveraging the long tail through customer self-service
  &lt;li&gt;Software above the level of a single device
  &lt;li&gt;Lightweight user interfaces, development models, AND business models
 &lt;/ul&gt;
  

</description>
      <pubDate>Mon, 03 Oct 2005 15:24:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:a0a9ab60-db00-44a3-8c40-62401c2eaf76</guid>
      <author>Mike</author>
      <link>/articles/2005/10/03/web-2-0</link>
      <category>web</category>
      <category>programming</category>
      <category>Google</category>
    </item>
    <item>
      <title>McGrath on Documentation</title>
      <description>
&lt;p&gt;Go read this short &lt;a href="http://www.itworld.com/AppDev/902/nls_ebizluke050412/index.html"&gt;article&lt;/a&gt; by &lt;a href="http://seanmcgrath.blogspot.com/"&gt;Sean McGrath&lt;/a&gt; on the subject of test driven documentation.  Unit tests as documentation is not what Knuth had in mind when he coined the phrase &lt;a href="http://en.wikipedia.org/wiki/Literate_programming"&gt;Literate Programming&lt;/a&gt;, but it&amp;rsquo;s a step in the right direction.&lt;/p&gt;

</description>
      <pubDate>Tue, 12 Apr 2005 11:30:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:141b9391771dce0d9f6f0a4fb7224bc4</guid>
      <author>mop</author>
      <link>/articles/2005/04/12/literate-programming</link>
      <category>programming</category>
      <trackback:ping>/articles/trackback/118</trackback:ping>
    </item>
    <item>
      <title>XDoclet code generation</title>
      <description>

&lt;p&gt;X, as in eXtreme, not XML.  &lt;a href="http://xdoclet.sourceforge.net/xdoclet/index.html"&gt;XDoclet&lt;/a&gt; leverages metadata encoded withing Java classes as Javadocs, generating content (Java classes, JSPs, etc.) as part of a build process.  The model is well suited to EJBs, Struts, as well as mixed content (generated plus hand crafted) files.  XDoclet is also easy to apply in ad-hoc situations.&lt;/p&gt;
&lt;p&gt;The premise is simple enough:  put a custom javadoc tag in a Java source file then apply an XDoclet transform to produce a helper class, a JSP, a unit test, whatever.  The transform can be as simple as an XDt template that references the custom javadoc tag, or a custom Java-based processor that applies complex logic to the tag-encoded metadata.&lt;/p&gt;

</description>
      <pubDate>Mon, 29 Nov 2004 20:43:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:3df8b04d2e214a82539cf9da33c9d711</guid>
      <author>mop</author>
      <link>/articles/2004/11/29/xdoclet</link>
      <category>programming</category>
      <trackback:ping>/articles/trackback/137</trackback:ping>
    </item>
    <item>
      <title>Web quickies</title>
      <description>
&lt;p&gt;Some tidbits from my &lt;a href="http://www.bloglines.com/"&gt;Bloglines&lt;/a&gt; RSS &lt;a href="http://www.bloglines.com/public/mpierson"&gt;subscriptions&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://www.jutils.com/"&gt;Lint4J&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;"Lint4j ("Lint for Java") is a static Java source code analyzer that detects locking and threading issues, performance and scalability problems, and checks complex contracts such as Java serialization by performing type, data flow, and lock graph analysis."&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://www.jot.com/"&gt;JotSpot&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;It&amp;rsquo;s Wiki++.  Typical intranet functionality is available to Wiki users. As seen on &lt;a href="http://weblog.infoworld.com/udell/"&gt;John Udell&amp;rsquo;s blog&lt;/a&gt;.  (I&amp;rsquo;ve added John&amp;rsquo;s blog to my roll.)&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://www.pragmaticautomation.com/cgi-bin/pragauto.cgi/Monitor/BlogYourBuild.rdoc"&gt;Blogging Your Build&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Blogs aren&amp;rsquo;t just for people, you&amp;rsquo;re processes should be blogging too.  Oh yeah.&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wiki.osuosl.org/display/LNX/Debian+on+Dell+Servers"&gt;Debian on Dell Servers&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;ISOs and pointers for those brave enough to run Dell servers.&lt;/p&gt;

</description>
      <pubDate>Mon, 01 Nov 2004 18:22:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:ba0883a746797eeb121c87ea83b1f45b</guid>
      <author>mop</author>
      <link>/articles/2004/11/01/htmlQuickies-2</link>
      <category>blogs</category>
      <category>web</category>
      <category>Linux</category>
      <category>programming</category>
      <trackback:ping>/articles/trackback/112</trackback:ping>
    </item>
    <item>
      <title>Google-like searches with Lucene</title>
      <description>
&lt;p&gt;&lt;a href="http://jakarta.apache.org/lucene/"&gt;Lucene&lt;/a&gt; is a Java system for "high-performance, full-featured text search".  The software apears to be mature, and the community has produced a fair bit of &lt;a href="http://wiki.apache.org/jakarta-lucene"&gt;documentation&lt;/a&gt;.  A replacement for RDBMS-based searches?&lt;p&gt;

&lt;p&gt;No doubt that the &lt;i&gt;searching&lt;/i&gt; is more intuitive, and would make it easier for users to perform keyword searches.  Not sure that a Google-like engine could match RDBMS for field-based searching and fancy list navigation.&lt;/p&gt;

&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://www.companywebstore.de/tangentum/mirror/en/products/phonetix/index.html"&gt;Phonetix&lt;/a&gt; integrates phonetic algorithms into Lucene&lt;/li&gt;
 &lt;li&gt;&lt;a href="http://www.getopt.org/luke/"&gt;Luke&lt;/a&gt; provides a high level interface (Java and GUI) to Lucene&amp;rsquo;s generated indexes&lt;/li&gt;
 &lt;li&gt;limited &lt;a href="http://jakarta.apache.org/lucene/docs/benchmarks.html"&gt;benchmarks&lt;/a&gt; are available&lt;/li&gt;
&lt;/ul&gt;

</description>
      <pubDate>Wed, 20 Oct 2004 12:30:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:46b228e9f3e47c542d3742aa876f99bb</guid>
      <author>mop</author>
      <link>/articles/2004/10/20/lucene</link>
      <category>programming</category>
      <category>Google</category>
      <trackback:ping>/articles/trackback/120</trackback:ping>
    </item>
    <item>
      <title>HTML quickies</title>
      <description>
&lt;p&gt;Some clever solutions to file for a rainy day...&lt;/p&gt;

&lt;h3&gt;Javascript popup object&lt;/h3&gt;

&lt;p&gt;Matt Kruse seems to have done a good job creating a flexible &lt;a href="http://www.mattkruse.com/javascript/popupwindow/"&gt;Javascript object for browser popups&lt;/a&gt;.  It support tool-tip style boxes, as well as traditional pop-up windows.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://norman.walsh.name/"&gt;Norm Walsh&lt;/a&gt;, the don of DocBook, mentioned Matt&amp;rsquo;s work in his discussion of &lt;a href="http://norman.walsh.name/2004/09/10/annotations"&gt;DocBook annotations&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Tag Soup&lt;/h3&gt;

&lt;p&gt;John Cowan wrote a SAX compatible parser for &amp;rsquo;nasty and brutish&amp;rsquo; HTML, called &lt;a href="http://mercury.ccil.org/~cowan/XML/tagsoup/"&gt;Tag Soup&lt;/a&gt;.  This lenient parser takes poorly formatted HTML snippets and parses them into a valid tree.  Seems like a must-have for any web application that allows users to enter HTML mark-up.&lt;/p&gt;
&lt;p&gt;Norm Walsh uses Tag Soup to parse comments authored by visitors to his blog.  Interesting that even the comments to Norm&amp;rsquo;s blog are syndicated.&lt;/p&gt;

</description>
      <pubDate>Fri, 10 Sep 2004 11:47:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:b20d1db9de494e26f24f213ceb035aa2</guid>
      <author>mop</author>
      <link>/articles/2004/09/10/htmlQuickies</link>
      <category>web</category>
      <category>programming</category>
      <trackback:ping>/articles/trackback/114</trackback:ping>
    </item>
    <item>
      <title>Converting a MS SQL Server database to PostgreSQL</title>
      <description>

&lt;p&gt;Herewith some notes from my attempt to migrate a database instance from Microsoft SQL Server 7 to PostgreSQL 7.3.  My journey began with Ian Harding&amp;rsquo;s &lt;a href="http://techdocs.postgresql.org/techdocs/sqlserver2pgsql.php"&gt;how-to&lt;/a&gt;, and it&amp;rsquo;s a good place to start.&lt;/p&gt;

&lt;h3&gt;export from SQL Server&lt;/h3&gt;
&lt;p&gt;The &lt;a href=""&gt;bcp&lt;/a&gt; utility is a quick and flexible command line utility that extracts raw table data (or query result) to a file. It works pretty much as advertised, with the only tricky parts being the treatment of nulls and character encoding.  Ian suggested using the &lt;i&gt;-k&lt;/i&gt; parameter which forces bcp to use a null character (x00) to represent an empty field, it&amp;rsquo;s probably the right thing to do. Unfortunately bcp does not distinguish between empty fields(i.e. value is null) and fields containing an empty string.  Character encoding can be dealt with in two ways: the &lt;i&gt;-c&lt;/i&gt; parameter will force all text data into ASCII text, or the &lt;i&gt;-w&lt;/i&gt; parameter will encode text as UTF-16.  The two-byte representation would be a no-brainer, except that the data swells to (almost) twice the original size.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s what I used for each table in the database:&lt;/p&gt;
&lt;pre&gt;
 bcp dbname..tablename out &amp;rsquo;filename&amp;rsquo; -w -k -t "&amp;lt;f-end&amp;gt;" -r "&amp;lt;record-end&amp;gt;" -b 1000
&lt;/pre&gt;
&lt;p&gt;where &lt;i&gt;-b&lt;/i&gt; is the number of rows per transaction, and the &lt;i&gt;-t&lt;/i&gt; and &lt;i&gt;-r&lt;/i&gt; parameters indicate the field and record delimiters.  The key when choosing delimiters is to avoid conflicts with field values.&lt;/p&gt;


&lt;h3&gt;mangle the exported data&lt;/h3&gt;
&lt;p&gt;Here&amp;rsquo;s what I did to the export of each table (after moving files to a Linux box):&lt;/p&gt;
&lt;pre&gt;
 # transform to 8 bit encoding
 recode utf-16..utf-8 $1

 # TODO check for literal &amp;rsquo;
&amp;rsquo;, &amp;rsquo;	&amp;rsquo;

 # replace back slash with forward slash
 perl -pi -e &amp;rsquo;s!\!/!g&amp;rsquo; $1

 # replace tabs with literal &amp;rsquo;	&amp;rsquo;
 perl -pi -e &amp;rsquo;s/	/\t/g&amp;rsquo; $1
 # replace line breaks with literal &amp;rsquo;
&amp;rsquo;
 perl -pi -e &amp;rsquo;s/
/\n/g&amp;rsquo; $1

 # replace field delimiter with tabs
 perl -pi -e &amp;rsquo;s/&amp;lt;f-end&gt;/	/g&amp;rsquo; $1
 # replace record delimiters with line break
 perl -pi -e &amp;rsquo;s/&amp;lt;record-end&gt;/
/g&amp;rsquo; $1

 # remove Windoze line feeds
 perl -pi -e &amp;rsquo;s/
//g&amp;rsquo; $1

 # remove nulls
 perl -pi -e &amp;rsquo;s/x00//g&amp;rsquo; $1
&lt;/pre&gt;

&lt;p&gt;Here&amp;rsquo;s the step by step explanation:&lt;/p&gt;
 &lt;ul&gt;
  &lt;li&gt;bcp exports Unicode using UTF-16, PostgreSQL expects UTF-8; and UTF-8 is easier to move around via SCP
  &lt;li&gt;the backslash character is significant when importing into PostgreSQL, and I couldn&amp;rsquo;t think of a reason to keep them in a field value
  &lt;li&gt;PostgreSQL uses the backslash to encode tabs and line breaks within fields values
  &lt;li&gt;obviously tabs and line breaks are used as delimiters
  &lt;li&gt;just housekeeping, I don&amp;rsquo;t think the line feeds cause a problem
  &lt;li&gt;nulls seem to confuse PostgreSQL&amp;rsquo;s import process
 &lt;/ul&gt;

&lt;p&gt;Notes: next time around I&amp;rsquo;ll use sed instead of perl, but I was too lazy to check the syntax of &lt;i&gt;recode&lt;/i&gt; for stream ops;  you&amp;rsquo;ll see that the null characters inserted by &lt;i&gt;bcp&lt;/i&gt; to represent empty fields are being stripped - could be that we don&amp;rsquo;t need the nulls in the exports, or that we should keep them in the export and convince PostgreSQL that they are significant.&lt;/p&gt;

&lt;h3&gt;create PostgreSQL schema&lt;/h3&gt;
&lt;p&gt;I used brute force.  It would be nice to build a &lt;i&gt;schema.sql&lt;/i&gt; script with ant and &lt;a href="http://blog.intouch.ca/mpierson/pyblosxom.cgi/makedata.html"&gt;makedata&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;import into PostgreSQL&lt;/h3&gt;
&lt;p&gt;The &lt;a href="http://www.postgresql.org/docs/7.3/static/sql-copy.html"&gt;COPY&lt;/a&gt; command allows table data to be imported from a local file.  The only tricky part is the interpretation of null field values; following Ian&amp;rsquo;s lead I&amp;rsquo;ve specified the empty string:&lt;/p&gt;
&lt;pre&gt;
 COPY tablename FROM &amp;rsquo;filename&amp;rsquo; WITH NULL AS &amp;rsquo;&amp;rsquo;;
&lt;/pre&gt;
&lt;p&gt;This approach worked for all tables except those that contained empty strings in columns defined as &amp;rsquo;NOT NULL&amp;rsquo;.  I kluged these tables by altering the schema: "... ALTER COLUMN xxx DROP NOT NULL".&lt;/p&gt;

&lt;h3&gt;update&lt;/h3&gt;
&lt;p&gt;It&amp;rsquo;s probably also a good idea to run the maintenance.sql script to clean up some tables before extracting.  Smaller is better.&lt;/p&gt;

</description>
      <pubDate>Wed, 11 Aug 2004 12:46:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:25f558fd59b12d8d2f7ca71efa42fda2</guid>
      <author>mop</author>
      <link>/articles/2004/08/11/sqlServer2PostgreSQL</link>
      <category>Linux</category>
      <category>programming</category>
      <trackback:ping>/articles/trackback/134</trackback:ping>
    </item>
    <item>
      <title>A brief survey of Java IDEs</title>
      <description>

&lt;p&gt;Herewith a record of my reaction to four Java IDEs.&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://www.eclipse.org/"&gt;Eclipse&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;I&amp;rsquo;ve used Eclipse at various times to do some code quality analysis (managing &lt;i&gt;import&lt;/i&gt; statements mostly).  In the past it has been cumbersome to use, especially for our Ant/CVS based projects.  It&amp;rsquo;s getting better. Of course my hardware is better than it used to be;  IDEs certainly benefit from a beefy CPU and big monitor.  The CVS integration is pretty slick even when dealing with branches and remote repositories.  The integration with Ant is still a bit tricky, although I&amp;rsquo;m not sure there is a nice way to mesh Ant builds with a built-in compiler.  The built-in compiler appears similar to Jikes (maybe same code base?), perhaps there are plug-ins that offer a more in-depth analysis. I&amp;rsquo;ve yet to give the debugger in Eclipse a run-through.&lt;/p&gt;

&lt;p&gt;Overall rating: 7/10.&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://www.borland.com/jbuilder/"&gt;JBuilder&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Note that I was prejudiced against JBuilder because it&amp;rsquo;s a commercial product without providing an obvious advantage over the open source alternatives. After filling out several pledge-my-first-born registration forms and downloading 100MB of files, the installation process was painless.  Creating projects is intuitive and Borland is the only IDE I know of that is explicit with respect to character encoding -- smart.  Unfortunately I could not get JBuilder to check out the example project from CVS, so I did limited testing.  The web site implied that the IDE would integrate with Borland&amp;rsquo;s OptimizeIt tool -- I couldn&amp;rsquo;t verify this because of the CVS issue, but I did try OptimizeIt outside the IDE.  Profiling is probably a good idea if you have oodles of time and a lot of patience (I have neither).  The standard version of JBuilder is $500/seat, the enterprise version (EJBs etc) is $3500.&lt;/p&gt;

&lt;p&gt;Overall rating: 5/10.&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://www.jetbrains.com/idea/"&gt;IntelliJ IDEA&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;I&amp;rsquo;ve heard IntelliJ referred to as the Cadillac of Java IDEs.  It has the nicest visuals of any of the other candidates -- nice widgets, decent sized icons, and pleasant colours are important if you&amp;rsquo;re staring at them all day. IntelliJ integrates will with Ant and CVS, although there is a bug in the CVS client that corrupts jar files on checkout.  The big selling feature of IntelliJ are the code inspection tools: detection of unused methods and members, complexity measurements, scope analysis, etc..  Most of the issues identified by the inspection tools are presented with suggested fixes.  Impressive but overwhelming.  IntelliJ IDEA is $500/seat.&lt;/p&gt;

&lt;p&gt;Overall rating: 8/10&lt;/p&gt;


&lt;h3&gt;&lt;a href="http://www.netbeans.org/"&gt;NetBeans&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;The IDE backed by Sun, and I hear there are some mind-altering plug-ins on the way from Tim Bray, but it&amp;rsquo;s not ready for prime time yet.  The file system paradigm used for projects was not a good fit with our CVS project, and the compiler was unable to build for lack of memory. I have heard that the NetBeans debugger is head and shoulders above the rest.&lt;/p&gt;

&lt;p&gt;Overall rating: 5/10.&lt;/p&gt;

</description>
      <pubDate>Wed, 28 Jul 2004 23:47:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:e38f1157fbba82a9997b885e7dd95cf4</guid>
      <author>mop</author>
      <link>/articles/2004/07/28/ideReview</link>
      <category>programming</category>
      <trackback:ping>/articles/trackback/116</trackback:ping>
    </item>
  </channel>
</rss>

