<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>TheContentGuy &#187; DITA</title>
	<atom:link href="http://thecontentguy.net/blog/category/dita/feed/" rel="self" type="application/rss+xml" />
	<link>http://thecontentguy.net</link>
	<description>all things unstructured</description>
	<lastBuildDate>Sat, 24 Jul 2010 05:00:00 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Webcast: DITA, Metadata Maturity &amp; the Case for Taxonomy</title>
		<link>http://thecontentguy.net/blog/2009/08/31/webcast-dita-metadata-maturity-the-case-for-taxonomy/</link>
		<comments>http://thecontentguy.net/blog/2009/08/31/webcast-dita-metadata-maturity-the-case-for-taxonomy/#comments</comments>
		<pubDate>Mon, 31 Aug 2009 23:59:10 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[DITA]]></category>
		<category><![CDATA[ECM]]></category>
		<category><![CDATA[XML]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Earley & Associates]]></category>
		<category><![CDATA[maturity]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[webcast]]></category>

		<guid isPermaLink="false">http://thecontentguy.net/blog/?p=454</guid>
		<description><![CDATA[Our research confirms that organizations that use XML authoring are more mature than their peers with respect to the adoption of best practices for search and metadata. However, the use of native DITA metadata capabilities is rare, and many are also missing out on opportunities to use taxonomy for reuse and improved findability. This webcast will explore the metadata capabilities within DITA and component content management systems, discuss two major benefits that can be achieved by using descriptive metadata and taxonomy, and recommend some best practices for getting started with metadata for component-oriented content.]]></description>
			<content:encoded><![CDATA[<p><strong>Webcast September 02, 2009, 1:00 &#8211; 2:00 PM EDT</strong><br />
$50 <a title="Register for DITA webcast" href="http://www.earley.com/webinars/dita" target="_blank">Register here</a></p>
<p><strong>Speakers</strong><br />
Robert Berry, Information Developer, IBM<br />
Michael Harris, Information Architect, IBM<br />
Erik Hennum, Information Model Architect, IBM<br />
Paul Wlodarczyk, Director Solutions Consulting, Earley &amp; Associates</p>
<p>Many organizations have turned to component-oriented content to create more sophisticated knowledge products, in more languages, at lower cost. For most organizations these days, component content is achieved by using DITA, the Darwin Information Typing Architecture. Finding content in your file system or content repository is hard enough when you’ve got simple text documents to deal with. When you’re using DITA and other component-oriented methods, you increase the difficulty by two or three orders of magnitude, because you’re looking for smaller needles in bigger haystacks. It’s logical that DITA users would turn to taxonomy and metadata to improve findability of their reusable content.</p>
<p>Our research confirms that organizations that use XML authoring are more mature than their peers with respect to the adoption of best practices for search and metadata. However, the use of native DITA metadata capabilities is rare, and many are also missing out on opportunities to use taxonomy for reuse and improved findability. We will explore the metadata capabilities within DITA and component content management systems, discuss two major benefits that can be achieved by using descriptive metadata and taxonomy, and recommend some best practices for getting started with metadata for component-oriented content.</p>
<p>Check out the preview on SlideShare:</p>
<div id="__ss_1892178" style="text-align: left; width: 425px;"><a style="font:14px Helvetica,Arial,Sans-serif;display:block;margin:12px 0 3px 0;text-decoration:underline;" title="September 2 Taxonomy CoP: DITA, Metadata Maturity, &amp; the Case for Taxonomy" href="http://www.slideshare.net/Earley/september-2-taxonomy-cop-dita-metadata-maturity-the-case-for-taxonomy">September 2 Taxonomy CoP: DITA, Metadata Maturity, &amp; the Case for Taxonomy</a><object width="425" height="355" data="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=taxocop-dita-preview-08-18-09-090821181316-phpapp01&amp;stripped_title=september-2-taxonomy-cop-dita-metadata-maturity-the-case-for-taxonomy" type="application/x-shockwave-flash"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=taxocop-dita-preview-08-18-09-090821181316-phpapp01&amp;stripped_title=september-2-taxonomy-cop-dita-metadata-maturity-the-case-for-taxonomy" /><param name="allowfullscreen" value="true" /></object></div>
<div style="font-family: tahoma,arial; height: 26px; font-size: 11px; padding-top: 2px;">View more <a style="text-decoration:underline;" href="http://www.slideshare.net/">presentations</a> from <a style="text-decoration:underline;" href="http://www.slideshare.net/Earley">Earley</a>.</div>
]]></content:encoded>
			<wfw:commentRss>http://thecontentguy.net/blog/2009/08/31/webcast-dita-metadata-maturity-the-case-for-taxonomy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SDL Announces the Acquisition of XyEnterprise, New Unit is SDL XySoft</title>
		<link>http://thecontentguy.net/blog/2009/06/29/sdl-announces-the-acquisition-of-xyenterprise/</link>
		<comments>http://thecontentguy.net/blog/2009/06/29/sdl-announces-the-acquisition-of-xyenterprise/#comments</comments>
		<pubDate>Mon, 29 Jun 2009 16:39:32 +0000</pubDate>
		<dc:creator>paulwlodarczyk</dc:creator>
				<category><![CDATA[DITA]]></category>
		<category><![CDATA[ECM]]></category>
		<category><![CDATA[XML]]></category>
		<category><![CDATA[SDL]]></category>
		<category><![CDATA[XyEnterprise]]></category>

		<guid isPermaLink="false">http://thecontentguy.net/blog/?p=381</guid>
		<description><![CDATA[The merger creates organizational stability, a seasoned and unified management team, a strong combined customer base, complete product offerings, and deep XML technical expertise.  ]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-385" title="sdlxysoft" src="http://thecontentguy.net/wp-content/uploads/2009/06/sdlxysoft.png" alt="sdlxysoft" width="243" height="48" />Today I learned that SDL acquired XyEnterprise, maker of the Contenta XML content management suite and the XPP publishing platform.  This is a marriage of what many of us in the industry consider to be two of the strongest XML component content management platforms in the industry: Trisoft, a more recent entrant with a very formidable offering, particularly strong in DITA; and Contenta, also with great DITA support but a unique set of capabilities for S1000D and custom schemas / DTDs, and a tight, powerful publishing integration. </p>
<p>This is a very exciting development which assures the future of both product lines inside one XML powerhouse with all the resources of SDL behind it.  SDL plans to manage both products from a single business unit, to be called SDL XySoft.  The merger creates organizational stability, a seasoned and unified management team, a strong combined customer base, complete product offerings, and deep XML technical expertise.  We can expect the roadmap to develop quickly to align and rationalize the two platforms.   SDL has already disclosed intentions to focus Trisoft on DITA, and to aim Contenta at S1000D and DocBook. </p>
<p><span id="more-381"></span>The full text of the press release follows:</p>
<blockquote>
<h3 style="margin: auto 0in;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 12pt;" lang="EN-GB">SDL Announces the Acquisition of XyEnterprise, a Leader in XML Publishing and Component Content Management Software </span></h3>
<h3 style="margin: auto 0in;"><em style="mso-bidi-font-style: normal;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt; font-weight: normal; mso-bidi-font-weight: bold;" lang="EN-GB">With this Acquisition, SDL Offers the Most Advanced Technologies in the Market for Sharing and Publishing Structured Component-based Content </span></em></h3>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span class="venue"><strong style="mso-bidi-font-weight: normal;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">SDL Maidenhead, United Kingdom</span></strong></span><strong style="mso-bidi-font-weight: normal;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> – <span class="datetime">June 29, 2009</span></span></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span class="datetime"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"><br />
SDL, the leading provider of Global Information Management (GIM) solutions, today announced the acquisition of XyEnterprise®, an award-winning leader in XML Component Content Management (CCM) and Dynamic Publishing solutions.</span></span><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> With tight integration into SDL’s GIM technologies, this acquisition adds another building block for managing global content. SDL provides the industry’s only end-to-end solution for creating and managing global content. The technology provides a solution from authoring, to translation supply chain management, right through to the publishing of global content.</span></p>
<p class="MsoNormal" style="margin: 12pt 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">The acquisition of XyEnterprise builds on the momentum created by SDL’s earlier acquisition of Trisoft™ in 2008. Since that acquisition, SDL has seen a growing demand for technologies that manage, reuse and deliver product information across a company’s base of global customers, including everything from service manuals to user documentation. Moving into XML standards such as DITA and S1000D, companies are seeking ways to create, translate and publish structured content once and share that information across their global organizations and customer base. Global companies that have already adopted SDL Trisoft Component Content Management (CCM) system include companies such as HACH, VMware, NetApp, FICO, ESRI, Micro Focus, Yokogawa, DAF Trucks and Atlas Copco. </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">With the acquisition of XyEnterprise, SDL significantly expands its CCM and publishing product offerings and technical capabilities.<span style="mso-spacerun: yes;">  </span>XyEnterprise products are recognized in the industry for leadership and innovation and currently support more than two hundred enterprise companies with thousands of users. <br style="mso-special-character: line-break;" /><br style="mso-special-character: line-break;" /></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; tab-stops: 135.0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">In particular, XyEnterprise brings to the SDL portfolio both XML publishing products (XPP™ &#8211; XML Professional Publisher and LiveContent™, an intelligent interactive delivery solution), as well as a new XML standard (S1000D) with XyEnterprise’s Contenta® content management software. In addition, XyEnterprise brings mature R&amp;D, professional services and support teams into the larger SDL Group. </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; tab-stops: 135.0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; tab-stops: 135.0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">“XyEnterprise has a strong set of innovative products and is an impressive organization with expertise that will complement and enhance our current SDL Trisoft offerings,” said Mark Lancaster, Chairman and CEO of SDL. “XyEnterprise’s products and expertise will enable us to scale faster to meet the needs of our enterprise customers who are seeking competitive advantage in the global marketplace.”<span style="mso-spacerun: yes;">  </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">“Our acquisition by SDL will significantly enhance our ability to execute the product vision and meet the needs of a growing base of customers,” said Kevin Duffy, CEO of XyEnterprise. “The backing by a large public company in a consolidating market and the ability to tightly integrate our products with SDL GIM technologies will accelerate our ability to deliver on our customers’ vision of delivering tailored content to their customers on a global basis.” </span></p>
<p><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">With the XyEnterprise acquisition, SDL PLC, a UK-based company, is creating a merged business unit of XyEnterprise and SDL Trisoft that will be branded SDL XySoft™. XyEnterprise Inc., the legal entity, will continue to do business as a US-based, Delaware company</span><span style="font-size: 10pt;" lang="EN-GB"><span style="font-family: Times New Roman;">. </span></span><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">The addition of this new business unit </span><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: HE;">makes<span style="color: navy;"> </span></span><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">SDL the definitive leader in XML Component Content Management and Dynamic Publishing. Global companies selling products into ten or more markets need to manage ninety percent of their content in other languages. SDL’s integration of SDL XySoft products with SDL’s GIM technologies will mean that companies can truly manage global content efficiently, moving products to global markets faster and at lower costs.</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">SDL XySoft will have a combined management team from the SDL Trisoft and XyEnterprise organizations, with Kevin Duffy, President and CEO of XyEnterprise, as CEO of the newly combined business unit, reporting to Mark Lancaster, Chairman and CEO of SDL.<span style="mso-spacerun: yes;">  </span>SDL will continue to support both XyEnterprise and SDL Trisoft customers and products as the R&amp;D organization moves to a shared component development model.<span style="mso-spacerun: yes;">  </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">Join us for an informational webinar on July 2nd, 2009 at 10am PDT/1pm EDT/6pm BST to learn more about the acquisition, the management team and the product roadmap. Register today at <a href="http://www.sdlxysoft.com/briefing"><span style="color: windowtext;">www.sdlxysoft.com/briefing</span></a>.  </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">You can also find additional information, as well as a podcast interview between Mark Lancaster and Kevin Duffy online at <a href="http://www.sdlxysoft.com/acquisition"><span style="color: windowtext;">www.sdlxysoft.com/acquisition</span></a>.  </span></p>
<h4 style="text-align: justify; margin: auto 0in;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">About SDL</span></h4>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">SDL is the leader in Global Information Management (GIM) solutions that empower organizations to accelerate the delivery of high-quality multilingual content to global markets. Its enterprise software and services integrate with existing business systems to manage the delivery of global information from authoring to publication and throughout the distributed translation supply chain. </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">Global industry leaders rely on SDL to provide enterprise software or hosted services for their GIM processes, including ABN-Amro, Best Western, Bosch, Canon, Chrysler, CNH, Hewlett-Packard, Microsoft, Philips, SAP, Sony, Sun Microsystems and Virgin Atlantic. </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">SDL has implemented more than 500 enterprise GIM solutions, has deployed over 170,000 software licenses across the GIM ecosystem and provides access to on-demand translation portals for 10 million customers per month. Over 1,000 service professionals deliver consulting, implementation and language services through its global infrastructure of more than 50 offices in 32 countries. For more information, visit www.sdl.com </span></p>
<h4 style="text-align: justify; margin: auto 0in;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">About XyEnterprise</span></h4>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">XyEnterprise develops, markets, and supports standards-based component content management, automated XML publishing and intelligent content delivery solutions that exceed customer expectations for reliability, productivity and ROI. XyEnterprise’s unmatched XML, DITA and S1000D expertise is the result of hundreds of successful deployments and is forged from lasting partnerships with customers for more than 20 years. Named one of KMWorld’s “100 Companies that Matter in Knowledge Management” for four consecutive years, XyEnterprise is committed to providing innovative solutions that deliver the highest possible performance and value to its customers, including dynamic publishing via the Web, IETMs and IETPs. XyEnterprise is headquartered in Wakefield, Mass. and has offices worldwide. For more information, please call 781.756.4400, visit <a href="http://thecontentguy.net/blog/Local%20Settings/Local%20Settings/Temporary%20Internet%20Files/OLK1/www.xyenterprise.com"><span style="color: #0000ff;">www.xyenterprise.com</span></a> or follow on Twitter at <a href="http://twitter.com/xyenterprise"><span style="color: #0000ff;">http://twitter.com/xyenterprise</span></a>.</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<h4 style="text-align: justify; margin: auto 0in;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">About SDL Trisoft</span></h4>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">SDL Trisoft is one of the worldwide leaders in Component Content Management (CCM) systems for technical writing organizations. SDL Trisoft’s software empowers global organizations to single source content, easily sharing, reusing and personalizing content in various publication formats and in multiple languages across global markets.</span></p>
<p class="MsoNormal" style="text-align: justify; margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="text-align: justify; margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">Through efficiency gains, SDL Trisoft customers are able to accelerate the time to deliver information to global markets, drive down the cost of content development and translation, provide more agility for the overall business and increase customer satisfaction through access to better information.</span></p>
<p class="MsoNormal" style="text-align: justify; margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="text-align: justify; margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">SDL Trisoft customers include a number of large consumer electronics and mobile communications companies as well as the following: Atlas Copco, DAF Trucks (a Paccar Company), Océ, Micro Focus, HACH, FICO, NetApp, VMware, Still, Blondé,Yokogawa, Maruboshi, Linde Material Handling, Nautilus, and Mitsubishi. SDL Trisoft headquarters are in Mechelen, Belgium.</span></p>
<p class="MsoNormal" style="text-align: justify; margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="text-align: justify; margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">Find out more on SDL Trisoft at <a href="http://www.sdltrisoft.com/"><span style="color: #0000ff;">www.sdltrisoft.com</span></a>.</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB"><br />
<strong style="mso-bidi-font-weight: normal;">Contact Information:</strong><br style="mso-special-character: line-break;" /><br style="mso-special-character: line-break;" /></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">EMEA &#8211; Amy Hall (SDL)<br />
01628 410120<br />
<a href="mailto:amyhall@sdl.com"><span style="color: #0000ff;">amyhall@sdl.com</span></a><br style="mso-special-character: line-break;" /><br style="mso-special-character: line-break;" /></span></p>
<p class="MsoNormal" style="line-height: 12pt; margin: 0in -0.25in 0pt 0in; mso-line-height-rule: exactly;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt;" lang="EN-GB">US </span><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt; mso-ansi-language: EN-US; mso-bidi-font-family: 'Times New Roman';"><span style="mso-spacerun: yes;"> </span>- Mary Galoski Parsons</span></p>
<p class="MsoNormal" style="line-height: 12pt; margin: 0in -0.25in 0pt 0in; tab-stops: .5in 4.5in; mso-line-height-rule: exactly;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt; mso-bidi-font-family: 'Times New Roman';" lang="EN-GB">1.781.756.5454</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt; mso-bidi-font-family: 'Times New Roman';" lang="EN-GB"><a href="mailto:mary.parsons@xyenterprise.com"><span style="color: #0000ff;">mary.parsons@xyenterprise.com</span></a></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt; mso-bidi-font-family: 'Times New Roman';" lang="EN-GB"> </span></p>
<p class="MsoBodyText3" style="line-height: 12pt; margin: 0in -0.25in 0pt 0in; mso-line-height-rule: exactly;"><span style="font-size: x-small;"><span style="font-family: Tahoma;">Nate Tennant (Kirk Communications)<strong></strong></span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: &quot;Arial&quot;,&quot;sans-serif&quot;; font-size: 10pt; mso-ansi-language: EN-US; mso-bidi-font-family: 'Times New Roman';">1.603.766.4945<span style="mso-tab-count: 1;">  </span><br />
<a href="http://thecontentguy.net/blog/Local%20Settings/Temporary%20Internet%20Files/Local%20Settings/Temporary%20Internet%20Files/Local%20Settings/Temporary%20Internet%20Files/Local%20Settings/Temporary%20Internet%20Files/OLK7C/natet@kirkcommunications.com"><span style="color: #0000ff;">natet@kirkcommunications.com</span></a></span></p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://thecontentguy.net/blog/2009/06/29/sdl-announces-the-acquisition-of-xyenterprise/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to Turn Tagging into Cash: Take the Metadata Best Practices Survey</title>
		<link>http://thecontentguy.net/blog/2009/05/26/how-to-turn-tagging-into-cash-take-the-metadata-best-practices-survey/</link>
		<comments>http://thecontentguy.net/blog/2009/05/26/how-to-turn-tagging-into-cash-take-the-metadata-best-practices-survey/#comments</comments>
		<pubDate>Tue, 26 May 2009 15:07:21 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[DITA]]></category>
		<category><![CDATA[ECM]]></category>
		<category><![CDATA[XML]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[semantic technology]]></category>
		<category><![CDATA[Benchmarking]]></category>
		<category><![CDATA[content management]]></category>
		<category><![CDATA[Earley & Associates]]></category>
		<category><![CDATA[markup]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[semantic search]]></category>
		<category><![CDATA[survey]]></category>

		<guid isPermaLink="false">http://thecontentguy.net/blog/?p=310</guid>
		<description><![CDATA[We tag stuff to add meaning, and so that we and others – especially information systems – can find it.  But is your approach to tagging business content effective?  Find out - take the Metadata Best Practices Benchmarking Survey from Earley &#038; Associates and Taxonomy Strategies.]]></description>
			<content:encoded><![CDATA[<p>If you couldn’t tell by now, one of my particular interests is tagging, a.k.a. content classification, a.k.a. metadata.  We tag stuff to add meaning, and so that we and others – especially information systems – can find it.  But is your approach to tagging business content effective?  Find out &#8211; take the <strong><a title="Metadata Best Practices Benchmarking Survey" href="http://www.surveymonkey.com/s.aspx?sm=TEtPrAKwkiKIXhkey6revA_3d_3d" target="_blank">Metadata Best Practices Benchmarking Survey</a></strong> from Earley &amp; Associates and Taxonomy Strategies.</p>
<p><strong><span style="color: black; font-size: 14pt;  mso-bidi-font-size: 11.0pt; "><a title="Metadata Best Practices Benchmarking Survey" href="http://www.surveymonkey.com/s.aspx?sm=TEtPrAKwkiKIXhkey6revA_3d_3d" target="_blank"><span style="color: blue;"><span style="font-family: Calibri;">Take the Survey</span></span></a></span></strong></p>
<p><span id="more-310"></span>Depending upon context, “tagging” can mean one of three different things: tagging a document, tagging within a document, or tagging a content object.</p>
<p style="PADDING-LEFT: 30px"><strong>Tagging documents.</strong>  These days most of us think of tagging as the keywords we put on our documents – like our photos and websites – so that others can find them when they search.  User tags are fine for finding photos in flickr, but for tagging to be effective in business we need to make it systematic, so that we avoid ambiguity and improve search recall and relevance.  So we’re increasingly “mature” in our approaches to tagging: We use taxonomy to organize our terms into classes and to manage the relationships between terms.  We develop thesauri and foreign language equivalents.  We integrate taxonomies and thesauri into search indexes for ECM and site search and SEO.</p>
<p style="PADDING-LEFT: 30px"><strong>Tagging within a document.</strong>  I got interested in tagging in the early days of XML (back when we spelled it &#8220;S-G-M-L&#8221;), when we were tagging within documents.  By tagging unstructured content inside documents we could do really sophisticated things – not just multi-channel output.  For example, knowing that a paragraph in a document was a step in a service procedure or that a string of gibberish was a part number let us bring life to that content when we transformed it from markup into an interactive electronic technical manual.  <strong>Tagging let us turn books into diagnostic software.</strong></p>
<p style="PADDING-LEFT: 30px"><strong>Tagging reusable content objects.</strong> As content reuse matured with standards like DITA, organizations had more reusable components, with more people creating them in more departments.  Tagging reusable content objects became essential to actually reusing them – if you couldn’t find it, you’d never reuse it.  If you had a single service manual with 100 procedures, now you have at least 100 reusable content objects, so the search scope increased by two orders of magnitude.  At IBM, colleagues report having over a million DITA topics in more than six repositories, with over a dozen departments sharing content across thousands of publications.  <strong>Searching for content objects is like trying to find a needle in a haystack, except you’re trying to find the right needle, and you have more and smaller needles to search amongst, in more and increasingly bigger haystacks.</strong></p>
<p><strong>Measuring Metadata Maturity.</strong>  Each type of tagging can have measurable benefits on your business.  Five years ago, <a title="Earley &amp; Associates" href="www.earley.com" target="_blank">Earley &amp; Associates</a> and <a title="Taxonomy Strategies" href="www.taxonomystrategies.com" target="_blank">Taxonomy Strategies</a> developed a survey to understand metadata maturity for various types of businesses.  Earley is conducting an updated survey to see how organizations have moved up the learning curve.  Since we have a baseline of responses from five years ago, we’ll be able to describe how metadata and taxonomy practices have matured over time.  Also, the original survey was focused on the impact of metadata best practices on knowledge management and e-commerce search.  We now recognize that metadata is also used by technical communicators – especially those that use XML and other technologies to create, manage, and multichannel publish reusable content.  We want to hear from you all for the first time.</p>
<p>The survey is pretty detailed, so you might want to grab your favorite caffeinated beverage before you dig in.  As compensation for your time (about 15 minutes) Earley &amp; Associates is offering these nifty incentives:</p>
<ul type="disc">
<li style="line-height: 14.25pt; margin: 0in 0in 10pt; color: black; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l0 level1 lfo1; tab-stops: list .5in"><strong>A free pass to any future Earley &amp; Associates Community of Practice conference call</strong> (a $50 value).  These are monthly, and the next one is Wednesday June 2<sup>nd</sup> on <a title="Taxonomy Community of Practice - June 2009" href="http://www.earley.com/_June2009.asp" target="_blank">Taxonomy for Portals</a> featuring Giovanni Piazza, Chief Knowledge Officer of Ernst &amp; Young, and Ralph Poole of Earley &amp; Associates.</li>
<li style="line-height: 14.25pt; margin: 0in 0in 10pt; color: black; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l0 level1 lfo1; tab-stops: list .5in;"><strong>A $200 discount on registration to the <a title="Henry Stewart Digital Asset Management Conference" href="http://www.damusers.com/" target="_blank">Henry Stewart conference</a></strong> on digital asset management, June 1-2 in NYC.  Seth Earley will be there presenting preliminary results.</li>
<li style="line-height: 14.25pt; margin: 0in 0in 10pt; color: black; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l0 level1 lfo1; tab-stops: list .5in;"><strong>Free participation</strong> in a webcast reviewing the results of the survey (date TBA).</li>
</ul>
<p class="MsoNormal" style="line-height: 14.25pt; margin: 0in 0in 0pt;"><strong><span style="color: black; font-size: 14pt; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 11.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri;"><a title="Metadata Best Practices Benchmarking Survey" href="http://www.surveymonkey.com/s.aspx?sm=TEtPrAKwkiKIXhkey6revA_3d_3d" target="_blank"><span style="color: blue;"><span style="font-family: Calibri;">Take the Survey</span></span></a></span></strong></p>
]]></content:encoded>
			<wfw:commentRss>http://thecontentguy.net/blog/2009/05/26/how-to-turn-tagging-into-cash-take-the-metadata-best-practices-survey/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Information Architecture for Your Office</title>
		<link>http://thecontentguy.net/blog/2008/09/06/information-architecture-for-your-office/</link>
		<comments>http://thecontentguy.net/blog/2008/09/06/information-architecture-for-your-office/#comments</comments>
		<pubDate>Sat, 06 Sep 2008 22:04:52 +0000</pubDate>
		<dc:creator>paulwlodarczyk</dc:creator>
				<category><![CDATA[DITA]]></category>
		<category><![CDATA[information architecture]]></category>

		<guid isPermaLink="false">http://paulwlodarczyk.wordpress.com/?p=23</guid>
		<description><![CDATA[DITA girl and JustSystems alum Amber Swope just had a great post on the Content Wrangler about using best practices for information architecture in deciding how to arrange her office and what to keep / hurl.  And you thought I  was OCD&#8230; but seriously, it&#8217;s a great primer on some of the basic techniques of Information [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft" style="margin: 2px;" title="Amber Swope, DITA Girl" src="http://thecontentwrangler.com/images/uploads/amberswope.jpg" alt="" width="150" height="150" />DITA girl and JustSystems alum <a title="Amber Swope, DITA Girl" href="http://www.linkedin.com/in/amberswope" target="_blank">Amber Swope</a> just had a great post on the <a title="The Content Wrangler" href="http://www.thecontentwrangler.com/" target="_blank">Content Wrangler </a>about using best practices for information architecture in deciding <a title="Information Architecture for My Office" href="http://www.thecontentwrangler.com/article/information_architecture_for_my_office/" target="_blank">how to arrange her office</a> and what to keep / hurl.  And you thought <strong><em>I</em></strong>  was OCD&#8230; but seriously, it&#8217;s a great primer on some of the basic techniques of Information Architecture, <em><strong>and</strong></em> she got excellent results.  And if you want to know how OCD <strong><em>I</em></strong> am just ask me about my workout log&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://thecontentguy.net/blog/2008/09/06/information-architecture-for-your-office/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Automating CMS metadata &#8211; could that work?  How?</title>
		<link>http://thecontentguy.net/blog/2008/08/19/automating-cms-metadata-could-that-work-how/</link>
		<comments>http://thecontentguy.net/blog/2008/08/19/automating-cms-metadata-could-that-work-how/#comments</comments>
		<pubDate>Tue, 19 Aug 2008 20:40:21 +0000</pubDate>
		<dc:creator>paulwlodarczyk</dc:creator>
				<category><![CDATA[DITA]]></category>
		<category><![CDATA[ECM]]></category>
		<category><![CDATA[XML]]></category>
		<category><![CDATA[semantic technology]]></category>
		<category><![CDATA[auto-tagging]]></category>
		<category><![CDATA[categorization technology]]></category>
		<category><![CDATA[content classification]]></category>
		<category><![CDATA[content lifecycle]]></category>
		<category><![CDATA[content management]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[named entity analysis]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[web services]]></category>

		<guid isPermaLink="false">http://paulwlodarczyk.wordpress.com/?p=16</guid>
		<description><![CDATA[In a previous post I asked the question, "What if a web service could automatically provide the CMS metadata when you go to check-in a new topic?"  In this post I'll discuss why you would want to do that, some of the candidate technologies, and what is necessary to make it real.]]></description>
			<content:encoded><![CDATA[<p>In a previous post I asked the question, &#8220;What if a web service could automatically provide the CMS metadata when you go to check-in a new topic?&#8221;  In this post I&#8217;ll discuss why you would want to do that, some of the candidate technologies, and what is necessary to make it real.<br />
<span id="more-13"></span><br />
<strong><img class="alignleft" title="Paper Tag" src="http://thecontentguy.net/wp-content/uploads/2008/08/tag-image.jpg" alt="" width="170" height="251" />Check-in, metadata, and taxonomies.</strong> Anyone who&#8217;s worked with a content or document management system knows this scenario:  You&#8217;re going to check-in newly authored content, and a dialog box comes up asking you to enter some keywords to describe the content.  This is metadata – data about your data. It’s important because if you fill it in properly, other people (and you, too) can find your content. If you leave it blank, then other users will need to rely on a full-text search of some machine indexing to find your content.</p>
<p>Many organizations have a formal system for classifying content called a taxonomy. Think of it like the naming of the sections in the yellow pages directory – it provides consistent category names. This avoids the problem I call “the Yellow Pages Problem,” where some people call those guys who represent you in court “lawyers” while other people call them “attorneys at law” (or worse things). When an organization uses a taxonomy, everyone uses consistent category names – that is, if they actually use it.</p>
<p><strong>Compliance blues.</strong> Taxonomies can be configured into the CMS, so that category names are able to be selected on the check-in dialog box. While that saves the author guesswork of remembering category names and avoids mistyping, it still requires an author to take action – and to get it right. This is a point of failure of many ECM initiatives: authors either fail to classify content at check-in time, or they accept the default settings, or the author applies the wrong category (or even too few categories when the content really crosses genres).</p>
<p>The problem is worse when the author isn’t a fulltime writer, but instead a business contributor who’s creating content as they serve their role in some business process. In these cases the author lacks the time, talent, or motivation to tag the content with the appropriate metadata. They may not see it as part of their job.</p>
<p><strong>Cure for the blues?</strong> So can this process be automated? Absolutely. Technologies have existed for some years now to analyze unstructured content. Algorithms involve some combination of statistical, linguistic, and structural analysis.</p>
<ul>
<li><em>Statistical methods</em> look at the document as a “bag of words” – words or phrases that occur more frequently, or that are “improbable” statistically are more important. Amazon uses SIPs – Statistically Improbable Phrases – to pull keywords out of books. This is purely statistical – the system doesn’t know what the words mean, just that they are “odd” so probably meaningful.</li>
<li><em>Linguistic methods</em> actually analyze the natural language in the document. If you know what the subject, verb, and object are in a sentence, then you know what it is about. Linguistic methods have gotten better with improvements in algorithms and increases in computing power.</li>
<li><em>Structural methods</em> leverage underlying markup in documents, like XML structural tags or even styling or text flow (e.g. recognizing terms in headers).</li>
</ul>
<p>These methods not only provide automated metadata tagging (document categorization), they can also determine what type of document is being analyzed (document classification). They can also be used to identify Named Entities – named people, places, things, and events. It’s one thing to say this document is a Legal Brief (document type or class). It’s another to say that Legal Brief is about Patent Infringement (a category). It’s another thing still to say that it’s a case between Palm and Xerox (named companies) about handwriting recognition (a named technology). Named entities can be extracted and listed in metadata. They can also be tagged in-line in an XML document (this is often called &#8220;auto-tagging&#8221; &#8211; a post for another day).</p>
<p>Named entities are not addressed by taxomonies, rather by lists or directories of named entities.  A number of these named entity directories are available as web services. Several are kept evergreen by using Wikipedia to drive the ever growing list of named entities.</p>
<p><strong>Making it real.</strong> So given this technology, how do you implement such a system?  My preferred method is to customize the authoring environment so that the “Save” dialog box in the editor of choice presents the ECM system’s check-in dialog.  This way the author does not take extra steps to check content in.</p>
<p>Also at check-in time, in the background, the customized editor performs a temporary save to the local file system, and automatically sends a copy of the document to a categorizer web service. This is a content categorizer application running on a server.  That categorizer service would apply the organization’s standard taxonomy to the document, using some classification algorithm to define one or more categories for the document. The results can be applied in either of two ways:</p>
<ul>
<li>Classify the document automatically with no user intervention. This can be done completely in the background with no user interface, even as part of an automated check-in workflow.</li>
<li>Classify the document automatically and have the user verify the results. This requires exposing the proposed metadata tags in the check-in dialog.</li>
</ul>
<p>Categorizers often provide some scoring of the certainty of a given tag; this score can be used to make the call about whether the automatic tag is applied, or whether it needs (or allows) end user verification or editing. Business requirements determine what the best approach or best combination is.</p>
<p><strong>What are the barriers?</strong> The reason this technique isn’t used more often is the integration required between the authoring tools, the ECM solution, and the categorization technology. In today’s market these technologies are typically provided by independent software vendors, who have few incentives for bundling tightly integrated solutions (and wish to remain “vendor neutral” with their own technology). As the ECM marketplace continues to consolidate vertically we may see some content lifecycle vendors with more complete solutions (watch IBM and EMC). Services firms specializing in unstructured content and ECM can be one source for prepackaged solutions that combine these ECM, authoring tools, and content classification into a seamless user experience – which is the key to success in deploying an automated solution.</p>
<p>At the end of the day, consideration of the needs and behavior of content authors and contributors (who are very often change-averse) is the most important step in adoption of a content lifecycle solution. Making content classification and categorization a “no brainer” through automation and a seamless user experience improves the likelihood of success.</p>
]]></content:encoded>
			<wfw:commentRss>http://thecontentguy.net/blog/2008/08/19/automating-cms-metadata-could-that-work-how/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Connecting the dots: How XML authoring enables the Semantic Web</title>
		<link>http://thecontentguy.net/blog/2008/08/15/connecting-the-dots-how-xml-authoring-enables-the-semantic-web/</link>
		<comments>http://thecontentguy.net/blog/2008/08/15/connecting-the-dots-how-xml-authoring-enables-the-semantic-web/#comments</comments>
		<pubDate>Fri, 15 Aug 2008 20:10:17 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[DITA]]></category>
		<category><![CDATA[XML]]></category>
		<category><![CDATA[semantic technology]]></category>
		<category><![CDATA[Calais]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[markup]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Search Monkey]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[web services]]></category>

		<guid isPermaLink="false">http://paulwlodarczyk.wordpress.com/?p=4</guid>
		<description><![CDATA[What if we start combining semantic web technologies and semantic document technologies?]]></description>
			<content:encoded><![CDATA[<p><a title="New, Improved *Semantic* Web!" href="http://flickr.com/photos/14829735@N00/303503677"><img class="alignleft" style="margin: 2px;" src="http://farm1.static.flickr.com/105/303503677_e83d70118f_m.jpg" alt="" width="193" height="240" /></a>I recently attended the <a title="Linked Data Planet" href="http://www.linkeddataplanet.com/" target="_blank">Linked Data Planet </a>conference where a number of pioneers in the field of Semantic Web shared their perspectives on the state of the art – and business – of helping the world tag their web pages for meaning.  For those of you in the dark about semantic mark-up, it lets authors annotate their web pages with metadata (HTML attributes that don’t get displayed in the document) that describe what those pages are about. <br />
<span id="more-7"></span><br />
So for example, when I say “New York” in an HTML document it&#8217;s ambiguous – do I mean the city, the state, the Yankees, the Mets, the Giants, the Jets, the song, the steak, the state of mind – you get the idea.  Words are ambiguous – except in the context of the language in which they occur.  So if I am writing about a sporting event <strong>you</strong> know from the context of the article that I mean the team, but the typical search engine does not.  To a search engine, New York is just a string that occurs in the document with some frequency. </p>
<p>There are two ways to make sense out of words in a document.  One is semantic analysis (I&#8217;ll leave that topic to another day).  The other is semantic tagging &#8211; adding metadata to a document.<br />
With metadata, I can define things precisely.  I can state that this document is about the sports team, not the steak.  I can do this by tagging the named entities in the document – the people, places, things, events, and facts – in an unambiguous way.  I can also set those entities into relationships with each other.  For example, a piece of text may refer to two companies involved in a merger.  So I can tag the document being about <strong>Company A</strong> (thing number one) and <strong>Company B</strong> (thing number two) involved in a <strong>merger</strong> (an event, but also a relationship between the two named entities). </p>
<p>So semantic tagging adds meaning to documents that goes beyond the text, and it does it in an unambiguous way, which is handy.  But it has traditionally faced two large hurdles: (1) it’s been relatively expensive to add semantic markup (either with investments in labor or technology) and (2) there has been little mass market for consuming this markup.  Both of those hurdles are rapidly falling away. </p>
<p>Let’s address the second point first.  Yahoo has introduced <a title="Yahoo! Search Monkey" href="http://developer.yahoo.com/searchmonkey/" target="_blank">Search Monkey</a> – a new technology that rates web pages not on the keywords and number of links to the page (the “wisdom of crowds”) but on the semantic markup that is embedded in the page (the wisdom of the author).  This creates a substantial motive for adding the markup: Search Engine Optimization.  Semantic markup makes your content more likely to be found and more relevant to the searcher.</p>
<p>Great, so how do you add semantic markup?  For legacy content, you need to use some combination of people and automation to add markup to what you already wrote.  Using people to tag content requires specialized skills that are not in good supply.  Natural language processing technologies for auto-tagging content have been around since the late 90s in lab settings; auto-tagging products are emerging in new and interesting forms in the marketplace today. Thomson-Reuter’s <a title="Thomson-Reuters Calais" href="http://www.opencalais.com/">Calais</a> open source project is a great example.  For a demo <a title="Calais Viewer Demo" href="http://sws.clearforest.com/calaisviewer/" target="_blank">click here</a> and try pasting some <a title="Terms of use" href="http://www.opencalais.com/terms" target="_blank">non-proprietary</a> text that describes what your company does (for example, I tried the “About Our Company” page we used in proposals at JustSystems and it accurately tagged all of the named companies, legal entities, products, technologies, countries, cities, and correctly identified JustSystems’s acquisition of XMetaL from Blast Radius as a business event).</p>
<p>Adding semantic markup to new web content as it is created &#8211; making it available as data &#8211; is the way to go.  But what about other types of unstructured content, like documents, that might be published to the web and other channels?  We’ve been doing this with XML and SGML documents all along, using semantic tags to unambiguously flag specific pieces of text for future discovery.  This has ranged from tagging part numbers in a service manual (which could automate adding hyperlinks or improve search relevance), to tagging financial reports with XBRL to find specific facts within the MD&amp;A or footnotes of an annual report (which could prevent another Enron).  But the important concept here is this: when content is tagged, it can be treated as data</p>
<p>More recent XML standards like <a title="DITA.XML.ORG" href="http://dita.xml.org/" target="_blank">DITA</a> help authors focus on creating granular content – primarily for content reuse.  But our customers are finding that DITA and other topic-oriented XML approaches are helping them break out of the document model – where loads of facts are locked-up within documents.  Think of a lengthy Policies and Procedures manual.  The historical reason it’s all bound in one book is for the convenience of publishing.  Today – with electronic publishing on the web, intranets, and portals – you really only want to publish a single policy or procedure as it is added or revised.  The book itself is obsolete when you can publish a procedure at a time. </p>
<p>In a DITA world, because of its granular nature, a single document (like a Policy manual that was one very large document in your document management system) may instead be managed as a collection of hundreds of DITA topics in your CMS or XML object store.  The document would no longer exist, it becomes a collection of topics, more like records in a database.  To effectively manage large collections of DITA topics, you <strong>need</strong> to specify metadata for each topic – just so that you can find any given topic again.  So a typical DITA project would define the CMS metadata scheme and the taxonomy for classifying the DITA topics.  For those of us in the XML document world, this is old hat.</p>
<p>So all this makes me ask:</p>
<ul>
<li>What if we start combining semantic web technologies and semantic document technologies?</li>
<li>What if we combine technologies that auto-tag named entities with granular authoring approaches like DITA?</li>
<li>What if you could automatically tag named entities within the DITA topic you are creating, tagging as you type? </li>
<li>What if a web service could automatically provide the CMS metadata when you go to check-in a new topic?</li>
<li>What if the publishing tools that transform your DITA to HTML could automatically add the semantic markup to your HTML pages that are published from your DITA content?</li>
<li>How would that change how you publish business documents like policies and procedures to your employees?</li>
<li>How would it change how you create marketing content for your web site?</li>
<li>How would it change the way you create and manage your product technical content?</li>
</ul>
<p>Could the secret to the semantic web be right under our nose?</p>
]]></content:encoded>
			<wfw:commentRss>http://thecontentguy.net/blog/2008/08/15/connecting-the-dots-how-xml-authoring-enables-the-semantic-web/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
