<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.3.3" --><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Data and the Web</title>
	<link>http://www.kirix.com/blog</link>
	<description>The Official Kirix Weblog</description>
	<pubDate>Tue, 23 Feb 2010 14:06:53 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.3</generator>
	<language>en</language>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.kirix.com/dataandtheweb" /><feedburner:info uri="dataandtheweb" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Announcing Kirix Strata 4.4</title>
		<link>http://feeds.kirix.com/~r/dataandtheweb/~3/WyNPSpSLCiU/</link>
		<comments>http://www.kirix.com/blog/2010/02/23/announcing-kirix-strata-44/#comments</comments>
		<pubDate>Tue, 23 Feb 2010 14:06:53 +0000</pubDate>
		<dc:creator>Ken Kaczmarek</dc:creator>
		
		<category><![CDATA[news/announcements]]></category>

		<guid isPermaLink="false">http://www.kirix.com/blog/2010/02/23/announcing-kirix-strata-44/</guid>
		<description><![CDATA[We&#8217;re pleased to announce a long-awaited new upgrade to Kirix Strata today.
Here are a few highlights of note:
1.  Back-end Stuff.  This version includes a TON of work on back-end processing, scripting, CSVs, etc.  So, a lot of the changes will be &#8220;invisible&#8221; to the average user, but will benefit everyone with speed improvements and a [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;re pleased to announce a long-awaited <a href="http://www.kirix.com/download/download-a-free-trial.html" title="Kirix Strata Free Trial">new upgrade to Kirix Strata</a> today.</p>
<p>Here are a few highlights of note:</p>
<p><strong>1.  Back-end Stuff.</strong>  This version includes a TON of work on back-end processing, scripting, CSVs, etc.  So, a lot of the changes will be &#8220;invisible&#8221; to the average user, but will benefit everyone with speed improvements and a more robust data engine.</p>
<p><strong>2.  Expanded Relationship Filtering.</strong>  In the previous version, we had two options for filtering related records &#8212; either &#8220;leave filter off&#8221; or &#8220;filter all child records.&#8221;  In this version, we&#8217;ve added a third option to &#8220;mark filtered records&#8221; within the context of the entire table.  So, now, when you tile your two related tables horizontally and select Tools &gt; Related Records &gt; Mark Related Records (or select it from the icon dropdown on the toolbar), your related records in the child set will be highlighted in yellow.  In addition, we&#8217;ve also added a cursor marker to the parent table so you can track where you are in the child set.</p>
<p><strong>3.  Additional Aggregate Functions.</strong>  In addition to the existing aggregate functions (e.g., SUM(), AVG(), etc.), we have included new options for Standard Deviation, STDDEV(), and Variance, VARIANCE().  As with other aggregate functions, you&#8217;ll be able to use these in areas such as the query builder, relationships and grouping.</p>
<p><strong>4.  Expanded Table Statistics.</strong>  When you select Data &gt; Summarize, you&#8217;ll now also get the minimum and maximum field length of each field.</p>
<p><strong>5.  Import Templates.</strong>  We&#8217;ve added the ability to save import templates (File &gt; Import) to your project, which will save a few steps if you have complex imports.</p>
<p><strong>6.  Bug Fixes.</strong>  We&#8217;ve been able to knock out loads of small fixes throughout the software.</p>
<p>You can <a href="http://www.kirix.com/download/download-a-free-trial.html" title="Kirix Strata Free Trial">download the latest and greatest here</a>.  If you run into any issues or need help with anything, <a href="http://www.kirix.com/contact-us.html" title="Contact Kirix">please just let us know</a>.</p>
<img src="http://feeds.feedburner.com/~r/dataandtheweb/~4/WyNPSpSLCiU" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.kirix.com/blog/2010/02/23/announcing-kirix-strata-44/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.kirix.com/blog/2010/02/23/announcing-kirix-strata-44/</feedburner:origLink></item>
		<item>
		<title>Further Sunlight on Government Data</title>
		<link>http://feeds.kirix.com/~r/dataandtheweb/~3/LpqoeWQ2QvQ/</link>
		<comments>http://www.kirix.com/blog/2009/07/20/further-sunlight-on-government-data/#comments</comments>
		<pubDate>Mon, 20 Jul 2009 19:14:45 +0000</pubDate>
		<dc:creator>Ken Kaczmarek</dc:creator>
		
		<category><![CDATA[data repositories]]></category>

		<category><![CDATA[government]]></category>

		<guid isPermaLink="false">http://www.kirix.com/blog/2009/07/20/further-sunlight-on-government-data/</guid>
		<description><![CDATA[In a previous post, we discussed some of the interesting things the US government is doing to make its data more widely available, culminating in the Data.gov website.  This website is now up and running and has definitely made some progress since we&#8217;ve last discussed it.
Data.gov is broken down into three main catalogs:

Raw Data Catalog [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.kirix.com/blog/files/2009/07/sunbeams1.png" alt="sunbeams1.png" align="left" border="0" />In <a href="http://www.kirix.com/blog/2009/03/05/datagov/" title="Data.gov">a previous post</a>, we discussed some of the interesting things the US government is doing to make its data more widely available, culminating in the <a href="http://www.data.gov/" title="Data.gov">Data.gov</a> website.  This website is now up and running and has definitely made some progress since we&#8217;ve last discussed it.</p>
<p>Data.gov is broken down into three main catalogs:</p>
<ol>
<li><strong>Raw Data Catalog</strong> (with data files available in XML, CSV, KML, etc.)</li>
<li><strong>Tools Catalog</strong> (list of tools built to work with various open data sets)</li>
<li><strong>Geodata Catalog</strong> (links to Federal geospatial data)</li>
</ol>
<p>They&#8217;ve also tried to make it easier to search for data sets, which like video, is quite reliant on being tagged with good, meaningful descriptions and related meta data.  It&#8217;s a hard nut to crack.  For example, government agencies tend to release data sets on an annual basis, so you&#8217;ll have, say, 5 different data sets (and counting) for the &#8220;Public Libraries Survey&#8221; from 2004 through 2008.  If your search terms aren&#8217;t specific enough, these repetitious items tend to clutter up the search results.  As Data.gov continues to add more data sets, hopefully they can refine this area further.</p>
<p>But, then again, <a href="http://sunlightlabs.com/blog/2009/07/15/kickoff-national-data-catalog/" title="National Data Catalog Post">maybe they won&#8217;t have to</a>.  The folks at <a href="http://sunlightlabs.com/" title="Sunlight Labs">Sunlight Labs</a>, whose mission is to build technology that makes government more transparent and accountable, has recently announced a project called <a href="http://sunlightlabs.com/blog/2009/07/15/kickoff-national-data-catalog/" title="National Data Catalog Post">The National Data Catalog</a>.  It will be a tool that aims to take the Data.gov concept and improve upon it.  From the announcement:</p>
<blockquote><p>&#8220;We think we can add value on top of things like Data.gov and the municipal data catalogs by autonomously bringing them into one system, manually curating and adding other data sources and providing features that, well, Government just can&#8217;t do. There&#8217;ll be community participation so that people can submit their own data sources, and we&#8217;ll also catalog non-commercial data that is derivative of government data like OpenSecrets. We&#8217;ll make it so that people can create their own documentation for much of the undocumented data that government puts out and link to external projects that work with the data being provided.&#8221;</p></blockquote>
<p>This should be interesting to watch.  As the Sunlight folks say in a <a href="http://sunlightlabs.com/blog/2009/07/17/datagov-great/" title="Data.gov is great">later post</a>, they are not out to replicate Data.gov, but to stand on its shoulders (similar to how, say, <a href="http://www.weather.com/" title="weather.com">Weather.com</a> relies on and improves upon the <a href="http://www.nws.noaa.gov/" title="US National Weather Service">National Weather Service</a>).  Given the nature of the beast, data sets need to be described really well in order to be both searchable and useful.  Hopefully the community aspect, in particular, can help give this data more utility.  If any are tech savvy folks interested in either following the project or contributing with code, <a href="http://www.pivotaltracker.com/projects/21785" title="Data Catalog Project Page">here&#8217;s the project page</a>.</p>
<img src="http://feeds.feedburner.com/~r/dataandtheweb/~4/LpqoeWQ2QvQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.kirix.com/blog/2009/07/20/further-sunlight-on-government-data/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.kirix.com/blog/2009/07/20/further-sunlight-on-government-data/</feedburner:origLink></item>
		<item>
		<title>A Wee Bit of Housekeeping…</title>
		<link>http://feeds.kirix.com/~r/dataandtheweb/~3/MqqKsSgkau0/</link>
		<comments>http://www.kirix.com/blog/2009/07/17/a-wee-bit-of-housekeeping/#comments</comments>
		<pubDate>Fri, 17 Jul 2009 20:18:10 +0000</pubDate>
		<dc:creator>Ken Kaczmarek</dc:creator>
		
		<category><![CDATA[examples]]></category>

		<category><![CDATA[news/announcements]]></category>

		<category><![CDATA[videos]]></category>

		<guid isPermaLink="false">http://www.kirix.com/blog/2009/07/17/a-wee-bit-of-housekeeping/</guid>
		<description><![CDATA[We haven&#8217;t been doing much regular blogging lately, but we&#8217;re hoping this will change in the coming weeks.
In the meantime, we&#8217;ve recently done some housekeeping on our website, so if you haven&#8217;t visited recently we&#8217;d encourage you to do so. We&#8217;ve updated many pages with new content, but here are two sections in particular that [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.kirix.com/blog/files/2009/07/brooms1.png" alt="brooms2.png" align="left" border="0" />We haven&#8217;t been doing much regular blogging lately, but we&#8217;re hoping this will change in the coming weeks.</p>
<p>In the meantime, we&#8217;ve recently done some housekeeping on <a href="http://www.kirix.com/" title="Kirix Strata Home">our website</a>, so if you haven&#8217;t visited recently we&#8217;d encourage you to do so. We&#8217;ve <a href="http://www.kirix.com/showcase/overview.html" title="Kirix Strata Overview">updated</a> <a href="http://www.kirix.com/showcase/features.html" title="Kirix Strata Features">many</a> <a href="http://www.kirix.com/showcase/why-strata.html" title="Why Choose Strata?">pages</a> with new content, but here are two sections in particular that we&#8217;d steer you toward:</p>
<ul>
<li><a href="http://www.kirix.com/showcase/examples.html" title="Kirix Strata Examples">Examples Section.</a>  This is a long overdue section that puts together some quick examples of how Kirix Strata™ can be applied to common data problems.  The section is still a work in progress with more videos still to be produced.  However, we expect what we have now will prove useful to new and old Strata users alike.  <a href="http://www.kirix.com/showcase/examples.html" title="Kirix Strata Examples">Check it out.</a></li>
<li><a href="http://www.kirix.com/help/video-tutorials.html" title="Kirix Strata Video Tutorials">Video Tutorials and Archive.</a>  We&#8217;ve done a bunch of different videos and screencasts over the past year or so, but they&#8217;ve been they&#8217;ve been posted all over our website.  This new section wrangles all of the videos together in one place for posterity.  The <a href="http://www.kirix.com/index.php?id=143#c646" title="Kirix Strata Tutorials">feature tutorials</a>, in particular, are worth viewing as they help give a more comprehensive look at how to use specific features in Strata.  <a href="http://www.kirix.com/help/video-tutorials.html" title="Kirix Strata Video Archive">Take a look.</a></li>
</ul>
<p>So, in a nod to <a href="http://www.imsdb.com/scripts/Matrix,-The.html" title="script">the Matrix</a>, where one cannot be told what it is, but one must see for oneself, we&#8217;ve tried to make some high quality video documentation available.  Stay tuned for more to come.  Enjoy!</p>
<img src="http://feeds.feedburner.com/~r/dataandtheweb/~4/MqqKsSgkau0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.kirix.com/blog/2009/07/17/a-wee-bit-of-housekeeping/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.kirix.com/blog/2009/07/17/a-wee-bit-of-housekeeping/</feedburner:origLink></item>
		<item>
		<title>wxWebConnect: Open-source Browser Library for wxWidgets</title>
		<link>http://feeds.kirix.com/~r/dataandtheweb/~3/mfjcrs_p1yo/</link>
		<comments>http://www.kirix.com/blog/2009/07/08/wxwebconnect-open-source-browser-library-for-wxwidgets/#comments</comments>
		<pubDate>Wed, 08 Jul 2009 19:12:02 +0000</pubDate>
		<dc:creator>Ken Kaczmarek</dc:creator>
		
		<category><![CDATA[browsers]]></category>

		<category><![CDATA[news/announcements]]></category>

		<guid isPermaLink="false">http://www.kirix.com/blog/2009/07/08/wxwebconnect-open-source-browser-library-for-wxwidgets/</guid>
		<description><![CDATA[
This is sort of out of the scope of this particular blog, but I thought I&#8217;d pass along the news that we just released another open-source library for wxWidgets users.  This one is called wxWebConnect and it&#8217;s a library for wxWidgets that enables developers to quickly integrate advanced web browser capabilities.
Basically, it wraps up [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.kirix.com/blog/files/2009/07/labs_home_connect.png" alt="labs_home_connect.png" border="0" /></p>
<p>This is sort of out of the scope of this particular blog, but I thought I&#8217;d pass along the news that we just released another open-source library for <a href="http://www.wxwidgets.org/" title="wxWidgets Home">wxWidgets</a> users.  This one is called <a href="http://www.kirix.com/labs/wxwebconnect.html" title="wxWebConnect Project Page">wxWebConnect</a> and it&#8217;s a library for wxWidgets that enables developers to quickly integrate advanced web browser capabilities.</p>
<p>Basically, it wraps up functionality exposed by the <a href="https://developer.mozilla.org/En/XULRunner" title="Mozilla XULRunner">Mozilla Foundation&#8217;s Gecko engine (XULRunner)</a> into a set of user-friendly classes to: embed browser controls, search web content, print web pages, interact with the DOM, implement custom content handling for different MIME types, issue POST calls using the current browser state, etc. Notably, with this library you can also embed all of your favorite Firefox browser plug-ins into your application. We&#8217;ve also gone out of our way to make sure that getting a browser control up and running in your application is as easy as possible.</p>
<p>More information can be found at the <a href="http://www.kirix.com/labs/wxwebconnect.html" title="wxWebConnect Project Page">wxWebConnect project page</a>.  Also, feel free to view some <a href="http://www.kirix.com/labs/wxwebconnect/screenshots.html" title="wxWebConnect Screenshots">screenshots and a short video demonstration too</a>.  If you&#8217;re a wxWidgets developer, give it a whirl and <a href="http://www.kirix.com/forums/" title="Kirix Forums - wxWebConnect">let us know what you think</a>.</p>
<img src="http://feeds.feedburner.com/~r/dataandtheweb/~4/mfjcrs_p1yo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.kirix.com/blog/2009/07/08/wxwebconnect-open-source-browser-library-for-wxwidgets/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.kirix.com/blog/2009/07/08/wxwebconnect-open-source-browser-library-for-wxwidgets/</feedburner:origLink></item>
		<item>
		<title>Announcing Kirix Strata 4.3</title>
		<link>http://feeds.kirix.com/~r/dataandtheweb/~3/mAgVwX5-cCA/</link>
		<comments>http://www.kirix.com/blog/2009/04/28/announcing-kirix-strata-43/#comments</comments>
		<pubDate>Tue, 28 Apr 2009 23:54:16 +0000</pubDate>
		<dc:creator>Ken Kaczmarek</dc:creator>
		
		<category><![CDATA[news/announcements]]></category>

		<guid isPermaLink="false">http://www.kirix.com/blog/2009/04/28/announcing-kirix-strata-43/</guid>
		<description><![CDATA[We&#8217;re pleased to announce that we just released a new upgrade to Kirix Strata, version 4.3! Kudos to our developers for adding a lot of nice features and bug fixes.  The full list of notes to this release is below the jump, but here are a few of the bigger changes:
External Database Connectivity
We&#8217;ve really improved [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.kirix.com/blog/files/2008/07/strata-icon.png" alt="Image - Strata Icon" align="left" border="0" />We&#8217;re pleased to announce that we just released a new upgrade to Kirix Strata, version 4.3! Kudos to our developers for adding a lot of nice features and bug fixes.  The full list of notes to this release is below the jump, but here are a few of the bigger changes:</p>
<h3>External Database Connectivity</h3>
<p>We&#8217;ve really improved the way that Strata works with external databases by optimizing our pass-through queries for databases like Oracle, SQL Server and MySQL.  In addition, queries in the query builder that reference external database tables pass the query through to the external database, significantly increasing the speed of queries on external databases.  Furthermore, you can now edit individual cells in Strata and have them update in your external database table.  This is very welcome news to folks that want to use Strata as a front-end to their external database tables.</p>
<p><em>UPDATE (04/30/2009):  Just a quick note of clarification, on the &#8220;read&#8221; side, you can work with external databases for things like sorting, filtering, marks, calculated fields, grouping and copying.  On the &#8220;write&#8221; side, we currently only have cell editing available, but will work on adding other features in the future such as append (i.e., insert record), delete, update, and some modify structure operations.  If you need these additional &#8220;write&#8221; features, please <a href="http://www.kirix.com/contact-us.html" title="Kirix Contact Us">send us a note</a> to let us know how you would plan on using them to help us prioritize our development efforts. Thanks!</em></p>
<h3>Improved SQL Support</h3>
<p>We added a console panel to allow direct querying of internal and external databases with SQL commands, as well as to provide feedback for database operations and scripts.  <a href="http://www.kirix.com/help/docs/using_structured_query_language_sql.htm" title="Console and SQL">You can learn more here.</a></p>
<h3>EBCDIC Conversion</h3>
<p>Strata now handles EBCDIC.  We haven&#8217;t added copybook support just yet, but you can either manually set your breaks using the text-import or create scripts to convert the EBCDIC file to ASCII format.  <a href="http://www.kirix.com/help/docs/using_the_source_view.htm" title="EBCDIC Conversion">You can learn more here.</a></p>
<h3>Fixed Length and Delimited Table Export</h3>
<p>We&#8217;ve also added Fixed-length export (this also works when using File &gt; Save As External).  In addition, we&#8217;ve expanded the text-delimited export so that you can specify your own delimiters, such as pipe-delimited and semi-colon delimited. You can learn more about the new <a href="http://www.kirix.com/help/docs/exporting_data_sets.htm" title="text-delimited export">text-delimited functionality</a> here.</p>
<h3>Handling Tablenames &amp; Fieldnames with Spaces</h3>
<p>One of our most common support questions relates to spaces in a fieldname (like &#8220;my field&#8221; instead of &#8220;my_field&#8221;).  We&#8217;ve now solved this issue by allowing spaces to be used by enclosing the name in brackets.  So, for example, these are now all valid expressions:</p>
<pre>
[Field  1] * [Field  2]
Field1 * [Field  2]
[Table 1].[Field 1]*[Table 2].Field2</pre>
<p><a href="http://www.kirix.com/help/docs/rules_for_creating_formulas.htm" title="Rules for Formulas">You can learn more here.</a></p>
<h3>Much Much More&#8230;</h3>
<p>There are plenty of other upgrades like project handling, new keyboard shortcuts, auto-fill group and sort dialogs, new script classes, etc.  You can check out all the changes below the jump.</p>
<p>Please <a href="http://www.kirix.com/download.html" title="Kirix Strata Download">download the latest Strata</a> (or just click &#8220;Check for Updates&#8221; in the Help menu), give it a whirl and let us know what you think!</p>
<p> <a href="http://www.kirix.com/blog/2009/04/28/announcing-kirix-strata-43/#more-118" class="more-link">(more&#8230;)</a></p>
<img src="http://feeds.feedburner.com/~r/dataandtheweb/~4/mAgVwX5-cCA" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.kirix.com/blog/2009/04/28/announcing-kirix-strata-43/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.kirix.com/blog/2009/04/28/announcing-kirix-strata-43/</feedburner:origLink></item>
		<item>
		<title>Data.gov</title>
		<link>http://feeds.kirix.com/~r/dataandtheweb/~3/PX6MJtF0TGY/</link>
		<comments>http://www.kirix.com/blog/2009/03/05/datagov/#comments</comments>
		<pubDate>Fri, 06 Mar 2009 00:46:00 +0000</pubDate>
		<dc:creator>Ken Kaczmarek</dc:creator>
		
		<category><![CDATA[data mining]]></category>

		<category><![CDATA[data repositories]]></category>

		<category><![CDATA[government]]></category>

		<guid isPermaLink="false">http://www.kirix.com/blog/2009/03/05/datagov/</guid>
		<description><![CDATA[We recently posted an article about Vivek Kundra, who was named United States CIO this morning by the Obama administration.  He&#8217;s got $71 billion in IT spending under his care.  Hmm, that&#8217;s a lot of data browsers.
One interesting tidbit appeared in this Saul Hansell NY Times article:
Another initiative will be to create a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.whitehouse.gov/omb/" title="OMB Seal"><img src="http://www.kirix.com/blog/files/2009/03/omb_small.png" alt="OMB Seal" align="left" border="0" /></a>We recently <a href="http://www.kirix.com/blog/2009/02/06/more-government-data-coming-to-a-browser-near-you/" title="Government Data Coming to You">posted an article about Vivek Kundra</a>, who was named United States CIO this morning by the Obama administration.  He&#8217;s got $71 billion in IT spending under his care.  Hmm, that&#8217;s a lot of <a href="http://www.kirix.com/" title="Kirix Strata, the data browser">data browsers</a>.</p>
<p>One interesting tidbit appeared in this <a href="http://bits.blogs.nytimes.com/2009/03/05/the-nations-new-chief-information-officer-speaks/" title="New York Times - Bits">Saul Hansell NY Times article</a>:</p>
<blockquote><p>Another initiative will be to create a new site, <strong>Data.gov</strong>, that will become a repository for all the information the government collects. He pointed to the benefits that have already come from publishing the data from the Human Genome Project by the National Institutes of Health, as well as the information from military satellites that is now used in GPS navigation devices.</p>
<p>“There is a lot of data the federal government has and we need to make sure that all the data that is not private, or restricted for national security reasons, can be made public,” [Kundra] said.</p></blockquote>
<p>In another bit of interesting news, the Jonathan Stein at <a href="http://www.motherjones.com/politics/2009/03/congressional-data-mining-coming-soon" title="Mother Jones - Data Mining">Mother Jones</a> notes that <a href="http://honda.house.gov/" title="Mike Honda - California">Mike Honda (D-Calif)</a> added a provision into the recent appropriations bill that requires government entities to make their public available in raw form:</p>
<blockquote><p>If the Senate passes the bill with the provision intact, citizens seeking information about Congress&#8217; activities—such as bill names and numbers, amendments, votes, and committee reports—won&#8217;t have to rely on government websites, which often filter information, are incomplete, or are difficult to use. Instead, the underlying data will be available to anyone who wants to build a superior site or tool to sift through it. &#8220;The language is groundbreaking in that it supports providing unfiltered legislative information to the public,&#8221; says Honda&#8217;s online communications director, Rob Pierson. &#8220;Instead of silo-ing the information, and only allowing access through a limited web form, access to the raw data will make it easier for people to learn what their government is doing.&#8221;</p></blockquote>
<p><a href="http://blog.wired.com/27bstroke6/2009/03/federal-bill-wo.html" title="Wired - API for Federal Legislation">Kim Zetter from Wired has more on the story here</a>.</p>
<p>Maybe once the data is made more accessible, some clever folks can put an interface on things that improve the complex aftermath of the &#8220;<a href="http://www.quotationspage.com/quote/27759.html" title="Laws and Sausages">laws and sausages</a>&#8221; routine.  I did my best to search for Honda&#8217;s three-sentence provision in the <a href="http://thomas.loc.gov/cgi-bin/query/z?c111:H.R.1105:" title="2009 Omnibus bill">latest omnibus bill</a> with no luck.  Anyone know what the actual provision stated? <em>[UPDATE:  Rob Pierson, Online Communications Director of Congressman Honda&#8217;s office, provided a link to an <a href="http://radar.oreilly.com/2009/03/bulk-data-downloads-government-transparency-breakthrough.html" title="O'Reilly - Government transparency">O&#8217;Reilly post</a> with the full text of the provision.  <a href="http://radar.oreilly.com/2009/03/bulk-data-downloads-government-transparency-breakthrough.html" title="O'Reilly - Government Transparency">Give the full article a read</a> &#8212; it&#8217;s quite worthwhile.]</em></p>
<p>And, for posterity, here are some of the data repositories mentioned in the articles above:</p>
<ul>
<li><a href="http://www.opencongress.org/" title="Open Congress">Open Congress</a></li>
<li><a href="http://www.govtrack.us/" title="GovTrack">GovTrack</a></li>
<li><a href="http://www.legistorm.com/" title="LegiStorm">LegiStorm</a></li>
<li><a href="http://www.maplight.org/" title="MAPLight">MAPLight</a></li>
<li><a href="http://thomas.loc.gov/" title="Thomas">Thomas</a></li>
<li><a href="http://www.opensecrets.org/" title="Open Secrets">OpenSecrets</a></li>
<li><a href="http://www.mvzilla.com/" title="MVzilla">MVzilla</a></li>
<li><a href="http://www.gpoaccess.gov/crecord/" title="US Government Printing Office">Government Printing Office</a></li>
<li><a href="http://www.theopenhouseproject.com/the-open-house-project-report/3-legislation-database/" title="Open House Project">The OpenHouse Project</a></li>
</ul>
<img src="http://feeds.feedburner.com/~r/dataandtheweb/~4/PX6MJtF0TGY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.kirix.com/blog/2009/03/05/datagov/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.kirix.com/blog/2009/03/05/datagov/</feedburner:origLink></item>
		<item>
		<title>AWS Public Data Sets Continues to Expand</title>
		<link>http://feeds.kirix.com/~r/dataandtheweb/~3/Eg2MZClH8KU/</link>
		<comments>http://www.kirix.com/blog/2009/02/25/aws-public-data-sets-continues-to-expand/#comments</comments>
		<pubDate>Wed, 25 Feb 2009 18:02:52 +0000</pubDate>
		<dc:creator>Ken Kaczmarek</dc:creator>
		
		<category><![CDATA[data analysis]]></category>

		<category><![CDATA[data mining]]></category>

		<category><![CDATA[data repositories]]></category>

		<guid isPermaLink="false">http://www.kirix.com/blog/2009/02/25/aws-public-data-sets-continues-to-expand/</guid>
		<description><![CDATA[Previously, we posted some information on Amazon&#8217;s foray into making huge public data sets available to users of their web services.  Yesterday they announced the addition of some very sizable additions:

US Bureau of Transportation Statistics
DBpedia Knowledge Base (67 GB)
Freebase Data Dump (66 GB)
Genbank Genetic Sequence Database(250 GB)

If you use AWS, the announcement provides more info [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://aws.typepad.com/aws/2009/02/new-aws-public-data-sets-economics-dbpedia-freebase-and-wikipedia.html" title="AWS Public Data Sets Screenshot"><img src="http://www.kirix.com/blog/files/2009/02/aws_screenshot.png" alt="AWS Public Data Sets Screenshot" align="right" border="0" /></a>Previously, we posted some information on <a href="http://www.kirix.com/blog/2008/12/04/amazon-gets-into-the-public-data-sets-game/" title="Amazon AWS and public data sets">Amazon&#8217;s foray into making huge public data sets available</a> to users of their web services.  Yesterday they announced the addition of some very sizable additions:</p>
<ul>
<li><a href="http://www.bts.gov/" title="Bureau of Transportation Statistics">US Bureau of Transportation Statistics</a></li>
<li><a href="http://dbpedia.org/About" title="DBpedia">DBpedia</a> Knowledge Base (67 GB)</li>
<li><a href="http://download.freebase.com/datadumps/" title="Freebase">Freebase</a> Data Dump (66 GB)</li>
<li><a href="http://www.ncbi.nlm.nih.gov/Genbank/index.html" title="GenBank">Genbank</a> Genetic Sequence Database(250 GB)</li>
</ul>
<p>If you use AWS, <a href="http://aws.typepad.com/aws/2009/02/new-aws-public-data-sets-economics-dbpedia-freebase-and-wikipedia.html" title="AWS more datasets">the announcement provides more info on these datasets</a> as well as how to access them.  If you don&#8217;t use AWS, you can still access much of this data directly from the websites linked above.</p>
<img src="http://feeds.feedburner.com/~r/dataandtheweb/~4/Eg2MZClH8KU" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.kirix.com/blog/2009/02/25/aws-public-data-sets-continues-to-expand/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.kirix.com/blog/2009/02/25/aws-public-data-sets-continues-to-expand/</feedburner:origLink></item>
		<item>
		<title>Free E-Gov Conference (via webcast) on February 17, 2009</title>
		<link>http://feeds.kirix.com/~r/dataandtheweb/~3/z7HoyIigXuI/</link>
		<comments>http://www.kirix.com/blog/2009/02/11/free-e-gov-conference-via-webcast-on-february-17-2009/#comments</comments>
		<pubDate>Wed, 11 Feb 2009 21:41:32 +0000</pubDate>
		<dc:creator>Ken Kaczmarek</dc:creator>
		
		<category><![CDATA[government]]></category>

		<category><![CDATA[news/announcements]]></category>

		<category><![CDATA[semantic web]]></category>

		<guid isPermaLink="false">http://www.kirix.com/blog/2009/02/11/free-e-gov-conference-via-webcast-on-february-17-2009/</guid>
		<description><![CDATA[As a follow up to my previous post on e-government, just wanted to let those who are interested know that there&#8217;s a free conference offered next week that will get much more in-depth about the initiatives for changing the way government uses and disburses information.  The conference will also have a particular emphasis on using [...]]]></description>
			<content:encoded><![CDATA[<p>As a follow up to <a href="http://www.kirix.com/blog/2009/02/06/more-government-data-coming-to-a-browser-near-you/" title="More government data coming your way">my previous post on e-government</a>, just wanted to let those who are interested know that there&#8217;s a <a href="http://semanticommunity.wik.is/Semantic_Community-Semantic_Exchange_February_17%2c_2009?utm_source=streamsend&amp;utm_medium=email&amp;utm_content=2807382&amp;utm_campaign=From%20E-Gov%20to%20Connected%20Governance%3A%20the%20Role%20of%20Cloud%20Computing%2C%20Web%202.0%20and%20Web%203.0%20Semantic%20Technologies" title="e-gov conference">free conference</a> offered next week that will get much more in-depth about the initiatives for changing the way government uses and disburses information.  The conference will also have a particular emphasis on using semantic technologies.</p>
<p>Here are the details:</p>
<blockquote><p><strong>From E-Gov to Connected Governance: the Role of Cloud Computing, Web 2.0 and Web 3.0 Semantic Technologies</strong></p>
<p>Tuesday, February 17, 2009.</p>
<p>Morning session: 8:30 am EST to 12:00 noon. Afternoon session: 1:00 pm EST to 4:00 pm EST.</p>
<p>Synopsis:  &#8220;We have a new administration that values transparency, citizen participation, collaboration, information sharing, and internet technology&#8230; The purpose of this conference is to operationalize this vision, demonstrate the kinds of changes that are coming to next stage web-based systems in government, and to map the role of  information and communication technologies (specifically, cloud computing, Web 2.0, and Web 3.0 semantic technologies) in the evolution of government information systems from e-gov (silos with web front ends) to connected governance (e.g. distributed social computing environments for collaborative work, information sharing, knowledge management, and participatory decision-making.)&#8221;</p>
<p><a href="http://project10x.com/dispatch.php?task=webin&amp;promo=sx2009wx01" title="e-gov webcast sign up">Webcast sign-up here</a> (or, if you are in Washington DC area, you could attend in person)</p>
<p>Further information about the conference <a href="http://semanticommunity.wik.is/Semantic_Community-Semantic_Exchange_February_17%2c_2009?utm_source=streamsend&amp;utm_medium=email&amp;utm_content=2807382&amp;utm_campaign=From%20E-Gov%20to%20Connected%20Governance%3A%20the%20Role%20of%20Cloud%20Computing%2C%20Web%202.0%20and%20Web%203.0%20Semantic%20Technologies" title="e-gov conference home">can be found here</a>.</p></blockquote>
<img src="http://feeds.feedburner.com/~r/dataandtheweb/~4/z7HoyIigXuI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.kirix.com/blog/2009/02/11/free-e-gov-conference-via-webcast-on-february-17-2009/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.kirix.com/blog/2009/02/11/free-e-gov-conference-via-webcast-on-february-17-2009/</feedburner:origLink></item>
		<item>
		<title>More Government Data Coming to a Browser Near You…</title>
		<link>http://feeds.kirix.com/~r/dataandtheweb/~3/50e-ZPuML7s/</link>
		<comments>http://www.kirix.com/blog/2009/02/06/more-government-data-coming-to-a-browser-near-you/#comments</comments>
		<pubDate>Fri, 06 Feb 2009 21:14:02 +0000</pubDate>
		<dc:creator>Ken Kaczmarek</dc:creator>
		
		<category><![CDATA[data repositories]]></category>

		<category><![CDATA[government]]></category>

		<category><![CDATA[mashups]]></category>

		<guid isPermaLink="false">http://www.kirix.com/blog/2009/02/06/more-government-data-coming-to-a-browser-near-you/</guid>
		<description><![CDATA[It was intriguing to see how all this newfangled web 2.0 technology was applied during the US presidential campaign this past year (organization, multimedia, etc.).  It&#8217;s also quite interesting to hear about some of the big ideas for how the new administration wants to change how government works.  And, not to be outdone, the opposition [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.kirix.com/blog/files/2009/02/file_catalog.png" alt="File Catalog" align="left" border="0" />It was intriguing to see how all this newfangled web 2.0 technology was applied during the US presidential campaign this past year (organization, multimedia, etc.).  It&#8217;s also quite interesting to hear about <a href="http://www.techpresident.com/blog/entry/33593/tigr_s_moment_in_the_sunlight_noveck_kundra_mclaughlin_explain_how_obama_transition_is_using_tech_to_innovate" title="TIGR">some of the big ideas</a> for how the new administration wants to change how government works.  And, not to be outdone, <a href="http://www.programmableweb.com/api/gop.gov" title="GOP API">the opposition party is also getting into the Web 2.0 game</a>.</p>
<p>According to Nextgov, it appears that <a href="http://en.wikipedia.org/wiki/Vivek_Kundra" title="Vivek Kundra">Vivek Kundra</a>, current CTO of the District of Columbia, <a href="http://www.nextgov.com/nextgov/ng_20090204_5457.php" title="Vivek Kundra">is going to be given the nod as the next e-government liaison</a>.  From the article:</p>
<blockquote><p>Kundra also is a strong proponent of giving the public access to government data. &#8220;Why does the government keep information secret?&#8221; he rhetorically asked during an interview with Nextgov. &#8220;Why not put it all out in the government domain?&#8221; [Since arriving in Washington], I&#8217;ve made all the government databases public. Every 311 call, every abandoned automobile, who has responded, etc. It provides high-level oversight of the daily tasks of government.&#8221;</p></blockquote>
<p>A more in-depth bio of Kundra can be found at <a href="http://www.washingtonpost.com/wp-dyn/content/article/2009/01/04/AR2009010401235.html" title="Vivek Kundra">this recent Washington Post article</a>.  A couple of the more intriguing things that he promoted in the District of Columbia were the <a href="http://data.octo.dc.gov/" title="DC Data Catalog">DC Data Catalog</a> and &#8220;<a href="http://www.appsfordemocracy.org/" title="Apps for Democracy Contest">Apps for Democracy</a>.&#8221;</p>
<p>The data catalog covers all kinds of DC data from  crime statistics to &#8212; ahem &#8212; most recent roadkill pickups.  It&#8217;s also available in a wide variety of formats. The &#8220;Apps for Democracy&#8221; was a kind of mashup contest to see what kind of apps could be developed to improve DC resident&#8217;s access to data.  It was highly successful, providing 47 different applications for a fraction of the cost of formally contracting out these projects.</p>
<p>Of course, changing such a huge, bureaucratic system as the Federal government will not happen overnight, but it is encouraging to see more of a focus on making data available in a timely manner (and in usable formats).</p>
<p>For those interested in this sort of thing, I&#8217;d also recommend checking out the <a href="http://www.sunlightfoundation.com/" title="Sunlight Foundation">Sunlight Foundation</a>, which is focused on government transparency.  Also, <a href="http://www.techpresident.com/" title="TechPresident">TechPresident</a> and <a href="http://www.nextgov.com" title="NextGov">Nextgov</a> are both news sources focused on following all things e-gov.</p>
<p>Got any other interesting links on this topic?  Please feel free to post &#8216;em in the comments below.</p>
<img src="http://feeds.feedburner.com/~r/dataandtheweb/~4/50e-ZPuML7s" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.kirix.com/blog/2009/02/06/more-government-data-coming-to-a-browser-near-you/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.kirix.com/blog/2009/02/06/more-government-data-coming-to-a-browser-near-you/</feedburner:origLink></item>
		<item>
		<title>Cooking the (Quick)Books</title>
		<link>http://feeds.kirix.com/~r/dataandtheweb/~3/inFjvWZd6nQ/</link>
		<comments>http://www.kirix.com/blog/2009/01/14/cooking-the-quickbooks/#comments</comments>
		<pubDate>Wed, 14 Jan 2009 23:05:35 +0000</pubDate>
		<dc:creator>Ken Kaczmarek</dc:creator>
		
		<category><![CDATA[data analysis]]></category>

		<category><![CDATA[dirty data]]></category>

		<category><![CDATA[spreadsheets]]></category>

		<guid isPermaLink="false">http://www.kirix.com/blog/2009/01/14/cooking-the-quickbooks/</guid>
		<description><![CDATA[Ah, tax season&#8230; could there be a more thrilling time of the year?
So, today I was reviewing a sales &#38; use tax form for the State of Illinois.  Since our governor really isn&#8217;t helping matters in our state these days, we felt the least we could do to help was to make sure to [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.kirix.com/blog/files/2009/01/illinois_st1.png" alt="Illinois ST-1 Image" align="left" />Ah, tax season&#8230; could there be a more thrilling time of the year?</p>
<p>So, today I was reviewing a <a href="http://www.revenue.state.il.us/" title="Illinois Department of Revenue">sales &amp; use tax form for the State of Illinois</a>.  Since <a href="http://en.wikipedia.org/wiki/Rod_Blagojevich" title="Wikipedia - Rod Blagojevich">our governor</a> really isn&#8217;t helping matters in our state these days, we felt the least we could do to help was to make sure to pay our taxes on time.</p>
<p>So, I was looking at our sales tax report in <a href="http://en.wikipedia.org/wiki/Quickbooks" title="Wikipedia - Quickbooks">Quickbooks</a> and, like a good accountant, just quickly checked to make sure it matched up against the total revenues in the income statement. They didn&#8217;t match.</p>
<p>Hmm&#8230; funny thing about accounting, things really ought to balance.</p>
<p>It was a small discrepancy, but after searching unsuccessfully for the difference, it was clear that the issue involved more than one transaction. And, unfortunately, there were just far too many transactions to try and come up with a solution manually.</p>
<p>So, since I happened to have this <a href="http://www.kirix.com/" title="Kirix Strata Home">data browser</a> laying around, I exported both reports as CSV files and opened them up in Kirix Strata™.</p>
<p>The Quickbooks CSVs were obviously meant for spreadsheet export (as it included subtotals and odd record breaks), so I quickly did some clean up and then did a few gymnastics to compare the tables. Turns out there were a few manual journal entries that weren&#8217;t mapped to the sales tax codes required by Quickbooks. And here I was hoping to blame Quickbooks&#8230; oh well.</p>
<p>Running through this process was a 5 minute affair, but it made me wonder about all these other small data manipulation tasks that are out there. There have got to be millions, nay, billions, of these things &#8212; 5 minute one-off, ad hoc data tasks that just can&#8217;t be solved with the help of a spreadsheet (in this case, grouping or relationships were needed to do this quickly).</p>
<p>What do people normally do in these situations? I fear that they probably spend hours working the problem manually. Got a similar story and/or solution? Feel free to share in the comments section below.</p>
<img src="http://feeds.feedburner.com/~r/dataandtheweb/~4/inFjvWZd6nQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.kirix.com/blog/2009/01/14/cooking-the-quickbooks/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.kirix.com/blog/2009/01/14/cooking-the-quickbooks/</feedburner:origLink></item>
	</channel>
</rss>
