Public Sector Open Data via Information Sharing and Enterprise Architecture

The title of this article is quite a mouthful, and three very complex and broadly-scoped disciplines mashed together. But that’s what’s happening all over, isn’t it, driven by consumer demand on their iPhones – mashing and manipulating information that’s managed to leak through the risk-adverse, highly-regulated mantle of the government’s secure data cocoon, and instantly sharing it for further rendering, visualization or actual, productive use. Mostly a "pull" style information flow, at best constrained or abstracted by public sector EA methods and models – at worst, simply denied.

This demand for open data, however, is rapidly exposing both opportunities and challenges within government information-sharing environments, behind the firewall – in turn a fantastic opportunity and challenge for the Enterprise Architects and Data Management organizations.

The recent "Open Data Policy" compels US Federal agencies to make as much non-sensitive, government-generated data as possible available to the public, via open standards in data structures (for humans and machine-readable), APIs (application programming interfaces) and browser-accessible functions. The public (including commercial entities) in turn can use this data to create new information packages and applications for all kinds of interesting and sometimes critical uses – from monitoring the health of public parks to predicting the arrival of city buses, or failure of city lights.

But there isn’t an "easy" button. And, given the highly-regulated and tremendously complex nature of integrated, older government systems and their maintenance contracts – significant internal change is very difficult, to meet what amounts to a "suggested" and unfunded (but with long-term ROI) mandate, without much in the way of clear and measurable value objectives.

That doesn’t mean there aren’t whole bunches of citizens and government employees ready, willing and enthusiastic about sharing information and ideas that clearly deliver tangible, touchable public benefit. Witness the recent "Open Data Day DC", a yearly hackathon in the District of Columbia for collaborating on using open data to solve local DC issues, world poverty, and other open government challenges. Simply sharing information in ways that weren’t part of the original systems integration requirements or objectives has become a very popular – and in fact expected behavior – of the more progressive and (by necessity) collaborative agencies – such as the Department of Homeland Security (DHS).

The Information Sharing Environment is the nation’s most prominent and perhaps active federal information sharing model – though its mission really generates "open data" products for a closed community (vs. the anonymous public) – i.e. those that deal with sensitive national security challenges. For information sharing purposes, however, it’s a very successful and well-documented, replicable model for any context that includes multiple government entities and stakeholders (whether one agency or department, or a whole city or state). A pragmatic Information Sharing Environment – with enthusiastic, knowledgeable and authoritative champions – is also the first, most important leg of the stool that supports successful Open Data initiatives.

The second leg is Enterprise Architecture – thinking of "open data" as the "demand" side of the equation, and "information sharing" as the conduit and source of "authorities" (i.e. policies, rules, governance, roles; internal and external) – EA can represent the "supply". "Represent" the supply, not "be" the supply; the "supply" are the actual agency assets, including data, budget, contracts, personnel, etc. EA can inform regarding what data is available where and when, with what constraints, in what format or representations, via which IT interfaces, and via which business or technology resources. What can or needs to be changed, or what will be impacted, for the supply to meet the demand? Perhaps reusable IT exists that can be fully leveraged to meet the requirements, perhaps existing Oracle SOA, BPM and WebCenter assets?

The third leg of course is the inventory of data assets available – data assets include not only the raw data, but the metadata and registries, data access functions and APIs, data models and schemas, and the information technologies and systems that produce, manipulate, manage, protect and store the data. Plus really neat, useful commercial and open source open data tools to help. Whether they exist already, or need to be created.

So it conceptually works as follows, very abstractly-mirroring the well-known "People, Process, Technology" business model;

People – An information-sharing environment and culture develops, enabling productive dialogue and guidance about proactively or reactively creating "open data" from enterprise assets to share with the public;
Process – An Enterprise Architecture method and framework is leveraged, to define and scope the "art of the possible" in leveraging enterprise data assets, in terms that enable compliant program and engineering planning; and
Information Technology – Useful, standards-based data products are cataloged and exposed to the public (better with some initial protoyping), meeting requirements and expectations, appropriately constrained by law, policy, regulations and investment controls.

Significant open data, and open government initiatives can’t succeed and persist without all three perspectives, all three domains of organizational expertise.