Joint DATA.GOV-ONTOLOG "Big Open Data" Session - Thu 2012_05_10    (3ANL)

Session Topic: "Fostering 'Big Open Data' in government through Open Collaboration - invited presentation on NYCFacets and introduction to New York City's 'Open' initiatives"    (3ANM)

Session Co-chair: JeanneHolm (Data.gov / NASA-JPL) & PeterYim (Ontolog / CIM3) - slides    (3ANN)

Panel Briefings from:    (3ANO)

Archives:    (3AVB)

Conference Call Details    (3AVJ)

Attendees    (3AWB)

Abstract:    (3AWR)

Fostering 'Big Open Data' in government through Open Collaboration - invited presentation on NYCFacets and introduction to New York City's 'Open' initiatives    (3AWS)

This is the first of two sessions jointly organized by the US federal data.gov initiative and Ontolog. This follows quite naturally from a few very exciting recent events, notably:    (3AWT)

During today's session, we will look at the NYCfacets app, the New York City open data initiative and contemplate how open collaborative community effort can help foster 'Big Open Data'.    (3AWX)

Agenda:    (3AWY)

Fostering 'Big Open Data' in government through Open Collaboration    (3AWZ)

Proceedings:    (3AX8)

Please refer to the above    (3AX9)

IM Chat Transcript captured during the session:    (3AXA)

 see raw transcript here.    (3AXB)
 (for better clarity, the version below is a re-organized and lightly edited chat-transcript.)
 Participants are welcome to make light edits to their own contributions as they see fit.    (3AXC)
 -- begin in-session chat-transcript --    (3AXD)
	PeterYim: Welcome to the    (3B7E)
	 = Joint DATA.GOV-ONTOLOG "Big Open Data" Session - Thu 2012-05-10 =    (3B7F)
	Session Topic: "Fostering 'Big Open Data' in government through Open Collaboration 
	 - invited presentation on NYCFacets and introduction to New York City's 'Open' initiatives"    (3B7G)
	Session Co-chair: JeanneHolm (Data.gov / NASA-JPL) & PeterYim (Ontolog / CIM3)    (3B7H)
	Panel Briefings:    (3B7I)
	*  ChrisMusialek (Data.gov / GSA) - "Empowering City Developers with Federal Data"    (3B7J)
	* AndrewNicklin (New York City) - "Opening up municipal government data: past, present, and future"    (3B7K)
	* JoelNatividad (Ontodia) - "Smart Cities and Big Open Data"    (3B7L)
	Session page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2012_05_10    (3B7M)
	Mute control: *7 to un-mute ... *6 to mute    (3B7N)
	Can't find Skype Dial pad? ... it's under the "Call" dropdown menu as "Show Dial pad"    (3B7O)
	 == Proceedings: ==    (3B7P)
	PeterYim: Attention ALL: because of time constraints from some of our panelists, we will have to 
	start promptly today. ... therefore if you have any logistics questions, please be ready to ask them 
	as soon as you get online, and before we mute everyone!    (3B7Q)
	anonymous morphed into EdDodds    (3B7R)
	anonymous morphed into AndrewNicklin    (3B7S)
	PeterYim: Hi Andrew!    (3B7T)
	anonymous morphed into JoelNatividad    (3B7U)
	PeterYim: Hi Joel, Hi Terry ... and everyone!    (3B7V)
	anonymous morphed into jgabriel    (3B7W)
	jgabriel: Hi everyone!    (3B7X)
	JoelNatividad: Howdy everyone!    (3B7Y)
	anonymous morphed into DeirdreLee    (3B7Z)
	anonymous1 morphed into HasanSayani    (3B80)
	anonymous morphed into JeanneHolm    (3B81)
	anonymous morphed into DavidMason & MarieJeanMeurs    (3B82)
	anonymous1 morphed into SamiBaig    (3B83)
	DavidMason & MarieJeanMeurs: hey all    (3B84)
	JesseWang: Hi, all! Good morning, afternoon, evening!    (3B85)
	SamiBaig: Hi all    (3B86)
	anonymous morphed into chrismusialek    (3B87)
	anonymous morphed into TomTinsley    (3B88)
	anonymous1 morphed into EdDodds    (3B89)
	anonymous1 morphed into sdupd_glenn    (3B8A)
	JackPark: Hi    (3B8B)
	PeterYim: == JeanneHolm presenting the intro slides now ...    (3B8C)
	anonymous morphed into BobLojek    (3B8D)
	PeterYim: == ChrisMusialek presenting ...    (3B8E)
	JeanneHolm: ChrisMusialek is now presenting Empowering City Developers with Federal Data. Slides can 
	be downloaded at the session page (above)    (3B8F)
	JesseWang: what is the search engine used in data.gov? did you develop your own text/query 
	analyzer/parser?    (3B8G)
	JeanneHolm: Jesse--we are using Bing as the search engine as part of USA.gov's search capability. 
	one of the things we all want to improve on Data.gov is the search capability. It's currently 
	limited by both the complexity of the queries you can build, as well as the fact that it only 
	searches the metadata of the data tool or dataset. We are moving toward a federated model that would 
	allow us to search the metadata or other attributes of the tools and data sources that are made 
	accessible.    (3B8H)
	JackRing: Has Data.gov calibrated the false positives and false negatives achieved with keywords?    (3B8I)
	JeanneHolm: Jack--I'll have to check on the false positive and false negative calibration.    (3B8J)
	anonymous morphed into DeniseWarzel    (3B8K)
	DeirdreLee: Latest W3C Editor's draft of Data Catalog Vocabulary (dcat) (managed by W3C Government 
	Linked Data (GLD) WG): http://dvcs.w3.org/hg/gld/raw-file/default/dcat/index.html    (3B8L)
	anonymous1 morphed into MarkDixon    (3B8M)
	AndrewNicklin: Will data.gov look at third-party API key & rate management tools instead of rolling 
	your own?    (3B8N)
	JeanneHolm: Andrew--Data.gov is definitely looking at third-party API tools and any emergent 
	standards in this area. The intent is that we are a connector amongst a set of data of national 
	interest, even beyond just the federal government.    (3B8O)
	PeterYim: == AndrewNicklin presenting ...    (3B8P)
	anonymous1 morphed into ElizabethFlorescu    (3B8Q)
	AndrewNicklin: The site I mentioned for NYC open data standards is http://www.nyc.gov/datastandards    (3B8R)
	PeterYim: == JoelNatividad presenting ...    (3B8S)
	JackRing: What will be the relevance of Ward Cunningham's expedition into Federating wiki's?    (3B8T)
	JackRing: Jeanne, Pls do clarify the FP-FN results because both are likely to be dismal.    (3B8U)
	JeanneHolm: Jack--completely agree. Just need to verify what's been done.    (3B8V)
	EdDodds: I saw a story about crowdfunding movies this a.m. (passer.by) and it got me thinking: are 
	there any crowdfunded open data efforts anybody has heard of? Any fellowships sponsored by 
	foundations?    (3B8W)
	JeanneHolm: Ed--like a Kickstarter for open data? Interesting...    (3B8X)
	DeborahMcGuinness: i like this idea and would be happy to point our students to such a call. I would 
	also be willing to help sell such a message    (3B8Y)
	EdDodds: Yes, I've seen a few stories proposing crowdfunding for investigative reporting 
	recently--haven't seen if they were actually successful    (3B8Z)
	JackPark: @Jeanne, re federation: I will be giving a talk at a bigdata meetup with these slides 
	http://www.slideshare.net/jackpark/big-datasciencemeetup-final    (3B90)
	JeanneHolm: Jack--Very interesting presentation. Is the meeting open for others to attend?    (3B91)
	JackRing: Jeanne, when you are ready to escape the limits of key words and rapidly assay data with 
	respect to large, complex, interconnected cominatorial networks I will be happy to offer some 
	insights. Long buried in highly classified systems but now patented in soon to be implemented in a 
	chip equivalent to 3,600 microprocessors on a grid.    (3B92)
	JackPark: @Jeanne, afaik it's sold out but contact me jackpark[at]topicquests.org - 
	http://www.meetup.com/Big-Data-Science/events/51766642/    (3B93)
	JeanneHolm: Thanks Jack.    (3B94)
	JackRing: @JackPark, I think your slides evidence great work. Thank you. Pls consider joining us at 
	the Symposium in July at San Jose, particularly the Sunday workshop. http://isss.org/world/index.php    (3B95)
	JackPark: @JackRing I would love to attend isss but it's just not in the budget; my friend Judith 
	Rosen is giving a tutorial I'd really like to attend. Your workshop, if it's open, I'll try to 
	attend. Many thanks    (3B96)
	JeanneHolm: Are there any questions for the speakers? We are about to go to question and answer...    (3B97)
	PeterYim: when we start, we will ask people to click on their "hand" buttons (lower right) ... and 
	queue folks up for Q&A and remarks ... amke sure you test your voices first, and start by telling us 
	who you are.    (3B98)
	DeirdreLee: I have to head out now, but thanks for lots of interesting presentations    (3B99)
	JeanneHolm: Thanks Deirdre!    (3B9A)
	LeoObrst: Thanks all, must leave.    (3B9B)
	PeterYim: == open Q&A and discussion now ...    (3B9C)
	anonymous morphed into PavithraKenjige    (3B9D)
	JeanneHolm: PeterYim: Presentations were fantastic. Congratulations to federal people who started 
	the movement in opening up government data; to NY developers for providing open data; and to Joel 
	and everyone who provided technology in helping Joel's app stand out from the crowd.    (3B9E)
	JeanneHolm: PeterYim: Next week's discussion will focus on the technical details of implementing 
	some of these solutions.    (3B9F)
	JackRing: Is anyone concerned with cybersecurity/privacy?    (3B9G)
	AndrewNicklin: JackRing: there are two aspects to our approach to security. First is not letting out 
	sensitive info (comparatively easy); Second is - potentially - evaluating whether our data, when 
	combined with outside information poses more risks.    (3B9H)
	sdupd_glenn: We'd love to see some case studies of municipal opendata in order to pitch to 
	management the benefits of a public-facing GIS system coupled with ERP data (merged visually with 
	other public data)    (3B9I)
	sdupd_glenn: a lot of our staff understands the potential of all this, but are unable to articulate 
	its benefit to the higher ups who control the purse    (3B9J)
	JeanneHolm: Are challenges a good way to get developers to focus on and consume government data? Are 
	there better ways?    (3B9K)
	JeanneHolm: JoelNatividad responded: The first time we submitted to a challenge was just to do 
	something with our partners. The second time was really to accomplish something. It wasn't about the 
	money, but the recognition and ability to build something useful was what drew us.    (3B9L)
	sdupd_glenn: sdupd_glenn: For the private sector, yes. For public agencies, the challenge is how to 
	incentivize the action of making data public in the first place    (3B9M)
	MikeBennett: I have to go now - thanks for great presentations    (3B9N)
	JeanneHolm: Thanks Mike!    (3B9O)
	EdDodds: It might be that the start up weekend or hackathon model of drawing everyone together 
	geographically for 48 hours (though I much prefer virtual innovation clusters such as Ontolog) might 
	be a tactic, especially if you could find sponsorships from firms who are likely to consume the 
	data, add their own and make a profit.    (3B9P)
	EdDodds: Nonprofits, community foundations, united way types also stand to benefit and could have 
	skin in the game    (3B9Q)
	JoelNatividad: And to Ed's point, that is what we want to do at Ontodia. We want to collaborate Open 
	Data with all kinds of databases, both public and private.    (3B9R)
	JoelNatividad: And do what Bloomberg did for Finance data, and do it for Open Data.    (3B9S)
	JeanneHolm: Ed--The hackathon model is good, but as you point out it's really important to have a 
	business model that brings those ideas to a sustainable service.    (3B9T)
	AndrewNicklin: @JeanneHolm, EdDodds: in terms of sustainability, we've also (informally, 
	unofficially) considered tiering access to our services such that the costs of operating open data 
	platform can be recovered from high-volume commercial users.    (3B9U)
	TerryLongstreth: @JeanneHolm - I agree that sustainability needs to be considered. Moreover, data 
	ages quickly, and there's little in today's talks about maintaining data qualilty and timeliness    (3B9V)
	JoelNatividad: @TerryLongstreth, in NYCFacets, that's why we derived "extrametadata" to characterize 
	and score each dataset    (3B9W)
	AndrewNicklin: @TerryLongstreth that's why automation is really important.    (3B9X)
	EdDodds: @JeanneHolm, all: strenuously agree. Caveat: just because something *should* be valuable 
	doesn't mean the market has "eyes to see" at the time a product is launched    (3B9Y)
	JoelNatividad: [in our "extrametadata"] we score it along freshness, sparseness, uniqueness, no of 
	downloads, views, etc., and we plan to make the scoring algorithm transparent and not opaque, so 
	publishers can respond; and in the future, we do plan to do time series as well, but not yet.    (3B9Z)
	JoelNatividad: @TerryLongstreth, we're actively tracking the wikidata effort and will sync up with 
	that    (3BA0)
	JoelNatividad: so "facts" and unstructured free form text are separated    (3BA1)
	anonymous1: In regards to competitions concerning opengov-- Chicago recently held a contest to 
	encourage app development and recieved a toyal of 60 submissions. Chicago's he open data portal 
	stats include:    (3BA2)
	328 datasets 470,000+ embeds 1000 + user views 50+ apps    (3BA3)
	EdDodds: Toronto's @buzzdata tries to socialize static data sets (streams too maybe?) marketing, 
	news gathering, conferences, higher education all could benefit; but I think until we get a mass of 
	aggregated micropayments for data feeds the challenge to fund will continue    (3BA4)
	JackPark: Great conference. Many thanks to the speakers.!    (3BA5)
	KingsleyIdehen: Please upload the slides to slideshare etc..    (3BA6)
	JeanneHolm: @Kingsley the slides are at 
	http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2012_05_10 for today's call.    (3BA7)
	KingsleyIdehen: @JeanneHolm: I've seen the presentations, but just suggesting slideshare for broader 
	audience etc..    (3BA8)
	EdDodds: + 1 Kingsley    (3BA9)
	JoelNatividad: We will upload the nycfacets overview to slideshare as well    (3BAA)
	EdDodds: There may be a few up at http://www.slideshare.net/eddodds/ already    (3BAB)
	JackPark: A thought: having slides up at slideshare means they can be viewed without downloading.    (3BAC)
	sdupd_glenn: slideshare, yes please    (3BAD)
	sdupd_glenn: Can I please suggest that someone or agency put together a series of webinars for 
	agencies that know opendata is crucial but cannot get C-suite or management approval to get an 
	opendata program started in the first place (all of us local municipalities and districts)!    (3BAE)
	PeterYim: thank you all, great session!    (3BAF)
	JoelNatividad: Thanks everyone! Special mention to Peter for all the great work to make this 
	possible!    (3BAG)
	SamiBaig: Thank you all!    (3BAH)
	anonymous morphed into lisa h    (3BAI)
	sdupd_glenn: we local municipalities feel like we get to applaud the state and federal efforts but 
	have no funding or champions to help us get off the ground. We can't participate without your help!    (3BAJ)
	JeanneHolm: @sdupd_glenn I'm happy to help out with providing discussions on the value of open data 
	to cities and municipalities. Part of Cities.Data.gov (coming soon!) will be to do that as well. 
	Feel free to reach out to me at jholm@jpl.nasa.gov    (3BAK)
	sdupd_glenn: JeanneHolm: Great I'd love to discuss with your team. We've been in touch with you 
	already via Barbara Moreno    (3BAL)
	PeterYim: come back, same time next week, when we will cover the technical aspects of the same 
	subject next Thursdau (May-17)    (3BAM)
	PeterYim: -- session ended: 11:18am PDT --    (3BAN)
 -- end of in-session chat-transcript --    (3AXE)

... More Questions    (3B2Y)

Additional Resources:    (3AXP)

Audio Recording of this Session    (3AXG)


For the record ...    (3AY0)

How To Join (while the session is in progress)    (3AY1)