Crowd Documentation How Programmer Social Communities are Flipping Software Development Chris Parnin, PhD Candidate @ GT @chrisparnin ninlabs.com checkbox.io
About Me My “other thesis”
Types of Documentation
Types of Documentation Software documentation is created by a few and read by few.
Types of Documentation Software documentation is created by a few and read by few. API documentation is created by a few and read by many.
API Documentation is Big Business
Is costly to create http://thirdblogfromthesun.com/2010/09/how-big-is-themsdn-library/
Help win platform wars It sucks, the documentation sucks, the examples are ridiculous. This is part of the reason there are so many junk apps on android as well. iOS stresses interface and user experience from the first time you open the dev portal, android... do they ever even mention it? Not that I have seen. Its all obscure code stuff and external libraries created by who knows who etc. http://forum.unity3d.com/threads/104567-iOS-vs-Androidfrom-a-dev-perspective/page2
APIs as the new business model • • Stripe • Coinbase Twilio //require the Twilio module and create a REST client var client = require('twilio')('ACCOUNT_SID', 'AUTH_TOKEN'); client.makeCall({ to:'+16515556677', from: '+14506667788', url: 'http://www.example.com/twiml.php' }, function(err, responseData) { console.log(responseData.from); // outputs "+14506667788" });
Traditional Forms 4000 pages of documentation
Traditional Forms 4000 pages of documentation
What are the sources of dev knowledge now?
Unofficial Microsoft Survey When learning about API documentation % of following resource is used “Often”
Unofficial Microsoft Survey When learning about API documentation % of following resource is used “Often” 73.5% (2,224)
Unofficial Microsoft Survey When learning about API documentation % of following resource is used “Often” 73.5% (2,224) 42.5% (1,289) Code completion (IntelliSense)
Unofficial Microsoft Survey When learning about API documentation % of following resource is used “Often” 73.5% (2,224) 42.5% (1,289) Code completion (IntelliSense) 40.1% (1,212) Microsoft’s Official Documentation
Searching for Documentation
Examine search results of jQuery
unofficial doc > Stack Overflow > blog >
... 1730 search results
>
Crowd Documentation Knowledge is created and curated by a mostly uncoordinated collective.
What sources do devs actually use?
A day of a dev
...1,316 days of developer browser history
Typical dev >
Typical dev > Typical Flow: C-Q 22% 28% 18% Monitor Flow: A-B 64% 2% 18% 22% 48% Direct 39% 18% 56% 1% 26% 38% Direct
Consistent with self-reported surveys.
What makes it different?
Twice as many examples can be found on Stack Overflow than the official documentation guide.
• Developers may be getting as much as 50% of their documentation from Stack Overflow. • More examples can be found on Stack Overflow than the official documentation guide. • In web searches, Stack Overflow questions are visited 2x-10x more often than official documentation.
User participation
User participation Advisors 60% 40% 20% 0% 0 20 40 60 User Percentile 80 100 100% 100% Percent of Contributions GWT Askers 80% Percent of Contributions Percent of Contributions 100% Android 80% 60% 40% 20% 0% 0 20 40 60 User Percentile 80 100 Java 80% 60% 40% 20% 0% 0 20 40 60 User Percentile 80 100
60% of questions answered by 5% of users. Diminishing game mechanisms 2 years to reach 80% coverage Failure of online communities http://michael.richter.name/blogs/why-i-no-longercontribute-to-stackoverflow/ Encouraging User Behaviour with Achievements: An Empirical Study Scott Grant and Buddy Betts
What did developer’s think?
But then...everything you thought you knew was wrong
“DLLs are tossed over the fence to English majors” “Documentation isn’t meant to be read by everyone, it’s meant to be authoritative” “We sometimes take a break from documentation and try building something for a month or two. Then we realize how bad it is” “We contact the top 100 Stack Overflow contributors” “You don’t understand how much this has impacted us. No, really you don’t.”
Stakeholders Concerns • • • Will we have a job in 5 years? • Getting feedback from the crowd: sentiment analytics over the frustrating parts, frequent questions, confusing topics, etc. What should our voice be? How can we "regain" authority over the crowd (e.g. people suggesting unsupported/non-public API methods in android development). [*] * https://code.google.com/p/android/issues/detail?id=62220
Why not participating?
Bar/coffee interviews
“I don’t blog because I fear being wrong on the internet... I share gists, or fix documentation and then tweet about it. Blogs are disconnected from official sources, like project documentation. I can't imagine someone reading what I write. I would rather associate the content with the documentation.”
“About 2 years ago, I used to answer a lot more. But now, it feels like most questions have already been answered -- it is saturated. That might change when something new comes out, some of the newer stuff we started working on, I might have a chance to answer questions on it.”
two decades Massively distributed software engineering In the next 50 years, as governments increasingly turn legal policy and services into source code and public APIs, often created in the timespan of a president’s term, we must be prepared to build massively-sized software systems on a regular basis. This will often require cooperation of many diverse stakeholders. health care insurance marketplace At our current place, imagine how these would fare if needed in a few years: • A government api to calculate taxes on all online purchases for any location? • A distributed traffic regulation system for a network of self-driving cars and drones. • ...
Crowd Programming • We have seen “crowd documentation” how might the other pieces look like? Rather than having a rich standard API, Javascript essentially has a “crowd API” assembled from Stack Overflow snippets and Github repositories.

Crowd Documentation - How Programmer Social Communities are Flipping Software Development

  • 1.
    Crowd Documentation How ProgrammerSocial Communities are Flipping Software Development Chris Parnin, PhD Candidate @ GT @chrisparnin ninlabs.com checkbox.io
  • 2.
  • 3.
  • 4.
    Types of Documentation Software documentationis created by a few and read by few.
  • 5.
    Types of Documentation Software documentationis created by a few and read by few. API documentation is created by a few and read by many.
  • 6.
  • 7.
    Is costly tocreate http://thirdblogfromthesun.com/2010/09/how-big-is-themsdn-library/
  • 8.
    Help win platformwars It sucks, the documentation sucks, the examples are ridiculous. This is part of the reason there are so many junk apps on android as well. iOS stresses interface and user experience from the first time you open the dev portal, android... do they ever even mention it? Not that I have seen. Its all obscure code stuff and external libraries created by who knows who etc. http://forum.unity3d.com/threads/104567-iOS-vs-Androidfrom-a-dev-perspective/page2
  • 9.
    APIs as thenew business model • • Stripe • Coinbase Twilio //require the Twilio module and create a REST client var client = require('twilio')('ACCOUNT_SID', 'AUTH_TOKEN'); client.makeCall({ to:'+16515556677', from: '+14506667788', url: 'http://www.example.com/twiml.php' }, function(err, responseData) { console.log(responseData.from); // outputs "+14506667788" });
  • 10.
  • 11.
  • 12.
    What are thesources of dev knowledge now?
  • 14.
    Unofficial Microsoft Survey Whenlearning about API documentation % of following resource is used “Often”
  • 15.
    Unofficial Microsoft Survey Whenlearning about API documentation % of following resource is used “Often” 73.5% (2,224)
  • 16.
    Unofficial Microsoft Survey Whenlearning about API documentation % of following resource is used “Often” 73.5% (2,224) 42.5% (1,289) Code completion (IntelliSense)
  • 17.
    Unofficial Microsoft Survey Whenlearning about API documentation % of following resource is used “Often” 73.5% (2,224) 42.5% (1,289) Code completion (IntelliSense) 40.1% (1,212) Microsoft’s Official Documentation
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
    Crowd Documentation Knowledge iscreated and curated by a mostly uncoordinated collective.
  • 24.
    What sources dodevs actually use?
  • 25.
    A day ofa dev
  • 26.
    ...1,316 days ofdeveloper browser history
  • 27.
  • 28.
    Typical dev > TypicalFlow: C-Q 22% 28% 18% Monitor Flow: A-B 64% 2% 18% 22% 48% Direct 39% 18% 56% 1% 26% 38% Direct
  • 29.
  • 30.
  • 32.
    Twice as manyexamples can be found on Stack Overflow than the official documentation guide.
  • 34.
    • Developers maybe getting as much as 50% of their documentation from Stack Overflow. • More examples can be found on Stack Overflow than the official documentation guide. • In web searches, Stack Overflow questions are visited 2x-10x more often than official documentation.
  • 35.
  • 36.
    User participation Advisors 60% 40% 20% 0% 0 20 40 60 User Percentile 80 100 100% 100% Percentof Contributions GWT Askers 80% Percent of Contributions Percent of Contributions 100% Android 80% 60% 40% 20% 0% 0 20 40 60 User Percentile 80 100 Java 80% 60% 40% 20% 0% 0 20 40 60 User Percentile 80 100
  • 37.
    60% of questionsanswered by 5% of users. Diminishing game mechanisms 2 years to reach 80% coverage Failure of online communities http://michael.richter.name/blogs/why-i-no-longercontribute-to-stackoverflow/ Encouraging User Behaviour with Achievements: An Empirical Study Scott Grant and Buddy Betts
  • 38.
  • 40.
  • 41.
    “DLLs are tossedover the fence to English majors” “Documentation isn’t meant to be read by everyone, it’s meant to be authoritative” “We sometimes take a break from documentation and try building something for a month or two. Then we realize how bad it is” “We contact the top 100 Stack Overflow contributors” “You don’t understand how much this has impacted us. No, really you don’t.”
  • 42.
    Stakeholders Concerns • • • Will wehave a job in 5 years? • Getting feedback from the crowd: sentiment analytics over the frustrating parts, frequent questions, confusing topics, etc. What should our voice be? How can we "regain" authority over the crowd (e.g. people suggesting unsupported/non-public API methods in android development). [*] * https://code.google.com/p/android/issues/detail?id=62220
  • 43.
  • 44.
  • 45.
    “I don’t blogbecause I fear being wrong on the internet... I share gists, or fix documentation and then tweet about it. Blogs are disconnected from official sources, like project documentation. I can't imagine someone reading what I write. I would rather associate the content with the documentation.”
  • 46.
    “About 2 yearsago, I used to answer a lot more. But now, it feels like most questions have already been answered -- it is saturated. That might change when something new comes out, some of the newer stuff we started working on, I might have a chance to answer questions on it.”
  • 47.
    two decades Massively distributedsoftware engineering In the next 50 years, as governments increasingly turn legal policy and services into source code and public APIs, often created in the timespan of a president’s term, we must be prepared to build massively-sized software systems on a regular basis. This will often require cooperation of many diverse stakeholders. health care insurance marketplace At our current place, imagine how these would fare if needed in a few years: • A government api to calculate taxes on all online purchases for any location? • A distributed traffic regulation system for a network of self-driving cars and drones. • ...
  • 48.
    Crowd Programming • Wehave seen “crowd documentation” how might the other pieces look like? Rather than having a rich standard API, Javascript essentially has a “crowd API” assembled from Stack Overflow snippets and Github repositories.