Multi-dimensional Exploration of API Usage Coen De Roover1, Ralf Lämmel2, Ekaterina Pek3 1 Software Languages Lab, Vrije Universiteit Brussel, Belgium 2 Software Languages Team, University of Koblenz-Landau, Germany 3 ADAPT Lab, University of Koblenz-Landau, Germany
Exploration Story: JHotDraw ➜ relatively few references to SAX and DOM what XML APIs are used and how extensively? Swing!!java.lang!!JavaBeans!!java.io!!AWT!!java.util Package org.jhotdraw.undo AWT!!Swing!!java.io java.lang java.util JavaBeans java.text java.lang.reflect!!DOM!!java.net java.util.regex!!Java Print Service!!java.util.zip!!java.lang.annotation java.math java.lang.ref java.util.concurrent Java security!!javax.imageio!!SAX JHotDraw’s API Cocktail Fig. 8. The API Cocktail of JHotDraw (cloud of API tags). View – A list as in the case of the API Footprint insight, except that it is narrowed down to a sub-API of interest. GUI IO!!For XML!!D JH GUI APIs for the API cloud cept for th a tory ents. heck API GUI!!Data!!Basics!! IO!!Format!!Component!!Meta!! XML!!Distribution!!Parsing!!Control!!Math!!Output!!Security!!Concurrency JHotDraw’s API Domain Cocktail GUI!!Basics!!Component!!IO Package org.jhotdraw.undo Project jhotdraw Fig. 9. Cocktail of domains for JHotDraw. java.lang!!java.net!!Swing!!JavaBeans!!java.io!! APIs API domains Coupling in JHotDraw for the interface org.jhotdraw.app.View Fig. 10. API Coupling for JHotDraw’s interface org.jhotdraw.app.View. software concepts. Consider Fig. 9 for illustration. It shows API domains for all of JHotDraw and also for its undo package. Thus, it presents the API cocktails of Fig. 8 in a API domain cloud ? Let me start by making the concept of exploring API usage more concrete. Imagine you are a developer tasked with migrating JH from XML to JSON for persistency. The first thing you would like to know is what APIs for manipulating XML are used, and how extensively these APIs are used. You could gain these insights through the two tag clouds shown on the slide. The top one contains the domains of the APIs used by JH, the bottom one the actual APIs. The size of a tag corresponds to the amount of references to the API or the domain. So we can conclude that XML apis are used by JH, more concretely DOM and SAX, but not extensively. There are a lot more references to the AWT and SWING APIs from the GUI programming domain, for instance.
Exploration Story: JHotDraw ➜ footprint of DOM in JHotDraw is but 94 refs to 19 distinct elements what elements of DOM are actually used? table of referenced API elements (i.e., DOM slice) ? The next insight to gain is whether the project uses the complete DOM API, or just a small subset. Given a table of referenced API elements, the latter seems to be the case. There are only 94 references to 19 distinct types and methods. Even better news, no exotic API elements are used.
Exploration Story: JHotDraw Slice of JHotDraw with DOM usage ➜ local to 1/13 top-level packages How is DOM usage distributed across JHotDraw? table of referencing project elements (i.e., JHotDraw slice) ? in the view of hundreds of API elements declared by the public void applyStylesTo(Element elem) {for (CSSRule rule : rules) {if (rule.matches(elem)) {rule.apply(elem);} } } usage. All good news so far, but it could still be the case that the API is used all over the project. Luckily, given a table of referencing project elements, the use of DOM is local to 4 classes in the org.jhotdraw.xml package. Our exploration therefore shows that migrating from XML to JSON is feasible.
Exploring API Usage: Quaatlas API Atlas API metadata API named collection of elements (98) API domain named collection of APIs addressing the same domain (27) API facet named collection of API elements addressing a particular concern basic d API erface ot pay works. faces, n soft- vokes efixes, APIs. ssibly GUI g as a ckage types kages APIs. cause Java’s mmon ubsets which hould ysis is given y sort rectly ojects, tions of projects or APIs as well as specific packages, types, or methods thereof. For instance, we may be interested in #api for a specific project. Also, we may be interested in #ref for some part of an API. Further, these metrics can be configured to count only specific patterns. It is easy to see now that the given metrics are not even orthogonal because, for example, #derive can be obtained from #ref by only counting patterns for ‘extends’ and ‘implements’ relationships. API Domains: We assume that each API addresses some programming domain such as XML processing or GUI pro- gramming. We are not aware of any general, widely adopted attempt to associate APIs with domains, but the idea appears to merit further research. We have begun collecting program- ming domains (or in fact, API domains) and tagging APIs appropriately. Let us list a few API domains and associate them with well-known Java APIs: GUI: GUI programming, e.g., Swing and AWT. XML: XML processing, e.g., DOM, JDOM, and SAX. Data: Data structures incl. containers, e.g., java.util. IO: File- and stream-based I/O, e.g., java.io and java.nio. Component: Component-oriented programming, e.g., JavaBeans. Meta: Meta-programming incl. reflection, e.g., java.lang.reflect. Basics: Basic language support, e.g., java.lang.String. API domains are helpful in reporting API usage and quan- tifying API usage of interest in more abstract terms than the names of individual APIs, as will be illustrated in §VI. API Facets: An API may contain dozens or hundreds of types each of which has many method members in turn. Some APIs use sub-packages to organize such API complexity, but those sub-packages are typically concerned with advanced API usage whereas the core facets of API usage are not distinguished in any operational manner. This makes it hard to understand API usage at a somewhat abstract level. 2. output : corpus 3. for each name in candidateList : 4. (psrc, pbin ) = obtainProject(name); 5. patches = exploratoryBuild(psrc, pbin ); 6. timestamp = build(psrc, patches); 7. (java, classes, jars) = collectStats(psrc); 8. java0 = filter(java); 9. (jarsbuilt , jarslib) = detectJars(timestamp, java0 , jars); 10. java0 compiled = detectJava(timestamp, java0 , classes, jarsbuilt ); 11. p0 src = (java0 compiled , jarslib); 12. p0 bin = jarsbuilt ; 13. p0 = (p0 src, p0 bin ); 14. if validate(p0 ) : corpus = corpus + p0 ; Fig. 4. Pseudocode describing the corpus (re)-engineering method. Accordingly, we propose leveraging a notion of API facets in the sense of aspects or concerns supported by the API. In this paper, we assume that facets are represented as named collections of specific API types or methods. As an illustration, we name a few API facets of the typical DOM-like API such as DOM itself, JDOM, or dom4j: Input / Output: De-/serialization for DOM trees. Observation: Getter-like access and other ‘read only’ forms. Addition: Addition of nodes et al. as part also of construction. Removal: Removal of nodes et al. as a form of mutation. Namespaces: XML namespace manipulation. Nontrivial XML: Use of CDATA, PI, and other XML idiosyncrasies. Nontrivial API: Usage of types and methods that are beyond normal API usage. For instance, XML APIs may provide some framework for node factories or adapters for API integration. API facets are helpful in communicating API usage to the user at a more abstract level than the level of individual types and methods, as will be illustrated in §VI. We leverage knowledge of the APIs to identify (to name) API facets and to tag APIs appropriately. The idea of grouping API members, e.g., by their functional roles, has also been studied in related work on code completion; see §III. V. THE QUAATLAS CORPUS FOR API-USAGE ANALYSIS Our study requires a suitable corpus of mature, well- developed projects coming from different application domains. Arguably, such projects show sufficient and advanced API usage. We decided to restrict ourselves to open-source Java projects; in order to increase quality and reproducibility of our research, we decided to use an existing, established and cu- rated, collection of Java projects—the QUALITAS corpus [27], release 20101126r. As we discuss in §IV, API usage entails the ability to resolve types. However, QUALITAS does not guarantee the availability of a project’s library types. The collection consists of source and binary forms as they are provided by the project developers. be exten be added projects. Line 4 source a project w nature o The exp occur du stage, w in the b set is sm build scr or invoc to push explorato build the After modifica Java file types, fo we explo containe On lin that we line 9, w informat classify or as bui and the compiled source c types tog the binar The r rebuildin making s the meth and libra we add t This p the proc per proje coverage somethin 10 as an it on reg gathered by studying API usage in a corpus of projects re-engineered Qualitas corpus to Eclipse projects that compile (79) dependencies resolved and separated from project files In the paper, we present a similar exploration-based approach for understanding API usage. This approach relies on a lot of meta-data about APIs that we have made available in an API atlas. For 98 APIs, this atlas describes the individual packages/types/methods the API consists of. A fine-grained description is necessary as libraries such as Google Guava or even java.util group different APIs together. We also associated a domain with each API. This resulted in 27 API domains. Finally, we have started describing groups of elements within an API that address a particular concern. We gathered this meta-data by studying the APIs used in a corpus of 79 mature projects. We re-engineered the projects from the Qualitas corpus such that all their dependencies are resolved and separated from project files. This enables extracting precise API usage facts.
linked to 101 Note that the entire API atlas is available on the paper’s website. There, we also present the meta-data in a human-readable format. One nice feature there is that each API is linked to its description on the 101companies wiki where you can also browse through small example programs that use the API etc.
Exploring API Usage: Exapus Platform scaled and ordered by usage metrics: #ref, #elem, #derive, #proj, #api, ... computes exploration views on usage facts selection of API references organized as project or API slice project members + outgoing refs within their scope API members + incoming refs within their scope rendered as graph, table or cloud by referenced elements: API name, element, meta-data ... by referencing elements: project name, element, syntactic pattern, ... gathers API usage facts for a given corpus referenced element, referencing scope, syntactic pattern (e.g., super call) The actual exploration-based approach to understanding API usage is supported by a tool that extracts references to API elements from a single project or a corpus of projects. During an AST visit, the tool records for each reference it discovers the referenced element, the project scope in which this reference resides, and the syntactic form of the reference. This could be a method return type, a super call, or a type parameter, .. The tool presents exploration views on the extracted facts, which can be configured along several dimensions. First of all, you can configure what API references to include in a view using conditions on the referenced element and the referencing element. For instance, only the exceptions defined by an API from the XML domain that are caught in the JH project. Next, you can choose to organize these references as a slice of project members with outgoing refs or as a slice of API members with incoming refs. Finally, you can have these slices rendered as a graph/table/cloud scaled by a usage metric. For instance, a tag cloud scaled by the amount of subclassing along the border between a project and an API.
What follows are some screenshots of the tool in action. At the far left, there is a list of predefined views. Their configuration can be edited in the top-right corner. Shown here is the configuration of a view that results in the tag cloud we saw earlier. At the top, you can select what referenced elements to include. Here, we include all of them using a wildcard pattern. At the bottom, you can select what referencing elements to include in a view. Here, we only include references from the JH project. Note that even though the tool has a dynamic IDE-like feel, it is actually completely web-based. We hope this will encourage others to explore and augment our API meta- data.
Here you see a project-centric table of outgoing references from JH to the Java collections API and DOM. We see for instance that the method add of StyleManager invokes method add of java.util.List. At the bottom-left, you see a tag cloud for the currently selected project element. We see that there are more references to data APIs than to XML apis in the StyleManager class. The source code for this class is shown at the bottom-right. API references are highlighted within the source code.
Finally, here you see an API-centric graph of references from JH to the APIs known to us. Nodes are APIs. Borders of the nodes are scaled by the relative amount of referenced elements. So this is basically another rendering of the tag cloud you saw earlier. You could also choose to scale the borders of the nodes using a different metric, such as the amount of derivation that happens.
And of course, we also made this tool publicly available.
Insight: API Dispersion intent stakeholder view intelligence understand and compare dispersion of an API across the corpus API developer project-centric table usage metrics for quantitative comparison API facets for qualitative comparison Fig. 5. JDOM’s API Dispersion in QUAATLAS (project-centric table). B. The API Dispersion Insight choose compliance tests for API evolution So, what insights about API usage can one hope to gain through such a tool? And how should you configure the tool such that it produces the right view for each insight? In the paper, we discuss this in a structured manner for several API usage insights. The one shown here is concerned with how dispersed or widespread an API is across a corpus of projects. It can be gained by configuring the tool to produce a table of referencing project elements, together with some usage metrics. Here, we see JDOM’s dispersion in the corpus. The table is sorted by the amount of references each project contains. We see that the informa project has the most references, but that jspwiki references the most distinct API elements. We also see that this project is one of the few that contain subtypes of API elements. So who could benefit from this insight? This would be the developer of an API that needs to choose easy and difficult projects for compliance testing after an API evolution.
Insight: API Footprint intent stakeholder view intelligence understand what API elements are actually used in a corpus or in specific project scopes API or project developer API-centric table or tree ordered or scaled by #ref Fig. 6. JDOM’s API Footprint in QUAATLAS (API-centric table). Nontrivial JDOM API usage in velocity org.apache.velocity.anakia.AnakiaJDOMFactory Scope Tags incl. facets #proj ... API migration by project developer: target effort API evolution by API developer: minimize breaking changes The API footprint insight is dual to the API dispersion insight in the sense that it is gained through a slice of referenced API elements rather than through a table of referencing project elements. API developers might want to gain this insight for an entire corpus of projects to minimize the impact of breaking API changes. A project developer might want to gain this insight for a single project to decide whether a wrapper-based migration, where a wrapper of the new API has to be produced for each referenced element, is feasible.
Insight: API Coupling intent stakeholder view intelligence understand what APIs or API domains are used in smaller project scopes project developer API-centric cloud, usage metrics applied reveals potential code smell: too many APIs in small scope on org.jhotdraw.app.AbstractView: Basics!!Distribution!!GUI!!IO!!Component java.lang!!java.net!!Swing!!JavaBeans!!java.io!! APIs API domains Coupling in JHotDraw for the interface org.jhotdraw.app.View string manipulation view saving view painting change notification exceptions during saving helps understand design and motivation for API dependencies Shown here is an insight that is targeted more towards project developers who would like to understand what APIs are used together in a small project scope. This insight can be gained by configuring the tool to produce an API tag cloud for the currently selected project scope. The one on the slide is for the AbstractView class of JH, which seems to be referencing quite a lot of different APIs. For small project scopes, such as a method, this could be the sign of a code smell. For larger scopes, API tag clouds can also help understand the motivation behind API dependencies. Here for instance, java.lang is referenced for string manipulation, java.net for saving a view to a URI, Swing for painting views, JavaBeans for change notifications, and java.io for handling exceptions during the saving of a view.
Insight: API Profile intent stakeholder view intelligence understand what API facets are used in varying project scopes project developer API-centric cloud of API facets, usage metrics applied project scope: reveals API asbestos smaller scope: API usage scenarios Observation!!Input!!Exception!! Package de.nava.informa.parsers Observation!!Input!! Nontrivial XML!!Manipulation Exception!!Renaming Addition Namespaces!!Nontrivial API!!Output!! Project informa JDOM’s API Profile for informa e.g., JDOM’s profile in informa The API profile insight is similar, but is gained through a cloud of the facets of a single API used within a project scope rather than complete APIs. At the top, we see the JDOM facets used within the entire informa project. Here, seldomly used non-trivial parts of an API reveal that the project might be difficult to change. At the bottom, we see the JDOM facets used within a smaller scope of the project. Here, the displayed facets correspond to API usage scenarios: the parsers package reads XML files and observes XML nodes.
Conclusion described several insights to be gained about API usage http://softlang.uni-koblenz.de/explore-API-usage provided Quaatlas API atlas re-engineered Qualitas projects for precise extraction of API usage added meta-data concerning APIs, API domains, API facets presented multi-dimensional exploration model supported by IDE-like web-based platform Exapus configurable views on API usage cocktail, dispersion, distribution, footprint, coupling, profile future work empirical research on understanding API usage through exploration support flow analyses in views

Multi-dimensional exploration of API usage - ICPC13 - 21-05-13

  • 1.
    Multi-dimensional Exploration of APIUsage Coen De Roover1, Ralf Lämmel2, Ekaterina Pek3 1 Software Languages Lab, Vrije Universiteit Brussel, Belgium 2 Software Languages Team, University of Koblenz-Landau, Germany 3 ADAPT Lab, University of Koblenz-Landau, Germany
  • 2.
    Exploration Story: JHotDraw ➜relatively few references to SAX and DOM what XML APIs are used and how extensively? Swing!!java.lang!!JavaBeans!!java.io!!AWT!!java.util Package org.jhotdraw.undo AWT!!Swing!!java.io java.lang java.util JavaBeans java.text java.lang.reflect!!DOM!!java.net java.util.regex!!Java Print Service!!java.util.zip!!java.lang.annotation java.math java.lang.ref java.util.concurrent Java security!!javax.imageio!!SAX JHotDraw’s API Cocktail Fig. 8. The API Cocktail of JHotDraw (cloud of API tags). View – A list as in the case of the API Footprint insight, except that it is narrowed down to a sub-API of interest. GUI IO!!For XML!!D JH GUI APIs for the API cloud cept for th a tory ents. heck API GUI!!Data!!Basics!! IO!!Format!!Component!!Meta!! XML!!Distribution!!Parsing!!Control!!Math!!Output!!Security!!Concurrency JHotDraw’s API Domain Cocktail GUI!!Basics!!Component!!IO Package org.jhotdraw.undo Project jhotdraw Fig. 9. Cocktail of domains for JHotDraw. java.lang!!java.net!!Swing!!JavaBeans!!java.io!! APIs API domains Coupling in JHotDraw for the interface org.jhotdraw.app.View Fig. 10. API Coupling for JHotDraw’s interface org.jhotdraw.app.View. software concepts. Consider Fig. 9 for illustration. It shows API domains for all of JHotDraw and also for its undo package. Thus, it presents the API cocktails of Fig. 8 in a API domain cloud ? Let me start by making the concept of exploring API usage more concrete. Imagine you are a developer tasked with migrating JH from XML to JSON for persistency. The first thing you would like to know is what APIs for manipulating XML are used, and how extensively these APIs are used. You could gain these insights through the two tag clouds shown on the slide. The top one contains the domains of the APIs used by JH, the bottom one the actual APIs. The size of a tag corresponds to the amount of references to the API or the domain. So we can conclude that XML apis are used by JH, more concretely DOM and SAX, but not extensively. There are a lot more references to the AWT and SWING APIs from the GUI programming domain, for instance.
  • 3.
    Exploration Story: JHotDraw ➜footprint of DOM in JHotDraw is but 94 refs to 19 distinct elements what elements of DOM are actually used? table of referenced API elements (i.e., DOM slice) ? The next insight to gain is whether the project uses the complete DOM API, or just a small subset. Given a table of referenced API elements, the latter seems to be the case. There are only 94 references to 19 distinct types and methods. Even better news, no exotic API elements are used.
  • 4.
    Exploration Story: JHotDraw Sliceof JHotDraw with DOM usage ➜ local to 1/13 top-level packages How is DOM usage distributed across JHotDraw? table of referencing project elements (i.e., JHotDraw slice) ? in the view of hundreds of API elements declared by the public void applyStylesTo(Element elem) {for (CSSRule rule : rules) {if (rule.matches(elem)) {rule.apply(elem);} } } usage. All good news so far, but it could still be the case that the API is used all over the project. Luckily, given a table of referencing project elements, the use of DOM is local to 4 classes in the org.jhotdraw.xml package. Our exploration therefore shows that migrating from XML to JSON is feasible.
  • 5.
    Exploring API Usage:Quaatlas API Atlas API metadata API named collection of elements (98) API domain named collection of APIs addressing the same domain (27) API facet named collection of API elements addressing a particular concern basic d API erface ot pay works. faces, n soft- vokes efixes, APIs. ssibly GUI g as a ckage types kages APIs. cause Java’s mmon ubsets which hould ysis is given y sort rectly ojects, tions of projects or APIs as well as specific packages, types, or methods thereof. For instance, we may be interested in #api for a specific project. Also, we may be interested in #ref for some part of an API. Further, these metrics can be configured to count only specific patterns. It is easy to see now that the given metrics are not even orthogonal because, for example, #derive can be obtained from #ref by only counting patterns for ‘extends’ and ‘implements’ relationships. API Domains: We assume that each API addresses some programming domain such as XML processing or GUI pro- gramming. We are not aware of any general, widely adopted attempt to associate APIs with domains, but the idea appears to merit further research. We have begun collecting program- ming domains (or in fact, API domains) and tagging APIs appropriately. Let us list a few API domains and associate them with well-known Java APIs: GUI: GUI programming, e.g., Swing and AWT. XML: XML processing, e.g., DOM, JDOM, and SAX. Data: Data structures incl. containers, e.g., java.util. IO: File- and stream-based I/O, e.g., java.io and java.nio. Component: Component-oriented programming, e.g., JavaBeans. Meta: Meta-programming incl. reflection, e.g., java.lang.reflect. Basics: Basic language support, e.g., java.lang.String. API domains are helpful in reporting API usage and quan- tifying API usage of interest in more abstract terms than the names of individual APIs, as will be illustrated in §VI. API Facets: An API may contain dozens or hundreds of types each of which has many method members in turn. Some APIs use sub-packages to organize such API complexity, but those sub-packages are typically concerned with advanced API usage whereas the core facets of API usage are not distinguished in any operational manner. This makes it hard to understand API usage at a somewhat abstract level. 2. output : corpus 3. for each name in candidateList : 4. (psrc, pbin ) = obtainProject(name); 5. patches = exploratoryBuild(psrc, pbin ); 6. timestamp = build(psrc, patches); 7. (java, classes, jars) = collectStats(psrc); 8. java0 = filter(java); 9. (jarsbuilt , jarslib) = detectJars(timestamp, java0 , jars); 10. java0 compiled = detectJava(timestamp, java0 , classes, jarsbuilt ); 11. p0 src = (java0 compiled , jarslib); 12. p0 bin = jarsbuilt ; 13. p0 = (p0 src, p0 bin ); 14. if validate(p0 ) : corpus = corpus + p0 ; Fig. 4. Pseudocode describing the corpus (re)-engineering method. Accordingly, we propose leveraging a notion of API facets in the sense of aspects or concerns supported by the API. In this paper, we assume that facets are represented as named collections of specific API types or methods. As an illustration, we name a few API facets of the typical DOM-like API such as DOM itself, JDOM, or dom4j: Input / Output: De-/serialization for DOM trees. Observation: Getter-like access and other ‘read only’ forms. Addition: Addition of nodes et al. as part also of construction. Removal: Removal of nodes et al. as a form of mutation. Namespaces: XML namespace manipulation. Nontrivial XML: Use of CDATA, PI, and other XML idiosyncrasies. Nontrivial API: Usage of types and methods that are beyond normal API usage. For instance, XML APIs may provide some framework for node factories or adapters for API integration. API facets are helpful in communicating API usage to the user at a more abstract level than the level of individual types and methods, as will be illustrated in §VI. We leverage knowledge of the APIs to identify (to name) API facets and to tag APIs appropriately. The idea of grouping API members, e.g., by their functional roles, has also been studied in related work on code completion; see §III. V. THE QUAATLAS CORPUS FOR API-USAGE ANALYSIS Our study requires a suitable corpus of mature, well- developed projects coming from different application domains. Arguably, such projects show sufficient and advanced API usage. We decided to restrict ourselves to open-source Java projects; in order to increase quality and reproducibility of our research, we decided to use an existing, established and cu- rated, collection of Java projects—the QUALITAS corpus [27], release 20101126r. As we discuss in §IV, API usage entails the ability to resolve types. However, QUALITAS does not guarantee the availability of a project’s library types. The collection consists of source and binary forms as they are provided by the project developers. be exten be added projects. Line 4 source a project w nature o The exp occur du stage, w in the b set is sm build scr or invoc to push explorato build the After modifica Java file types, fo we explo containe On lin that we line 9, w informat classify or as bui and the compiled source c types tog the binar The r rebuildin making s the meth and libra we add t This p the proc per proje coverage somethin 10 as an it on reg gathered by studying API usage in a corpus of projects re-engineered Qualitas corpus to Eclipse projects that compile (79) dependencies resolved and separated from project files In the paper, we present a similar exploration-based approach for understanding API usage. This approach relies on a lot of meta-data about APIs that we have made available in an API atlas. For 98 APIs, this atlas describes the individual packages/types/methods the API consists of. A fine-grained description is necessary as libraries such as Google Guava or even java.util group different APIs together. We also associated a domain with each API. This resulted in 27 API domains. Finally, we have started describing groups of elements within an API that address a particular concern. We gathered this meta-data by studying the APIs used in a corpus of 79 mature projects. We re-engineered the projects from the Qualitas corpus such that all their dependencies are resolved and separated from project files. This enables extracting precise API usage facts.
  • 6.
    linked to 101 Notethat the entire API atlas is available on the paper’s website. There, we also present the meta-data in a human-readable format. One nice feature there is that each API is linked to its description on the 101companies wiki where you can also browse through small example programs that use the API etc.
  • 7.
    Exploring API Usage:Exapus Platform scaled and ordered by usage metrics: #ref, #elem, #derive, #proj, #api, ... computes exploration views on usage facts selection of API references organized as project or API slice project members + outgoing refs within their scope API members + incoming refs within their scope rendered as graph, table or cloud by referenced elements: API name, element, meta-data ... by referencing elements: project name, element, syntactic pattern, ... gathers API usage facts for a given corpus referenced element, referencing scope, syntactic pattern (e.g., super call) The actual exploration-based approach to understanding API usage is supported by a tool that extracts references to API elements from a single project or a corpus of projects. During an AST visit, the tool records for each reference it discovers the referenced element, the project scope in which this reference resides, and the syntactic form of the reference. This could be a method return type, a super call, or a type parameter, .. The tool presents exploration views on the extracted facts, which can be configured along several dimensions. First of all, you can configure what API references to include in a view using conditions on the referenced element and the referencing element. For instance, only the exceptions defined by an API from the XML domain that are caught in the JH project. Next, you can choose to organize these references as a slice of project members with outgoing refs or as a slice of API members with incoming refs. Finally, you can have these slices rendered as a graph/table/cloud scaled by a usage metric. For instance, a tag cloud scaled by the amount of subclassing along the border between a project and an API.
  • 8.
    What follows aresome screenshots of the tool in action. At the far left, there is a list of predefined views. Their configuration can be edited in the top-right corner. Shown here is the configuration of a view that results in the tag cloud we saw earlier. At the top, you can select what referenced elements to include. Here, we include all of them using a wildcard pattern. At the bottom, you can select what referencing elements to include in a view. Here, we only include references from the JH project. Note that even though the tool has a dynamic IDE-like feel, it is actually completely web-based. We hope this will encourage others to explore and augment our API meta- data.
  • 9.
    Here you seea project-centric table of outgoing references from JH to the Java collections API and DOM. We see for instance that the method add of StyleManager invokes method add of java.util.List. At the bottom-left, you see a tag cloud for the currently selected project element. We see that there are more references to data APIs than to XML apis in the StyleManager class. The source code for this class is shown at the bottom-right. API references are highlighted within the source code.
  • 10.
    Finally, here yousee an API-centric graph of references from JH to the APIs known to us. Nodes are APIs. Borders of the nodes are scaled by the relative amount of referenced elements. So this is basically another rendering of the tag cloud you saw earlier. You could also choose to scale the borders of the nodes using a different metric, such as the amount of derivation that happens.
  • 11.
    And of course,we also made this tool publicly available.
  • 12.
    Insight: API Dispersion intent stakeholder view intelligence understandand compare dispersion of an API across the corpus API developer project-centric table usage metrics for quantitative comparison API facets for qualitative comparison Fig. 5. JDOM’s API Dispersion in QUAATLAS (project-centric table). B. The API Dispersion Insight choose compliance tests for API evolution So, what insights about API usage can one hope to gain through such a tool? And how should you configure the tool such that it produces the right view for each insight? In the paper, we discuss this in a structured manner for several API usage insights. The one shown here is concerned with how dispersed or widespread an API is across a corpus of projects. It can be gained by configuring the tool to produce a table of referencing project elements, together with some usage metrics. Here, we see JDOM’s dispersion in the corpus. The table is sorted by the amount of references each project contains. We see that the informa project has the most references, but that jspwiki references the most distinct API elements. We also see that this project is one of the few that contain subtypes of API elements. So who could benefit from this insight? This would be the developer of an API that needs to choose easy and difficult projects for compliance testing after an API evolution.
  • 13.
    Insight: API Footprint intent stakeholder view intelligence understandwhat API elements are actually used in a corpus or in specific project scopes API or project developer API-centric table or tree ordered or scaled by #ref Fig. 6. JDOM’s API Footprint in QUAATLAS (API-centric table). Nontrivial JDOM API usage in velocity org.apache.velocity.anakia.AnakiaJDOMFactory Scope Tags incl. facets #proj ... API migration by project developer: target effort API evolution by API developer: minimize breaking changes The API footprint insight is dual to the API dispersion insight in the sense that it is gained through a slice of referenced API elements rather than through a table of referencing project elements. API developers might want to gain this insight for an entire corpus of projects to minimize the impact of breaking API changes. A project developer might want to gain this insight for a single project to decide whether a wrapper-based migration, where a wrapper of the new API has to be produced for each referenced element, is feasible.
  • 14.
    Insight: API Coupling intent stakeholder view intelligence understandwhat APIs or API domains are used in smaller project scopes project developer API-centric cloud, usage metrics applied reveals potential code smell: too many APIs in small scope on org.jhotdraw.app.AbstractView: Basics!!Distribution!!GUI!!IO!!Component java.lang!!java.net!!Swing!!JavaBeans!!java.io!! APIs API domains Coupling in JHotDraw for the interface org.jhotdraw.app.View string manipulation view saving view painting change notification exceptions during saving helps understand design and motivation for API dependencies Shown here is an insight that is targeted more towards project developers who would like to understand what APIs are used together in a small project scope. This insight can be gained by configuring the tool to produce an API tag cloud for the currently selected project scope. The one on the slide is for the AbstractView class of JH, which seems to be referencing quite a lot of different APIs. For small project scopes, such as a method, this could be the sign of a code smell. For larger scopes, API tag clouds can also help understand the motivation behind API dependencies. Here for instance, java.lang is referenced for string manipulation, java.net for saving a view to a URI, Swing for painting views, JavaBeans for change notifications, and java.io for handling exceptions during the saving of a view.
  • 15.
    Insight: API Profile intent stakeholder view intelligence understandwhat API facets are used in varying project scopes project developer API-centric cloud of API facets, usage metrics applied project scope: reveals API asbestos smaller scope: API usage scenarios Observation!!Input!!Exception!! Package de.nava.informa.parsers Observation!!Input!! Nontrivial XML!!Manipulation Exception!!Renaming Addition Namespaces!!Nontrivial API!!Output!! Project informa JDOM’s API Profile for informa e.g., JDOM’s profile in informa The API profile insight is similar, but is gained through a cloud of the facets of a single API used within a project scope rather than complete APIs. At the top, we see the JDOM facets used within the entire informa project. Here, seldomly used non-trivial parts of an API reveal that the project might be difficult to change. At the bottom, we see the JDOM facets used within a smaller scope of the project. Here, the displayed facets correspond to API usage scenarios: the parsers package reads XML files and observes XML nodes.
  • 16.
    Conclusion described several insightsto be gained about API usage http://softlang.uni-koblenz.de/explore-API-usage provided Quaatlas API atlas re-engineered Qualitas projects for precise extraction of API usage added meta-data concerning APIs, API domains, API facets presented multi-dimensional exploration model supported by IDE-like web-based platform Exapus configurable views on API usage cocktail, dispersion, distribution, footprint, coupling, profile future work empirical research on understanding API usage through exploration support flow analyses in views