#acl BaderLabGroup:read,write,revert,delete All:read #pragma section-numbers 2 = GOSlimmer Documentation = GOSlimmer is a plug-in built for Cytoscape that allows the user to interactively manipulate a Gene Ontology (GO) network to create a GO slim term set that optimally covers a gene set of interest. In addition, it provides the capability to export a gene annotation file containing only the Gene Ontology terms included in the GO slim set. <> == Introduction == Due to the large size of the vocabularies, it can be difficult to work with Gene Ontology (GO) as a whole. In total, the three ontologies comprise approximately 25,000 terms. In many cases, the user may want only a small subset of these terms that adequately describes a gene set of interest. For this reason, Gene Ontology also provides slim sets, which are ‘slimmed down’ versions of the ontologies containing a subset of the ontology terms. Slim sets can be useful in giving a broad overview of the ontology content without the detail. Gene Ontology provides a number of generic GO slim sets, however, if the user is working on a problem in one specific area of the ontology, there may not be an appropriate GO slim set available. Thus, it may become necessary for the user to create a custom slim set. For that reason, we have developed GOSlimmer, a plug-in built for use with Cytoscape. Cytoscape is an open source software platform for biological network visualization and analysis available from http://www.cytoscape.org. GOSlimmer, when run under Cytoscape, allows the user to interactively manipulate a Gene Ontology (GO) network to create a GO Slim term set which optimally covers a gene set of interest. <> == Installation == 1. Download and install Cytoscape from http://www.cytoscape.org. The GOSlimmer plug-in is compatible with Cytoscape version 2.4 and above. 1. Download the GOSlimmer plugin: [[attachment:GOSlimmer.jar]] 1. Copy the GOSlimmer.jar file to your [Cytoscape_Home]/plugins directory. 1. Start Cytoscape. This can be done by double clicking the Cytoscape icon in your [Cytoscape_Home] directory, or via the command line. * On Unix/Linux or MacOS X, run: cytoscape.sh * On Windows, run: cytoscape.bat 1. Verify that GOSlimmer appears in the Plugins menu of Cytoscape If GOSlimmer does not appear in the Plugins menu, verify that you have completed step one. You will have to restart Cytoscape to reload the plug-in. <> == Starting GOSlimmer == 1. Start Cytoscape 1. Load a Gene Ontology network in Cytoscape a. Go to File→Import→Ontology and Annotation (Figure 3.1) || {{attachment:starting_goslimmer_2a.jpg}} || || ~-'''Figure 3.1'''-~|| a. Select a gene annotation (for example, ‘Gene Association file for Saccharomyces cerevisiae). GOSlimmer will import its own annotation, so for the purpose of GOSlimmer, the file chosen here does not matter. (Figure 3.2) || {{attachment:starting_goslimmer_2b.jpg}} || || ~-'''Figure 3.2'''-~|| a. Select your gene ontology network (for example ‘Gene Ontology Full’) (Figure 3.3) || {{attachment:starting_goslimmer_2c.jpg}} || || ~-'''Figure 3.3'''-~|| a. Press the ‘Import’ button. Click on the ‘Close’ button in the ‘Loading Annotation’ window once the ontology and annotation have been successfully loaded. At this point, you should have a Gene Ontology network loaded in Cytoscape upon which GOSlimmer can operate (Figure 3.4). || {{attachment:starting_goslimmer_2.jpg}} || || ~-'''Figure 3.4'''-~|| 1. Make sure the newly imported Gene Ontology network is selected in the ‘Network’ Cytoscape panel (it need not have a network view), then start GOSlimmer. * Go to the Plugins→GOSlimmer→Start GOSlimmer (Figure 3.5) || {{attachment:starting_goslimmer_3.jpg}} || || ~-'''Figure 3.5'''-~|| If GOSlimmer does not appear in the Plugins menu, see [[#Installation|Installation]] (Section 2). Once GOSlimmer has been successfully started, the GOSlimmer panel will appear as a tab in the left-hand panel of Cytoscape, and three new sub network views will be created (Figure 3.6). These sub networks are derived from the main Gene Ontology network imported in step 2 with the nodes being divided into the three GO namespaces: Cellular Component, Biological Process and Molecular Function. For each of the three new network views, only the root node and its immediate children are visible, and they are hierarchically displayed. || {{attachment:starting_goslimmer_done.jpg}} || || ~-'''Figure 3.6'''-~|| == GOSlimmer By Example == === Example 1 – Manipulating a Gene Ontology network, selecting and deselecting nodes === 1. Start Cytoscape, load a gene ontology and start the GOSlimmer plug-in, as described in [[#StartingGOSlimmer|Starting GOSlimmer]] (Section 3). For the purpose of this example, we imported the ontology ‘Gene Ontology Full’. 1. In the ‘Gene Association’ panel, select ‘Saccharomyces cerevisiae’ in the drop down box, and click the ‘GO’ button. GOSlimmer will apply the selected gene annotation information, and a number of sub panels in the GOSlimmer panel will become visible (Figure 4.1). Note that the GO term node layout is random, and may not be exactly as shown in the figure. || {{attachment:example_goslimmer_1_2.jpg}} || || ~-'''Figure 4.1'''-~|| 1. Enlarge the network display area for the ‘Cellular Component’ ontology and zoom to display the entire current network. 1. In the Cellular Component ontology network, right click on the ‘cell part’ node to display the node context menu and click on ‘Expand’ (Figure 4.3). || {{attachment:example_goslimmer_1_4a.jpg}} || || ~-'''Figure 4.3'''-~|| The children of the ‘cell part’ GO term node will be displayed in the ontology network (Figure 4.4). By default, the depth limit of expansion is 1, so only the immediate children will be displayed. This option can be changed in the [[#AdvancedViewSettings|Advanced View Settings panel]] (Section 4.1.4). || {{attachment:example_goslimmer_1_4b.jpg}} || || ~-'''Figure 4.4'''-~|| 1. Right click on the ‘membrane’ node to display the node context menu, and click on ‘Expand’. The children of the ‘membrane’ GO node will be displayed in the ontology network (Figure 4.5). Notice that ‘membrane part’ is a child node of both ‘cell part’ and ‘membrane’. || {{attachment:example_goslimmer_1_5.jpg}} || || ~-'''Figure 4.5'''-~|| 1. Right click on the ‘membrane’ node to display the node context menu, and click on ‘Collapse’. The children nodes of this node will be hidden in the display area (Figure 4.6). Notice that the ‘membrane part’ node (which has another visible parent – ‘cell part’) remains visible, but the line connecting ‘membrane’ and ‘membrane part’ is hidden. || {{attachment:example_goslimmer_1_6.jpg}} || || ~-'''Figure 4.6'''-~|| 1. Now, suppose we are only interested in a few of the children nodes of the ‘cell part’ node. It is useful to ‘prune’ the other nodes so that they do not clutter up the display area unnecessarily. Right click on the ‘cell projection’ node to display the node context menu and select ‘Prune’. This will remove the ‘cell projection’ node (and any visible children) from the display area (Figure 4.7). || {{attachment:example_goslimmer_1_7.jpg}} || || ~-'''Figure 4.7'''-~|| 1. Right click on the ‘cell part’ node and click on ‘Collapse’. The children nodes of this node will be hidden in the display area (Figure 4.8). || {{attachment:example_goslimmer_1_8.jpg}} || || ~-'''Figure 4.8'''-~|| 1. Right click on the ‘cell part’ node and click on ‘Expand’. Notice that the ‘cell projection’ node (which was pruned in step 7) does not appear in the ontology network (Figure 4.9). || {{attachment:example_goslimmer_1_9.jpg}} || || ~-'''Figure 4.9'''-~|| 1. Right click on the ‘cell part’ node and click on ‘Collapse’ (Figure 4.10). || {{attachment:example_goslimmer_1_10.jpg}} || || ~-'''Figure 4.10'''-~|| 1. Right click on the ‘cell part’ node, and hold down the ‘Ctrl’ modifier key while selecting ‘Expand’. Notice that this time, the ‘cell projection’ node (and any other pruned nodes) will reappear in the ontology network (Figure 4.11). || {{attachment:example_goslimmer_1_11.jpg}} || || ~-'''Figure 4.11'''-~|| 1. Right click on the ‘cell part’ node to display the node context menu, and select ‘Select’. This will add the node to the GO slim set. The selected node will be highlighted in blue, and the coverage statistics will be updated in the Cellular Component coverage statistics panel (Figure 4.12). || {{attachment:example_goslimmer_1_12.jpg}} || || ~-'''Figure 4.12'''-~|| 1. Right click on the ‘cell part’ node and click on ‘Deselect’. This removes the node from the GO slim set. The deselected node will return to the default color, and the coverage statistics will be updated in the Cellular Component coverage statistics panel (Figure 4.13). || {{attachment:example_goslimmer_1_13.jpg}} || || ~-'''Figure 4.13'''-~|| 1. You can continue to expand/collapse/prune nodes to manipulate the GO ontology, and select/deselect nodes to add/remove them from the GO slim set. Example 2 shows how to create and export a simple GO slim set. === Example 2 – Exporting a simple GO slim set === 1. Start Cytoscape, load a gene ontology, and start the GOSlimmer plug-in, as described in [[#StartingGOSlimmer|Starting GOSlimmer]] (Section 3). For the purpose of this example, we imported the ontology ‘Gene Ontology Full’. 1. In the ‘Gene Association’ panel, select ‘Saccharomyces cerevisiae’ in the drop down box, and click the ‘GO’ button. GOSlimmer will apply the selected gene annotation information, and a number of sub panels in the GOSlimmer panel will become visible (Figure 4.14). Note that the GO term node layout is random, and may not be exactly as shown in the figure. || {{attachment:example_goslimmer_2_2.jpg}} || || ~-'''Figure 4.14'''-~|| 1. Enlarge the network display area for the ‘Cellular Component’ ontology, and zoom to display the entire current network. 1. In the Cellular Component ontology network, right click on the ‘extracellular region’ node to display the node context menu and click on ‘Select’ (Figure 4.16). || {{attachment:example_goslimmer_2_4.jpg}} || || ~-'''Figure 4.16'''-~|| The selected node will be highlighted in blue, and the coverage statistics will be updated in the Cellular Component coverage statistics panel (Figure 4.17). || {{attachment:example_goslimmer_2_4b.jpg}} || || ~-'''Figure 4.17'''-~|| 1. Repeat step 4 for the root node ‘cellular_component’. You should now have 2 selected nodes in the Cellular Component ontology network (Figure 4.18). || {{attachment:example_goslimmer_2_5.jpg}} || || ~-'''Figure 4.18'''-~|| 1. Click in the ‘Molecular Function’ coverage statistic panel to bring the molecular function ontology network into focus. Then enlarge the network display area for the ‘Molecular Function’ ontology, and zoom to display the entire current network. 1. In the Molecular Function ontology network, right click on the ‘binding’ node to display the node context menu and click on ‘Select’. The selected node will be highlighted in blue, and the coverage statistics will be updated in the Molecular Function coverage statistics panel (Figure 4.20). || {{attachment:example_goslimmer_2_7.jpg}} || || ~-'''Figure 4.20'''-~|| 1. Repeat step 7 for the root node ‘molecular_function'. You should now have 2 selected nodes in the Molecular Function ontology network (Figure 4.21). || {{attachment:example_goslimmer_2_8.jpg}} || || ~-'''Figure 4.21'''-~|| 1. Click in the ‘Biological Process’ coverage statistic panel to bring the biological process ontology network into focus. Then enlarge the network display area for the ‘Biological Process’ ontology, and zoom to display the entire current network. 1. In the Biological Process ontology network, right click on the ‘metabolic process’ node to display the node context menu and click on ‘Select’. The selected node will be highlighted in blue, and the coverage statistics will be updated in the Biological Process coverage statistics panel (Figure 4.23). || {{attachment:example_goslimmer_2_10.jpg}} || || ~-'''Figure 4.23'''-~|| 1. Repeat step 10 for the root node ‘biological_process'. You should now have 2 selected nodes in the Biological Process ontology network (Figure 4.24). || {{attachment:example_goslimmer_2_11.jpg}} || || ~-'''Figure 4.24'''-~|| 1. Click on ‘Export Slim Set Term List File’ in the import/export panel. This will open a file save dialog (Figure 4.25). Select the location, and name (e.g. selectedGOTerms.txt) and click save. || {{attachment:example_goslimmer_2_12.jpg}} || || ~-'''Figure 4.25'''-~|| This function will create a text file with the list of selected GO terms. Figure 4.26 shows the content of the file created in this example. || {{attachment:example_goslimmer_2_12b.jpg}} || || ~-'''Figure 4.26'''-~|| 1. Click on ‘Export Remapped Gene Association File’. This will open a file save dialog (Figure 4.27). Select the location and name (e.g. gene_association_sample.txt) and click save. || {{attachment:example_goslimmer_2_13.jpg}} || || ~-'''Figure 4.27'''-~|| This function will create an output file in the gene annotation file format [[#AppendixA|(Appendix A)]]. This file can then be imported by software programs that accept annotation files, such as Cytoscape. Figure 4.28 shows the first few lines of the file created in this example. || {{attachment:example_goslimmer_2_13b.jpg}} || || ~-'''Figure 4.28'''-~|| <> === Example 3 – Importing a user gene set === 1. Create a user gene set file to be imported into go slimmer using one of the following methods: a. Download the sample file used in this example: [[attachment:gene_ids_sample.txt]]. a. Create a new text file using your favorite text editor and enter the string identifiers for the genes, one on each line. a. Copy gene identifiers from the gene annotation file using Microsoft Excel. i. Download the desired gene annotation file from http://www.geneontology.org/GO.current.annotations.shtml i. Start Microsoft Excel, and open a blank worksheet. i. Go to Data-> Get External Data-> Import Text File (Figure 4.29). || {{attachment:example_goslimmer_3_1_c_iii.jpg}} || || ~-'''Figure 4.29'''-~|| i. In the ‘Choose a File’ dialog box, enable ‘All Documents’ and select the annotation file (Figure 4.30). Click the ‘Get Data’ button. || {{attachment:example_goslimmer_3_1_c_iv.jpg}} || || ~-'''Figure 4.30'''-~|| i. In Step 1 of the ‘Text Import Wizard’, select ‘Delimited’ (Figure 4.31) and click ‘Next’. || {{attachment:example_goslimmer_3_1_c_v.jpg}} || || ~-'''Figure 4.31'''-~|| i. In Step 2, ensure that ‘Tab’ is selected as the delimiter (Figure 4.32), and click ‘Next’. || {{attachment:example_goslimmer_3_1_c_vi.jpg}} || || ~-'''Figure 4.32'''-~|| i. In Step 3, click ‘Finish’ (Figure 4.33), and then ‘Ok’. || {{attachment:example_goslimmer_3_1_c_vii.jpg}} || || ~-'''Figure 4.33'''-~|| i. Once the data is loaded into the spreadsheet, select the data in Column B by clicking on the column header, and go to Edit->Copy (Figure 4.34). || {{attachment:example_goslimmer_3_1_c_viii.jpg}} || || ~-'''Figure 4.34'''-~|| i. Go to a new work sheet and paste the data (Edit->Paste) (Figure 4.35). If there are any empty rows at the top, delete them. || {{attachment:example_goslimmer_3_1_c_ix.jpg}} || || ~-'''Figure 4.35'''-~|| i. Insert a new row at the top, and enter ‘Gene Ids’ (Figure 4.36). || {{attachment:example_goslimmer_3_1_c_x.jpg}} || || ~-'''Figure 4.36'''-~|| i. Select the data in Column A by clicking on the column header, and go to Data->Filter->Advanced Filter (Figure 4.37). || {{attachment:example_goslimmer_3_1_c_xi.jpg}} || || ~-'''Figure 4.37'''-~|| i. Click OK to use the first row as a column label. i. In the ‘Advanced Filter’ dialog, select ‘Filter the list, in-place’, check the ‘Unique records only’ checkbox (Figure 4.38), and click ‘OK’. || {{attachment:example_goslimmer_3_1_c_xiii.jpg}} || || ~-'''Figure 4.38'''-~|| i. Once the data has been filtered, you are left with a list of unique gene identifiers, with the duplicate rows hidden. Make sure the filtered list is still selected, then go to Edit->Copy (Figure 4.39). || {{attachment:example_goslimmer_3_1_c_xiv.jpg}} || || ~-'''Figure 4.39'''-~|| i. Go to Data->Filter->Show All (Figure 4.40), and then delete the highlighted rows. || {{attachment:example_goslimmer_3_1_c_xv.jpg}} || || ~-'''Figure 4.40'''-~|| i. Go to Edit->Paste to paste the filtered list, and delete the first row (‘Gene Ids’ label) (Figure 4.41). || {{attachment:example_goslimmer_3_1_c_xvi.jpg}} || || ~-'''Figure 4.41'''-~|| i. At this point, you can delete any of the rows of gene identifiers in which you are not interested. For the purpose of this example, we have kept only the first 15 gene identifiers and deleted the rest (Figure 4.42). || {{attachment:example_goslimmer_3_1_c_xvii.jpg}} || || ~-'''Figure 4.42'''-~|| i. Go to File->Save As. In the File save dialog (Figure 4.43), select ‘Text’ as the file format, enter a filename, select the file location, and click the ‘Save’ button. Click ‘Ok’ to save only the active sheet, and click ‘Yes’ to keep the workbook in the text format. || {{attachment:example_goslimmer_3_1_c_xviii.jpg}} || || ~-'''Figure 4.43'''-~|| 1. Start Cytoscape, load a gene ontology, and start the GOSlimmer plug-in, as described in [[#StartingGOSlimmer|Starting GOSlimmer]] (Section 3). For the purpose of this example, we imported the ontology ‘Gene Ontology Full’. 1. In the ‘Gene Association’ panel, select ‘Saccharomyces cerevisiae’ in the drop down box, and click the ‘GO’ button. GOSlimmer will apply the selected gene annotation information, and a number of sub panels in the GOSlimmer panel will become visible (Figure 4.44). Note that the node layout is random, and may not be exactly as shown in the figure. || {{attachment:example_goslimmer_3_3.jpg}} || || ~-'''Figure 4.44'''-~|| 1. Enlarge the network display area for the ‘Cellular Component’ ontology, and zoom to display the entire current network. 1. Click on the ‘Import Gene Set’ button in the ‘Import User Gene Set’ panel (Figure 4.46) to import a user gene set. || {{attachment:example_goslimmer_3_5a.jpg}} || || ~-'''Figure 4.46'''-~|| This will open a file open dialog (Figure 4.47). Navigate to the file you created in Step 4, and click ‘Open’. || {{attachment:example_goslimmer_3_5b.jpg}} || || ~-'''Figure 4.47'''-~|| GOSlimmer will parse the file and import the user gene set. The ‘Import User Gene Set’ panel will display statistics describing the number of genes successfully imported. In addition, the nodes will be sized to represent the user gene set, and the coverage will reflect the coverage of the user genes (Figure 4.48). || {{attachment:example_goslimmer_3_5c.jpg}} || || ~-'''Figure 4.48'''-~|| 1. Click on each of the labels in the ‘Import User Gene Set’ panel. A popup window will appear listing the genes corresponding to the label. For example, clicking on the 15 User Genes label will result in a popup window listing all the user genes in the file (Figure 4.49). || {{attachment:example_goslimmer_3_6.jpg}} || || ~-'''Figure 4.49'''-~|| === Example 4 – Using the GO term generator === 1. Start Cytoscape, load a gene ontology, and start the GOSlimmer plug-in, as described in [[#StartingGOSlimmer|Starting GOSlimmer]] (Section 3). For the purpose of this example, we imported the ontology ‘Gene Ontology Full’. 1. In the ‘Gene Association’ panel, select ‘Saccharomyces cerevisiae’ in the drop down box, and click the ‘GO’ button. GOSlimmer will apply the selected gene annotation information, and a number of sub panels in the GOSlimmer panel will become visible (Figure 4.50). Note that the node layout is random, and may not be exactly as shown in the figure. || {{attachment:example_goslimmer_4_2.jpg}} || || ~-'''Figure 4.50'''-~|| 1. Enlarge the network display area for the ‘Cellular Component’ ontology, and zoom to display the entire current network. 1. Expand the ‘Automatic GO Set Term Generator’ panel in the GOSlimmer side panel (Figure 4.52). || {{attachment:example_goslimmer_4_4.jpg}} || || ~-'''Figure 4.52'''-~|| 1. Click the ‘Find Best Covering Terms’ button. This will generate a list of the 10 GO terms with the highest inferred gene coverage of the uncovered genes (Figure 4.53). Note that the root node has 100% coverage, and so will always be at the top of the list. || {{attachment:example_goslimmer_4_5.jpg}} || || ~-'''Figure 4.53'''-~|| 1. In the list, click on one of the entries to highlight it, and then click the ‘Show’ button. This will display the corresponding GO term node in the ontology network, and highlight it in yellow (Figure 4.54). || {{attachment:example_goslimmer_4_6.jpg}} || || ~-'''Figure 4.54'''-~|| 1. Right click on the highlighted node to display the node context menu, and click ‘Select’ to add it to the GO slim set (Figure 4.55). || {{attachment:example_goslimmer_4_7a.jpg}} || || ~-'''Figure 4.55'''-~|| The selected node will be highlighted in blue, and the coverage statistics will be updated in the Cellular Component coverage statistics panel (Figure 4.56). || {{attachment:example_goslimmer_4_7b.jpg}} || || ~-'''Figure 4.56'''-~|| 1. In the ‘Automatic GO Set Term Generator’ panel, click on the ‘Find Best Covering Terms’ button again. This will generate a new list containing the GO terms with the highest coverage of uncovered genes (Figure 4.57). || {{attachment:example_goslimmer_4_8.jpg}} || || ~-'''Figure 4.57'''-~|| 1. Steps 6 and 7 can then be repeated iteratively to build a slim set. In this way, the user can use the list of GO terms with highest coverage as a guide in building a slim set. == Running GOSlimmer == Once the GOSlimmer plug-in has been successfully installed and started, it can be used to interactively manipulate a Gene Ontology (GO) network. The user can create a GO Slim term set by selecting GO terms in each of the three newly created network views, and can investigate various levels of GO by expanding GO terms to show their children terms. Once a GO slim set has been created, the user can export a remapped version of the gene annotation file that remaps the genes associated with unselected terms to their closest selected ancestor along all ancestor paths. GOSlimmer has a number of functions and tools available to aid in the creation of the GO Slim set, and to manipulate the view of GO term network. These are found in the [[#GOSlimmerPanel|GOSlimmer panel]] and in the [[#ContextMenu|node context menu]]. <> === GOSlimmer Panel === When GOSlimmer is initially started, the GOSlimmer Panel has only one visible sub panel, the ‘Gene Association Import’ panel (Figure 5.1). Once the user has imported a gene annotation file, a number of other sub panels are made visible (Figure 5.2). These sub panels are discussed in Sections 5.1.1-5.1.6. || {{attachment:panel_initial_labeled.jpg}} || {{attachment:panel_all_labeled.jpg}} || || ~-'''Figure 5.1:''' GOSlimmer panel after GOSlimmer plug-in is installed and started.-~ || ~-'''Figure 5.2:''' GOSlimmer panel after a gene annotation file is applied using the GOSlimmer initial panel.-~ || ==== Gene Association Import Panel ==== || {{attachment:panel_gene_association.jpg}} || || ~-'''Figure 5.3:''' Gene Association Import panel showing the Advanced Options sub panel-~ || The ‘Gene Association Import’ panel (Figure 5.3) allows the user to specify the gene annotation information that is to be imported into the Gene Ontology namespace networks. This information can be imported using the predefined annotations for a number of species, or from a custom annotation file. For more information on the gene annotation file, see [[#AppendixA|Appendix A]]. Once the import has been completed, each node in the ontology networks will be annotated with the genes associated with the corresponding GO term. The genes associated with each GO node are available as a GO node attribute in Cytoscape. In addition, the imported file name is displayed in the text label of this panel. To import a custom gene annotation file: a. Go to ‘Advanced Option’ → ‘Browse for annotation file…’ a. Select the gene annotation file, and click ‘Open’ a. Click the ‘Go’ button beside the drop down box in the Gene Association Import Panel. To import a predefined annotation: a. Select the desired species in the gene association file drop down box a. Click on the ‘Go’ button. <> ==== User Gene Import Panel ==== || {{attachment:panel_user_gene_before.jpg}} || {{attachment:panel_user_gene_after.jpg}} || || ~-'''Figure 5.4:''' ‘User Gene Import Panel’ before importing a set of user genes.-~ || ~-'''Figure 5.5:''' ‘User Gene Import Panel’ after importing a set of user genes.-~ || The ‘User Gene Import’ panel allows the user to import a file containing a custom gene set. The genes in this set must be a subset of the genes already annotated to the GO terms. The file must contain the string identifiers for the genes (corresponding to the second column in the gene annotation file, for example 'S000005164') separated by new line characters. See [[#Example3|Example 3]] for details on how to create a user gene set file. Initially, this panel contains only the ‘Import Gene Set’ button (Figure 5.4). To import a custom gene set, click on the ‘Import Gene Set’ button, select the desired file using the file selector, and click ‘Open’. GOSlimmer will then parse the user gene file, and display 3 text labels on the ‘User Gene Import Panel’ (Figure 5.5): * The total number of user genes found in this file. If this number is greater than 0, clicking on this label will open a window displaying the list of user genes in the file. * The number of user genes successfully mapped to GO terms. If this number is greater than 0, clicking on this label will open a window displaying the list of user genes successfully mapped. * The number of user genes that could not be mapped to GO terms. If this number is greater than 0, clicking on this label will open a window displaying the list of user genes that could not be mapped. <> ==== Coverage statistic panels ==== || {{attachment:panel_coverage_stats.jpg}} || || ~-'''Figure 5.6:''' Coverage statistic panels with the list of selected GO terms expanded for the Biological Process network. In this case, the ‘metabolic process’ GO term has been selected in the Biological Process network, resulting in the displayed direct and inferred coverage statistics.-~ || The ‘Coverage Statistics’ panels display the percentage of genes that are annotated to one or more GO terms in the selected slim set, and hence are ‘covered’ by that set. Coverage statistics are important in constructing a slim set that optimally covers a gene set of interest. There are three coverage statistic panels, one for each of the GO namespaces (i.e. Biological Process, Cellular Component and Molecular Function), and each displays the coverage statistics for its corresponding namespace. Each of these panels contains two text labels (Figure 5.6). The first text label displays the inferred gene coverage, or the percentage of genes covered by the GO terms in the selected slim set and their descendants. The second label displays the direct gene coverage, or the percentage of genes directly covered by the GO terms in the selected slim set. In addition, each panel also contains a collapsible sub panel with a list of selected GO terms. Clicking on one of the GO terms in the list will result in the view being focused onto that particular GO term. In addition, clicking inside one of the coverage statistic panels will result in the network view for its corresponding ontology to be brought into focus. For example, clicking on the ‘Biological Process’ statistic panel will result in the ‘Biological Process’ ontology network being brought into focus in the network panel. If a user gene set has been imported [[#ImportUserGene|(Section 5.1.2)]], then the coverage statistics can be displayed for either the user genes, or for the complete set of annotated genes. See [[#AdvancedViewSettings|Advanced View Options]] (Section 5.1.4) for more details on how to switch between these two modes. <> ==== Advanced View Options Panel ==== || {{attachment:panel_advanced_settings.jpg}} || || ~-'''Figure 5.7:''' Advanced View Settings panel-~ || The ‘Advanced View Options’ panel (Figure 5.7) provides a number of advanced settings to manipulate the GO Slimmer view. It allows the user to determine the parameters used in calculating the size of the nodes, as well as the node labels and tool tips. In addition, the user can control which children nodes are displayed when a parent node is expanded, as well as which statistics are to be displayed in the [[#CoverageStatisticPanel|Coverage Statistic Panel]] (Section 5.1.3). Sections 5.1.4.1-5.1.4.7 cover each of these options in detail. ===== Include child nodes when calculating node size ===== In GOSlimmer, the GO term node size is proportional to the gene coverage of that particular node. This option allows the user to determine whether direct or inferred coverage will be used to calculate the node size. If the option is unchecked, then the GO term node size will be proportional to the percentage of genes directly annotated to that term. If the option is checked, the GO term node size will be proportional to the percentage of genes annotated to the term and any of its descendants. When GOSlimmer is started, by default, this option is unchecked. ===== Show GO definition as node tool tip ===== This option allows the user to set the gene ontology definition as a tool tip for the GO term nodes. If this option is checked, the corresponding gene ontology definition will appear as a tool tip when the mouse is rolled over a GO term node. If it is unchecked, no tool tip will appear. This option is unchecked by default. ===== Size nodes according to the user gene set ===== This option allows the user to determine which gene set will be used to calculate the node size. If this option is unchecked, the GO term node size will be proportional to the percentage of genes in the complete set of annotated genes that are associated to that term. If it is checked, and a user set of genes has been imported ([[#ImportUserGene|Section 5.1.2]]), the GO term node size will be proportional to the percentage of genes in the user gene set that are annotated to the term. This option is unchecked by default when GOSlimmer is initially started, and is checked by default when a user gene set is imported. ===== Calculate coverage of user specified genes ===== This option allows the user to control the statistics to be displayed in the [[#CoverageStatisticPanel|Coverage Statistic Panels]] (Section 5.1.3). If this option is unchecked, the direct and inferred coverage statistics displayed in each of the three statistic panels will be calculated using the complete set of annotated genes. If it is checked, and a user gene set has been imported ([[#ImportUserGene|Section 5.1.2]]), the statistics will be calculated using the user gene set. This option is unchecked by default when GOSlimmer is initially started, and is checked by default when a user gene set is imported. ===== Expand nodes with associated genes only ===== This option allows the user to determine the child nodes to be displayed when a parent node is expanded. If this option is checked, GOSlimmer will display only the child nodes that have associated genes (either directly, or inferred through their own children). If it is unchecked, all un-pruned child nodes will be displayed. This option is checked by default. ===== Label nodes with Ontology Term Name ===== This option allows the user to label the GO term nodes with their corresponding gene ontology names. If this option is checked, each node will be labeled with a formatted version of its GO name. If it is unchecked, the GO term nodes will be labeled with their gene ontology IDs. By default, this option is checked. ===== Expand nodes to specified depth ===== This option allows the user to limit and specify the depth of node expansion. If this option is checked, the expansion of a node will be limited to show only those descendants up to the depth specified in the accompanying text box. If it is unchecked, there is no limit to the expansion, and a node will be expanded to display all of its descendants. By default, this option is checked with a depth of expansion limit of 1 (i.e. the expansion of a node is limited to show only the immediate children of that node). ==== Import/Export Panel ==== || {{attachment:panel_export.jpg}} || || ~-'''Figure 5.8:''' Export panel-~ || The ‘Import/Export’ panel provides import and export functions that may be useful to the user when creating a GO slim set (Figure 5.8). There are currently three options: [[#ExportGeneAssociation|'Export Remapped Gene Association File']], which creates a new gene association file containing only those GO terms in the selected slim set, [[#ExportSlimSet|'Export Slim Set Term List File']], which records the list of selected GO terms to a text file, and [[#ImportSlimSet|'Import Slim Set Term List File']], which reads a text file containing a list of GO terms and adds those terms to the current slim set. <> ===== Export Remapped Gene Association File ===== This button allows the user to create a new gene association file containing only those GO terms in the selected slim set. This file can then be re-imported via Cytoscape as a gene annotation file for a gene network. The new file is a remapped version of the original gene association file where the genes annotated to GO terms that are not in the slim set (unselected terms) are remapped to their closest selected ancestor along each ancestor path. Thus, for each entry in the original gene annotation file, if the specified GO term is in the slim set, the entry remains unchanged in the new annotation file. However, if the specified GO term is not in the slim set, for each ancestor path, an entry is created in the new annotation file replacing the GO term with its closest selected ancestor. It is important to note that the root node for each of the three namespaces must be selected before exporting the remapped gene association file. Otherwise, some gene annotations may be lost. In addition, in order for the exported file to be re-imported as a gene annotation file via Cytoscape, the file name must begin with ‘gene_association’. To clarify this export function, consider the following simplified example. Figure 5.9 depicts a simple ontology with 5 terms and 3 annotated genes. In the network, the circular nodes represent the ontology terms and the rectangular nodes represent the genes. We have created a slim set containing the ontology terms Term1 and Term3, which are highlighted in blue in the figure. The original annotation file would contain 3 gene annotations: 1. Gene1 and Term3 1. Gene2 and Term4 1. Gene3 and Term5 || {{attachment:panel_export_example_genes.jpg}} || || ~-'''Figure 5.9:''' Simple ontology with 5 terms (circular nodes) and 3 annotated genes (rectangular nodes). A slim set contains terms Term1 and Term3, which are highlighted in blue.-~ || Now, consider the remapped version of the annotation file created by the ‘Export Remapped Gene Association File’ function. We will look at each of the entries in the original gene association file in turn. Consider the first annotation, Gene1 and Term3. Since Term3 is in the slim set, this entry will remain the same in the new annotation file. Next, consider the second annotation, Gene2 and Term4. Since Term4 is not in the slim set, we find its closest selected ancestor along all ancestor paths. There is only one ancestor path from Term4 to the root node (Term4→Term2→Term1), and the closest selected ancestor is the root node (Term1). Thus, the annotation of Gene2 and Term4 in the original annotation file will be replaced by the annotation of Gene2 and Term1 in the new annotation file. Finally, consider the annotation of Gene3 and Term5. There are two ancestor paths from Term5 to the root node (path 1: Term5→Term2→Term1, path 2: Term5→Term3→Term1). The closest selected ancestor along path 1 is the root node (Term1), and the closest selected ancestor along path 2 is Term 3. Thus, the annotation of Gene3 and Term5 in the original annotation file will be replaced by two entries in the new annotation file: the annotation of Gene3 and Term1, and the annotation of Gene3 and Term3. Therefore, the new annotation file created by the export function will contain the following gene annotations: 1. Gene1 and Term3 1. Gene2 and Term1 1. Gene3 and Term1 1. Gene3 and Term3 <> ===== Export Slim Set Term List File ===== This button allows the user to export the list of GO terms in the selected slim set. The export file is a tab-delimited text file containing the gene ontology IDs and the gene ontology names for the selected terms separated by new line characters. Figure 5.10 shows the contents of the output file created by this function when four GO terms were selected in the slim set. || {{attachment:panel_export_golist_example.jpg}} || || ~-'''Figure 5.10:''' Contents of the output file created by the ‘Export Slim Set Term List File’ function after a slim set containing four GO terms was constructed.-~ || <> ===== Import Slim Set Term List File ===== This button allows the user to import a list of GO terms into GOSlimmer and automatically add those terms to the current slim set. The import file must be a tab-delimited text file where the first field on each line is a GO term ID. Other fields may be present in the file, and are ignored by the import function. This allows flexibility so that the user can import a text file containing only GO term IDs, or can import a GO slim set term file that has been exported during a previous GOSlimmer session ([[#ExportSlimSet|Section 5.1.5.2]]). This import function parses the input file to get a list of GO term IDs. It then adds each GO term to the current slim set, displaying it in the network if it is not currently visible. If any of the IDs are invalid, a warning message appears to the user. ==== Automatic GO Set Term Generator Panel ==== || {{attachment:panel_generator.jpg}} || || ~-'''Figure 5.11:''' Automatic GO Set Term Generator-~ || The ‘Automatic GO Set Term Generator’ panel (Figure 5.11) allows the user to automatically generate a list of GO terms with the highest coverage of the remaining uncovered genes (i.e. the genes that are not associated with any of the GO terms in the current slim set). This generator can be used iteratively to aid in the construction of a slim set of GO terms that optimally covers a gene set of interest. To use this tool, click on the ‘Find Best Covering Terms’ button. GOSlimmer first finds the list of genes that are not annotated to any of the GO terms in the current slim set, and then determines the GO terms in the network that have the highest inferred coverage of those genes. If a user gene set has been imported, it is that gene set that is considered for coverage. Otherwise, the complete set of annotated genes is used. The generator then displays the list of terms in the list box of this panel, with the term’s name, gene ontology ID and the number of uncovered genes covered by this GO term. Once the list has been generated, the user can select an entry in the list, and click on the ‘Show’ button to interactively find the term in the network, and display it if it is not already visible. The user can then manually select the GO term to add it to the slim set. In addition, this panel also allows the user to specify the size of the list to generate, with the default being set to 10 GO terms. Note that this tool functions on the ontology namespace network that is currently in focus, so to run it on a different namespace, the user must first bring that network into focus (for example, by clicking on its corresponding coverage statistic panel), before re-executing the utility. <> === Node Context Menu === || {{attachment:context_menu.jpg}} || || ~-'''Figure 5.12:''' Biological Process network showing the node context menu for the ‘biological_process’ node (GO:0008150).-~ || The node context menu contains a set of actions that can be performed on a node. To display the menu, right click on a GO term node in the network. In addition to the default Cytoscape options (Visual Mapping Bypass, LinkOut), a number of GOSlimmer specific options are also available (Figure 5.12). These menu items are discussed in Sections 5.2.1-5.2.5. ==== Collapse ==== This menu item allows the user to collapse a node to hide its descendants, as well as all of the incoming edges to this node in the network (Figures 5.13 and 5.14). In the case where a descendant of this node has another visible parent in the network (with a visible connecting edge to that parent) the child node will remain visible, but the edge connecting this child and the collapsed node will be hidden. || {{attachment:context_menu_collapse_before.jpg}} || {{attachment:context_menu_collapse_after.jpg}} || || ~-'''Figure 5.13:''' Biological Process network before the 'gene expression' node is collapsed.-~ || ~-'''Figure 5.14:''' Biological Process network after the 'gene expression' node is collapsed.-~ || ==== Prune ==== This menu item allows the user to selectively hide a node in the network without deleting it from the network (Figures 5.15 and 5.16). When performed on a node, the ‘prune’ function has the effect of first collapsing the node to hide all of its children and incoming edges, and then hides the pruned node itself, as well as all of its outgoing edges. Once a node has been pruned, it will not be displayed if one of its parents is expanded unless a ‘full’ expand is executed (see [[#Expand|Expand]] (Section 5.2.3)). || {{attachment:context_menu_prune_before.jpg}} || {{attachment:context_menu_prune_after.jpg}} || || ~-'''Figure 5.15:''' Biological Process network before the 'gene expression' node is pruned.-~ || ~-'''Figure 5.16:''' Biological Process network after the 'gene expression' node is pruned.-~ || <> ==== Expand ==== This menu item allows the user to expand a node to show its descendants, as well as the incoming edges to this node in the network (Figures 5.17 and 5.18). The expansion depth can be limited, with the depth limit specified by the user. Parameters to control node expansion are found in the [[#AdvancedViewSettings|Advanced View Settings]] panel of the GOSlimmer panel (Section 5.1.4). Children nodes that have been pruned will not be displayed when the parent node is expanded unless a ‘full’ expand is performed. To execute a ‘full’ expand and display all children nodes of a node (including those children nodes that have been pruned), the user must use the ‘control’ modifier key while expanding the parent node (i.e. right click on the parent node to display the context menu and hold down the ‘control’ key while selecting the ‘expand’ option). || {{attachment:context_menu_expand_before.jpg}} || {{attachment:context_menu_expand_after.jpg}} || || ~-'''Figure 5.17:''' Biological Process network before the 'translation' node is expanded.-~ || ~-'''Figure 5.18:''' Biological Process network after the 'translation' node is expanded.-~ || ==== Select ==== This menu item allows the user to add a node to the current GO slim set (Figures 5.19 and 5.20). When a node is selected, it is highlighted in blue in the network. In addition, the [[#CoverageStatisticPanel|coverage statistic panel]] (Section 5.1.3) for the corresponding ontology is automatically updated to reflect the new statistics, and the name and gene ontology ID of the selected GO term are added to the list of selected terms. || {{attachment:context_menu_select_before.jpg}} || || ~-'''Figure 5.19:''' Biological Process network before the 'biological_process' node is selected. Note that the inferred and direct coverage statistics in the 'Biological Process' coverage statistics panel are both 0%.-~|| || {{attachment:context_menu_select_after.jpg}} || || ~-'''Figure 5.20:''' Biological Process network after the 'biological_process' node is selected. The selected node is highlighted in blue. In addition, the 'Biological Process' coverage statistic panel shows the new inferred and direct coverage statistics, and the 'biological_process' node appears in the list of selected GO terms.-~|| ==== Deselect ==== This menu item allows the user to remove a node from the current GO slim set (Figures 5.21 and 5.22). When a node is deselected, the [[#CoverageStatisticPanel|coverage statistic panel]] (Section 5.1.3) for the corresponding ontology is automatically updated to reflect the new statistics, and the name and gene ontology ID of the selected GO term are removed from the list of selected terms. In addition, the deselected GO term is returned to the original default color. || {{attachment:context_menu_deselect_before.jpg}} || || ~-'''Figure 5.21:''' Biological Process network before the 'biological_process' node is deselected. Note that the inferred coverage is 100%, the direct coverage is 26.13%, and the 'biological_process' term is in the list of selected GO terms.-~|| || {{attachment:context_menu_deselect_after.jpg}} || || ~-'''Figure 5.22:''' Biological Process network after the 'biological_process' node is deselected. Note that the inferred and direct coverage statistics in the 'Biological Process' coverage statistic panel have been updated, and that the 'biological_process' term has been removed from the list of selected GO terms.-~|| <> == Appendix A: Gene Annotation File == Gene Annotation files are used to link gene products (proteins, genes, etc) to Gene Ontology terms. They use a tab-delimited format with 15 fields, and each line represents a single association between a gene product and a GO term. Table 1 presents a list of the fields used by GOSlimmer when importing the gene annotation file. For complete definitions and requirements of the field contents, or for more information on Gene Annotation files, please visit http://www.geneontology.org/GO.format.annotation.shtml. ||Field Number||Name||Definition||Example|| ||2||DB_Object_ID||A unique identifier for the item being annotated (ie. the gene product).||S000000296|| ||3||DB_Object_Symbol||A unique symbol to which the DB_Object_ID is matched. It is not just an identifier, and should have a biologically relevant meaning (for example, a gene symbol).||PHO3|| ||5||GO ID||The Gene Ontology identifier for the term associated with this gene product.||GO:0003993|| ||9||Aspect||The namespace for the GO ID specified in field 5. The value for this field can be one of P (biological process), F (molecular function) or C (cellular component).||F|| ||10||DB_Object_Name||Name of the gene or gene product.||acid phosphatase|| ||11||DB_Object_Synonym||DB_Object_Symbol for genes or gene products that are synonyms for this object.||YBR092C||