| BGI WEGO Web Gene Ontology Annotation Plotting |
|
WEGO Document Introduction: ![]() Fig 1. The sample figure of WEGO output, from the rice genome paper published on science. The GO (Gene Ontology) project began as the collaboration of Flybase, Saccharomyces Genome Database (SGD) and Mouse Genome Base. And now it has gone beyond what it used to be. There are so many GO resources and tools that help biologists explore the depth of gene analysis, from several genes to large-scale. WEGO (Web Gene Ontology Annotation Plot) is a useful tool for plotting GO annotation results. It has been widely used in many important biological research projects, such as the rice genome project [Yu, J. et al. Science 296, 79-92 (2002); Yu, J. et al. PLoS Biol 3, e38 (2005)] and the silkworm genome project [Xia, Q. et al. Science 306, 1937-40 (2004)]. It has become one of the daily tools for downstream gene annotation analysis, especially when performing comparative genomics tasks. WEGO along with two other tools, namely External to GO Query and GO Archive Query, are freely available for all users. Any suggestions are welcome at wego@genomics.org.cn. Here is a sample output generated by WEGO (Fig. 1). There are three steps to work with WEGO. The first is to upload annotation result(s). The input file(s) can be in WEGO native format, or if you are using InterProScan as the annotation tool, the result(s) could be used directly. We support InterProScan text, raw and XML output formats as the input format of WEGO. Then, you will be redirected to a webpage with hierarchical GO tree in which all the GO terms contained in the files uploaded are included. You could choose any GO terms interested at this page to display in the output histogram. The last step is figure setting, such as the figure caption, histogram color(s) and legend description. Currently, WEGO support SVG, PNG, PostScript, EPS and GIF as output graph format. You can also get the results by our feedback Email. Cite WEGO: Ye J, Fang L, et al. Nucleic. Acids Res., 2006, 34(Web service issue), 293-297 [PDF] Manual: [Input of WEGO] Currently, WEGO supports 3 kinds of input format: WEGO native format, InterProScan text, raw and XML output formats. WEGO native format is a simple text file with one gene record per line. And each column is tab-delimited. The first column is the gene name and the others are the GO ID in format of GO:0000015. The annotation columns could be empty if there is no annotation result available for the gene. It supports comment line which starts with an exclamation point (!). A sample file of WEGO native format could be downloaded from the homepage of WEGO. ![]() Fig2. WEGO native format. The first column is the protein ID, the followed column(s) are the GO ID. [Output of WEGO] SVG is the default output format of WEGO, for its wide support by many industrial and open source software, such as CorelDRAW, Illustrilator, inkscape, ImageMagick and so on. With the help of SVG plug-in, SVG graph could be viewed in browser. Another advantage of SVG is easy conversion to other graph formats and suitability for publishing. WEGO also support other graph formats, including the bitmap formats PNG, JPEG and GIF, suitable for on-screen display, and the other vector formats PostScript and EPS formats. The file will be compressed for downloading. Some useful links of SVG tools are listed on the WEGO homepage. [Uses of WEGO] There are two ways to work with WEGO. The first is to upload the annotation files (up to three files at a time).The input files must be in one of the three formats described above. The version of GO archive is optimal for that it suggested to be the same version of that used in annotation. The second way is to enter the job ID if a previous analysis on WEGO web site was performed within three days. WEGO allows users to change almost all of the settings from their prior session via this job ID. Even the version of GO archive could be changed without re-uploading the input files. ![]() Fig3. Step 1 of WEGO. ![]() Fig4. Step 2 of WEGO, GO tree edit page. 2. Then the user is redirected to a webpage with hierarchical GO tree in which all the GO terms contained in the files uploaded are included. Any GO ID that don't exist in the GO archive are listed in the "view error" page. Another tool GO Archive Query is developed to help users, especially the one without information of the GO version used in annotation, deal with this frequently happened error. On the top of the page, are the ontology type selection box, the GO level input box and the view error button. Users can switch among the three ontology trees via the ontology type selection box. And the number inputted in the GO level input box is used to limit the level of GO tree displayed in the page. The user could choose any GO terms interested at this page to display in the output histogram. (The second level is chosen as default.) The hierarchical GO tree is the main body of this page. Each line of the GO tree represents a GO term. From left to right of each line, are selection accelerating toolbar, gene number associated to this GO term, gene percentage of the GO term, Pearson Chi-Square test p-value of every two input data, GO ID and GO term annotation. If there is only one input data the Pearson Chi-Square test p-value will be dropped. If there are three input data, the three columns stand for the p-value between every couple of the three datasets in the order of one-two, one-three, two-three. Comparing with Fisher's exact test, Pearson Chi-Square test is appropriate for 2x2 matrix when all of the expected counts are greater than 5. However, it does give 'Ml' standing for meaningless if any of the expected counts are less than 5. And 'Na' stands for not available of the p-value of Pearson Chi-Square test. Red arrows are used to mark items with significant relationship. (since the significance level is below the 0.05.) The 'arrowed' button was designed to help users to select all the significant items. Users can switch among the three ontology trees to choose the GO terms interested. The selections are saved in the server automatically and are available in the "summary". The output figure could also be previewed before setting. When all the selection are OK, the user could click the "plot" button to enter the export decoration page to set the output figure. ![]() Fig5. Step 3 of WEGO, export decoration setting page. ![]() Fig6. Step 4 of WEGO, output format convertion. [External to GO Query] ![]() Fig7. External to GO Query. [GO Archive Query] ![]() Fig8. GO Archive Query. References: 1. Xia, Q. et al. A draft sequence for the genome of the domesticated silkworm (Bombyx mori). Science 306, 1937-40 (2004).[pdf] 2. Yu, J. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296, 79-92 (2002).[pdf] 3. Yu, J. et al. The Genomes of Oryza sativa: A History of Duplications, PLoS Biol 3, e38 (2005).[pdf] |
[WEGO Resources] Home Documents External to GO Query GO Archive Query [WEGO mirrors] Homepage Backup site Hangzhou mirror SWU mirror [GO Documents] Gene ontology documents GO flat file format guide OBO flat file format guide XML version guide MySQL version guide [GO Resources] The Gene Ontology homepage The Sequence Ontology Saccharomyces Genome Database FlyBase Mouse Genome Informatics The InterProScan homepage The GOA homepage Cluster of Orthology Groups [GO Tools] DAG-Edit AmiGo GOFigure GOblet Manatee QuickGO GO Term Finder [SVG Tools] ImageMagick Inkscape SVG Editor Adobe SVG viewer Apache SVG Tools |
|
BGI all right reserved! wego@genomics.org.cn |