Help: Data Analysis - Upload your own data
-
Chose the type of analysis you want to perform from the Data Analysis menu
(Pathway,
Gene Ontology,
Network,
Interactor,
TFBS
) and select a file to upload by clicking on the "Upload File" button - upload a tab-delimited file of protein/gene
identifiers or accession numbers (human, murine or bovine gene/protein identifiers only).
Alternatively, click on the "Web Form" button and paste your tab-delimited data in the text box (max. 1000 lines)
Note: There should be only one accession number per row. Probes that map to multiple genes should be removed.
Accession numbers from the following databases are currently accepted:
- Ensembl
- RefSeq
- Entrez
- UniProt
- InnateDB (gene IDs only)
We strongly recommend to use Ensembl identifiers since they have a one-to-one mapping to InnateDB gene identifiers. Identifiers which map to multiple genes (e.g. some UniProt identifiers) will be ignored.
- Click on the column headers to specify which column in your data file contains the identifiers/accession numbers for each gene (and which database they come from). This is called the "Cross-reference ID".
You can only specify one cross-reference ID column. Please note that when using identifiers from InnateDB, only gene IDs are allowed, not interactions IDs!
-
Specify the Cross-reference database. This is the database where
the identifiers in the cross-reference column come from.
- If you have included gene expression data - identify which columns contain the gene expression values and their associated p-values.
You may also identify the column containing the probe IDs if you have included them in your file.
Including quantitative data such as gene expression data is optional but a very useful way to investigate
quantitative data in a pathway and interaction network context and
to carry out subsequent analysis such as Pathway
Over-representation Analysis. It is used to include gene expression values
in your file that are mapped to molecule cross-references.
Expression values must be in the format where a value of +2 represents a 2 fold increase
in expression and a value of -2 a 2 fold decrease in expression.
You can specify values from up to ten different conditions or
time-points. You can also specify a name for each condition.
Filter the Network Analysis results
You can choose to filter the results by using one of the following methods:
- Do not filter the results
This will return all interactions that involve genes/proteins in the uploaded list.
- Only show interactions between uploaded molecules
This will ONLY return interactions BETWEEN genes/proteins in the user-uploaded list.
i.e. if molecule A interacts with B and C but only A and B are in your file, the interaction between A and C will not appear in the returned results.
This is very useful to construct a network of interactions only between molecules in the uploaded list (e.g. differentially expressed genes).
- Filter for interactions in pathway
This option limits the interactions returned to a particular pathway. You can search for any of the + pathways from all data sources by typing the name of the pathway in the text box and by selecting one of the given choices.
- Include orthologous interactions
Checking this box will return interactions that have been inferred via orthology in other species (human, mouse & cow only).
- Return InnateDB-curated interactions only
This will limit the results returned to only interactions that have been annotated by the InnateDB curation team.