Windows - To start Path2 double-click on startPath2.bat. This will open up Path2 from a terminal so that you may see extra feedback from the program. (You can also just run the Jar file Path2.jar and refer to the log.txt file for the same feedback.)
Linux - To run Path2 from a terminal you may use: java -jar Path2.jar
With a fresh install of Path2, the first thing that you will see on start-up is the Import Data tab.
Import Data tab
** PLEASE NOTE! If you import data into Path2 you must have Path2 delete this data first in order to import more and then run a new analysis. To have Path2 delete all data and analysis results and return to the Import Data tab, select File → Clean the Database. The files that will be deleted are listed below. Please make sure you have saved all desired analysis files prior to “Cleaning the database”! **
Importing genotype data
Path2 currently accepts genotype data in the form of PLINK PED/MAP files, binary PLINK BIM/BED/FAM files, or association results files. PED/MAP files are specified in Import Genotype Information → Standard PLINK, binary PLINK files are specified in the → Binary PLINK tab, and association results can be specified in → Summary Results.
If specifying an association results file, it should have at least two tab-delimited columns titled exactly SNP and P:
(Please note: future versions of Path2 will add support for DATA/PED/MAP files. A DATA file lets you specify the format of the PED file used.)
Using the file choosers in the Standard PLINK tab, select the sample PED and MAP files asthma_pathway.ped and .map. You may find them in: Path2/data/sampleData
How are insertions/deletion/copy variant numbers treated?
The "SNP" column must contain rs numbers of the form: rs1234 (rs[0-9]+).These rs numbers *must* correspond only to SNP's, and not for example to: insertions, deletions, .... All entries not of the rs form will be safely ignored by Path2. For example, copy number variants (CNV's) of the form cnv1234 will not be included in PLINK output files or any subsequent step in running Path2.
Path2 performs genetic pathway association analyses using “mapping files” which contain information about how SNP’s, genes, KEGG pathways, and Gene ontologies are related. You may manually select these mapping files via the Import Mapping files tab. Alternatively, you may select the “Download from Internet?” check box and Path2 will grab all the required info from the internet. At this time, Path2 cannot grab Gene Ontology information from the internet and so the next step will be to manually specify two gene ontology information files that will work with our asthma pathway sample dataset.
** PLEASE NOTE! The mapping files generated by Path2 automatically are dataset specific (meaning that Path2 fetches only the information that it needs given the specified dataset) and must be regenerated for each dataset that it loaded into the application. In addition, it is a good idea to periodically (for example, weekly) re-download any needed mapping files since the KEGG and NCBI databases are updated quite frequently. **
Using the file choosers in the Import Mapping files tab, select the “Gene to gene ontology category” and “Gene ontology category to type” files. For this tutorial these are:
(Please note: future versions of Path2 will add support for automatically downloading gene ontology information.)
At this point your Import data tab should look like the following figure:
Import Data tab with fields specified
Your selections for the various file choosers will be saved in the plaintext file Path2/path.properties and reloaded upon subsequent runs of Path2 for convenience. You may edit this file while Path2 is not running.
Press the “Next” button to start processing the provided data files.
When the Next button is pressed in the Import data tab, assuming all needed files are specified, Path2 will call your system’s installed PLINK version to process the genotype data files. The command used to run PLINK depends on the visible sub-tab in the Import Genotype Information panel:
If the “Download from Internet?” check box is enabled, Path2 will also attempt to grab mapping files from the internet. (It uses NCBI’s dbSNP database via Entrez queries to grab SNP to gene information, and KEGG’s database to grab KEGG pathway to gene information. It will over-write and save the files by default to the following locations:
If you are not connected to the internet, or the mapping files do not download correctly for some reason, for the sake of the tutorial you may use the provided mapping files:
To use them: