diff --git a/Histology/stainingGUS.md b/Histology/stainingGUS.md
new file mode 100644
index 0000000..56aa625
--- /dev/null
+++ b/Histology/stainingGUS.md
@@ -0,0 +1,70 @@
+#GUS Staing Protocol
+*Written by Ciera Martinez*
+
+This protocol is a combination of Julie Kang's protocol and Yasu Ichihashi. Although works with many plant tissue types, this particular protocol was used on *Solanum lycopersicum* leaf primordia and apices.
+##Protocol
+
+1. Prepare Solutions (see below)
+2. Dissect leaf tissue into 90% acetone on ice, incubate for 15 minutes.
+3. Wash the sample with NaH2PO4 solution.
+4. 13 hours / overnight in 37ºC vacuum
+5. Place sample in 70% ethanol for at least 5 min. *Optional stopping point, keep in 4ºC*
+6. 6:1 ethanol/acetic acid. for at least 1 hr.
+7. Clearing
+
+
+
+
+
+
+
+
+##Preperation of Solutions
+
+###1. GUS Staining Buffer
+
+50 mL 0.1M NaPO4
+1 mL DMSO
+1 mL Triton X - 100
+2 mL 0.5M EDTA
+47 mL H2O
+
+###2. Phosphate buffers:
+
+**Stock Solution**
+
+A. Na2HPO4 = 27.59 g/L (13.79g per 500mL H2O)
+B. NaH2PO4 = 28.39 g/L (14.19g per 500mL H2O)
+
+**Working Buffer : 0.1M**
+
+Combine
+39mL Na2HPO4 stock solution
+61mL NaH2PO4 stock solution
+100 mL H2O
+
+###3. Sensitive Solutions
+
+*Dilute and add GUS right before using and cover solutions with aluminum foil*
+
+a. Potassium (K3) Ferricyanide = 0.0167g / 1mL H2O
+b. Potassium (K2) Ferrocyanide = 0.02112g / 1mL H2O
+c. X-gluc = 0.005g/ 100μL DMSO
+
+###GUS Staining Solution
+
+This will depend on the tissue and GUS contruct under investigation. There is room for optimization.
+
+10 mL GUS staining buffer
+300 μL K3
+300 μL K2
+192 μ X-gluc
+
+
+
+
+
+
+
+
+
diff --git a/Molecular/PCR.cleanup.md b/Molecular/PCR.cleanup.md
new file mode 100644
index 0000000..f45060f
--- /dev/null
+++ b/Molecular/PCR.cleanup.md
@@ -0,0 +1,10 @@
+#PCR Clean Up
+
+This is a fast protocol that cleans up PCR products with Ampure beads.
+
+1. Add 17.6 μL Ampure beads and pipet to mix.
+2. Incubate 5 minutes @ room temp.
+3. Add to magnetic strip and let sit till clear (about 1 min)
+4. Wash once with 200 μL 75% EtOH.
+5. Remove EtOH and let dry for 5 minutes.
+6. Elute in 15 μL DI water.
diff --git a/README.md b/README.md
index 6abd419..441ca7d 100644
--- a/README.md
+++ b/README.md
@@ -3,7 +3,7 @@ Scripts-and-Protocols
Sinha lab community scripts and protocols.
-Contibute
+Contribute
---------
You do not need to be part of the Sinha lab to upload protocols.
diff --git a/RNAseq/Instructions/BWA.md b/RNAseq/Instructions/BWA.md
index 36c9bd1..f268073 100644
--- a/RNAseq/Instructions/BWA.md
+++ b/RNAseq/Instructions/BWA.md
@@ -27,27 +27,27 @@ Samtools: [http://samtools.sourceforge.net/samtools.shtml](http://samtools.sourc
There should now be five new versions of the file (.amb, .ann, .bwt, .pac, sa).
-2. Organize your files so that they are named how you would like them. In my case, I am going to name each barcode. For now I will just cancatfiles with the same library and rep all in one .fq file. In order to name the libraries in order of their barcode, I wrote a script below:
-
- #!/usr/bin/env python
- #renameFiles.py
- #Ciera Martinez
- #This script takes in a .csv file as a key on how to rename files in current directory.
-
- import os
- import csv
-
- #open file
- with open('lane1Key.csv','rU') as csvfile:
- reader = csv.reader(csvfile, delimiter = ',')
- mydict = {rows[1]:rows[0] for rows in reader}
-
- # renaming
- for fileName in os.listdir( '.' ):
- newName = mydict.get(fileName) if mydict.get(fileName) else "empty" #can not read 'typeNone' from the keys that do not have matching files.
- list(newName)
- #print newName
- os.rename(fileName, newName)
+2. Organize your files so that they are named how you would like them. In my case, I am going to name each barcode. For now I will just concatenate files with the same library and rep all in one .fq file. In order to name the libraries in order of their barcode, I wrote a script below:
+
+ #!/usr/bin/env python
+ #renameFiles.py
+ #Ciera Martinez
+ #This script takes in a .csv file as a key on how to rename files in current directory.
+
+ import os
+ import csv
+
+ #open file
+ with open('lane1Key.csv','rU') as csvfile:
+ reader = csv.reader(csvfile, delimiter = ',')
+ mydict = {rows[1]:rows[0] for rows in reader}
+
+ # renaming
+ for fileName in os.listdir( '.' ):
+ newName = mydict.get(fileName) if mydict.get(fileName) else "empty" #can not read 'typeNone' from the keys that do not have matching files.
+ list(newName)
+ #print newName
+ os.rename(fileName, newName)
4. Now I need to concatenate all the reads from each specific library/rep into one file ie combine the lanes. I used this shell command to accomplish this. I first make a key folder that contained empty names with all possible libraries.
diff --git a/RNAseq/Instructions/iplant.FTP.SSH.md b/RNAseq/Instructions/iplant.FTP.SSH.md
index 2d2a87b..8c862b1 100644
--- a/RNAseq/Instructions/iplant.FTP.SSH.md
+++ b/RNAseq/Instructions/iplant.FTP.SSH.md
@@ -1,6 +1,6 @@
# Guide to Working in iPlant
-##Overview
+## Overview
These are the steps you must take *before* you can begin to process your data.
@@ -9,13 +9,13 @@ These are the steps you must take *before* you can begin to process your data.
3. Mount iplant Volume
4. Mount IRODS Volume (for data storage)
-#Iplant Atmosphere
+# Iplant Atmosphere
Sign up for iplant using an educational e-mail. After logging in go to Atmosphere and start an instance, in this case I used Maloof08.
Can take up to 30 min. You will get an email verifying that your instance is up and running. Then you can proceed.
-##SSH connection
+## SSH connection
There are two ways in which you can interact with your iplant enviroment. 1. Command line on your own desktop. This is faster, especially if you are familiar with command line. I also like using this option because I have more control over my terminal appearance and keyboard shortcuts 2. The other option and more frequently used option is to have a virtual desktop running through VNC viewer. Overall it is easier to use the VNC viewer, mostly because you can allow programs to run without worrying about disconnecting ssh, which can stop longer programs from running.
@@ -37,7 +37,7 @@ They will ask for your iplant password.
Now you are remotely connected to your iplant instance. Use normal Unix commands to navigate your instance file directory system.
-##To transfer files between your computer to your iplant instance
+## To transfer files between your computer to your iplant instance
To transfer files you simply use [`scp`](http://linux.die.net/man/1/scp).
@@ -53,9 +53,11 @@ or if you need to do it from server to local do the opposite.
scp iamciera@128.196.142.74:~/Desktop/ /Users/iamciera/Desktop/RNAseqAnalysis/sinhaLab/Barcode-tools-3.2.tgz
-##How to Attach extra space to instance
+## How to Attach extra space to instance
-##Volumes
+## Volumes
+
+You can think of a volume as a hard drive you attach to your computing enviroment. Iplant will give allow you to have a certain amount of space. The great part about volumes is that you can move the volume, with your data, from one istance to another by attaching and detaching.
[How to attach a volume](https://pods.iplantcollaborative.org/wiki/display/atmman/Attaching+a+Volume+to+an+Instance)
@@ -83,7 +85,7 @@ To quit
command + D
-##IRODs ("Unlimited GB")
+## IRODs ("Unlimited GB")
IRODS is where you want to backup everything. It is a good idea to back up the raw files right away. IRODS is another file directory in which you have access to. You mount IRODS similarly to how you would mount an iplant volume, but you access it differently, through Icommands, which is basically regular unix commands with the an "i" in front.
@@ -93,7 +95,7 @@ In order to use IRODS there are two steps.
[Using Icommands](https://pods.iplantcollaborative.org/wiki/display/start/Using+icommands)
-###Uploading multiple files or a directory (with recursion)
+### Uploading multiple files or a directory (with recursion)
iput -P -V -b -r -T -X --lfrestart localDirectory dataDirectory
@@ -107,7 +109,7 @@ If you want to get files from irods simply use iget in a similar way. For exampl
-##FTP download of Berkeley files
+## FTP download of Berkeley files
[*MAC FTP tutorial*](http://www.maclife.com/article/howtos/how_use_ftp_through_command_line_mac_os_x)
@@ -150,11 +152,11 @@ To quit the FTP connection
mget * ~/Desktop
-##Permission settings
+## Permission settings
sudo chown iamciera /home/iamciera/lcm
-##Running and Interacting with Processes
+## Running and Interacting with Processes
From Vince's Book
To run a program in the background include ampersand in the background.
@@ -180,7 +182,7 @@ Place in Background. To do this, we need to suspend the process, and then use th
$ bg
[1]+ program1 input.txt > results.txt
-##Running Programs where disconnecting ssh could happen
+## Running Programs where disconnecting ssh could happen
disown -h a %job #maintain ownership until you disconnect. This only works when the job is running in the background.
@@ -202,7 +204,7 @@ First you have to make standard error and standard out put files, then you can s
Cody suggested I use [GNU Screen](http://www.gnu.org/software/screen/). But I haven't looked into that just yet.
-##Basic Unix and Tools
+## Basic Unix and Tools
ls -l -h #list long human readible
@@ -228,7 +230,7 @@ Yes. When running a command add time to the end.
I need to seriously figure out bin and usr folders.
[usr_bin](http://www.linfo.org/usr_bin.html)
-##Permissions
+## Permissions
In order to change the permission of an entire directory use chown. In the example below we are allowing to change owner recursively through all sub directories to the owner iamciera of the directory Data.
diff --git a/RNAseq/Instructions/preprocessing.md b/RNAseq/Instructions/preprocessing.md
index b6d1d8c..01f1cec 100644
--- a/RNAseq/Instructions/preprocessing.md
+++ b/RNAseq/Instructions/preprocessing.md
@@ -57,7 +57,7 @@ Remove adapter contamination sequences
$ python adapterEffectRemover.py 41 Nremoved.fq AdaptersRemoved.fq b #10 min
- $ python ~/lcm/scripts/barcoded_data_toolbox/adapterEffectRemover.py 41 Nremoved.fq AdaptersRemoved.fq b #ciera
+ $ python adapterEffectRemover.py 41 Nremoved.fq AdaptersRemoved.fq b #cieras specifics
FastQC on AdaptersRemoved.fq file.
@@ -87,16 +87,16 @@ Remove the portion of the reads containing the barcode so that the reads can be
$ for n in ./*.fq; do ~/lcm/scripts/bin/fastx_trimmer -f 5 -Q 33 -i $n -o ./BCRemoved/$n; done
-##2. Mike Covington
+## 2. Mike Covington
-###Mike's way
+### Mike's way
You first need to switch the columns of your BCfile.txt, because mike's program needs them a different way. Ie.
ATAGG Barcode1
GCTAT Barcode2
-Copy the BCfile so you don't aucutally fuck it up.
+Copy the BCfile so you don't auctually fuck it up.
cp BCfile1.txt BCfiletest.txt
diff --git a/RNAseq/scripts/sam2counts.R b/RNAseq/scripts/sam2counts.R
new file mode 100644
index 0000000..c456605
--- /dev/null
+++ b/RNAseq/scripts/sam2counts.R
@@ -0,0 +1,61 @@
+##R script to obtain counts per transcript when reads have been mapped to cDNAs
+
+##searches working directory for .sam files and summarizes them.
+
+#BEFORE running the lines below change the working directory to
+#the directory with sam files. If your files end in something
+#other than ".sam" change the command below
+
+#get a list of sam files in the working directory
+#the "$" denotes the end of line
+files <- list.files(pattern="\\.sam$")
+
+#look at files to make sure it is OK
+print(files)
+
+#create an empty object to hold our results
+results <- NULL
+
+#loop through each file...
+for (f in files) {
+ print(f) #print the current file
+
+ #read the file. We only care about the third column.
+ #also discard the header info (rows starting with "@")
+ tmp <- scan(f,what=list(NULL,NULL,""),
+ comment.char="@",sep="\t",flush=T)[[3]]
+
+ #use table() to count the occurences of each gene.
+ #convert to matrix for better formatting later
+ tmp.table <- as.data.frame(table(tmp))
+ colnames(tmp.table) <- c("gene",f) #get column name specified
+ #not needed, in fact a mistake, I think.
+ #tmp.table$gene <- rownames(tmp.table)
+
+ #add current results to previous results table, if appropriate
+ if (is.null(results)) { #first time through
+ results <- as.data.frame(tmp.table) #format
+ } else { #not first time through
+ results<-merge(results,tmp.table,all=T,
+ by="gene") #combine results
+ #rownames(results) <- results$Row.names #reset rownames for next time through
+ } #else
+ } #for
+ rm(list=c("tmp","tmp.table")) #remove objects no longer needed
+
+#summarize mapped and unmapped reads:
+print("unmapped")
+unmapped <- results[results$gene=="*",-1]
+unmapped
+results.map <- results[results$gene!="*",]
+print("mapped")
+mapped <- apply(results.map[-1],2,sum,na.rm=T)
+mapped
+print("percent mapped")
+round(mapped/(mapped+unmapped)*100,1)
+
+
+write.table(results.map,file="sam2countsResults.tsv",sep="\t",row.names=F)
+
+
+