Skip to main content

Generation of Random amino acid protein sequences using Python/Bio-python

This program is about how to generate protein sequences (random sequences of 100amino acid) and these sequences should be stored in a database type file. The sequence analysis gave the bioscience researches a new direction. This project works for generating new sequences of proteins. These sequences might already exist in nature and having similarity with any organism. Through this software new random sequences may be generated and saved to a file on user’s machine. Saving to file helps to compare different sequences as well as the complete information may be placed at the same place.
The Protein generation part is a simple program which takes in user input for number of random Protein Sequences to generate and filename/file-path. Based on these user inputs random Protein sequences are generate of 100 amino acid residues of type IUPACProtein. The sequences thus generated are saved to a file in the current directory by the name username.fasta in Fasta format.
This program is in Python Language using BIO-Python modules. The central object in bio-informatics is the sequence, thus it started with the bio-python mechanisms for dealing with sequences.
# File Name RandonProteinSequences.py
# standard library
import os
import random

# biopython
from Bio.Seq import Seq
from Bio.Alphabet import IUPAC
from Bio.SeqRecord import SeqRecord
import Bio.writers.SeqRecord.fasta
from Bio import SeqIO
from sys import *

residueList1 = ["C","D","E","F","G","H","I"]
residueList2 = ["A","K","L","M","N","S"]
residueList3 = ["P","Q","R","T","V","W","Y"]
residueList4 = ["C","A","G","U"]
def getProteinSeqRecord(residue, seqcount):
strSeq = ""
for i in range(0,100,1):
index = random.randint(0, len(residue)-1)
strSeq += residue[index]

sequence = Seq(strSeq, IUPAC.IUPACProtein)
seqRec = SeqRecord(sequence, id = 'randSeq' + str(seqcount), description= 'A random sequence using Amino acid residues.')
return seqRec

def getProteinSequence(residue):
strSeq = ""
for i in range(0,100,1):
index = random.randint(0, len(residue)-1)
strSeq += residue[index]

sequence = Seq(strSeq, IUPAC.IUPACProtein)
return sequence

def randomProteinSeqRecord(index):
if(index%2)==0:
return getProteinSeqRecord(residueList1, index)
elif(index%3)==0:
return getProteinSeqRecord(residueList2, index)
else:
return getProteinSeqRecord(residueList3, index)

#information
print '--- This is python based program to generate random sequences ---'
print '--- Provide number of random sequences to generate. Default 10 ---'
print '--- Inorder to save to a file provide file path or filename ---'
print '--- If none or invalid filepath is provided then results will be displayed to console ---'
print '--- The file will be created in fasta format ---'
print

filepathProvided = False
#raw_input received the user input as string
try:
filepath = raw_input('Enter filepath to save sequences ... ')
filepath = filepath + '.fasta'
handle = open(filepath, "w")
handle.close()

filepathProvided = True
except IOError:
print 'Invalid or No File provided will print results to console'
print
ranSeqCount = 10
try:
ranSeqCount = int(raw_input('Enter number of random sequences to generate ... '))
except ValueError:
ranSeqCount = 10
pass

if(filepathProvided):
handle = open(filepath, "w")

if(filepathProvided):
fasta_writer = Bio.writers.SeqRecord.fasta.WriteFasta(handle)
else:
fasta_writer = Bio.writers.SeqRecord.fasta.WriteFasta(stdout)
print 'Sequence Count : '
print ranSeqCount

for i in range(0,ranSeqCount,1):
fasta_writer.write(randomProteinSeqRecord(i+1))
if(filepathProvided):
handle.close()
print 'File created at : ' + filepath

print
raw_input('Press any key to exit ...')
print

This software will also help user to create protein sequences of fairly distributed amino acids of his own choice. It means the user can create a new Protein sequence database type files which will help in studies and researches of varieties of species and organisms.
These phenomena prove the relationship and dependency of protein and genes on each other to operate an organism.

Comments

  1. Thanks for sharing information about amino acids bodybuilding. Your blog is very appreciable and informational. Healthgenie.in offers at amino acids bodybuilding, weighing scales, best protein powder products with heavy discount.

    ReplyDelete

Post a Comment

Popular posts from this blog

Nursery and Kindergarten Admissions in Delhi/NCR Schools: What’s the age?

It is the most intricate issue of the current time in India. What should be the exact age of kids at the time of nursery admissions ? Lot of different perspectives and ideas are there, but in my opinion, the ideal age for kindergarten should be 5-6 years. I think we first let them enjoy their childhood, let them develop in all directions. Give them some more years or going play group is sufficient. But the opinions vary here, and everyone has her view on this matter. Some parents don’t want to send them at an early age, but they have no choice. These days mostly schools are taking 3+ kids for nursery and 4+ kids for KG . There are also some schools that prefer to take 4+ kids for nursery or they start from KG classes. The education system is growing more complicated as they don’t focus on the abilities of child and parents. Strict criteria and some guidelines are put, and that’s all. The most painful thing is the distances between schools and homes, how our little buds ca...

Green tea benefits

Green tea is a form of the leaves from the same tea plant, Camellia sinensis . It is less oxidised than black tea, but more than  white . The Green tea benefits have been proved remarkably promising in cancer and heart diseases. Presence of strong antioxidants makes it unique, theses anti-oxidants forage free radicals of our cells to wipe them out. Green tea benefits include the presence of  healthy  polyphenols and the most prominent poly-phenol is epigallocatechin gallate. Some main benefits of green tea are noticed in the case of Rheumatoid arthritis, High cholesterol levels, cardiovascular disease and in the improvement of the immune system. For weight loss and fat  burn  green tea is the best source of fitness.  Must add this nature's gift to your routine. Other varieties of Chinese teas are Bancha, Kabusecha, Gyokuro, Matcha and Tencha etc.