Newest 'bioinformatics' Questions

1

vote

0answers

46 views

Edit distance (Optimal Alignment) - follow up

This is a follow up of this question Optimal substructure of ED: Here is the reasoning behind my solution: let \$x = (\alpha _{1},\alpha _{2},\alpha _{3},...,\alpha _{m})\$ and \$y = (\beta_{1},\...

asked Oct 1 at 5:03

MAG

2,039219

3

votes

1answer

39 views

Calculating how similar two objects are according to a database

I want to calculate how similar two objects are according to a database. However the code is very slow; it takes around 2 minutes to analyze just 3 objects. How can I speed it up? I have tried to ...

matrix graph combinatorics r bioinformatics

asked Sep 26 at 10:52

Llopis

1377

0

votes

0answers

579 views

Needleman–Wunsch algorithm in Rust

The Needleman–Wunsch algorithm is an algorithm used in bioinformatics to align protein or nucleotide sequences. Here is an implementation in Rust: ...

algorithm rust bioinformatics

asked Sep 20 at 10:48

qed

687416

2

votes

1answer

40 views

Returning substrings (motifs) and original strings (sequences) from a file of strings (sequences)

I would like to get some help/tips on writing better and more pythonic code, as well as variable naming. Code info: .backtranscribe() is just a method to convert ...

python bioinformatics

asked Aug 18 at 16:31

Kevin Chen

133

6

votes

1answer

74 views

Counting K-mers (words) in a sequence string

I recently found out how to use joblib for parallelization. I want to iterate through this string seq in steps of 1 and count ...

python performance strings multithreading bioinformatics

asked Aug 8 at 23:41

O.rka

1312

2

votes

2answers

64 views

Reading genetic data in VCF and Tabix formats using an asynchronous library

I'm working with an open-source library for processing and parsing genetic data in VCF and Tabix formats. It contains functions and classes that make it easy to read an index file (a Tabix) and load ...

javascript performance asynchronous bioinformatics

asked Jul 19 at 17:51

5813

1213

6

votes

1answer

71 views

Calculating protein mass

This question is part of a series solving the Rosalind challenges. For the previous question in this series, see A sequence of mistakes. The repository with all my up-to-date solutions so far can be ...

strings ruby programming-challenge hash-table bioinformatics

asked Jun 23 at 22:20

Mast

5,89432671

4

votes

3answers

92 views

Nucleotide count in Scala

This is my second day in learning Scala and I still need to develop a taste of functional programming, I often find myself doing imperative coding. Below is the result of my TDD practice. Code ...

beginner unit-testing scala bioinformatics

asked Jun 16 at 17:05

CodeYogi

1,505431

5

votes

3answers

45 views

Counting GUAG introns in chromosomes

I have this code that is working fine but it's taking pretty much 100% of my cpu to run and it takes around 25min. I'd really like to optimize it but don't know what parts I could improve. The main ...

time-limit-exceeded perl bioinformatics

asked Jun 10 at 19:55

Tomb

313

2

votes

2answers

51 views

A sequence of mistakes

This question is part of a series solving the Rosalind challenges. For the previous question in this series, see The Genetic Code. The repository with all my up-to-date solutions so far can be found ...

strings ruby bioinformatics rspec edit-distance

asked Jun 5 at 21:50

Mast

5,89432671

6

votes

3answers

473 views

The Genetic Code

This question is part of a series solving the Rosalind challenges. For the previous question in this series, see Wascally wabbits. The repository with all my up-to-date solutions so far can be found ...

strings ruby programming-challenge bioinformatics

asked May 12 at 18:14

Mast

5,89432671

2

votes

0answers

76 views

Needleman Wunsch algorithm in Scala

The Needleman–Wunsch algorithm is an algorithm used in bioinformatics to align protein or nucleotide sequences. Here is an implementation in Scala: ...

algorithm scala bioinformatics

asked Feb 26 at 22:41

qed

687416

2

votes

0answers

25 views

Adding information to a compressed file and compressing the output

I wrote this script for adding information to a compressed file and compressing the output: ...

python performance compression bioinformatics

asked Feb 16 at 15:45

tommy.carstensen

21528

1

vote

1answer

36 views

Compare sequence & maps headers in fasta file

This is the perl code which compares the sequence in fasta file & maps the header. Though the code is working well, I still would like to make it more efficient. Since the files I compare has >...

performance perl bioinformatics

asked Feb 12 at 9:38

Arun

463

3

votes

0answers

55 views

Determining if a genetic sequence is palindromic

Adding another level to my previous question on 'normal' palindrome identification, in this one I'm interested in identifying genetic palindromes. Here's my attempt: ...

c++ algorithm c++14 palindrome bioinformatics

asked Jan 27 at 18:14

Daniel

480212

1

vote

1answer

48 views

Collating creature descriptors spread across multiple stanzas

I have been using Python for only a few days, so I am trying to learn about some best practices. An explanation of what this code is supposed to do is at the bottom of this post. It is an exercise to ...

python beginner strings parsing bioinformatics

asked Jan 15 at 20:33

bio_inf_dreamer

133

5

votes

1answer

81 views

VCF parser for eventual genomic data visualization

I've just started out writing an app that will visualize genomic data for anybody to understand. When you get your genome sequenced the raw data usually comes in the form of a VCF file. I started out ...

oop parsing node.js bioinformatics ecmascript-6

asked Jan 12 at 2:31

Melanchroes

285

4

votes

1answer

91 views

Rosalind problem “Consensus and Profile”

Source: Rosalind("Consensus and Profile") Brief summary ...

ruby programming-challenge bioinformatics

asked Jan 2 at 23:28

Sergii K

5949

6

votes

3answers

75 views

Categorizing gene sequences read from a CSV file

I am relatively new to programming and would love to get some feedback on the following section of my code. ...

python beginner csv bioinformatics

asked Dec 4 '15 at 23:13

koreebay

312

3

votes

1answer

62 views

V Snare T Snare Model

In the beginning, everything is defined to be of value 10, but I have to change them to suit them for different possible values, hence those are changing. I'm a (Im)mature C coder, hence there might ...

c bioinformatics

asked Dec 4 '15 at 18:43

user2754673

163

2

votes

1answer

38 views

Compare a sequence with the reference frequency of hexamers

I have written this function (and others similar to that one) But I am not sure I am using references on their full power. My currently concerns is if I make a huge use of memory. The subroutine ...

algorithm perl bioinformatics memory-optimization

asked Nov 12 '15 at 8:15

Llopis

1377

4

votes

2answers

190 views

DNA base pair match counter

So my code is done it outputs exactly what it needs to I'm just wondering if it is possible to make this code a lot more simple using objects. If so could someone tell me what I would need member-wise ...

c++ beginner homework bioinformatics

asked Nov 3 '15 at 14:11

Motorscooter

233

3

votes

2answers

71 views

Rosalind string algorithm problems

I've been starting to learn Rust by going through some of the Rosalind String Algorithm problems. If anyone would like to point out possible improvements, or anything else, that would be great. There ...

beginner algorithm strings bioinformatics rust

asked Oct 6 '15 at 15:14

user673679

21819

7

votes

3answers

159 views

Prefix Sum in Ruby, Genomic Range Query from Codility

I'm currently going through some lessons on Codility. I've just spent a couple of hours with GenomicRangeQuery, which is intended to demonstrate the use of prefix sums. The task description is here. ...

beginner algorithm ruby programming-challenge bioinformatics

asked Aug 17 '15 at 17:24

Dae

1363

6

votes

1answer

322 views

High performance parsing for large, well-formatted text files

I am looking to optimize the performance of a big data parsing problem I have using Python. The example data I show are segments of whole genome DNA sequence alignments for six primate species. Each ...

python performance parsing bioinformatics

asked Jul 26 '15 at 17:39

isosceleswheel

1334

6

votes

3answers

92 views

Building a report of DNA sites and chunks

Here is the slow part of my code: ...

python performance csv formatting bioinformatics

asked Jul 9 '15 at 15:34

Remi.b

21419

4

votes

1answer

85 views

Find allele frequencies at each site for each iteration for each population from FASTA file

The script takes a FASTA format file in input and outputs the frequencies of each amino acid (A, C, ...

python beginner parsing bioinformatics

asked Jun 30 '15 at 18:33

Remi.b

21419

0

votes

2answers

771 views

Comparing two columns in two different rows

I want to go through each line of the a .csv file and compare to see if the first field of line 1 is the same as first field of next line and so on. If it finds a match then I would like to ignore ...

python csv hash-table bioinformatics

asked Jun 12 '15 at 17:26

upendra

1353

2

votes

2answers

76 views

RNA/DNA transcriber

I've been going through some of the exercises over on exercism and this is one of my solutions: a basic RNA/DNA transcriber. I was happy enough at first but now, looking at it again, the solution ...

ruby regex bioinformatics

asked May 12 '15 at 14:56

SoSimple

1724

4

votes

2answers

130 views

Fast comparison of molecular structures and deleting duplicates

I have a program that reads in two xyz-files (molecular structures) and compares them by an intramolecular distance measure (dRMSD, Fig. 22). A friend told me that my program structure is bad, and as ...

python beginner bioinformatics

asked May 11 '15 at 13:02

pH13 - Yet another Philipp

1215

5

votes

2answers

64 views

Converting domain-specific regular-expressions to a list of all matching instances

There seem to be several questions floating around Stackexchange regarding how to take a python regular expression list the matching instances. This problem is a bit different because 1) I'm need to ...

python regex bioinformatics

asked Apr 27 '15 at 3:55

user809695

1341

7

votes

1answer

147 views

Statistics about gaps in DNA sequences

Noobie to Numba here, I'm trying to get faster code from existing function but the result is not faster. 10 times faster would be heaven, but I know nothing about optimization. This is code about ...

python performance numpy bioinformatics numba

asked Apr 6 '15 at 11:59

Julien Cochennec

663

3

votes

1answer

379 views

Python Longest Repeat

I am trying to find the longest repeated string in text with python, both quickly and space efficiently. I created an implementation of a suffix tree in order to make the processing fast, but the ...

python algorithm strings tree bioinformatics

asked Mar 6 '15 at 21:24

mls3590712

161

4

votes

2answers

190 views

bash script for constructing RNA pipeline

I have written a bash script that consists of multiple commands and Python scripts. The goal is to make a pipeline for detecting long non coding RNA from a certain input. Ultimately I would like to ...

bash bioinformatics

asked Feb 27 '15 at 21:13

upendra

1353

5

votes

1answer

283 views

Reading an Excel file and comparing the amino acid sequence of each data pair

Since I am fairly new to Python I was wondering whether anyone can help me by making the code more efficient. I know the output stinks; I will be using Pandas to make this a little nicer. ...

python beginner excel bioinformatics pandas

asked Feb 22 '15 at 9:44

Timo

261

1

vote

2answers

106 views

Counting adenine and cytosine bases

I've started a little challenge on a website, and the first one was about counting different DNA letters. I've done it, but I found my method very brutal. I have a little experience, and I know that ...

python beginner bioinformatics

asked Feb 15 '15 at 23:24

Chirac

1255

4

votes

2answers

311 views

Reflecting emotion classification based on the Lövheim cube

Background I created a simple class to reflect emotion classification based on the Lövheim cube. The code is not scientific at all, and I just did it for fun, but I want all code I write to be as ...

python classes python-2.7 bioinformatics

asked Dec 29 '14 at 21:18

user2589328

554

2

votes

1answer

43 views

A Java class for reading MaCH dosage files v2.0

Version 2 of A Java class for reading MaCH dosage files ...

java parsing io bioinformatics

asked Nov 16 '14 at 20:30

qed

687416

3

votes

1answer

79 views

A Java class for reading MaCH dosage files

A dosage file (used in computational genetics) is formatted like this: ...

java parsing io bioinformatics

asked Nov 16 '14 at 1:51

qed

687416

4

votes

2answers

145 views

Convert impute2 files to mach format

Here is a program for converting Impute2 files into MaCH format (related to genetics). Source files include one xxx_haps file and one xxx_samples file, for example: ...

java converting file io bioinformatics

asked Nov 6 '14 at 12:12

qed

687416

3

votes

0answers

77 views

Finding the Cox regression coefficients in a mixed model for microarray data

I have written a code for a project which aims at finding the Cox regression coefficients in a mixed model for microarray data. The study was carried out on the Affymetrix Hgu133a platform. In the ...

performance beginner r bioinformatics

asked Oct 6 '14 at 20:03

Nilotpal

164

2

votes

1answer

230 views

Slow Python text-processing script

This script of mine merges columns 1 and 2 from one input file and sees if these merged combinations exist in the other infile (and vice versa). I know I get stuck in appending. It did not get past ...

python csv file bioinformatics time-limit-exceeded

asked Sep 2 '14 at 6:53

AWE

1164

4

votes

0answers

109 views

Vectorize Fisher's Exact Test

I have two data frames/ lists of data, humanSplit and ratSplit, and they are of the form ...

optimization csv r statistics bioinformatics

asked Aug 10 '14 at 6:31

hmi

1212

0

votes

1answer

231 views

Faster way to parse file to array, compare to array in second file, write final file

I currently have an MGF file containing MS2 spectral data (QE_2706_229_sequest_high_conf.mgf). The file template is here, as well as a snippet of example: ...

python performance bioinformatics

asked Jul 15 '14 at 11:41

user2277435

224

8

votes

1answer

1k views

Genetic Algorithm in Python

I'm a new programmer, so any help is welcome. Preferably to make it faster, avoid heavy memory usage, and so on. ...

python algorithm beginner ai bioinformatics

asked Jun 25 '14 at 8:21

f.rodrigues

412215

6

votes

2answers

217 views

Comparing 2 lists of peptide to spectrum rankings generated by 2 different algorithms

I'm seeking a general review, but I'm particularly interested in style. This program gets 2 lists of peptide to spectrum matches, so every spectrum title is linked to a list of 1 or 10 possible ...

java bioinformatics

asked Jun 13 '14 at 21:10

user3700660

313

11

votes

3answers

1k views

Counting DNA nucleotides in C

I have written code to solve the following Rosalind problem. This is my first time writing in C and I would like a review of my code, particularly in regard to correctness and performance. ...

c beginner bioinformatics

asked Jun 13 '14 at 9:01

jma1991

585

3

votes

1answer

338 views

Calculating overlap of segments in chromosome data

I wrote an R code that basically performs 2 operations: For each segment in file A, find all segments in file B that lie in that segment. Find the percentage of overlap for each case in previous ...

r performance bioinformatics

asked Jun 10 '14 at 10:06

Jason

1

vote

2answers

443 views

Parsing BLAST output in XML format using Regular Expression

There many other better ways to parse BLAST output in .xml format, but I was curious to try using regex, even if it is not so straightforward and common. Here is the code how to extract translated ...

python regex bioinformatics

asked May 27 '14 at 15:58

user3224522

405

3

votes

2answers

409 views

Rosalind's 3rd problem in Scheme

I have an imperative programming background and I've decided to study functional programming by applying it to problems found on sites such as Project Euler and Rosalind. My language of choice is ...

beginner scheme bioinformatics

asked May 9 '14 at 20:10

user29120

your communities

Tagged Questions

Related Tags