The Computer Language
Benchmarks Game

regex-redux Matz's Interpreter #9 program

source code

# The Computer Language Benchmarks Game
# http://benchmarksgame.alioth.debian.org
#
# regex-dna program contributed by jose fco. gonzalez
# corrected to use regex instead of string substitution
# converted from regex-dna program

seq = STDIN.readlines.join
ilen = seq.size

seq.gsub!(/>.*\n|\n/,"")
clen = seq.length

[
  /agggtaaa|tttaccct/i,
  /[cgt]gggtaaa|tttaccc[acg]/i,
  /a[act]ggtaaa|tttacc[agt]t/i,
  /ag[act]gtaaa|tttac[agt]ct/i,
  /agg[act]taaa|ttta[agt]cct/i,
  /aggg[acg]aaa|ttt[cgt]ccct/i,
  /agggt[cgt]aa|tt[acg]accct/i,
  /agggta[cgt]a|t[acg]taccct/i,
  /agggtaa[cgt]|[acg]ttaccct/i
].each {|f| puts "#{f.source} #{seq.scan(f).size}" }

{
/tHa[Nt]/ => '<4>', /aND|caN|Ha[DS]|WaS/ => '<3>', /a[NSt]|BY/ => '<2>', 
/<[^>]*>/ => '|', /\|[^|][^|]*\|/ => '-'
}.each { |f,r| seq.gsub!(f,r) }

puts
puts ilen
puts clen
puts seq.length
    

notes, command-line, and program output

NOTES:
64-bit Ubuntu quad core
ruby 1.8.7 (2008-08-11 patchlevel 72) [x86_64-linux]


Mon, 27 Nov 2017 16:46:42 GMT

COMMAND LINE:
/usr/bin/ruby regexredux.mri-9.mri 0 < regexredux-input50000.txt

UNEXPECTED OUTPUT 

13c13
< 535239
---
> 273927

PROGRAM OUTPUT:
agggtaaa|tttaccct 3
[cgt]gggtaaa|tttaccc[acg] 12
a[act]ggtaaa|tttacc[agt]t 43
ag[act]gtaaa|tttac[agt]ct 27
agg[act]taaa|ttta[agt]cct 58
aggg[acg]aaa|ttt[cgt]ccct 16
agggt[cgt]aa|tt[acg]accct 15
agggta[cgt]a|t[acg]taccct 18
agggtaa[cgt]|[acg]ttaccct 20

508411
500000
535239