Shortest parser
By Emmett on Saturday, August 23 2008, 18:03 - Permalink
We have a programming problem we give out at Justin.tv: write a parser to convert arithmetic expressions from infix to prefix (should deal with parens and arbitrary variable names). I wrote a version when I first came up with the problem, but I've lost it. So I thought I'd try writing it again and see how short I could get it.
# parse simple (no parens) arithmetic expressions => prefix expressions
def parse_simple(expr, opers=%w(+ - * /))
return expr if opers.empty?
return parse_simple(expr, opers[1..-1]) unless expr.include?(opers.first)
"(#{opers.first} " +
expr.split(opers.first).map {|sub_expr| parse_simple(sub_expr, opers[1..-1]).strip}.join(" ") +
")"
end
# remove parens, then parse simple
def parse(expr, exprs = [])
while expr.gsub!(/\(([^\(\)]+)\)/) {"expr:#{(exprs << parse_simple($1)).size - 1}"}; end
expr = parse_simple(expr)
while expr.gsub!(/expr:(\d+)/) {exprs[$1.to_i]}; end
expr
end
I'm fairly satisfied with it; you could remove some lines through stupid ruby tricks, but I don't see any more concise way to do it. I'd love to see your stab at it though - how short can you get it? Language of choice.
Comments
buggy code...
irb(main):022:0> parse_simple("2+4 * 45 / 2 + 9")
=> "(+ 2 (* 4 (/ 45 2)) 9)"
That seems to be working as intended. 2+(4*(45/2))+9
no, the order of operations is wrong. it should result in
"(+ 2 (/ (* 4 45) 2) 9)"
I thought you didn't need parans. Isn't that the whole point of prefix or postfix expressions. So compilers don't face ambiguity when it comes to evaluating expressions?
http://www.allinterview.com/showans...
Anon, you don't need parenthesis if you consider each operand as binary, like in + 1 + 2 + 3 4, but here they are making them n-ary, as in (+ 1 2 3 4). That way you need parenthesis to delimit the scope of the operand.
If you don't mind a regex-heavy approach, here's a single-method version in Ruby that handles precedence correctly and maintains the ability to pass in (single-character, precedence-grouped) operators:
<code>
def in_to_pre_fix(expr, opers=%w(*/ +-))
token = "\\s*(?:(?:<([^>]+)>)|([\\d]+))\\s*"
while expr !~ /^<[^>]*>$/ and expr.sub!(/(?:^|\()([^)(]*)(?:\)|$)/) { | m |
opers.each { | oper |
while m.sub!(Regexp.new("#{token}([#{Regexp.quote(oper)}])#{token}"),
'<\3 \1\2 \4\5>'); end }
m.tr_s("() "," ").strip() }; end
expr.delete("<>")
end
</code>
I'm new to Ruby, so this may not be typical idiom or whitespace.