Peggy
Generate Packrat PEG parsers for Julia
Features:
- pretty good syntax error messages.
- detects indirect left-recursive rules during compilation. Only left-recursive rules pay a performance cost
- both combinator functions and a macro are provided
A Peggy.Parser
is function that takes a string as input and returns its parsed value.
Create parsers using either a succinct Peggy expression via the @peg
macro or lower-level functions.
Index
Peggy.Parser
Base.:!
Base.:*
Base.:|
Base.parse
Base.tryparse
Peggy.ANY
Peggy.CHAR
Peggy.END
Peggy.fail
Peggy.followedby
Peggy.grammar
Peggy.many
Peggy.not
Peggy.oneof
Peggy.peggy
Peggy.@peg
Peggy.Parser
— Type(parser::Parser)(input)
A Peggy.Parser
is function that takes a string as input and returns its parsed value.
Base.:!
— Method!(p::Parser) == not(p)
Base.:*
— Methodp::Parser * n == many(p; min=n)
p::Parser * (a:b) == many(p; min=a, max=b)
Base.:|
— Method p1::Parser | p2::Parser == oneof(p1, p2)
A short-form for oneof
.
Base.parse
— MethodBase.parse(p::Parser, s::AbstractString)
(p::Parser)(s:AbstractString)
Parse the input with the parser.
Returns the resulting value or throws a ParseException
.
Base.tryparse
— Methodtryparse(parser, input)
Like parser(input)
, but returns nothing
if the parse fails.
Peggy.ANY
— MethodA PEG parser that matches any character and yields it as a string.
Peggy.CHAR
— MethodCHAR(charclass::String)
Create a parser for a single character matchng regex character classes.
Functionally identical to Regex("[charclass]") except it is known to never match an empty string. This is important to avoid unneccesary and expensive left-recursion overhead.
Examples
julia> g = @grammar begin
number = [ digit ds:(digit...) { parse(Int, *(digit, ds...)) } ]
digit = CHAR("[:digit:]")
end;
julia> g("1234")
1234
Peggy.END
— MethodA PEG parser that matches the end of the input; yields result ()
.
Peggy.fail
— Methodfail(message) => Parser
A parser that always fails with the given message.
Useful for error messages.
Peggy.followedby
— Methodfollowedby(expr...)
!!(e::Parser)
Create a parser that matches expr
but consumes nothing.
Peggy.grammar
— Methodgrammar([start::Symbol], (symbol => expr)...)
Create a parser from a set of productions, which are named, mutually recursive parsers.
Parsers that are members of a grammar can reference are member parser by their symbol.
If start
is omitted, the symbol of the first production is used.
Peggy.many
— Methodmany(exprs...; min=0, max=missing)
Create a parser that matches zero or more repititions of the sequence expr...
; returns a vector of results.
Peggy.not
— Methodnot(expr)
Create a parser that fails if parser p
succeeds. Otherwise it succeeds with value ()
Peggy.oneof
— Methodoneof(pegexpr...)
Create a parser for ordered alternatives.
Peggy.peggy
— Methodpeggy(expr...; whitespace=r"[[:space:]]*")
Create a Parser
from a PEG expression.
The parser matches each expr sequentiallly and returns the combined results (details below).
Each expr can be one of the following.
String
- matches & yields the string literalRegex
- matches theRegex
and yields the match value (but avoid this)Symbol
- matches to expression associated with the symbol in agrammar
.Symbol => expr
- matchesexpr
and assign it a name.expr => callable
- matchesexpr
and yields result of applyingcallable
to its value.expr => k
- short-hand forexpr
=> _ -> k`(expr, exprs...)
- same aspeggy(expr, exprs...)
[expr, exprs...]
- same asmany(expr, exprs; max=1)
Parser
- any expression that yields a parser
Names and sequence results
Each element of a sequence has a name. Symbol
and Symbol => expr
take the name of the symbol. All other expressions are named ":_". The value of the sequence is then formed as follows.
Discard values with names starting with "_" if there are any that do not. If a single value remains, that is the sequence value. Othewise the value is a Vector
of the remaining values.
Whitespace
String literals by default ignore surrounding whitespace. Use option whitespace=r""
to disable this.
Peggy.@peg
— MacroCreate a Peggy.Parser
from a Peggy expression