Peggy

Generate Packrat PEG parsers for Julia

Features:

  • pretty good syntax error messages.
  • detects indirect left-recursive rules during compilation. Only left-recursive rules pay a performance cost
  • both combinator functions and a macro are provided

A Peggy.Parser is function that takes a string as input and returns its parsed value.

Create parsers using either a succinct Peggy expression via the @peg macro or lower-level functions.

Index

Peggy.ParserType
(parser::Parser)(input)

A Peggy.Parser is function that takes a string as input and returns its parsed value.

source
Base.:*Method
p::Parser * n == many(p; min=n)
p::Parser * (a:b) == many(p; min=a, max=b)
source
Base.parseMethod
Base.parse(p::Parser, s::AbstractString)
(p::Parser)(s:AbstractString)

Parse the input with the parser.

Returns the resulting value or throws a ParseException.

source
Base.tryparseMethod
tryparse(parser, input)

Like parser(input), but returns nothing if the parse fails.

source
Peggy.ANYMethod

A PEG parser that matches any character and yields it as a string.

source
Peggy.CHARMethod
CHAR(charclass::String)

Create a parser for a single character matchng regex character classes.

Functionally identical to Regex("[charclass]") except it is known to never match an empty string. This is important to avoid unneccesary and expensive left-recursion overhead.

Examples

julia> g = @grammar begin
       number = [ digit ds:(digit...)  { parse(Int, *(digit, ds...)) } ]
       digit = CHAR("[:digit:]")
       end;

julia> g("1234")
1234
source
Peggy.ENDMethod

A PEG parser that matches the end of the input; yields result ().

source
Peggy.failMethod
fail(message) => Parser

A parser that always fails with the given message.

Useful for error messages.

source
Peggy.followedbyMethod
followedby(expr...)
!!(e::Parser)

Create a parser that matches expr but consumes nothing.

source
Peggy.grammarMethod
grammar([start::Symbol], (symbol => expr)...)

Create a parser from a set of productions, which are named, mutually recursive parsers.

Parsers that are members of a grammar can reference are member parser by their symbol.

If start is omitted, the symbol of the first production is used.

source
Peggy.manyMethod
many(exprs...; min=0, max=missing)

Create a parser that matches zero or more repititions of the sequence expr...; returns a vector of results.

source
Peggy.notMethod
not(expr)

Create a parser that fails if parser p succeeds. Otherwise it succeeds with value ()

source
Peggy.oneofMethod
oneof(pegexpr...)

Create a parser for ordered alternatives.

source
Peggy.peggyMethod
peggy(expr...; whitespace=r"[[:space:]]*")

Create a Parser from a PEG expression.

The parser matches each expr sequentiallly and returns the combined results (details below).

Each expr can be one of the following.

  • String - matches & yields the string literal
  • Regex - matches the Regex and yields the match value (but avoid this)
  • Symbol - matches to expression associated with the symbol in a grammar.
  • Symbol => expr - matches expr and assign it a name.
  • expr => callable - matches expr and yields result of applying callable to its value.
  • expr => k - short-hand for expr => _ -> k`
  • (expr, exprs...) - same as peggy(expr, exprs...)
  • [expr, exprs...] - same as many(expr, exprs; max=1)
  • Parser - any expression that yields a parser

Names and sequence results

Each element of a sequence has a name. Symbol and Symbol => expr take the name of the symbol. All other expressions are named ":_". The value of the sequence is then formed as follows.

Discard values with names starting with "_" if there are any that do not. If a single value remains, that is the sequence value. Othewise the value is a Vector of the remaining values.

Whitespace

String literals by default ignore surrounding whitespace. Use option whitespace=r"" to disable this.

source