Introduction

Rena is a library to parse text.
Rena can describe not only regular language but also recursive pattern like mathematical formulae.
Rena can also treat synthesize or inherited attributes.
'Rena' is an acronym REpetation(or REcursion) Notation API.

Create a parse by Rena

A parser made by Rena consists of matching objects.

Matching Object Factory

Class Rena is a factory of matching objects.
Type of attribute are specified by generics.

Rena<Double> fac = new Rena<Double>();

Creating Matching Objects

Interface PatternMatcher is a base interface of matching object. Matching objects are created by matching object factory.
Simple string, regular expression and matching object itself are available as arguments.
Methods which creates mathing object are shown as follows.

Table 1. Matching Object
Method Description

string

Simple string

regex

Regular expressions

then

PatternMatcher itself

Rena<Double> fac = new Rena<Double>();
PatternMatcher<Double> parser = fac.string("string");

Matching String

Method parse matches pattern for a string.
Return value of the method is an object which has properties shown as follows or null if it is not matched.

Table 2. Return value of Method parse
Property Description

match

Matched string

lastIndex

Matched last position of the string

attribute

Attribute

Matching methods of matching object are shown as follows.

Table 3. Mathing Methods
Method Description

parse

Matches entire string

parsePart

Matches part of string

parsePartGlobal

Matches all parts of string

Concatenation

Method then is concatenating matching objects.

Rena<Double> fac = new Rena<Double>();
PatternMatcher<Double> parser = fac.string("765").then(fac.string("pro"));
System.out.println(parser.parse("765pro").getMatch());  // "765pro"

Alternation

Method or alternates matching objects.

Rena<Double> fac = new Rena<Double>();
PatternMatcher<Double> parser = fac.or(fac.string("765"), fac.string("346"), fac.string("283"));
System.out.println(parser.parse("765").getMatch());  // "765"

Repetation

Repeating n times or more

Method atLeast repeats n times or more.
Method thenAtLeast is a compound method of then and atLeast.

Rena<Double> fac = new Rena<Double>();
PatternMatcher<Double> parser = fac.regex("[0-9]").atLeast(3);
System.out.println(parser.parse("765").getMatch());  // "765"

Method zeroOrMore repeats zero times or more, and method OneOrMore repeats one time or more.

Caution

Method then can not describe after repetation.

Repeating from n times to m times

Method times repeats from n times to m times.
Method thenTimes is a compound method of then and times.

Rena<Double> fac = new Rena<Double>();
PatternMatcher<Double> parser = fac.regex("[0-9]").times(2, 4);
System.out.println(parser.parse("765").getMatch());  // "765"

Method atMost repeats from zero times to m times, and method maybe repeats zero times or one time.

Pattern Delimited by Delimiter

Method delimit matches patterns which is delimited by a delimiter.
Method thenDelimit is a compound method of then and delimit.

Rena<Double> fac = new Rena<Double>();
PatternMatcher<Double> parser =  fac.regex("[0-9]").delimit(",");
System.out.println(parser.parse("7,6,5").getMatch());  // "7,6,5"

Lookahead

Method lookahead matches a pattern but consume the input.
Method lookaheadNot matches if a pattern is not matcheed.

Rena<Double> fac = new Rena<Double>();
PatternMatcher<Double> parser = fac.string("765").lookahead(fac.string("pro"));
System.out.println(parser.parseStart("765pro").getMatch());  // "765"
System.out.println(parser.parseStart("765pr"));              // null

Keywords

Method key matches a token which is specified in factory class.

Rena<Double> fac = new Rena<Double>(new String[] { "+", "++", "+=" });
PatternMatcher<Double> parser = fac.key("+");
System.out.println(parser.parse("+=").getMatch());  // "+="

Recursion

Method Rena.letrec is used to recursion.
Arguments of Rena.letrec is the return value of the lambda function.

Here is an example which matches balanced parenthesis.

Rena<Double> fac = new Rena<Double>();
var parser = Rena.letrec(
  x -> fac.maybe(fac.string("(").then(x).then(fac.string(")"))
);
System.out.println(parser.parse("((()))").getMatch());  // "((()))"
System.out.println(parser.parse("((())"));              // null

Skipping Whitespaces

To specify regular experssion you want to skip, parts of string which matches the pattern will be skipped.

Rena<Double> fac = new Rena<Double>(" +");
PatternMatcher<Double> parser = fac.oneOrMore(fac.regex("[0-9]"));
System.out.println(parser.parse("7   6  5").getMatch());  // "7   6  5"

Attribute

One of feature of Rena is treating attributes.
Inherited attribute and synthesized attribute can be used.
To use attributes, you add an action function after matching method. Arguments of action function shows as folles.

Table 4. Action Function
Argument Description

1st

Matched string

2nd

Attribute of matched pattern(Synthesized attribute)

3rd

Attribute before concatenation(Inherited attribute)

Initial value of attribute is available for repetation methods.

Here is an example of action function which parses integer.

Rena<Double> fac = new Rena<Double>();
PatternMatcher<Double> parser = fac.oneOrMore(
    fac.regex("[0-9]", (match, synthesize, inherit) -> Double.parseDouble(match)),
    (match, synthesize, inherit) -> inherit * 10 + synthesize, 0);
System.out.println(parser.parse("765").getAttribute());  // 765.0

Examples

Arithmetical operation

Rena<Double> r = new Rena<Double>();
PatternMatcher<Double> expr = r.then(Rena.letrec(
  (term, factor, element) -> r.then(factor).thenZeroOrMore(r.or(
               r.string("+").then(
                   factor,
                   (match, synthesize, inherit) -> inherit + synthesize),
               r.string("-").then(
                   factor,
                   (match, synthesize, inherit) -> inherit - synthesize))),
  (term, factor, element) -> r.then(element).thenZeroOrMore(r.or(
               r.string("*").then(
                   element,
                   (match, synthesize, inherit) -> inherit * synthesize),
               r.string("/").then(
                   element,
                   (match, synthesize, inherit) -> inherit / synthesize))),
  (term, factor, element) -> r.or(
               r.regex("[0-9]+",
                   (match, synthesize, inherit) -> Double.parseDouble(match)),
               r.string("(").then(term).then(r.string(")"))))).end();

// outputs 7
System.out.println(expr.parse("1+2*3", 0).getAttribute());

// outputs 1
System.out.println(expr.parse("4-6/2", 0).getAttribute());