PlantUML Grammar

Irwansyah
4 min readSep 25, 2018

So I am working with microservices. Because of that sometimes I am writing codes that act as a composer that calls one or more microservices and process the results. I also has to create a sequence diagram to pass to the front-end developer so that he can understand the flow of the calls.

One day a colleague of mine shows how he created his sequence diagram. He is using PlantUML where he just needs to write some kind of DSL and then the editor directly displayed the generated sequence diagram in real-time. I was thinking, it seems I can use this to also generate the composer codes along with the configurations also with the integration tests and probably the unit tests.

To achieve that, what I need to do is to have the PlantUML grammar. I searched the net and I can’t find one. PlantUML only has language reference and they parse the language without grammar but only using regex. If I want to compile the language into my own codes it seems easier if I started with the grammar.

So then I researched which parser toolkit that is convenient to use in `nodejs` world. And I found nearleyjs and someone already built the online playground here so I can start hacking directly.

After a while I then produced what seems to be a working PlantUML grammar. I put it on gist here. Or if you want to see it directly, I pasted it below:

statement -> element_expr:*  {% id %}
| null {% id %}

element_expr -> actor_expr {% id %}
| boundary_expr {% id %}
| call_expr {% id %}
| participant_expr {% id %}


actor_expr -> ACTOR {% d => d[0] %}
| ACTOR _ ELEMENT_COLOR {% d => [ d[0][0], d[0][1], d[2] ] %}
| ACTOR _ ELEMENT_ORDER {% d => [ d[0][0], d[0][1], d[2] ] %}
| ACTOR _ ELEMENT_COLOR _ ELEMENT_ORDER
| ACTOR _ ELEMENT_ORDER _ ELEMENT_COLOR

boundary_expr -> BOUNDARY {% d => d[0] %}
| BOUNDARY _ ELEMENT_COLOR {% d => [ d[0][0], d[0][1], d[2] ] %}
| BOUNDARY _ ELEMENT_ORDER {% d => [ d[0][0], d[0][1], d[2] ] %}
| BOUNDARY _ ELEMENT_COLOR _ ELEMENT_ORDER
| BOUNDARY _ ELEMENT_ORDER _ ELEMENT_COLOR

participant_expr -> PARTICIPANT {% d => d[0] %}
| PARTICIPANT _ ELEMENT_COLOR {% d => [ d[0][0], d[0][1], d[2] ] %}
| PARTICIPANT _ ELEMENT_ORDER {% d => [ d[0][0], d[0][1], d[2] ] %}
| PARTICIPANT _ ELEMENT_COLOR _ ELEMENT_ORDER
| PARTICIPANT _ ELEMENT_ORDER _ ELEMENT_COLOR

call_expr -> CALL {% d => d[0] %}

id -> STRING {% d => ["IDENTIFIER", d[0][1] ] %}
| dqstring {% d => ["IDENTIFIER", d[0] ] %}
integer -> INT {% id %}

CALL_ARROW -> CALL_ARROW_BODY CALL_ARROW_HEAD CALL_ARROW_HEAD_MODIFIER
| CALL_ARROW_BODY CALL_ARROW_HEAD
| CALL_ARROW_HEAD_MODIFIER CALL_ARROW_HEAD CALL_ARROW_BODY



CALL_ARROW_BODY -> "-"
| "--"
| "->"
| "<-"

CALL_ARROW_HEAD -> ">"
| "<"
| "\\"
| "\\\\"
| "/"

CALL_ARROW_HEAD_MODIFIER -> "x"
| ">"
| "<"
| "\\"
| "/"
| "o"

CALL -> id _ CALL_ARROW _ id _ ":" _ MESSAGE_NAME _ {% d => ["CALL", d[0], d[4], d[7]] %}
ACTOR -> "actor" _ id _ {% d => ["ACTOR", d[2] ] %}
BOUNDARY -> "boundary" _ id _ {% d => ["BOUNDARY", d[2] ] %}
PARTICIPANT -> "participant" _ id _ {% d => ["PARTICIPANT", d[2] ] %}
ELEMENT_ORDER -> "order" _ INT _ {% d => ["ELEMENT_ORDER", d[2] ] %}
MESSAGE_NAME -> dstrchar:* {% d => ["MESSAGE_NAME", d[0].join('')] %}
ELEMENT_COLOR -> "#" STRING _ {% d => ["ELEMENT_COLOR", d[1][1] ] %}

INT -> [0-9]:+ {% d => ["INT", d[0][0]] %}
STRING -> [a-zA-Z]:+ {% d => ["STRING", d[0].join('')] %}
STRINGW -> [a-zA-Z ]:+ {% d => ["STRING", d[0].join('')] %}
WHITESPACE -> _

_ -> wschar:* {% function(d) {return null;} %}
__ -> wschar:+ {% function(d) {return null;} %}

wschar -> [ \t\n\v\f] {% id %}

# Matches various kinds of string literals

# Double-quoted string
dqstring -> "\"" dstrchar:* "\"" {% function(d) {return d[1].join(""); } %}
sqstring -> "'" sstrchar:* "'" {% function(d) {return d[1].join(""); } %}
btstring -> "`" [^`]:* "`" {% function(d) {return d[1].join(""); } %}

dstrchar -> [^\\"\n] {% id %}
| "\\" strescape {%
function(d) {
return JSON.parse("\""+d.join("")+"\"");
}
%}

sstrchar -> [^\\'\n] {% id %}
| "\\" strescape
{% function(d) { return JSON.parse("\""+d.join("")+"\""); } %}
| "\\'"
{% function(d) {return "'"; } %}

strescape -> ["\\/bfnrt] {% id %}
| "u" [a-fA-F0-9] [a-fA-F0-9] [a-fA-F0-9] [a-fA-F0-9] {%
function(d) {
return d.join("");
}
%}

Disclaimer, I haven’t parse the grammar using nearly I was only using the playground to test the grammar. And these are some of PlantUML codes that I am using for testing:

autonumber 40 20 "<asd>"
actor "new Aliceasdadad()" order 20 #yellow
boundary "Bob\nvery long" #yellow order 20
Alice -[#red]> Bob: Authentication\n request
participant Juki #yellow order 20
participant rob
Bob ->x Alice: asdad
Bob -> Alice: adasd
Bob <->o Lice: adsad

--

--