PLOW
2014
TXL
Workshop Presentations
TXL
References
TXL
Lab #1 - Monday, 3 March 2014
In this lab we will
begin by installing and testing TXL and a simple first
transformation of PHP programs.
Requirements
In order to do the
TXL labs, we will need to install FreeTXL and the TXL
grammar for the PHP language we will be transforming. All
work will be run on the command line (in a Terminal window)
on your own machine, using whatever editor and file system
you are comfortable with.
If you have a
choice, it is recommended that you work in Linux or MacOSX
rather than Windows.
Part
I: Check TXL is Installed
We begin by simply
checking that TXL is properly installed.
- Open a
command-line terminal window on your
computer.
- Test
that TXL is installed, using the command:
- The
version of TXL installed on your machine should
print on the terminal.
|
Part
II: Check the PHP Grammar is Installed - Parse a
program
Next we try parsing
a PHP program using TXL.
- In the
terminal window, change directory to the PHP
grammar directory you previously downloaded
(called PHP345):
- Try
parsing one of the example programs that comes
with the grammar.
- The
parsed output should appear in the terminal
output.
|
Part
III: Understanding the Parser
It's important to
understand what the parser does before we begin writing
transformations in TXL§.
- You can
save the parsed output to a file by redirecting
it, like so:
txl Examples/ApiBase.php > parsedApiBase.php
|
- Open
the parsed output and notice some things: it is
consistently formatted and indented according to
the grammar,
with the spacing and indentation specifiedin the
grammar Txl/php.grm, and with comments removed.
- If you
want to see the actual parse tree, you can use
TXLÄôs XML output form to see
the details. Use the command:
txl Examples/ApiBase.php -xml > parsedApiBase.xml
|
- Open
the XML file and you can see the actual parse
tree of the parsed PHP program. Usually we don't
need to look at this,
but it can help when crafting patterns to match
in TXL.
|
Part
IV: Make a Simple TXL program
Now it's time to
make an actual TXL program.
- Using
whatever editor you like. begin editing a new
file called "first.txl" in the PHP345
directory.
- We need
a program that uses the PHP grammar, so it
should begin with an include of the grammar,
like so:.
% Use the PHP grammar
include "php.grm"
|
- We also
need a TXL main rule or function to match inputs
to our program.
Let's make a simple one that just tries to match
the input as a PHP program:
% Main rule to match PHP programs
function main
replace [program]
PHPprogram [program]
construct Message [id]
_ [message "matched it!"]
by
PHPprogram
end function
|
- The
program doesn't really do anything other than
match an input PHP program and transform it to
its parsed self,
exactly once (because this main rule is a TXL
function, not a rule).
- Save
this program and exit the editor, then run the
program to see it is working:
txl Examples/ApiBase.php first.txl > firstApiBase.php
|
- If
everything is right, the message "matched it!"
should appear on the terminal, indicating that
the TXL main function
found and pattern matched the input as a PHP
program. Simple messaging constructors like the
Message one in our program can be
helpful when debugging TXL programs.
|
Part
V: Make a Simple TXL Transformation
Of course, it is
boring to just parse programs and not do anything with them.
So let's try to turn our matching program into a TXL
transformation that actually does something.
What we are going to
do is transform any input PHP program into one that traces
function calls when it is run. When the transformed PHP
program is run, it will output a message indicating that
each function has been called. This might for example be
used when debugging a complex PHP program.
For example, we want
the PHP function "getParameter()" in "Examples/ApiBase.php",
which looks like this:
protected function getParameter($paramName) {
$params = $this->getAllowedParams();
$paramSettings = $params[$paramName];
return $this->getParameterFromSettings($paramName, $paramSettings);
}
To be transformed to
this:
protected function getParameter($paramName) {
print ">>> called function getParameter \n" ;
$params = $this->getAllowedParams();
$paramSettings = $params[$paramName];
return $this->getParameterFromSettings($paramName, $paramSettings);
}
And similarly for
all other functions in the PHP program.
To do this, we need
to use a sub-rule, a TXL rule that will search within the
input program we matched in the main function, to find and
transform PHP functions.
- Looking
at the PHP grammar "Txl/php,grm" we can see that
functions are parsed as
[FunctionDecl],
and have this grammatical form:
define FunctionDecl
'function [opt '&] [id] '( [Param,] ') [NL]
[Block]
end define
|
That is, the keyword "function" followed by an
optional "&", the name of the function as an
identifier,
and a list of parameters in parentheses.
- The
body of the function is a [Block], which
has the form:
define Block
'{ [NL][IN]
[TopStatement*] [EX]
'} [NL]
end define
|
That is, a sequene of [TopStatement]'s
enclosed in brace brackets "{ }".
- We can
ignore the formatting directives, [NL],
[IN], [EX], because they play no
role in the grammar
other than to specify how output programs should
be formatted.
- Now we
know enough to make a pattern for our
transformation rule. The pattern and replacement
can look like this:
rule addFunctionTracing
replace $ [FunctionDecl]
'function OptAmp [opt '&] FunctionName [id] '( Params [Param,] ')
'{
Statements [TopStatement*]
'}
by
'function OptAmp FunctionName '( Params ')
'{
'print "We are here";
Statements
'}
end rule
|
- To make
the rule, all we have done is copied the
grammatical form of [FunctionDecl] and
[Block] into the pattern,
and added names to each nonterminal, for
example, "FunctionNam"e for the [id].
These names (TXL "variables"),
are used in the replacement to copy the parts
matched in the pattern into the replacement
(after the "by").
- The "$"
after "replace" in the rule specifies that each
function declaration is to be matched only
once.
If a TXL rule has no $, then it will match and
transform recursively until a fixed point is
reached.
- The
quotes ' before items in the pattern and
replacement tell TXL that the word "function",
for example,
is not intended to be the TXL function keyword
starting a new TXL function, but rather part of
the PHP pattern.
These leading quotes are optional unless the
word is a TXL keyword.
- For
now, we have just made a transformation rule to
add a PHP "print" statement to each PHP function
that prints
"We are here". Once that is working, we will
refine it to print the actual name of the
function in the message.
- Copy
your "first.txl" program to a new one called
"trace.txl", and edit it to add the sub-rule
above.
Change the main function in "trace.txl" to call
the sub-rule in its replacement:
by
PHPprogram [addFunctionTracing]
|
- Run the
new TXL program to make sure it is working:
txl Examples/ApiBase.php trace.txl > tracedApiBase.php
|
- Open
"tracedApiBase.php" in your editor and check
that the transformed output program has been
transformed
to have a new print statement at the beginning
of each function in the program.
If so, you have made your first TXL
transformation!
|
Part
VI: Refine Your TXL Transformation
The transfIormation
rule we have made so far simply adds the PHP
statement:
print "we are here";
to the beginning of
each PHP function in the program. What we really wa,nt is
the name of the function in the message.
- To do
that, we need to use a TXL constructor to make a
string iteral with the name of the function in
it.
Since the pattern of the rule captured the name
as FunctionName [id], we can simply
concatenate that name into a string
to be put in the transformed result.
- By
convention a TXL string literal constructor
looks like this:
construct MessageString [stringlit]
_ [+ "We are entering function "] [+ FunctionName]
|
It begins with an empty string "", denoted by
"_", which means "empty item" in TXL
constructors.
We concatenate the message and the function name
to it to make the message string.
- Edit
your "trace.txl" program and add the constructor
above following the pattern in your
"addFunctionTracing" rule.
- Change
the print statement in the replacement of the
rule to print the constructed MessageString
instead of "We are here".
- Now,
run your modified TXL program once again:
txl Examples/ApiBase.php trace.txl > newTracedApiBase.php
|
- Open
the transformed result "newTracedApiBase.php" in
your editor and check that the output
program
actually has print statements that print out the
function name message at the beginning of each
PHP function.
- If so,
congratulations! You have made your first useful
TXL program.
- Tomorrow
we will try something more interesting, and you
will work more independently to solve a real
problem in TXL.
|
Part
VII: Another PHP Transformation
Once we have a TXL
transformation to mind and modify every PHP function in an
input program, we can use it as
a framework to do other things involving functions. In this
next problem, we modify our trace.txl transformation
to make a transformation to extract the interface of a PHP
class.
- Copy
your "trace.txl" program to a new TXL program
named "interface.txl."
- Edit
your new "interface.txl" program, and change the
addFunctionTracing rule to remove all the
statements
in each function instead of adding a print
statement. Rename addFunctionTracing to
removeFunctionBodies
in both the rule's declaration and the call to
it in the main function of TXL program.
- Now,
run your new TXL program on the ApiBase.php
example PHP program:
txl Examples/ApiBase.php interface.txl > interfaceApiBase.php
|
- Open
the transformed result "ninterfaceApiBase.php"
in your editor and note that we now have a
version
of the main class in the program containing only
the signatures and an empty body for each
function declaration.
This is an approximation to an interface
extraction transformation for PHP classes.
- To
further refine the interface extractor, we
should remove everything except the public
functions in the result.
Edit your "interface.txl" program to add another
rule, called "removeNonPublic", and add a call
to it following the
call to removeFunctionBodies in the main
function of the TXL program.
- In
order to remove all non-public members of class
declarations, we need to look at the PHP grammar
again.
Note that PHP class declarations are parsed as
[ClassDecl], defined as follows:
define ClassDecl
[ClassType] [id] [NL]
[ExtendsClause?]
[ImplementsClause?]
'{ [NL][IN]
[ClassMember*] [EX]
'} [NL]
end define
|
- The
target type of our new rule should therefore be
[ClassMember*], since what we want to do
is remove all
non-public members from our result. Write the
new rule to match a [ClassMember*]
sequence as a
ClassMember [ClassMember] followed by
MoreClassMembers
[ClassMember*]..
- We have
now captured a [ClassMember] we may want
to remove (by not copying it into the
replacement of the rule).
However, we only want to remove ones that are
not public. Returning to the grammar, we see
that the definition of
[ClassMember] uses the nonterminal
[VarModifiers?], defined as a sequence
of [VarModifier], to specify public
functions.
- Thus we
need to guard the removal of the
[ClassMember] captured by the rule's
pattern by insisting that it not
have a [VarModifier] which is "public"..
We can use a TXL deconstructor to say this:
deconstruct not * [VarModifier] ClassMember
'public
|
This deconstructor will succeed only if we can
NOT find a "public" [VarModifier] in
ClassMember.
- Add
this deconstructor before the by clause of your
new rule.
- Run the
transformation with your new "removeNonPublic"
rule and check that only public function
signatures
are left in the result.
|
TXL
Lab #2 - Tuesday, 4 March 2014
In this lab we will
undertake some more interesting transformations, and begin
to craft them more independently.
Requirements
To do this lab you
will need everything you installed for Lab #1 yesterday, and
your results from doing Lab #1.
Part
I: A Restructuring Transformation
Open a command line
terminal, and change directory to the PHP345 directory we
were working in yesterday.
Copy the "interface.txl" TXL program you created yesterday
to a new program "restruct.txl" and edit the new
file.
In this
transformation, we are going to reorganize the members of
each PHP class in the input PHP program
to have all private and protected members first, and all
public members following them. Although this is just
an example problem, we might want to do it for various
reasons, including program understanding tasks.
- Without removing
anything from your previous TXL program, change the
replacement of the main function to
call only a new rule, called "publicAfterPrivate". The
target type of this rule must be a class declaration,
which according to the PHP grammar is defined as:
define ClassDecl
[ClassType] [id] [NL]
[ExtendsClause?]
[ImplementsClause?]
'{ [NL][IN]
[ClassMember*] [EX]
'} [NL]
end define
|
- So the rule we
have in mind should look something like this:
rule publicAfterPrivate
replace $ [ClassDecl]
ClassType [ClassType] ClassName [id]
Extends [ExtendsClause?]
Implements [ImplementsClause?]
'{
ClassMembers [ClassMember*]
'}
by
ClassType ClassName
Extends
Implements
'{
ClassMembers
'}
end rule
|
- Make this rule,
save the program, and run it on the PHP example
ApiBase.php.
txl Examples/ApiBase.php restruct.txl > restructApiBase.php
|
- Edit the result
"restructApiBase.php" and check that the output is the
same as the input, because so far the
rule simply transforms each class to itself.
Part
II: Making Multiple Copies
An advantage of
TXL's functional paradigm is that it does not work directly
on a single copy of the input, and
it costs nothing to work with mulitple copies of parts of an
input program. We are going to exploit this in
implementing this transformation.
- Edit your
"restruct.txl" program and insert two constructors before
the by clause of the new publicAfterPrivate rule.
Both constructors should be of type
[ClassMember*], and both should use the matched
ClassMembers as their value.
construct PrivateMembers [ClassMember*]
ClassMembers
construct PublicMembers [ClassMember*]
ClassMembers
|
For now, this will give us two copies of all of the
members of the matched class.
- What we want in
our result is the PrivateMembers followed by the
PublicMembers, so change the ClassMembers
in the replacement of the rule to say:
PrivateMembers [. PublicMembers]
|
- The
"PrivateMembers [. PublicMembers]" part says that
the sequence of class members made in the second
construct
is to be appended to the sequence made by the first. The
reason the sequence concatenation operaator [. ]
is needed
is that the result we want, two sequences of type
[ClassMember*}, would be of type
[ClassMember*] [ClassMember*],
but the PHP grammar only allows [ClassMember*] in
the class body, so we must concatenate them to make one
sequence.
- Save the program
and run it on the example again:
txl Examples/ApiBase.php restruct.txl > restructApiBase.php
|
- Edit the result
"restructApiBase.php" again, and notice that the
transformed class declaration in the example
now has two copies of all of the members of the class.
Part
III: Filtering Public from Private
Copying large parts
of the input and then filtering for the parts we want is
often the easiest and most efficient
way to transform in TXL. In this case, the remaining problem
is now to modify the two constructors such that
the first one keeps only the private members of the class,
and the second one keeps only the public members.
- Yesterday we
figured out how to isolate all the public members by
removing the non-public ones using our rule
"removeNonPublic". Today we will use that rule
again.
- Edit your
"restruct.txl" program and modify the constructor for
PublicMembers to call [removeNonPublic]
on the ClassMembers in it. This will leave only the
public members in that copy.
- Now, copy your
entire rule "removeNonPublic" to a new rule, which we
will call removePublic.
Modify this new rule to not remove non-public members,
but rather to remove the public members
from its scope.
- Modify the
constructor for PrivateMembers to call this new
[removePublic] rule on the ClassMembers in
it.
This will leave only the non-public members in that
copy.
- Save the program
and run it on the example again:
txl Examples/ApiBase.php restruct.txl > restructApiBase.php
|
- Edit the result
"restructApiBase.php" again, and check that the class
members of the example class have been
reordered to have all private and protected members
before the public ones.
- If so, congrats!
You have succeeded in making a restructuring
transformation.
Part
IV: More Restructuring, On Your Own!
Now it's time for
you to do something completely on your own. Restructuring
transformations often involve modifying
things in multiple dimensions at once. In this problem, we
will reorganize the body of each PHP class declaration to
have
constants first, private members next, protected functions
next, and public functions last. This reorganization can
be useful as part of an anlysis to catalogue elements in a
program, or in design recovery when identifying kinds of
entities.
- Copy your
"restruct.txl" program to a new TXL program "reorg.txl".
- Change your
"publicAfterPrivate" and other rules to restructure the
members of class declarations in the
order constants, variables, private and protected
functions, and public functions.
- Remember to do
it step by step, and to test at each step as you iterate
towards a complete solution.
- Good
luck!
TXL
Lab #3 - Wednesady, 5 March 2014
In this lab you are
challenged to solve some real problems using TXL, entirely
on your own!
Challenge
I: Declared Variables in PHP
Because PHP
variables are by default global, and because PHP programs
are built using often large sets of additive
plugins to add functionality, a serious problem when
maintaining PHP programs is "plugin interference", where a
variable
used locally by one plugin accidentally has the same name as
a variable of the main application or another
plugin.
This problem comes
about because PHP does not really have any good notion of
variable declaration - you can't say
that you intend that this variabe be a new one in your
program. In this challenge, we are going to address this
problem
by adding an explicit "var" statement to PHP that declares a
new variable. The syntax will be just like a normal PHP
assignment statement, except that the word "var" will be
before it.
To implement our new
dialect of PHP, we are going to transform it to an
equivalent pure PHP program that dynamically
checks that the variable is new when the assignment marked
"var" is executed. For example, in our dialect we might
write:
protected function getParameter($paramName) {
var $params = $this->getAllowedParams();
var $paramSettings = $params[$paramName];
return $this->getParameterFromSettings($paramName, $paramSettings);
}
Which expresses our
intention that $params and $paramSettings be new, unset
variables, so that we are not
accidentally trashing the values of any existing variables.
To implement this intention, your TXL program will
transform these var statements to:
protected function getParameter($paramName) {
{
if (isset($params)) print "*** Error, var already defined: $params";
$params = $this->getAllowedParams();
}
{
if (isset($paramSettings)) print "*** Error, var already defined; $paramSettings";
$paramSettings = $params[$paramName];
}
return $this->getParameterFromSettings($paramName, $paramSettings);
}
In order to
implement this transformation, you will have to:
- Make a TXL
grammar override for [Statement] to add the new
"var" statement to the language.
Hint: the declared var should be of type
[SimpleVariableName].
- Write a TXL rule
that matches var statements and transforms them to the
checked version in pure PHP
as shown in the example above.
A good solution to
this challenge will be no more than 25 lines in
TXL.
Challenge
II: Static Call Graph Extraction for PHP
Design recovery is
the process of extracting an ER model of a program's
architecture from its source code.
One aspect of this extraction is the creation of the static
call graph of the program from the source.
In this challenge, you will transform a PHP program to its
static call graph in the form of Prolog-like facts
repreenting the edges of the graph as "calls" facts. So, for
example, the simple program fragment:
function foo($x) {
bar ();
if ($x) {
blat ();
}
}
function bar() {
blat ();
ding ();
}
Would be transformed
to the facts:
calls (foo, bar)
calls (foo, blat)
calls (bar, blat)
calls (bar, ding)
In order to
implement this transformation, you will have to:
- Make a TXL
nonterminal definition for "calls" facts. Remember to put
an [NL] at the end of it so output is
readable.
- Redefine the PHP
nonterminal [VariableOrFunctionCall] to allow an
optional "calls" fact following it.
- Redefine the
[program] nonterminal to allow the output form, a
sequence of "calls" facts.
- You will need to
write a rule that matches each function in the program to
get its name, then calls a subrule
with the function name as parameter.
- The subrule
should match every function call (which is a
[VariableOrFunctionCall] which is an identifier
followed
by a parameter list in parentheses), and adds a "calls"
fact following it.
- Your main
function should match the whole program and call the
function matching rule on it in a constructor.
(You need a constructor since we don't intend the
transformed program to be output.)
- Finally, you
should write another constructor that uses the extract
built-in function [^ ] to extract all the
calls facts from your transformed program, and use these
as the replacement for the program.
This is a challenging problem, but nevertheless, a good solution in TXL should be no more than 50 lines.
Good luck!
|