PLOW 2014 TXL Workshop

Presentations
Lab #1 Monday
Lab #2 Tuesday
Lab #3 Wednesday


Home

About TXL

Learn

Download

Resources

Documentation

Support


 

  PLOW 2014 TXL Workshop Presentations

  TXL References

  TXL Lab #1 - Monday, 3 March 2014

In this lab we will begin by installing and testing TXL and a simple first transformation of PHP programs.

  Requirements

In order to do the TXL labs, we will need to install FreeTXL and the TXL grammar for the PHP language we will be transforming. All work will be run on the command line (in a Terminal window) on your own machine, using whatever editor and file system you are comfortable with.

If you have a choice, it is recommended that you work in Linux or MacOSX rather than Windows.

  Part I: Check TXL is Installed

We begin by simply checking that TXL is properly installed.

  • Open a command-line terminal window on your computer.
  • Test that TXL is installed, using the command:
    txl -V
  • The version of TXL installed on your machine should print on the terminal.

  Part II: Check the PHP Grammar is Installed - Parse a program

Next we try parsing a PHP program using TXL.

  • In the terminal window, change directory to the PHP grammar directory you previously downloaded (called PHP345):
    chdir PHP345
  • Try parsing one of the example programs that comes with the grammar.
    txl Examples/ApiBase,php
  • The parsed output should appear in the terminal output.

  Part III: Understanding the Parser

It's important to understand what the parser does before we begin writing transformations in TXL§.

  • You can save the parsed output to a file by redirecting it, like so:
    txl Examples/ApiBase.php > parsedApiBase.php
  • Open the parsed output and notice some things: it is consistently formatted and indented according to the grammar,
    with the spacing and indentation specifiedin the grammar Txl/php.grm, and with comments removed.
  • If you want to see the actual parse tree, you can use TXL‚Äôs XML output form to see the details. Use the command:
    txl Examples/ApiBase.php -xml > parsedApiBase.xml
  • Open the XML file and you can see the actual parse tree of the parsed PHP program. Usually we don't need to look at this,
    but it can help when crafting patterns to match in TXL.

  Part IV: Make a Simple TXL program

Now it's time to make an actual TXL program.

  • Using whatever editor you like. begin editing a new file called "first.txl" in the PHP345 directory.
  • We need a program that uses the PHP grammar, so it should begin with an include of the grammar, like so:.

    % Use the PHP grammar include "php.grm"
  • We also need a TXL main rule or function to match inputs to our program.
    Let's make a simple one that just tries to match the input as a PHP program:

    % Main rule to match PHP programs function main replace [program] PHPprogram [program] construct Message [id] _ [message "matched it!"] by PHPprogram end function
  • The program doesn't really do anything other than match an input PHP program and transform it to its parsed self,
    exactly once (because this main rule is a TXL function, not a rule).
  • Save this program and exit the editor, then run the program to see it is working:
    txl Examples/ApiBase.php first.txl > firstApiBase.php
  • If everything is right, the message "matched it!" should appear on the terminal, indicating that the TXL main function
    found and pattern matched the input as a PHP program. Simple messaging constructors like the Message one in our program can be
    helpful when debugging TXL programs.

  Part V: Make a Simple TXL Transformation

Of course, it is boring to just parse programs and not do anything with them. So let's try to turn our matching program into a TXL transformation that actually does something.

What we are going to do is transform any input PHP program into one that traces function calls when it is run. When the transformed PHP program is run, it will output a message indicating that each function has been called. This might for example be used when debugging a complex PHP program.

For example, we want the PHP function "getParameter()" in "Examples/ApiBase.php", which looks like this:

	protected function getParameter($paramName) {
		$params = $this->getAllowedParams();
		$paramSettings = $params[$paramName];
		return $this->getParameterFromSettings($paramName, $paramSettings);
	}

To be transformed to this:

	protected function getParameter($paramName) {
		print ">>> called function getParameter \n" ;
		$params = $this->getAllowedParams();
		$paramSettings = $params[$paramName];
		return $this->getParameterFromSettings($paramName, $paramSettings);
	}

And similarly for all other functions in the PHP program.

To do this, we need to use a sub-rule, a TXL rule that will search within the input program we matched in the main function, to find and transform PHP functions.

  • Looking at the PHP grammar "Txl/php,grm" we can see that functions are parsed as [FunctionDecl],
    and have this grammatical form:

    define FunctionDecl 'function [opt '&] [id] '( [Param,] ') [NL] [Block] end define

    That is, the keyword "function" followed by an optional "&", the name of the function as an identifier,
    and a list of parameters in parentheses.
  • The body of the function is a [Block], which has the form:

    define Block '{ [NL][IN] [TopStatement*] [EX] '} [NL] end define

    That is, a sequene of [TopStatement]'s enclosed in brace brackets "{ }".
  • We can ignore the formatting directives, [NL], [IN], [EX], because they play no role in the grammar
    other than to specify how output programs should be formatted.
  • Now we know enough to make a pattern for our transformation rule. The pattern and replacement can look like this:

    rule addFunctionTracing replace $ [FunctionDecl] 'function OptAmp [opt '&] FunctionName [id] '( Params [Param,] ') '{ Statements [TopStatement*] '} by 'function OptAmp FunctionName '( Params ') '{ 'print "We are here"; Statements '} end rule
  • To make the rule, all we have done is copied the grammatical form of [FunctionDecl] and [Block] into the pattern,
    and added names to each nonterminal, for example, "FunctionNam"e for the [id]. These names (TXL "variables"),
    are used in the replacement to copy the parts matched in the pattern into the replacement (after the "by").
  • The "$" after "replace" in the rule specifies that each function declaration is to be matched only once.
    If a TXL rule has no $, then it will match and transform recursively until a fixed point is reached.
  • The quotes ' before items in the pattern and replacement tell TXL that the word "function", for example,
    is not intended to be the TXL function keyword starting a new TXL function, but rather part of the PHP pattern.
    These leading quotes are optional unless the word is a TXL keyword.
  • For now, we have just made a transformation rule to add a PHP "print" statement to each PHP function that prints
    "We are here". Once that is working, we will refine it to print the actual name of the function in the message.
  • Copy your "first.txl" program to a new one called "trace.txl", and edit it to add the sub-rule above.
    Change the main function in "trace.txl" to call the sub-rule in its replacement:

    by PHPprogram [addFunctionTracing]
  • Run the new TXL program to make sure it is working:
    txl Examples/ApiBase.php trace.txl > tracedApiBase.php
  • Open "tracedApiBase.php" in your editor and check that the transformed output program has been transformed
    to have a new print statement at the beginning of each function in the program.
    If so, you have made your first TXL transformation!

  Part VI: Refine Your TXL Transformation

The transfIormation rule we have made so far simply adds the PHP statement:

	print "we are here";

to the beginning of each PHP function in the program. What we really wa,nt is the name of the function in the message.

  • To do that, we need to use a TXL constructor to make a string iteral with the name of the function in it.
    Since the pattern of the rule captured the name as FunctionName [id], we can simply concatenate that name into a string
    to be put in the transformed result.
  • By convention a TXL string literal constructor looks like this:

    construct MessageString [stringlit] _ [+ "We are entering function "] [+ FunctionName]
    It begins with an empty string "", denoted by "_", which means "empty item" in TXL constructors.
    We concatenate the message and the function name to it to make the message string.
  • Edit your "trace.txl" program and add the constructor above following the pattern in your "addFunctionTracing" rule.
  • Change the print statement in the replacement of the rule to print the constructed MessageString instead of "We are here".
  • Now, run your modified TXL program once again:
    txl Examples/ApiBase.php trace.txl > newTracedApiBase.php
  • Open the transformed result "newTracedApiBase.php" in your editor and check that the output program
    actually has print statements that print out the function name message at the beginning of each PHP function.
  • If so, congratulations! You have made your first useful TXL program.
  • Tomorrow we will try something more interesting, and you will work more independently to solve a real problem in TXL.

  Part VII: Another PHP Transformation

Once we have a TXL transformation to mind and modify every PHP function in an input program, we can use it as
a framework to do other things involving functions. In this next problem, we modify our trace.txl transformation
to make a transformation to extract the interface of a PHP class.

  • Copy your "trace.txl" program to a new TXL program named "interface.txl."
  • Edit your new "interface.txl" program, and change the addFunctionTracing rule to remove all the statements
    in each function instead of adding a print statement. Rename addFunctionTracing to removeFunctionBodies
    in both the rule's declaration and the call to it in the main function of TXL program.
  • Now, run your new TXL program on the ApiBase.php example PHP program:
    txl Examples/ApiBase.php interface.txl > interfaceApiBase.php
  • Open the transformed result "ninterfaceApiBase.php" in your editor and note that we now have a version
    of the main class in the program containing only the signatures and an empty body for each function declaration.
    This is an approximation to an interface extraction transformation for PHP classes.
  • To further refine the interface extractor, we should remove everything except the public functions in the result.
    Edit your "interface.txl" program to add another rule, called "removeNonPublic", and add a call to it following the
    call to removeFunctionBodies in the main function of the TXL program.
  • In order to remove all non-public members of class declarations, we need to look at the PHP grammar again.
    Note that PHP class declarations are parsed as [ClassDecl], defined as follows:

    define ClassDecl [ClassType] [id] [NL] [ExtendsClause?] [ImplementsClause?] '{ [NL][IN] [ClassMember*] [EX] '} [NL] end define
  • The target type of our new rule should therefore be [ClassMember*], since what we want to do is remove all
    non-public members from our result. Write the new rule to match a [ClassMember*] sequence as a
    ClassMember [ClassMember] followed by MoreClassMembers [ClassMember*]..
  • We have now captured a [ClassMember] we may want to remove (by not copying it into the replacement of the rule).
    However, we only want to remove ones that are not public. Returning to the grammar, we see that the definition of
    [ClassMember] uses the nonterminal [VarModifiers?], defined as a sequence of [VarModifier], to specify public functions.
  • Thus we need to guard the removal of the [ClassMember] captured by the rule's pattern by insisting that it not
    have a [VarModifier] which is "public".. We can use a TXL deconstructor to say this:

    deconstruct not * [VarModifier] ClassMember 'public

    This deconstructor will succeed only if we can NOT find a "public" [VarModifier] in ClassMember.
  • Add this deconstructor before the by clause of your new rule.
  • Run the transformation with your new "removeNonPublic" rule and check that only public function signatures
    are left in the result.

  TXL Lab #2 - Tuesday, 4 March 2014

In this lab we will undertake some more interesting transformations, and begin to craft them more independently.

  Requirements

To do this lab you will need everything you installed for Lab #1 yesterday, and your results from doing Lab #1.

  Part I: A Restructuring Transformation

Open a command line terminal, and change directory to the PHP345 directory we were working in yesterday.
Copy the "interface.txl" TXL program you created yesterday to a new program "restruct.txl" and edit the new file.

In this transformation, we are going to reorganize the members of each PHP class in the input PHP program
to have all private and protected members first, and all public members following them. Although this is just
an example problem, we might want to do it for various reasons, including program understanding tasks.

  • Without removing anything from your previous TXL program, change the replacement of the main function to
    call only a new rule, called "publicAfterPrivate". The target type of this rule must be a class declaration,
    which according to the PHP grammar is defined as:

    define ClassDecl [ClassType] [id] [NL] [ExtendsClause?] [ImplementsClause?] '{ [NL][IN] [ClassMember*] [EX] '} [NL] end define
  • So the rule we have in mind should look something like this:

    rule publicAfterPrivate replace $ [ClassDecl] ClassType [ClassType] ClassName [id] Extends [ExtendsClause?] Implements [ImplementsClause?] '{ ClassMembers [ClassMember*] '} by ClassType ClassName Extends Implements '{ ClassMembers '} end rule
  • Make this rule, save the program, and run it on the PHP example ApiBase.php.
    txl Examples/ApiBase.php restruct.txl > restructApiBase.php
  • Edit the result "restructApiBase.php" and check that the output is the same as the input, because so far the
    rule simply transforms each class to itself.

  Part II: Making Multiple Copies

An advantage of TXL's functional paradigm is that it does not work directly on a single copy of the input, and
it costs nothing to work with mulitple copies of parts of an input program. We are going to exploit this in
implementing this transformation.

  • Edit your "restruct.txl" program and insert two constructors before the by clause of the new publicAfterPrivate rule.
    Both constructors should be of type [ClassMember*], and both should use the matched ClassMembers as their value.

    construct PrivateMembers [ClassMember*] ClassMembers construct PublicMembers [ClassMember*] ClassMembers
    For now, this will give us two copies of all of the members of the matched class.
  • What we want in our result is the PrivateMembers followed by the PublicMembers, so change the ClassMembers
    in the replacement of the rule to say:
        PrivateMembers [. PublicMembers]
  • The "PrivateMembers [. PublicMembers]" part says that the sequence of class members made in the second construct
    is to be appended to the sequence made by the first. The reason the sequence concatenation operaator [. ] is needed
    is that the result we want, two sequences of type [ClassMember*}, would be of type [ClassMember*] [ClassMember*],
    but the PHP grammar only allows [ClassMember*] in the class body, so we must concatenate them to make one sequence.
  • Save the program and run it on the example again:
    txl Examples/ApiBase.php restruct.txl > restructApiBase.php
  • Edit the result "restructApiBase.php" again, and notice that the transformed class declaration in the example
    now has two copies of all of the members of the class.

  Part III: Filtering Public from Private

Copying large parts of the input and then filtering for the parts we want is often the easiest and most efficient
way to transform in TXL. In this case, the remaining problem is now to modify the two constructors such that
the first one keeps only the private members of the class, and the second one keeps only the public members.

  • Yesterday we figured out how to isolate all the public members by removing the non-public ones using our rule
    "removeNonPublic". Today we will use that rule again.
  • Edit your "restruct.txl" program and modify the constructor for PublicMembers to call [removeNonPublic]
    on the ClassMembers in it. This will leave only the public members in that copy.
  • Now, copy your entire rule "removeNonPublic" to a new rule, which we will call removePublic.
    Modify this new rule to not remove non-public members, but rather to remove the public members
    from its scope.
  • Modify the constructor for PrivateMembers to call this new [removePublic] rule on the ClassMembers in it.
    This will leave only the non-public members in that copy.
  • Save the program and run it on the example again:
    txl Examples/ApiBase.php restruct.txl > restructApiBase.php
  • Edit the result "restructApiBase.php" again, and check that the class members of the example class have been
    reordered to have all private and protected members before the public ones.
  • If so, congrats! You have succeeded in making a restructuring transformation.

  Part IV: More Restructuring, On Your Own!

Now it's time for you to do something completely on your own. Restructuring transformations often involve modifying
things in multiple dimensions at once. In this problem, we will reorganize the body of each PHP class declaration to have
constants first, private members next, protected functions next, and public functions last. This reorganization can
be useful as part of an anlysis to catalogue elements in a program, or in design recovery when identifying kinds of entities.

  • Copy your "restruct.txl" program to a new TXL program "reorg.txl".
  • Change your "publicAfterPrivate" and other rules to restructure the members of class declarations in the
    order constants, variables, private and protected functions, and public functions.
  • Remember to do it step by step, and to test at each step as you iterate towards a complete solution.
  • Good luck!

  TXL Lab #3 - Wednesady, 5 March 2014

In this lab you are challenged to solve some real problems using TXL, entirely on your own!

  Challenge I: Declared Variables in PHP

Because PHP variables are by default global, and because PHP programs are built using often large sets of additive
plugins to add functionality, a serious problem when maintaining PHP programs is "plugin interference", where a variable
used locally by one plugin accidentally has the same name as a variable of the main application or another plugin.

This problem comes about because PHP does not really have any good notion of variable declaration - you can't say
that you intend that this variabe be a new one in your program. In this challenge, we are going to address this problem
by adding an explicit "var" statement to PHP that declares a new variable. The syntax will be just like a normal PHP
assignment statement, except that the word "var" will be before it.

To implement our new dialect of PHP, we are going to transform it to an equivalent pure PHP program that dynamically
checks that the variable is new when the assignment marked "var" is executed. For example, in our dialect we might
write:

    protected function getParameter($paramName) {
        var $params = $this->getAllowedParams();
        var $paramSettings = $params[$paramName];
        return $this->getParameterFromSettings($paramName, $paramSettings);
    }

Which expresses our intention that $params and $paramSettings be new, unset variables, so that we are not
accidentally trashing the values of any existing variables. To implement this intention, your TXL program will
transform these var statements to:

    protected function getParameter($paramName) {
        {
            if (isset($params)) print "*** Error, var already defined: $params";
            $params = $this->getAllowedParams();
        }
        {
            if (isset($paramSettings)) print "*** Error, var already defined; $paramSettings";
            $paramSettings = $params[$paramName];
        }
        return $this->getParameterFromSettings($paramName, $paramSettings);
    }

In order to implement this transformation, you will have to:

  • Make a TXL grammar override for [Statement] to add the new "var" statement to the language.
    Hint: the declared var should be of type [SimpleVariableName].
  • Write a TXL rule that matches var statements and transforms them to the checked version in pure PHP
    as shown in the example above.

A good solution to this challenge will be no more than 25 lines in TXL.

  Challenge II: Static Call Graph Extraction for PHP

Design recovery is the process of extracting an ER model of a program's architecture from its source code.
One aspect of this extraction is the creation of the static call graph of the program from the source.
In this challenge, you will transform a PHP program to its static call graph in the form of Prolog-like facts
repreenting the edges of the graph as "calls" facts. So, for example, the simple program fragment:

    function foo($x) {
        bar ();
        if ($x) {
            blat ();
        }
     }
         
     function bar() {
          blat ();
          ding ();
     }

Would be transformed to the facts:

     calls (foo, bar)
     calls (foo, blat)
     calls (bar, blat)
     calls (bar, ding)

In order to implement this transformation, you will have to:

  • Make a TXL nonterminal definition for "calls" facts. Remember to put an [NL] at the end of it so output is readable.
  • Redefine the PHP nonterminal [VariableOrFunctionCall] to allow an optional "calls" fact following it.
  • Redefine the [program] nonterminal to allow the output form, a sequence of "calls" facts.
  • You will need to write a rule that matches each function in the program to get its name, then calls a subrule
    with the function name as parameter.
  • The subrule should match every function call (which is a [VariableOrFunctionCall] which is an identifier followed
    by a parameter list in parentheses), and adds a "calls" fact following it.
  • Your main function should match the whole program and call the function matching rule on it in a constructor.
    (You need a constructor since we don't intend the transformed program to be output.)
  • Finally, you should write another constructor that uses the extract built-in function [^ ] to extract all the
    calls facts from your transformed program, and use these as the replacement for the program.
This is a challenging problem, but nevertheless, a good solution in TXL should be no more than 50 lines.
         
Good luck!