In the last few weeks I’ve been working on Crystal Gazer, a nodejs console application to gather information from your Git repository. You can find it on GitHub and on NPM.

One of the things I’d like to do is to track the evolution of a function. Has it been modified a lot? How many people has been working on it? Is the function too long?

To answer the first question, what we could do is rely on the git log -L:function:file command to give us all the changes a function has suffered. We need a couple of things to run this command: the name of the file and the name of the function.

The name of the file is the easy part. We have the names of the files in the original git log (see the README.md of Crytal Gazer to know how it works) and we can always ask the user to input it. But given a file, we’d like to list the different functions with their statistics. So we need an automated way to extract the name of the functions from a source code file (we’re going to work with .cs files).

First attempt

My first approximation was to try to code something by myself. A combination of Regex, a couple of ifs, maybe some split… That didn’t worked well. The number of cases to take into account is big enough to be very difficult to write such function.

ANTLR

Then, one acronym came to my mind: AST. An Abstract Syntax Tree is a tree representation of the abstract syntactic structure of source code written in a programming language (wikipedia). This is what, for example, escomplex uses to perform metrics. Can we create the AST of C# code in JavaScript?

The answer is ANTLR. ANTLR is a parser generator that can create an AST from a file using a grammar that we can define. Fortunately, we don’t need to create the C# grammar by ourselves, we can grab it from here.

These are the steps I followed to be able to get the function names from a C# file.

Create the lexer, parser and the listener from the grammar.

In a VERY short way, antlr uses a lexer to create tokens from your input, then uses these tokens to initialize the parser that creates the AST. When you traverse the tree, you get notified in the listener about the different things it finds.

To create these three files, we’ll need to use the antlr tool. So, let’s download the tool and make it accessible. Follow these instructions just changing 4.5.3 to 4.7.

Now we can use the tool to generate the lexer. Download CSharpLexer.g4 from the grammars repository and copy it to your project folder. Run the following command:

1
antlr4 -Dlanguage=JavaScript CSharpLexer.g4

This will generate CSharpLexer.js and CSharpLexer.tokens.

We need to do the same for CSharpParser.g4. So run the following command:

1
antlr4 -Dlanguage=JavaScript CSharpParser.g4

In this case this command will generate CSharpParser.js, CSharpParser.tokens and CSharpParserListener.js.

Prepare a nodejs project

We’re going to create a nodejs project to use all this stuff. Let’s add antlr4 to it:

1
npm install antlr4 --save

Create a file called index.js and import the neccessary modules:

1
2
3
4
const antlr4 = require('antlr4/index');
const fs = require('fs');
const CSharpParser = require('./CSharpParser.js');
const CSharpLexer = require('./CSharpLexer.js');

If you now run this script you’ll get errors loading the CSharpLexer module. That’s because the file generated has some invalid identifiers:

  • In the line 5 there’s a
    1
    import java.util.Stack;
    
    which is obviously invalid in JavaScript. Just comment out (or delete) the line.
  • In the lines 1770 to 1773 (both included) there is some C# code. Just change each variable initialisation by
    1
    var <variable_name>
    
    .
  • Between the lines 1866 and 1883 (both included) there is more C# code. Comment out or delete it.

Creating the listener

With the help of ANTLR we’ve created a CSHarpListener. We can use that listener as a base class for more specific listeners. As we want to get the names of the different methods of the file, let’s create a listener for that.

Create a file called CSharpFunctionListener.js and copy the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
const antlr4 = require('antlr4/index');
const CSharpLexer = require('./CSharpLexer');
const CSharpParser = require('./CSharpParser');
var CSharpListener = require('./CSharpParserListener').CSharpParserListener;

CSharpFunctionListener = function(res) {
    this.Res = res;    
    CSharpListener.call(this); // inherit default listener
    return this;
};
 
// inherit default listener
CSharpFunctionListener.prototype = Object.create(CSharpListener.prototype);
CSharpFunctionListener.prototype.constructor = CSharpFunctionListener;


CSharpFunctionListener.prototype.enterMethod_member_name = function(ctx){
    this.Res.push(ctx.getText());
}

exports.CSharpFunctionListener = CSharpFunctionListener;

Nothing too special here. The important part is that we’re overriding the enterMethod_member_name method, which is the method that will be called when the parser finds a method name.

Putting all together

Time to use all this stuff. Go back to your index.js file and add the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
var input = fs.readFileSync('aFile.cs').toString();

var chars = new antlr4.InputStream(input);
var lexer = new CSharpLexer.CSharpLexer(chars);
var tokens  = new antlr4.CommonTokenStream(lexer);
var parser = new CSharpParser.CSharpParser(tokens);

var tree = parser.namespace_member_declarations();   
var res = [];
var csharpClass = new CSharpFunctionListener(res);
antlr4.tree.ParseTreeWalker.DEFAULT.walk(csharpClass, tree);

console.log("Function names: ", res.join(','));

As you can see, we’re creating the lexer and the parser. Then, we use antlr to traverse the tree and pass the listener in order to be notified.

If you run it, you will get the names of the functions.

Summary

This is the first thing I do with antlr and it’s been really pleasant. The only problem was removing the invalid code from the lexer and that’s all. Expect a new version of Crystal Gazer using some antlr magic soon!!

Until now we’ve seen how to create a Step Function, but we’ve always called them using the serverless framework. In this article we’re going to see how to call them programatically.

We have two options to call a Step Function: the first one is to use the API Gateway and create an HTTP endpoint as the Event source of the Step Function. The second one is to call the step function from a Lambda function using the AWS SDK.

HTTP endopoint

We can create the HTTP endpoint easily just modifying the serverless.yaml file. The basic code we need is the following one:

1
2
3
4
5
6
7
stepFunctions:
    stateMachines:
        hello:
        events:
            - http:
                path: hello
                method: GET

If we want to pass some data to the Step Function, just change GET by POST. Now, we you deploy the function using serverless sls deploy you’ll see something like this in the ouput:

1
2
3
4
5
6
7
8
9
10
11
12
........................
Serverless: Stack update finished...
Service Information
service: TestCallStepFunction
stage: dev
region: us-east-1
api keys:
None
endpoints:
GET - https://awo2ongx54.execute-api.us-east-1.amazonaws.com/dev/hello
functions:
hello: TestCallStepFunction-dev-hello

So, if you open that link in the browser you will see the output of the call.

Call via the SDK

The other option we have, is use the SDK to call the Step Function from a Lambda function. Let’s do that using NodeJS.

First of all, install the aws sdk as a dev dependency in your project using npm install –dev aws-sdk or yarn add –dev aws-sdk.

Now you need to import the sdk in your Lambda function code:

1
const AWS = require('aws-sdk');

Create the object you’ll need to call the methods associated to the step functions:

1
var stepfunctions = new AWS.StepFunctions();

And call the step function:

1
2
3
var params = {
stateMachineArn: '<the step function arn>',
input: '{"value": "hello!"}'   };

stepfunctions.startExecution(params, function(err, data) { if (err) console.log(err, err.stack); // an error occurred else console.log(data); // successful response });

You can get the Step Function arn from the dashboard.

If you now call the Lambda sls invoke -f hello you will see the output of the Lambda. To see if the Step Function was called we need to go to CloudWatch and see the logs of the invocation. So, go to your AWS console -> CloudWatch -> Logs, select the Lambda function and select the last execution.

You will see an error like this:

1
2
3
4
{ 
    AccessDeniedException: User: arn:aws:sts::165940758985:assumed-role/TestCallStepFunction-dev-us-east-1-lambdaRole/TestCallStepFunction-dev-hello is not authorized to perform: states:StartExecution on resource: arn:aws:states:us-east-1:165940758985:stateMachine:TestCall-Z249EWN421QQ
    ...
}

So, the role serverless is creating for us to run the Lambda and the Step Function does not have permissions to start the execution of the Step Function from the Lambda. We need then to add that permission. Fortunately, serverless makes it easy. Just add this piece of code inside the provider section of the serverless.yml file:

1
2
3
4
5
iamRoleStatements:
-  Effect: "Allow"
   Action:
     - "states:StartExecution"
   Resource: <the step function arn>

Now, if we go to the CloudWatch logs we will see no error.

## Making the code robust What’s the problem with the current code? The problem is that we’re hardcoding the arn of the Step Function in the serverless.yml file and in the code of the Lambda function. If we re-deploy the Step Function, its arn will change and we’ll need to change both files. Not very practical. Wouldn’t be better if we can export the arn of the Step Function and use it everywhere? Environment variables to the rescue!

Let’s start by defining an output that will export the arn of the Step Function to an environment variable. Open the serverless.yml file and add:

1
2
3
4
5
6
resources:
    Outputs:
        TestCall:
        Description: The ARN of the state machine
        Value:
            Ref: TestCall

Now we need to use this environment variable when we’re defining the IAM role. Let’s change the resource value to use the environment variable:

1
2
3
4
5
iamRoleStatements:
-  Effect: "Allow"
   Action:
     - "states:StartExecution"
   Resource: ${self:resources.Outputs.TestCall.Value}

The next step is to pass the environment variable to the Lambda function. We can do that easily just adding a couple of lines in the Lambda function definition:

1
2
3
4
5
6
7
8
9
functions:
    hello:
        handler: handler.hello
        events:
            - http:
                path: hello
                method: GET
        environment:
            testcall_arn: ${self:resources.Outputs.TestCall.Value}

And finally, we need to change the code of the Lambda to use the environment variable:

1
2
3
4
var params = {
    stateMachineArn: process.env.testcall_arn,
    input: '{"value": "hello!"}'
};

With these changes, we can redeploy the Step Function and the Lambda function and we’ll continue to be able to call the Step Function from the Lambda function.

Summary

We’ve seen how we can interact programmatically with a Step Function, using the API Gateway or using the AWS SDK inside a Lambda function.

This will be the last article explaining the different states we can use in a step function. We’ll see three simple states like Pass, Fail and Succeed and finally, we’re going to a see a more complex state like Choice. And obviously, we’re going to use the http://serverless.com framework to deploy them.

Pass state

The pass state is a simple state that just passes its input to its output, without performing any work. Apart from the common fields, you can specify two optional fields:

  • Result: The result to pass to the next state, filtered by the ResultPath field.
  • ResultPath: Specifies where (in the input) put the output specified in the Result field.

Fail state

The fail state stops the execution of the Step Function and marks it as a failure. It only allows the Type and Comment fields from the common fields, and you can use a couple of optional fields:

  • Cause: a custom failure string
  • Error: an error name that can be use for error handling.

Succeed state

The succeed state stops the execution of the Step Function successfully. It’s a terminal state, so it doesn’t have a Next or End field. It’s a good target for a choice branch that you just want to stop the execution.

Choice state

The choice state allows you to declare a decision logic into your state machine. You can specify different branches with different logic to access them. In addition to the common fields it adds a couple of fields:

  • Choices (required): an array of choice rules that determines the next state.
  • Default (optional but recommended): the state to transition if no choice rule is satisfied.

Choice rules

Each choice rule contains a comparision and a next field. Comparisions can be composed using And or Or operators. You can check all the operators available here.

When defining a comparision you must specify two fields:

  • Variable: which value are you going to compare. It will be a path of the input value of the function.
  • Operator: the field name will be the operator you want to use and the value will be the value you want to compare with.

Let’s put everything together

So, let’s put all we’ve learned so far in a single step function (without any lambda this time):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
stepFunctions:
    stateMachines:
        testChoiceStepFunction:
            definition:
                StartAt: DoChoice
                States:
                    DoChoice:
                        Type: Choice
                        Choices: 
                        - Variable: "$.value"
                            NumericGreaterThan: 0
                            Next: PositiveNumber
                        - Variable: "$.value"
                            NumericLessThan: 0
                            Next: NegativeNumber
                        Default: Zero
                    PositiveNumber:
                        Type: Pass
                        Result: {"result": "It's a positive number!"}
                        Next: FinalState
                    NegativeNumber:
                        Type: Pass
                        Result: {"result": "It's a negative number!"}
                        Next: FinalState
                    Zero:
                        Type: Fail
                        Cause: "It's a zero!"
                    FinalState:
                        Type: Succeed

What are we defining here is a step function with two choices (number greater than 0 and number less than zero) and a default state. Every choice has a next state. Both PositiveNumber and NegativeNumber are Pass states with a different result, the Zero state is a Fail state and the FinalState is a Succeed state.

step function

Let’s deploy and run the function and see what happens.

1
sls invoke stepf --nam e testChoiceStepFunction --data '{"value": 1}'

choice positive result

1
sls invoke stepf --nam e testChoiceStepFunction --data '{"value": -1}'

choice negative result

1
sls invoke stepf --nam e testChoiceStepFunction --data '{"value": 0}'

choice zero result

Summary

We’ve seen how easy is to set up a choice state in State Functions, allowing us to choose different paths depending on the input of the state.

As we’ve seen in previous articles, Step Functions helps us to orchestrate lambda functions. One of the most important aspects when we’re developing a system, distributed or not, is handling errors and retries. In this articles we’ll see how easy is to do it using Step Functions and the serverless framework.

Catching errors

Coding the lambda

First of all we’re going to catch some errors. Let’s create a new project with one lambda inside it named ErrorLambda with the following code:

1
2
3
4
5
6
7
8
9
10
public class ErrorLambda
{
    public void Error(string input){
        throw new CustomException(input);
    }
}

public class CustomException : Exception{
    public CustomException(string message) : base(message){}
}

Creating the Step Function

Let’s create a Step Function that catches errors. As always, create the serverless.yml and copy the following definition of the Step Function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
stepFunctions:
    stateMachines:
        testErrorStepFunction:
            definition:
                StartAt: Error
                States:
                    Error:
                        Type: Task
                        Resource: arn:aws:lambda:${opt:region}:${self:custom.accountId}:function:${self:custom.errorService}-${opt:stage}
                        Catch:
                            - ErrorEquals: ["CustomException"]
                                Next: CustomErrorFallback
                            - ErrorEquals: ["States.TaskFailed"]
                                Next: "ReservedTypeFallback"
                            - ErrorEquals: ["States.ALL"]
                                Next: "CatchAllFallback"
                        End: true
                    CustomErrorFallback:
                        Type: Pass
                        Result: "This is a fallback from a custom Lambda function exception"
                        End: true
                    ReservedTypeFallback:
                        Type: Pass
                        Result: "This is a fallback from a reserved error code"
                        End: true
                    CatchAllFallback:
                        Type: "Pass"
                        Result: "This is a fallback from any error code"
                        End: true

We can split what we’re doing here in two parts. At the end of the definition we’re defining the steps that we’ll run after we catch an error. In this case, we’ll define three types of errors and three different next states.

First, we defined our catchers. In this case we defined three different catchers. The first one is to catch exceptions that we throw from our Lambda. The string that help us to filter the error is the name of the class of the exception we want to catch. In the second and third catchers, we’re using predefined error codes. We know that they are predefined error codes because they start with States. The possible values for a predifined error codes are States.ALL, States.Timeout, States.TaskFailed, States.Permissions. You can read more about them here

If we now deploy the step function, run it, and we go to the AWS console we’ll see the representation of it

retries

As you can see, is similar to a parallel step, but in this case we just run one of the branches.

If we click in one of the catchers and inspect the input, we’ll see the type of error and the stack trace:

1
2
3
4
5
6
7
8
9
10
11
{
    "Error":"CustomException",
    "Cause":{
        "errorType": "CustomException",
        "errorMessage": "an error",
        "stackTrace": [
            "at StepFunctionsHandleErrors.ErrorLambda.Error(String input) in /StepFunctionsHandleErrors/ErrorLambda/ErrorLambda.cs:line 12",
                "at lambda_method(Closure , Stream , Stream , ContextInfo )"
                ]
    }
}

The string in the errorType field is the one that you have to use in the definition of the catcher.

Retrying

It’s possible that when you detect an error (or an specific type of error) you want to retry a lambda to see if the error was a transient one. Configuring retries is a very simple step in Step Functions, you just need to add the following code just before the Catch element:

1
2
3
4
5
6
7
Retry:
    - ErrorEquals: [ "CustomException" ]
      IntervalSeconds: 1
      BackoffRate: 2.0
      MaxAttempts: 4
    - ErrorEquals: [ "States.ALL" ]
      IntervalSeconds: 5

In this case we’re specifying that, if we get a CustomException error we’re going to retry 4 times at most (default 3), with an initial interval of 1 second (default 1) and a back-off rate of 2.0 (default 2.0).

If we want, we can specify a list of errors in the ErrorEquals field. If we want to specify a States.ALL retrier, it must appear alone and as the last retrier.

Summary

We’ve seen how easy is to deal with errors and retries using Step Functions and, as always, how the serverless framework help us in deploying them.

In the last article we’ve seen how to the parallel state in a State function. In this article we’ll see how we can use the Wait state using the serverless framework.

The wait state delays the execution of the state function for a certain amount of time. By default, it returns the same object that it receives.

What are we going to code

We are going to code the following state function step function with parallel state

As you can see we’re going to have an initial function that creates a result with a field called DelaySeconds. Then, we’ll have the wait state and finally a result state that will format the output.

Coding the lambdas

Following the steps of the following article, create two lambdas called InitLambda and ResultLambda with the following code:

InitLambda

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public class InitLambda
{
    public InitResult Init(string input){
        return new InitResult(int.Parse(input));
    }
}
    
public class InitResult{

    public InitResult(int delay)
    {
        DelaySeconds = delay;
    }
    public int DelaySeconds {get;set;}
}

ResultLambda

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
public class ResultLambda
{
    public string Result(InitResult input){
        return $"The seconds delayed are {input.DelaySeconds}";
    }
}

public class InitResult{

    public InitResult(int delay)
    {
        DelaySeconds = delay;
    }
    public int DelaySeconds {get;set;}
}

// You can have the result class in a shared library

Creating the step function

Now it’s time to create the step function. The code is very similar to our original code, but we’re now introducing a new kind of state. Let’s put here the interesting bits:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
stepFunctions:
    stateMachines:
        testParallelStepFunction:
            definition:
                StartAt: Init
                States:
                    Init:
                        Type: Task
                        Resource: arn:aws:lambda:${opt:region}:${self:custom.accountId}:function:${self:custom.initService}-${opt:stage}-Init
                        Next: WaitSeconds
                    WaitSeconds:
                        Type: Wait
                        Seconds: 10
                        Next: Result
                    Result:
                        Type: Task
                        Resource: arn:aws:lambda:${opt:region}:${self:custom.accountId}:function:${self:custom.resultService}-${opt:stage}-Result
                        End: true

As you can see we have a new task state called WaitSeconds which is of type Wait. In this first case we are specifying that we want to wait 10 seconds. Let’s run the step function from the UI and see if it waits the desired time.

wait 10 seconds

It works!

Let’s see which other alternatives do we have.

Specifying a timestamp

It can be possible that we need a step to be executed at a certain time. If we want that, we can specify the timestamp field:

1
2
3
4
WaitSeconds:
    Type: Wait
    Timestamp: "2017-06-20T20:58:00Z"
    Next: Result

The timestamp, as the documentation says, must conform to the RFC3339 profile of ISO 8601, with the further restrictions that an uppercase T must separate the date and time portions, and an uppercase Z must denote that a numeric time zone offset is not present. In our case, we’re saying that we want to wait until 2017/06/20 20:58 UTC.

Let’s deploy and execute the step function from the UI to see if it works:

wait timestamp

Non-hardcoded duration

We don’t need to always hardcode the value of the duration or the timestamp. We can use a path from the state’s input data to specify. If we want to do that, we need to specify the state in this way:

1
2
3
4
WaitSeconds:
    Type: Wait
    SecondsPath: "$.DelaySeconds"
    Next: Result

In our case, we’re going to use the field DelaySeconds of the input data to read the amount of seconds we want to wait. We can do the same thing with the timestamp using the field TimestampPath.

Summary

We’ve seen another possible state that you can use when defining a State Function: the wait step. We’ve seen how we configure this kind of step in four different ways.