torsdag den 7. juni 2018

SOLID principles, in layman's terms: Liskov Substitution

Raison d'être: I set out to write about the SOLID software development principles. Specifically, my aim was/is to make these things more understandable to other developers who, much in the way as yours truly, found it troublesome to have to decipher lengthy, complex articles and books on the matter. These principles are for everyone to learn and use, but I found that they were hard to fully grasp; quite possibly because I come from a non-English speaking background. So with this series of articles I'm setting out to try and de-mystify the principles, with only the best intentions in mind. The principles apply to many layers of software development. In my articles, I specifially aim to describe them as they relate to programming. I hope it'll be of use to you. Thank you for stopping by.

This will be a 5 article-series about SOLID. SOLID is all the rage; at least as far as the job-adds I'm reading are concerned; "you are expected to honor the SOLID principles" etc. So what exactly is SOLID about? Plain and simple, it's a set of guide-lines in developing object-oriented systems. They are a group of concepts that have proven themselves valuable for a great many people coding a great many pieces of software. A tale told by your elders in software engineering, if you will, that you will want to pay heed to so you can boast on your CV that you're into the SOLID principles - and you'll be a better developer for knowing them, I promise you that. Heck, you're a better developer for simply _wanting_ to know them!

SO[L]ID - The Liskov Substitution principle

The Liskov substitution principle is the third in the set of principles that make out the SOLID acronym. According to the wikipedia entry, it states that, [quote] "in a computer program, if S is a subtype of T, then objects of type T may be replaced with objects of type S (i.e., objects of type S may be substituted for objects of type T) without altering any of the desirable properties of that program[/quote]. OK. Cool. But what the heck does that imply, exactly, and why is it important?

It implies that a derived class must be substitutable for its base class. That is to say, whatever methods you define in your base class, you must implement these in all your implementing classes (if you're programming to an interface) or derived classes (if you're programming to an abstraction). You can't, by way of the principle, choose to NOT implement a method, and you mustn't extend your implementation by offering functionality in the implementation that's not available in the base class.

Basically, it's a code of honor: You mustn't 'violate' the base class in your implementations, rather honor it and fulfill its every method.

Alright - so that's the principle. But why is it important? Well, it's important because we would like to always be able to refer to the base class or the interface as opposed to deal with implementations of a base class directly, because this leads us to greater de-coupling in our design, which in turn provides us flexibility, a blessing in systems development (all things being equal!). But if we can't trust our implementations to deliever the same result - albeit in their own unique ways - of our abstraction or interface, that's when we begin to throw ugly if-else lines around. And that's when code becomes bloated, recompilations abound... Ragnarök!

I'll demonstrate by way of example. Please consider the below:

public interface ILogLocation
{
    bool LogVerbose { get; set; }
    void Log(string message);
}

public class DiskLogLocation : ILogLocation
{
    public DiskLogLocation()
    {
    }

    public bool LogVerbose { get; set; }

    public void Log(string message)
    {
        // do something to log to a file here

        if (LogVerbose)
        {
            // do someting to log a heck of lot more log-text here
        }
        else
        {
            // do someting to log not very much log-text here
        }
    }
}

public class EventLogLocation : ILogLocation
{
    public EventLogLocation()
    {
    }

    public bool LogVerbose { get; set; }

    public void Log(string message)
    {
        // do something to log to the event-viewer here

        // notice we don't check the LogVerbose bool here
    }
}

So here we have an interface, ILogLocation, with two different implementations. But only one of the implementations makes use of the LogVerbose boolean. That's our danger-sign: this is a behaviour that alters the design-logic of our interface, which explicitly specifies that boolean and designates it a property, with getter and setter methods, for all intents and purposes because it's meant to be used. But the EventLogLocation-class 'dishonors' the interface in as much it doesn't make use of it all. Wonder if that will come back and haunt us? Let's see below how the implementations might be instantiated:


ConfigurationSettings.AppSettings.type_of_log_location = "eventLog";

private ILogLocation logLocation;

static void Main(string[] args)
{
    string logLocationType = ConfigurationSettings.AppSettings.Get("type_of_log_location");

    switch( logLocationType )
    {
        case "disk":
            logLocation = new DiskLogLocation();
            break;
        case "eventLog":
            logLocation = new EventLogLocation();
            break;
    }

    logLocation.LogVerbose = true;
    logLocation.Log("Hoping to get a lot more text logged");
}

Seems we could land in hot water here; in the above, as we instantiate an ILogLocation implementation by way of a configuration-file. We're setting the LogVerbose property and thus would expect to log a lengthy text with lots of debug hints as Log()-method. But as our configuration file finds us instantiating an EventLogLocation() that's never going to happen!

The Liskov Substitution principle is great for keeping us out of trouble down the road. We'll share our code with others, and they needn't have to know about potential pitfalls like "oh, yeah, there's this thing you need to know about the EventLogLocation's Log()-method...". More importantly, it's a reminder to focus on the code-quality of our implementations, and re-design our interfaces accordingly if we should feel the need to extend or degrade our base-class implementations. That danger lies particularly in altering the behaviour of our base-class, which may for example necessitate ugly type-checks such as the below:

if ( logLocation is EventLogLocation)
{
    // we, the deveoper, know there's no check on the 'LogVerbose' bool
    // on the EventLogLocation implementation
    logLocation.Log("Hoping not so much to get a verbose log");
}
else
{
    logLocation.LogVerbose = true;
    logLocation.Log("Hoping to get a verbose log");
}

In the above we're doing a type-check on the logLocation, and reacting one way if it's this time and another if it's not. Not so good - we've gone beyond having to deal with only a base class or interface representation, and set ourselves up for maintenance problems down the road.

So that's basically what the Liskov Substitution principle holds: Honor the contract as set up by your base class or interface. Don't extend the implementations to hold logic that's not called for, don't skip implementing methods by simple calling a 'return()' or throw an exception from them. If you feel that urge, look into re-designing your 'contract' with the base class or interface instead.

When I was first introduced to it, the Liskov Substitution principle seemed to me to hold little value. "Who does this",  I thought, "who creates a base-class with methods and choose then NOT to implement them - what are they even for, then?" As it turns out, lots of coders. Myself included. It slowly creeps up on you, as various factors such as "you need to finish this project NOW!" or "it'll just be this one time I'm splitting my logic based on this type-check" affect the way you work, and code. But let me assure you it pays dividends down the road, paying attention to it - never more so when you're part of a bigger team, and must be able to trust your colleagues' code, and they yours.

Hope this helps you in your further career. And good luck with it, too!

Buy me a coffeeBuy me a coffee

SOLID principles, in layman's terms: Interface Segregation

Raison d'être: I set out to write about the SOLID software development principles. Specifically, my aim was/is to make these things more understandable to other developers who, much in the way as yours truly, found it troublesome to have to decipher lengthy, complex articles and books on the matter. These principles are for everyone to learn and use, but I found that they were hard to fully grasp; quite possibly because I come from a non-English speaking background. So with this series of articles I'm setting out to try and de-mystify the principles, with only the best intentions in mind. The principles apply to many layers of software development. In my articles, I specifially aim to describe them as they relate to programming. I hope it'll be of use to you. Thank you for stopping by.

This will be a 5 article-series about SOLID. SOLID is all the rage; at least as far as the job-adds I'm reading are concerned; "you are expected to honor the SOLID principles" etc. So what exactly is SOLID about? Plain and simple, it's a set of guide-lines in developing object-oriented systems. They are a group of concepts that have proven themselves valuable for a great many people coding a great many pieces of software. A tale told by your elders in software engineering, if you will, that you will want to pay heed to so you can boast on your CV that you're into the SOLID principles - and you'll be a better developer for knowing them, I promise you that. Heck, you're a better developer for simply _wanting_ to know them!

SOL[I]D - The Interface Segregation principle

Round 4 of our series of articles on the SOLID principles brings us to the Interface Segregation. And once more we turn to wikipedia for an official explanation, where it states that, [quote] "that no client should be forced to depend on methods it does not use.[[/quote]. Let's get down and dirty - what's it all about, then?

It's very simple, really: Basically this ones means that if you're implementing interfaces and your implementation doesn't implement all methods of that interface, that's a sign that your interface is bloated and must might benefit from a redesign.

A brief example to the fore. Again, these are very simple examples, solely for the purpose of getting a grasp on the concept. Let's say we've got this interface, that we wish to implement, and two such implementations:

  public interface ILogLocation
    {
        string LogName { get; set; }
        void Log(string message);
        void ChangeLogLocation(string location);
    }

    public class DiskLogLocation : ILogLocation
    {
        public string LogName { get; set; }

        public DiskLogLocation(string _logName)
        {
            LogName = _logName;
        }

        public void Log(string message)
        {
            // do something to log to  a file here
        }

        public void ChangeLogLocation(string location)
        {
            // do something to change to a new log-file path here
        }

    }

    public class EventLogLocation : ILogLocation
    {
        public string LogName { get; set; }

        public EventLogLocation()
        {
              LogName = "WindowsEventLog";
        }

        public void Log(string message)
        {
            // do something to log to the event-viewer here
        }


        public void ChangeLogLocation(string location)
        {
            // we can't change the location of the windows event log, 
            // so we'll simply return
            return;
        }
    }

Above are two implementations of the ILogLocation interface. Both implement the ChangeLogLocation() method, but only our DiskLogLocation-implementation brings something meaningful to the table. This it messy; it's like putting a winter beanie over your summer cap, they're both hats but you needn't wear them both at the same time. Let's say that we have three DiskLogLocations and one EventLogLocation: now, for configuration purposes we may run through them all by adressing them by their interface, that would be a perfectly reasonable thing to do...

var logLocations = new List<ILogLocation>() {
                new DiskLogLocation("DiskWriteLog"),
                new DiskLogLocation("DiskReadLog"),
                new EventLogLocation(),
                new DiskLogLocation("FileCheckLog")
            };

            foreach( var logLocation in logLocations)
            {
                logLocation.ChangeLogLocation(@"c:\" + logLocation.LogName + ".txt");
            }

... so that's what we do in the above, but that leads as you can see to an unncessary call to the ChangeLogLocation() for the EventLogLocation-object. It's unnecessary, it doesn't lead to anything rather the potential of introducing bugs and is to be considered a code smell, and should be generally 'uncalled for' - parden the pun.

So here's what we can do instead:

public interface ILogLocation
{
    string LogName { get; set; }
    void Log(string message);
}

public interface ILogLocationChanger
{
    void ChangeLogLocation(string location);
}

public class DiskLogLocation : ILogLocation,  ILogLocationChanger
{
    public string LogName { get; set; }

    public DiskLogLocation(string _logName)
    {
        LogName = _logName;
    }

    public void Log(string message)
    {
        // do something to log to  a file here
    }

    public void ChangeLogLocation(string location)
    {
        // do something to change to a new log-file path here
    }
}

public class EventLogLocation : ILogLocation
{
    public string LogName { get; set; }

    public EventLogLocation()
    {
        LogName = "WindowsEventLog";
    }

    public void Log(string message)
    {
        // do something to log to the event-viewer here
    }
}

Above we've moved the ChangeLogLocation()-method into its own interface, ILogLocationChanger - which is implemented only the DiskLogLocation implementation. That's a much nicer way of going about it; no unneccesary method implementations ever again! Also, we now have a way of distinguishing the implementations, by way of the abstractions they implement. We could potentially re-use one or more interfaces for other classes.

So there you have it - the Interface Segregation principle, which basically means 'split up your interfaces if you find that you have to implement methods that aren't required for your class, or that you simply return, or throw an exception, from'. Easy!

As an aside, you might ponder the different from this Interface Segregation principle and the Single Responsibility principle; as a reminder, the Single Responsibility principle informs us that a module should carry a single responsibility only. Isn't that about the same thing as this Interface Segregation principle we've just taken on? Well, they're somewhat similar. There's a difference, but I won't go into it. I'll instead refer to StackOverflow user Andreas Hallberg, who I thought penned an explanation superior to anything I could've come up with: (ref: http://stackoverflow.com/questions/8099010/is-interface-segregation-principle-only-a-substitue-for-single-responsibility-pr) [quote]"Take the example of a class whose responsibility is persisting data on e.g. the harddrive. Splitting the class into a read- and a write part would not make practical sense. But some clients should only use the class to read data, some clients only to write data, and some to do both. Applying ISP here with three different interfaces would be a nice solution."[/quote] Great stuff! So we gather that the SR principle is about 'one thing, and one thing only' vs. the IS principle is about 'what we need, and only what we need'. Related principles, both valid, both important.


Hope this helps you in your further career. And good luck with it, too!

Buy me a coffeeBuy me a coffee

SOLID principles, in layman's terms: Open/Closed

Raison d'être: I set out to write about the SOLID software development principles. Specifically, my aim was/is to make these things more understandable to other developers who, much in the way as yours truly, found it troublesome to have to decipher lengthy, complex articles and books on the matter. These principles are for everyone to learn and use, but I found that they were hard to fully grasp; quite possibly because I come from a non-English speaking background. So with this series of articles I'm setting out to try and de-mystify the principles, with only the best intentions in mind. The principles apply to many layers of software development. In my articles, I specifially aim to describe them as they relate to programming. I hope it'll be of use to you. Thank you for stopping by.

This will be a 5 article-series about SOLID. SOLID is all the rage; at least as far as the job-adds I'm reading are concerned; "you are expected to honor the SOLID principles" etc. So what exactly is SOLID about? Plain and simple, it's a set of guide-lines in developing object-oriented systems. They are a group of concepts that have proven themselves valuable for a great many people coding a great many pieces of software. A tale told by your elders in software engineering, if you will, that you will want to pay heed to so you can boast on your CV that you're into the SOLID principles - and you'll be a better developer for knowing them, I promise you that. Heck, you're a better developer for simply _wanting_ to know them!

S[O]LID - The Open/Closed principle

The open/closed principle is the second in the set of principles that make out the SOLID acronym. According to the wikipedia entry, it states that [quote] "software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification" [/quote]. What is basically comes down to is this, that you'll want to write your classes in such a way that you won't have to change them once they're done.

And why is this a good thing? Because changes to code is an inheritantly bad thing. Or, to put it differently, if you find yourself having to make changes to your source code when it's gone into production, that's when this principle would come in handy. Why, because source code once it's there and it's been tested, it's been reviewed, it's been released, by then it doesn't lend itself well to changes. Add a feature, tweak a business-rule, and you might find yourself in a situation where something, somewhere breaks, and you could be in a heap of trouble.

So what to do about it? It's easy: program to abstractions (i.e. interfaces or inherit from abstract classes), not implementations. When your variables are interface types, you won't need to specify until run-time which kind of implementation will actually represent the interface type. Consider this basic example:

First here's doing it wrong:

public class Logger
    {
        private DiskLogLocation diskLogLocation;

        public Logger()
        {
            diskLogLocation = new DiskLogLocation(@"c:\logs");
        }

        public void LogMessage(string message)
        {
            diskLogLocation.Log(message);

        }
    }


Above we have a Logger-class, which news up a 'DiskLogLocation', and write a message to it - supposedly to the directory specified in the constructor. The problem with this lies in the fact that we've now limited ourselves to the DiskLogLocation - we haven't fulfilled the open/closed principle, in as much as we cannot easily extend the Logger-class to allow us to log to the event-viewer, for example. Then here's doing it right:


 public class Logger
 {
        // use an interface, and provide an implementation of 
        // this interface at runtime
        private ILogLocation logLocation;

        public Logger()
        {
        }

        public void LogMessage(string message)
        {
            logLocation.Log(message);

        }
 }



In the above, the logLocation acts as a placeholder for which ever implementation of ILogLocation we eventually choose to use (see "As an aside" below on how to do just that). That's basically it - the open/closed principle in a nut-shell. That's - basically - what's meant by "open for extension": that we program to interfaces, not implementations, so that we can pull in whichever code we desire at a later time by simply replacing the interface implementation - for example, an 'EventLogger' class.

There're more ways than one to go about extending a class; the above is just one - but a fine place to start non the less, if I may say so myself. You can do magical stuff with some so-called 'design patterns', for example the strategy pattern, but that's for a different article.

So now you know about the open/closed principle: that you should program to interfaces and thus leave your options open for extention. Easy to learn, difficult to master; it takes some experience, recognizing the internal workings of a class that should be left open for extention. And there's always the risk of going overboard and interfacing to everything, when the business requirements just don't call for it. These discussions can get right religious at times.

______________

As an aside; how do you then, at run-time, specify which implementation will be used? Well one way is to have a method on your utilizing class which offers the possibility of exchanging the implementation with another one. For example, like so:

public class Logger
 {
        private ILogLocation logLocation;

        public Logger()
        {
        }
        
        public void SetLogLocation(ILogLocation _logLocation)
        {
            logLocation = _logLocation;
        }

        public void LogMessage(string message)
        {
            logLocation.Log(message);
        }
 }


Another way is to allow for this in the utilizing class' constructor, like so:

public class Logger
 {
        private readonly ILogLocation logLocation;

        // allow the Logger instantiator to determine which
        // implementation to use
        public Logger(ILogLocation _logLocationDeterminedAtRunTime)
        {
            logLocation = _logLocationDeterminedAtRunTime;          
        }

        public void LogMessage(string message)
        {
            logLocation.Log(message);

        }
 }

Given the above example you might possibly determine which implementing class to use via a configuration file, which you run at your composition root - which is just a fancy way of saying 'where your app starts', typically main() or ApplicationStart() or what have you. You could do this, for a simplistic example:

static void Main(string[] args)
        {
            string logLocationType = ConfigurationSettings.AppSettings.Get("type_of_log_location");
           
            if (logLocationType == "Disk")
            {
                logLocation = new DiskLogLocation(@"c:\logs");
            }

            Logger logger = new Logger(logLocation);
            logger.LogMessage("hello log");

        }

Here we're asking a configuration-setting for know-how on which type of log-location we wish to use, and we instantiate that pass it to the logger on construction. Now you're well on your way to using something fancy called 'Dependency Injection', - which in turn is just a fancy way of saying 'we are giving our classes the variables they need in their constructor*'. A topic for a different article entirely.

*) Usually the case, but not always.


Hope this helps you in your further career. And good luck with it, too!

Buy me a coffeeBuy me a coffee

Generating a Pie Chart with Grafana and MS SQL

Struggled with this, hope it helps someone.

In Grafana, if we base a visualization on an MS SQL query, and if we set Format as to Time series, for use in Graph panel for example, then the query must return a column named time.

This can make life difficult if we need to generate a chart, for example a Pie Chart, where we don't have a need for the time-visualization as with a bar-chart, for example.

A way around this is to return an aggregate 'time' column, like in the below:


SELECT sum(exceptionCounts.hasException), exceptionCounts.title as metric, count(exceptionCounts.time) as time
FROM
(
SELECT sessionnumber, 
        operationResult, 
        startdatetime as time
  FROM  foobarDatabase.dbo.fullSessionLog
  where operationResult like '%xception%' 
    AND $__timeFilter(customDateTimeStamp)
  group by sessionnumber, operationResult
) as exceptionCounts
group by exceptionCounts.title

In the above, we execute an inner query, which returns a timestamp. In the outer query we can then aggreate - 'count' - the timestamp. And the Pie Chart should work.

Took me some time and then some to figure that one out. Hope it helps!

Buy me a coffeeBuy me a coffee

mandag den 4. juni 2018

Using Docker to maintain a development environment

Would you contemplate using a containerized environment, that you might tear down at any moment, only to build it right back up again, indistinguishable from how it was before?

I have recently begun to do just that. And it has greatly enhanced my workflow, and productivity. And I’ll try and describe the how’s and why’s below.

When we think of Docker, we usually think of reasonably light-weight container instances, that we may assemble and congregate to form various applications. Each container yields usually a particular and highly individual service, for example a REST API or a database instance or a memory-cache or what have you, and we can compile these into low coupled solutions. And there’s the potential of scaling, of the low overhead and cost that comes with only keeping the containers running for the duration of their workload. That’s the norm. But really one of the fundamental aspects of using Docker is in how we can create a image of anything we wish, and the brevity - all things being equal - in how Docker image recipe makes it simple to conjure up the exact same environment to the detailed spec that is the Docker-file. Right now I’m running a number of only slightly different Ubuntu-based development environments in docker containers, that I access remotely. It’s based on an Ubuntu 17.10, it’s got Visual Studio Code installed, a few extensions to that, a browser, git. Postman, that I recommend for API testing.

Why would one wish to do that, 'dockerize' a full-fledged Ubuntu, weighing in at some odd gigabytes? I can think of several promoting arguments:

- The transparency of the environment. Setting up a dev machine can be tedious, and after a while it inevitably becomes cluttered with updates and remains of old installs and such. A Docker-initiated container won’t suffer from this; any changes are made in the Docker-file and we’ll get into the habit of tearing the machine down, and building it anew, soon enough. A Docker-container will never become polluted over time; it doesn’t live long.

- The versioning of Docker-files becomes a major factor. We can have several distinct Docker-files, each suited to their individual project. Some will hold a different set of tools than others. Also, significant, it won’t hold applications that aren’t needed for solving the particular problem you’re working; no e-mail app, no chat app, no office-app if that’s not required. It’s easy to not become distracted when the apps that might distract us aren’t installed.

- Sharing is caring. I can work on the exact same development environment as my colleague, as we’re both tied in to the same dev setup. I’ve so suffered in the past when for some reason I wasn’t able to access the same resources, or install the same apps, as my colleague. Everything is the same; it narrows the focus on the project. And when I need to hand the project over to the developer next to me, I can be sure the environment will work the way I intended.

- I needn’t worry about my machine breaking down. Not that this has happened many times during my career, but there has been the odd irrecoverable crash. With a Docker environment, I just need to get my hand on a machine, any damn machine that will run Docker. Pull the Docker image, spin up the container, get to work.

There are, of course, alternatives. I’ve often used Virtual Machines, that I’ve provisioned and enhanced with scripts, be it apt-get’s on Linux or Choco on Windows. But they’ve never worked for me in the same fashion as does Docker-containers. They too became polluted over time, or it was a hassle to take them on the road with me. Docker containers are easier; the fire-and-forget nature appeals to me. A virtual machine takes up a chunk of space that you’ll only get back when you stop using it; a Docker-container only takes up space for as long as you use it - though we do need to remember to clean up after ourselves.

It’s not all pros, there are some cons, of course there are. If I’m not disciplined enough to commit my development changes to source control, well, let’s just say that a Docker-container doesn’t have a long attention-span! But that really all I can hold against the notion.

The potential in tying in the development environment with project-specific software and -configuration is, I can testify, quite productivity enhancing - and I, for one, won’t be going back to maintaining my own development machine.

Enough talk, let’s try it out. Install the Docker community edition if you don’t have it already. Then take a glance at my template Docker-file, below, for an Ubuntu 17.10 with VS Code and a few related extensions installed, as well as a Firefox Browser, the Postman API development tool and a few other tools of my personal liking. It’s heavily annotated so I’d like to try and let the comments speak for themselves:

# Get the distro:
FROM ubuntu:17.10
ENV DEBIAN_FRONTEND=noninteractive

# Do some general updates.
RUN apt-get clean && rm -rf /var/lib/apt/lists/* &&  apt-get clean &&  apt-get update -y &&  apt-get upgrade -y
RUN apt-get install -y software-properties-common 
RUN add-apt-repository universe
RUN apt-get install -y cups curl sudo libgconf2-4 iputils-ping libxss1 wget xdg-utils libpango1.0-0 fonts-liberation
RUN apt-get update -y && apt-get install -y software-properties-common && apt-get install -y locales 

# Let's add a user, so we have a chance to, well, actually use the machine.
ENV USER='thecoder'
ENV PASSWORD='password'
RUN groupadd -r $USER -g 433 \
    && useradd -u 431 -r -g $USER -d /home/$USER -s /bin/bash -c "$USER" $USER \
    && adduser $USER sudo \
    && mkdir /home/$USER \
    && chown -R $USER:$USER /home/$USER \
    && echo $USER':'$PASSWORD | chpasswd

# Let's add a super user, too. Same password
ENV SUDOUSER='theadmin'
ENV PASSWORD='password'
RUN groupadd $SUDOUSER \
    && useradd -r -g $SUDOUSER -d /home/$SUDOUSER -s /bin/bash -c "$SUDOUSER" $SUDOUSER \
    && adduser $SUDOUSER sudo \
    && mkdir /home/$SUDOUSER \
    && chown -R $SUDOUSER:$SUDOUSER /home/$SUDOUSER \
    && echo $SUDOUSER':'$PASSWORD | chpasswd

# Configure timezone and locale to en_US. __Change locale and timezone to whatever you want.__
 ENV LANG="en_US.UTF-8"
 ENV LANGUAGE=en_US
 RUN locale-gen en_US.UTF-8 && locale-gen en_US
 RUN echo "Europe/Copenhagen" > /etc/timezone && \
     apt-get install -y locales && \
     sed -i -e "s/# $LANG.*/$LANG.UTF-8 UTF-8/" /etc/locale.gen && \
     dpkg-reconfigure --frontend=noninteractive locales && \
     update-locale LANG=$LANG

# Create an keyboard-layout file, so we won't have to set it every time the machine starts. Just replace XKBLAYOUT="dk" with your layout, ex. "us".
# RUN printf '# Consult the keyboard(5) manual page.\nXKBMODEL="pc105"\nXKBLAYOUT="dk"\nXKBVARIANT=""\nXKBOPTIONS=""\nBACKSPACE="guess"\n'"" > /etc/default/keyboard
# - doesn't work :(
# set danish keyboard layout
# RUN setxkbmap dk  - doesnt work :(

# Install some much needed programs - nano, midnight commander, guake terminal
RUN apt-get install nano -y
RUN apt-get install mc -y
RUN apt-get install guake -y

# Use the Xfce desktop. Because it's nice to look at, in my opinion.
RUN apt-get update -y && \
    apt-get install -y xfce4
# There's also the MATE desktop-enviroment. Bit more light-weight.
#RUN apt-get update -y && \
#   apt-get install -y mate-desktop-environment-extras    

# Install git
RUN apt-get install -y git
# Install Python.
# Whoa, waitaminute - Ubuntu 17.10 already comes with Python 3.6 as default. Just run python3 to invoke it.

# Install Firefox
RUN apt-get install firefox -y

# Install Postman to 'opt' dir
RUN wget https://dl.pstmn.io/download/latest/linux64 -O postman.tar.gz
RUN tar -xzf postman.tar.gz -C /opt
RUN rm postman.tar.gz

# Install Visual Studio Code
ENV VSCODEPATH="https://go.microsoft.com/fwlink/?LinkID=760868"
RUN curl -fSL "${VSCODEPATH}" -o vscode.deb && dpkg -i vscode.deb

# To make it easier to automate and configure VS Code, it is possible to list, install, 
# and uninstall extensions from the command line. When identifying an extension, provide 
# the full name of the form publisher.extension, for example donjayamanne.python.
USER $USER
WORKDIR /home/$USER

# Enable viewing git log, file history, compare branches and commits - https://marketplace.visualstudio.com/items?itemName=donjayamanne.githistory
RUN code --install-extension donjayamanne.githistory
# Install Ms' python - linting, debugging, intellisense, etc.
RUN code --install-extension ms-python.python
# Install code outline provider - better code visualization in the explorer pane
RUN code --install-extension patrys.vscode-code-outline


# Annnnnd back to root for the remainder of this session.
USER root

# Install nomachine, the remote-desktop server that enables us to remote into the container image.
# You don't have to rely on my choice of NoMachine - just go to their website and get a different one, if you want.
ENV NOMACHINE_PACKAGE_NAME nomachine_6.1.6_9_amd64.deb
ENV NOMACHINE_MD5 00b7695404b798034f6a387cf62aba84

RUN curl -fSL "http://download.nomachine.com/download/6.1/Linux/${NOMACHINE_PACKAGE_NAME}" -o nomachine.deb \
&& echo "${NOMACHINE_MD5} *nomachine.deb" | md5sum -c - \
&& dpkg -i nomachine.deb


# Create an executable file that starts the NoMachine remote desktop server.
# A unix executable .sh-file must start with #!/bin/bash. '\n' means 'newline'.
# Note how the file ends with a /bin/bash-command. That's deliberate, it allows
# us do - don't ask me how - keep the container running when we use it later.
RUN printf '#!/bin/bash\n/etc/NX/nxserver --startup\n/bin/bash'"" > /etc/NX/nxserverStart.sh
# Now make the executable _actually_ executable ...
RUN chmod +x /etc/NX/nxserverStart.sh
# ... and start the nomachine-remote server when the container runs, and ...
CMD ["/etc/NX/nxserverStart.sh"]
#... happy developing! Use a NoMachine-client program to log into the server.
# PS: remember to run the container with the -d and -t arguments. 
# Check the readme.md file, https://github.com/harleydk/linuxRemoteDocker/blob/master/README.md

If you’re entirely new to Docker, hopefully the comments should make it clearer in what’s going on here. We start out by downloading a Linux distribution, then do a bunch of RUN commands which execute when we progress to build the image. This includes also setting up a user and a super-user. Save the file as ‘Dockerfile’, without extension, and execute, in the directory of this new Docker-recipe, the ...

docker build -t ‘linux_remote_pc’ .

... command. Then sit back and wait for a bit; the individual steps of the recipe will be executed and the image built. Now’s the time to put it to use. The image installs and runs a NoMachine service, enabling remote desktop connections via a NoMachine-client application. It’s fast and free. So go ahead and install a client from their web-site, https://www.nomachine.com/download, and spin up a Docker container from the Docker image just built, by issuing the following command:


docker run -d -t -p 4000:4000 --name=linux_remote_pc --cap-add=SYS_PTRACE linux_remote_pc_image

This will spin up an instance, a ‘Docker container’ as it's called, with the port 4000 of the Docker host mapped to port 4000 on the container - the port that we’ll connect to, as we remote access the container. So, on a machine on the same network as the Docker host, start the NoMachine client and connect:



...Where 'localhost' would of course be the host-name of the machine where the Docker-container is running. Having connected to the container, this should be the first that you see:



The machine tells us it lacks some basic configuration - that’s because we installed it unattended, so as not to have to deal initially with all those setup-choices. Just press ‘use default config’ and be done with it.

That’s it - you now have a machine with VS Code installed:





If you’re new to Docker there’re lots of tutorials and documentation available. Highly recommend trying build commands on a ‘hello world’ Dockerfile instead of the slightly more complex one, above. If just to get a feel for the command-line interface and command responses.

The Docker-file above is maintained at my github repository, https://github.com/harleydk/linuxRemoteDocker, which you can clone if you like. There’s a bit of further documentation available there.

I’d much like to hear from you, if you find it useful and/or if I can help out in some way. So drop me a note in the comments or create a new github issue, https://github.com/harleydk/linuxRemoteDocker/issues and I’ll see what I can do.

Thanks for reading & happy developing.


Buy me a coffeeBuy me a coffee

lørdag den 28. april 2018

Going serverless with Internet-of-Things and Azure Functions


The internet-of-things concept prevails in near every article on future IT prospects. And such a fascinating topic it is,too, the scenarios are limitless and the use-cases palpable, feels as if we should have been able to fulfill them ages ago. But let’s put the philosophical notions aside; I’d like to demonstrate my choice of technology for an IoT-solution based on the Azure serverless-offerings. So without further ado.

The internet-of-things - IoT from hence forward - may be realized in a myriad of fashions; I’ll cover today the use of Microsoft’s Azure cloud functionality, specifically the use of Azure Function Apps, Azure Service Bus and Azure Storage - with a very little bit of Azure Logic Apps thrown in for good measure. So it’s a whole lot of Azure, and the title speaks of ‘serverless’, too. In case you’re unfamiliar with the term, I’ll try my best at my own definition of the term: ‘Serverless’ refers to resources you do not need to carefully provision on your own. Of course we’re all well aware of how no request for a specific resource is ever truly ‘serverless’; until further, future notice there’s, however deep down in the many-layered stack, some physical hardware-element that is destined to provide the response to the request, there’s no getting around that yet. So the ‘serverless’ term should not be taken literally, as opposed to providing a moniker for getting resources up and running, and responding to requests, without having to care much about physical hardware restraints. Such that you do not need to, in honor of the phrasing of the term, set up an actual server to response to requests, rather you can simply deploy the response-mechanism itself to the Azure cloud, and instead of tending to hardware-specific metrics - how many CPU’s, whether to deploy load balancing, such considerations - you can focus on how you would like the response-mechanism to behave, for example how to scale. 

But I digress; you’re likely fully aware of the term and its implications. For the sake of this article, however, serverless also implies a consideration about the setup: my IoT devices will send data to, and receive data from, the aforementioned cloud services. This as opposed to letting requests be handled by a central broker such as an MQTT implementation, for example. This is a wholly deliberate choice; this particular implementation of an IoT architecture is suited for smaller projects, where the requirements can be the more readily fixed and thus we can make these assumptions that we would otherwise abstract away. Also, further limitations of this particular implementation is, to be later amended, a lack of device security focus and provisioning, entirely complicated topics in their own right. Make no mistake, this IoT implementation will work fine and have extremely limited financial impact. If you’re dealing with tens of thousands of distinct sensors and devices, across a multitude of locations, you will likely find better use of the suite of dedicated Microsoft IoT portfolio, and I recommend you then investigate that route further.

Tedious limitations aside, let’s dive into the fun stuff! Please keep the below architecture-diagram in mind as we go through a simple use-case, along the way deliberating on the technological choices.




The example will be of a temperature-sensor that sends a temperature-reading to our cloud-based back-end, optionally in turn retrieving a device-specific command to execute.

The device distributes its reading to an Azure Function (1) by way of a HTTPS REST-call. The function is responsible for storing raw data into an Azure Storage queue (2). Azure Functions are tremendously cheap to execute - you can have millions of executions for very low cost, thus ideal for a network of sensors that frequently sends data. Similar financial argument applies for the Azure Storage account that we will use to hold our sensor data. The queue storage is ideal for our purpose; it is designed to hold work-items to be processed at some later stage, at which point the data will simply be removed from the queue - and we can also specify a dedicated ‘time-to-live’ for the data, if needed. But, most importantly, we can make use of a dedicated Azure Function App trigger (3) that activates on new items in the queue. In this case specific case it retrieves the data from the queue and, from the raw sensor data, creates a more specific, and enriched, data model. We could do this in the first Azure Function, certainly, but the abstraction point is important in as it enables us to inject business logic here, if this is later needed. At present the only logic is in retrieving sensor-device information from an Azure storage table (4), and adding a bit of this information to the device-message that then goes into an Azure storage table that holds sensor data (5) - but later on we might add sensor-authentication in there, and at least now we have an abstraction point in which to implement this further down the line. The Azure Functions scale well; if your queue becomes crowded and you’re running the functions on a so-called consumption plan, for example, the functions will simply spin up more instances to handle the load. That really speaks to the core of the serverless term.

So, the second Azure function creates a more meaningful piece of data (6) - I add information about the specific type of sensor, for example - and sends this to an Azure Service Bus topic (7). The Azure Service Bus is a data ingestion and distribution mechanism, quite capable of receiving and handling millions of messages within an ambitious time-frame - just what we might need for a sensor-rich IoT solution. It is not the only tool-choice in regards to mass-message ingestion; Microsoft offers the dedicated IoT event hub, for example, and other vendors will have their own offerings. The reasons I chose it for are as follows, it’s cheap, fast, simple, and it plays extremely well with Azure Functions, as we’ll get around to shortly. The Azure Service Bus receives messages in two various ways: directly into a queue, not unlike the table storage queue albeit with some significantly enhanced features. For our purpose, however, we’ll utilize the Service Bus Topic feature, where we send messages into a so-called ‘topic’, which we may then subscribe to. This is the general publish-subscribe mechanism most often associated with various service bus implementations, and it works well with an IoT scenario such as this. In my specific implementation, I create a generic ‘message received’-topic, and into this I then send every piece of sensor-data that is received and enriched. This enrichment of data is mainly what facilitates a meaningful filtering of the message into dedicated subscriptions (8). A simple example might be in how a temperature sensor sends a reading to the receiving Azure function. The raw data is enriched with device-type information, so that we can infer the sensor-type - a temperature sensor - from the device-id. This enriched message is then sent to the service bus. A subscription to this topic will have been created and will pick up, for example, any messages from temperature sensors with a temperature exceeding an x degrees threshold.

The advantages of this, in conjunction with the use of the Azure Function App, becomes quite clear as we react to messages being picked up by our various topic-subscriptions, such as the ‘TemperatureHigh’-subscription. The subscriptions act as nothing more than message-filters and -routers. In order to consume the messages we have, as is almost always the case with the Azure platform, multiple ways of going about it. For our implementation we’ll implement another Azure Function, specific to messages being sent to the 'TemperatureHigh'-subscription. It’s that simple - we specify a subscription-name as we create the function, then deploy it, and the Azure infrastructure sees to it that the function is triggered appropriately (9). We do not have to poll continuously, we’re always hooked up, so to speak. This is a major advantage of integrating these two technologies, i.e. the possibility of quickly building an infrastructure that’s equally capable and reliable. The downside remains, of course, that it’s a very efficient yet hard-coupled architecture - there’s no replacing the service bus component with another cloud provider’s similar product. There’s always that trade-off that we need make; for my purposes, this coupling works extremely well: as messages arrive at their dedicated subscription, an equally dedicated Azure Function is triggered and the message is thus consumed and acted upon. The act, in this scenario, is in issuing an appropriate command (10) for the device itself, or possibly another device. For example, given a higher than usual temperature, we might issue a command to the device itself to sound an alert-buzzer. The command goes on an Azure Storage table (11), where we keep the history of issued commands, for audit trail and visualization purposes. It’s from this table the final Azure Function retrieves this command, upon a periodic request (12) from the sensor device that then executes it.

 So that’s an example of an IoT architecture based on Azure cloud technologies, without a central broker. Once again, it’s not to be considered best practice for all scenarios, please don’t implement it without regard for the circumstances pertaining to your own demands.

Now, for some technical aspects, I’d like to present some bits of the code of the Azure Functions, and thus go further into the details behind the implementation.

The Azure function into which is sent the raw data from the sensor is an HTTP-triggered one such. Here’s the code, annotated further below:



[FunctionName("QueueRawValue")]
public static async Task<HttpResponseMessage> Run([HttpTrigger(AuthorizationLevel.Anonymous, "get", "post", Route = "QueueRawValue")]HttpRequestMessage req, TraceWriter log)
{
    try
    {
        log.Info("C# QueueRawValue http trigger function processed a request.");

        string deviceId = req.GetQueryNameValuePairs().FirstOrDefault(q => string.Compare(q.Key, "deviceId", true) == 0).Value; 
        string value = req.GetQueryNameValuePairs().FirstOrDefault(q => string.Compare(q.Key, "value", true) == 0).Value;
        DateTime timestamp = DateTime.Now;

        CloudStorageAccount cloudStorageAccount = CloudConfigurationFactory.GetCloudStorageAccount();
        var queueClient = cloudStorageAccount.CreateCloudQueueClient();
        var queueReference = queueClient.GetQueueReference("iotl15queue");
        // Create the queue if it doesn't already exist
        await queueReference.CreateIfNotExistsAsync();

        RawIngestDataModel data = new RawIngestDataModel
        {
            DeviceId = deviceId,
            RawValue = value
        };
        string queueMessage = JsonConvert.SerializeObject(data, Formatting.None);
        var message = new CloudQueueMessage(queueMessage);
        await queueReference.AddMessageAsync(message);

        return deviceId == null
            ? req.CreateResponse(HttpStatusCode.BadRequest, "Please pass a deviceId on the query string or in the request body")
            : req.CreateResponse(HttpStatusCode.OK);
    }
    catch (Exception e)
    {
        // todo: do some logging
        throw e;
    }
}



Please disregard the deliberately blatant lack of security considerations and other such concerns, and focus on the basic functionality. The raw json-data from the sensor is put onto a storage queue for later processing. That’s all the function does, grabbing raw input data from the request parameters and storing this into a 'RawIngestDataModel' object, representing just that, raw input data in any shape or form. So we have a very basic way of capturing information and storing this into a queue, for eventual later - hopefully swift and efficient - processing. We could process the raw data at this stage, but this design provides us with an extension point we might need good use of, later on: if the number of request were to suddenly sky-rocket, the queue would easily scale to fit this requirement, by virtue of its built-in cloud capabilities thus regarding.

The next function, in turn, is then triggered by this queue-adding:

[FunctionName("ProcessRawQueueMessage")]
public static void Run([QueueTrigger("iotl15queue", Connection = "AzureStorageConnectionString")]string myQueueItem, TraceWriter log)
{
    try
    {
        RawIngestDataModel rawIngestData = JsonConvert.DeserializeObject<RawIngestDataModel>(myQueueItem);
        
        CloudStorageAccount cloudStorageAccount = CloudConfigurationFactory.GetCloudStorageAccount();
        var cloudService = new AzureTableStorageService(cloudStorageAccount);
        RegisteredValueModel registeredValueModel = CreateRegisteredDatamodelFromRawInput(rawIngestData);
        cloudService.SendRegisteredDataToTableStorage(registeredValueModel);
  
        // send to servicesbus
        string ServiceBusConnectionString = @"Endpoint=sb://myservicesbus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=ssharedkeyD/2zayhog=";
        string TopicName = @"devicemessagetopic";
        ITopicClient topicClient = new TopicClient(ServiceBusConnectionString, TopicName);

        // Create a new message to send to the topic.
        string messageBody = JsonConvert.SerializeObject(registeredValueModel);
        var message = new Message(Encoding.UTF8.GetBytes(messageBody));

        message.UserProperties.Add("DeviceId", registeredValueModel.DeviceId);
        message.UserProperties.Add("TextValue", registeredValueModel.TextValue);
        message.UserProperties.Add("NumericalValue", registeredValueModel.NumericalValue);
        DeviceModel deviceInformation = GetDataAboutDevice(registeredValueModel.DeviceId);
        message.UserProperties.Add("DeviceType", deviceInformation.DeviceType);

        // TODO: enrich with device-type, etc.

        // Send the message to the topic.
        topicClient.SendAsync(message);

        log.Info($"C# Queue trigger function processed: {myQueueItem}");
    }
    catch (Exception e)
    {
        // todo: do some logging
        throw e;
    }
}

private static RegisteredValueModel CreateRegisteredDatamodelFromRawInput(RawIngestDataModel rawIngestData)
{
    RegisteredValueModel registeredValueModel = new RegisteredValueModel()
    {
        DeviceId = rawIngestData.DeviceId,
        TextValue = rawIngestData.RawValue,
    };

    float attemptToParseValueAsNumerical;
    if (float.TryParse(rawIngestData.RawValue, out attemptToParseValueAsNumerical))
        registeredValueModel.NumericalValue = attemptToParseValueAsNumerical;

    return registeredValueModel;
}

/// <summary>
/// Get device-data from table storage
/// </summary>
/// <remarks>
/// Return dummy data for now.
/// </remarks>
private static DeviceModel GetDataAboutDevice(string deviceId)
{
    // TODO: implement this. Consider memory caching.

    DeviceModel temporaryDeviceModel = new DeviceModel()
    {
        DeviceId = deviceId,
        DeviceType = "TemperatureMeasurementDevice"
    };
    return temporaryDeviceModel;
}


Above function dequeues data from the queue, for further processing. The bulk of work is already done for us, in terms of how to connect to the queue and react on new entries into said queue, all this rather crucial functionality is already wired up for us and ready to be made good use of. Almost seems to good to be true, does it not. It’s well worth remembering the old adage, ‘if it seems to good to be true…’: we do get a tremendous amount of proven functionality ‘for free’, so to speak, but of course we also give up the possibility of having a say in how all this is achieved; we’re tied into the Azure platform. This is an acceptable choice for my particular IoT implementation, but may not be for you - it’s pros and cons and something you should take into serious consideration, as per your particular scenario. The above code will retrieve the first available raw data from the queue, and transform it into a RegisteredValueModel-object. Note how this inherits from the TableEntity-object; so we can store it within an Azure Table Storage table. For my purposes I'm using the device-id as partition-key on the table, as this seems a natural fit. That's behind the scenes and not shown here, for brevity's sake. From this table we'll later be able to do visualizations and historic compilations on the device data, though that's for a later blog-entry. The most important bit, for now, is in noting how the registered device data is sent to the Azure Service Bus topic, with the 'devicemessagetopic' name that indicates how, indeed, this topic receives all messages from all devices. Here stops, then, the responsibility of the Azure function. Now we can go and create subscriptions to this topic, as pertains to our specific use-cases. For example creating aforementioned subscription to dangerously high temperatures from my temperature-sensors. "temperatureHighSubscription" is my name for it, and given this name and a valid connection into the service bus, we can easily crate an Azure Function that triggers when the Azure service bus filters messages to this subscription:


[FunctionName("GeneralHighTempTriggerFunction")]
public static async Task Run([ServiceBusTrigger("devicemessagetopic", "temperatureHighSubscription", Connection = "AzureServiceBusConnectionString")]string mySbMsg, TraceWriter log)
{
    log.Info($"C# ServiceBus topic trigger function processed message: {mySbMsg}");

    RegisteredValueModel dataFromServiceBusSubscription = JsonConvert.DeserializeObject<RegisteredValueModel>(mySbMsg);

    // Add to commandModel history data table
    DeviceCommandModel deviceCommand = new DeviceCommandModel()
    {
        DeviceId = dataFromServiceBusSubscription.DeviceId,
        CommandText = "SoundAlarm",
        SentToDevice = false
    };

    CloudStorageAccount cloudStorageAccount = CloudConfigurationFactory.GetCloudStorageAccount();
    var cloudService = new AzureTableStorageService(cloudStorageAccount);
    cloudService.SendDeviceCommandToTableStorage(deviceCommand);

    // Send notification of high temperature to azure logic app:
    INotificationService azureLogicAppNotificationService = new AzureLogicAppSendPushOverNotificationService();
    NotificationModel notification = new NotificationModel()
    {
        Title = "Temperature-alarm",
        Message = $"Temperature-device {dataFromServiceBusSubscription.DeviceId} at {dataFromServiceBusSubscription.NumericalValue:F} degrees",
        From = "IOT",
        To = "pushOverUserId" 
    };
    await azureLogicAppNotificationService.SendNotification(notification);
}

Couldn't be much easier, it's already wired up by design and the functionality to act on the trigger is all that remains for us to implement. In my case, the subscription is a call to action, namely triggering a device-specific command to act on the high temperature, and "SoundAlarm". All commands are stored into an Azure storage table, for both audit trail and command repository: all devices may, if so configured, continuously poll this table for any command that needs be executed by them - identified by their device-id. A quick Azure Function, http-triggered, delivers the goods:


/// <summary>
/// Retrieves the latest non-yet-retrieved command for a device, if any such command exists.
/// </summary>
[FunctionName("GetCommandFromServicesBus")]
public static HttpResponseMessage Run([HttpTrigger(AuthorizationLevel.Anonymous, "get", "post", Route = "GetCommandFromServicesBus")]HttpRequestMessage req, TraceWriter log)
{
    string deviceId = req.GetQueryNameValuePairs().FirstOrDefault(q => string.Compare(q.Key, "deviceId", true) == 0).Value; //req.Content["deviceId"];

    CloudStorageAccount cloudStorageAccount = CloudConfigurationFactory.GetCloudStorageAccount();
    var cloudService = new AzureTableStorageService(cloudStorageAccount);
    DeviceCommandModel commandForDevice = cloudService.GetCommandFromTableStorage(deviceId);

    return commandForDevice == null ?
        req.CreateResponse(HttpStatusCode.OK) // no commands found, just OK status
        :
        req.CreateResponse(HttpStatusCode.OK, JsonConvert.SerializeObject(commandForDevice)); // command found, return as json.
}

And so round and round it goes; devices ships data, the data is enriched, sent to a services bus and maybe/maybe not picked up by a subscription, which in turns triggers a command, so on and so forth.

I haven't touched upon the use of Azure Logic Apps, and I shan't go into them safe but note that I do implement a couple, for notification purposes - for example in the above 'GeneralHighTempTriggerFunction' code. Azure Logic Apps gives us the ability/possibility of gluing many Azure offerings together, but that's not my use-case as yet. You can have an Azure Logic App listen for subscription-hits on your services bus, for example, and compile multiple messages into a single command to a device, or vice versa. The graphical interface with which you create the Logic Apps is intuitive, yet offers great levels of complexity in the execution. You could also make use of it as an elaborate extension point, and out-source business logic to others while you take care of data yourself, for example.

So that's a bit of inspiration, I hope, on going serverless with Azure and getting those IoT-message flowing. I won't lie, getting just a dozen devices up and running and sending data and commands back and forth is fun to watch - and those Azure offerings make it simple and mostly intuitive to get started. Of course there's tons of stuff I haven't covered in detail in the above, and I'll leave you to second-guess the missing functions that'll enable to code to compile. It's meant as an appetizer, and I'll look forward to learning about your particular 'main course', so please by all means drop me a note about what you're doing with Azure and IoT. 

IoT projects are fun to be part of, I wish I could do more of it but, to my chagrin, my career-path never led me down that road besides trying it out for fun at home. I hope the above will inspire you in your endeavors. Thanks for checking it out, and if there's anything I can do to help out get in touch me and I'll try and do that.



Hope it helps!

Buy me a coffeeBuy me a coffee